VirtualString

This class is an abstract representation of an arbitrary sized byte array with methods to load parts of it on demand. It is useful to represent a program virtual memory and allow metasm to work on it while only reading bytes from it when actually needed.

The base class is defined in metasm/os/main.rb.

Basics

The API of the object is designed to be compatible with a standard String (ASCII-8BIT). The main restriction is that the size of this string cannot be changed: concatenation / shortening is not supported.

The main operation on the object should be [] and []=, that is, reading some subpart of the string, or overwriting some substring. The arguments are the same as for a String, with the exception that rewrite raises an IndexError if the rewriting would change the string length.

A few methods are written specifically with the VirtualString semantics, others are redirected to a temporary real String generated with realstring.

The VirtualString works with a page concept, that represents some arbitrary chunks of data that can be actually read from the underlying target, e.g. a memory page (4096 bytes) when mapping a process virtual address space. Instances get to define a pagelength sound for the specific implementation.

Whenever a substring is requested from a VirtualString, if the substring length is less than the page size, an actual read is made and a String is returned.

If the length is greater however, a new VirtualString is created to map this new view without actually reading.

To force the conversion to a String, use the realstring or to_str method. The latter is prefered, as it works on both Strings and VirtualStrings.

To force the creation of a new VirtualString, use the dup(start, len) method.

When reading actual bytes, a local page cache is used. By default is has only 4 pages, and can be invalidated using invalidate. The cache is automatically invalidated when part of the string is written to.

The VirtualString may index invalid pages (e.g. unmapped memory range in a process address space) ; you can check that with page_invalid? with an index as parameter.

Creation

To create your own flavor of VirtualString, you must:

Feel free to override any other method with an optimized version. For exemple, the default realstring will repeatadly call get_page with each page in the range 0..length, you may have a more efficient alternative.

You can alter the cache size by rewriting the @pagecache_len variable after calling super() in initialize. The default value is 4, which you may want to increase.

See the WindowsRemoteString source for a simple exemple (ignore the open_pid method).

Standard subclasses

VirtualFile

Defined in metasm/os/main.rb.

This class maps over an open file descriptor, and allows reading data on-demand. It implements the read class method, similar to File.read, with the file opened in binary mode. For a small file (<=4096), the content is directly returned, otherwise a VirtualString is created.

This class is used by the default ExeFormat decode_file[_header] methods.

LinuxRemoteString

Defined in metasm/os/linux.rb.

This class maps over the virtual memory of a Linux process. Accesses are done through the /proc/<pid>/mem for reading. The linux kernel requires that the target process be ptraced before we can read this file, so the object will use the debugger instance passed to the constructor, or create a new core/PTrace.txt object to stop the process and read its memory during get_page.

If a core/Debugger.txt object was given, get_page will return nil if the debugger indicates that the target is not stopped.

Writing is done through PTrace#writemem using PTRACE_POKEDATA.

WindowsRemoteString

Defined in metasm/os/windows.rb.

This class maps over the virtual memory of a Windows process.

The memory accesses are done using the Read/WriteProcessMemory API.

The class method open_pid is defined, that will try to OpenProcess first in read/write, and fallback to read-only mode.

GdbRemoteString

Defined in metasm/os/remote.rb.

Maps over the virtual memory of a remote process debugged with a core/GdbClient.txt instance, using setmem and getmem.