r2 wrote:
On Jul 27, 9:06 am, Peter Otten <__pete...@web.de> wrote:
r2 wrote:
I have a memory dump from a machine I am trying to analyze. I can view
the file in a hex editor to see text strings in the binary code. I
don't see a way to save these ascii representations of the binary, so
I went digging into Python to see if there were any modules to help.
I found one I think might do what I want it to do - the binascii
module. Can anyone describe to me how to convert a raw binary file to
an ascii file using this module. I've tried? Boy, I've tried.
That won't work because a text editor doesn't need any help to convert the
bytes into characters. If it expects ascii it just will be puzzled by bytes
that are not valid ascii. Also, it will happily display byte sequences that
are valid ascii, but that you as a user will see as gibberish because they
were meant to be binary data by the program that wrote them.

Am I correct in assuming I can get the converted binary to ascii text
I see in a hex editor using this module? I'm new to this forensics
thing and it's quite possible I am mixing technical terms. I am not
new to Python, however. Thanks for your help.
Unix has the "strings" commandline tool to extract text from a binary.
Get hold of a copy of the MinGW tools if you are on windows.

Peter

Okay. Thanks for the guidance. I have a machine with Linux, so I
should be able to do what you describe above. Could Python extract the
strings from the binary as well? Just wondering.

Yes, you could do the same thing in Python easily enough. And with the advantage that you could define your own meanings for "characters."

The memory dump could be storing characters that are strictly ASCII. Or it could have EBCDIC, or UTF-8. And it could be Unicode, 16 bit or 32 bits, and big-endian or little-endian. Or the characters could be in some other format specific to a particular program.

However, it's probably very useful to see what a "strings" program might look like, because you can quickly code variations on it, to suit your particular data.
Something like the following (totally untested)

def isprintable(char):
   return 0x20 <= char <= 0x7f

def string(filename):
   data = open(filename, "rb").read()
   count = 0
   line = ""
   for ch in data:
       if isprintable(ch):
            count += 1
            line = line + ch
       else:
if count > 4 : #cutoff, don't print strings smaller than this because they're probably just coincidence
                print line
                count = 0
                line= ""
   print line


Now you can change the definition of what's "printable", you can change the min-length that you care about. And of course you can fine-tune things like max-length lines and such.

DaveA

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to