"cjl" <[EMAIL PROTECTED]> writes:

> Fredrik Lundh wrote:
>
>> something like this could work:
>>
>>     import re
>>
>>     text = open(file, "rb").read()
>>
>>     for m in re.finditer("([\x20-\x7f]{4,})[\n\0]", text):
>>         print m.start(), repr(m.group(1))
>
> Hey...that worked. I actually modified:
>
> for m in re.finditer("([\x20-\x7f]{4,})[\n\0]", text):
>
> to
>
> for m in re.finditer("([\x20-\x7f]{4,})", text):
>
> and now the output is nearly identical to 'strings'. One problem
> exists, in that if the binary file contains a string
> "monkey/chicken/dog/cat" it is printed as "mokey//chicken//dog//cat",
> and I don't know enough to figure out where the extra "/" is coming
> from.

Are you sure it's monkey/chicken/dog/cat, and not
monkey\chicken\dog\cat? The later one will print monkey\\chicken...
because of the repr() call.

Also, you probably want it as [\x20-\x7e] (the DEL character \x7f
isn't printable). You're also missing tabs (\t).

The GNU binutils string utility looks for \t or [\x20-\x7e].

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to