SpreadTooThin wrote:
On Apr 13, 2:37 pm, Grant Edwards <inva...@invalid> wrote:
On 2009-04-13, Grant Edwards <inva...@invalid> wrote:
On 2009-04-13, SpreadTooThin <bjobrie...@gmail.com> wrote:
I want to compare two binary files and see if they are the same.
I see the filecmp.cmp function but I don't get a warm fuzzy feeling
that it is doing a byte by byte comparison of two files to see if they
are they same.
Perhaps I'm being dim, but how else are you going to decide if
two files are the same unless you compare the bytes in the
files?
You could hash them and compare the hashes, but that's a lot
more work than just comparing the two byte streams.
What should I be using if not filecmp.cmp?
I don't understand what you've got against comparing the files
when you stated that what you wanted to do was compare the files.
Doh! I misread your post and thought were weren't getting a
warm fuzzying feeling _because_ it was doing a byte-byte
compare. Now I'm a bit confused. Are you under the impression
it's _not_ doing a byte-byte compare? Here's the code:
def _do_cmp(f1, f2):
bufsize =UFSIZE
fp1 =pen(f1, 'rb')
fp2 =pen(f2, 'rb')
while True:
b1 =p1.read(bufsize)
b2 =p2.read(bufsize)
if b1 !=2:
return False
if not b1:
return True
It looks like a byte-by-byte comparison to me. Note that when
this function is called the file lengths have already been
compared and found to be equal.
--
Grant Edwards grante Yow! Alright, you!!
at Imitate a WOUNDED SEAL
visi.com pleading for a PARKING
SPACE!!
I am indeed under the impression that it is not always doing a byte by
byte comparison...
as well the documentation states:
Compare the files named f1 and f2, returning True if they seem equal,
False otherwise.
That word... Seeeeem... makes me wonder.
Thanks for the code! :)
Some of this discussion depends on the version of Python, but didn't say
so. In version 2.61, the code is different (and more complex) than
what's listed above. The docs are different too. In this version, at
least, you'll want to explicitly pass the shallow=False parameter. It
defaults to 1, by which they must mean True. I think it's a bad
default, but it's still a useful function. Just be careful to include
that parameter in your call.
Further, you want to check the version included with your version. The
file filecmp.py is in the Lib directory, so it's not trouble to check it.
--
http://mail.python.org/mailman/listinfo/python-list