Re: Reading the first MB of a binary file

2009-01-25 Thread Marc 'BlackJack' Rintsch
On Sun, 25 Jan 2009 08:37:07 -0800, Max Leason wrote:

> I'm attempting to read the first MB of a binary file and then do a md5
> hash on it so that i can find the file later despite it being moved or
> any file name changes that may have been made to it. These files are
> large (350-1400MB) video files and i often located on a different
> computer and I figure that there is a low risk for generating the same
> hash between two files. The problem occurs in the read command which
> returns all \x00s. Any ideas why this is happening?
> 
> Code:
>open("Chuck.S01E01.HDTV.XViD-YesTV.avi", "rb").read(1024)
> b'\x00\x00\x00\x00\x00\x00\x00'

As MRAB says, maybe the first 1024 actually *are* all zero bytes.  Wild 
guess:  That's a file created by a bittorrent client which preallocates 
the files and that file above isn't downloaded completely yet!?

Ciao,
Marc 'BlackJack' Rintsch
--
http://mail.python.org/mailman/listinfo/python-list


Re: Reading the first MB of a binary file

2009-01-25 Thread MRAB

Max Leason wrote:
> Hi,
>
> I'm attempting to read the first MB of a binary file and then do a
> md5 hash on it so that i can find the file later despite it being
> moved or any file name changes that may have been made to it. These
> files are large (350-1400MB) video files and i often located on a
> different computer and I figure that there is a low risk for
> generating the same hash between two files. The problem occurs in the
> read command which returns all \x00s. Any ideas why this is
> happening?
>
> Code:
> open("Chuck.S01E01.HDTV.XViD-YesTV.avi", "rb").read(1024)
> b'\x00\x00\x00\x00\x00\x00\x00'
>
You're reading the first 1024 bytes. Perhaps the first 1024 bytes of the
file _are_ all zero!

Try reading more and checking those, eg:


SIZE = 1024 ** 2
>>> open("Chuck.S01E01.HDTV.XViD-YesTV.avi", "rb").read(SIZE) == 
b'\x00' * SIZE

--
http://mail.python.org/mailman/listinfo/python-list


Reading the first MB of a binary file

2009-01-25 Thread Max Leason
Hi,

I'm attempting to read the first MB of a binary file and then do a md5
hash on it so that i can find the file later despite it being moved or
any file name changes that may have been made to it. These files are
large (350-1400MB) video files and i often located on a different
computer and I figure that there is a low risk for generating the same
hash between two files. The problem occurs in the read command which
returns all \x00s. Any ideas why this is happening?

Code:
open("Chuck.S01E01.HDTV.XViD-YesTV.avi", "rb").read(1024)
b'\x00\x00\x00\x00\x00\x00\x00'
--
http://mail.python.org/mailman/listinfo/python-list