Re: Reading the first MB of a binary file
On Sun, 25 Jan 2009 08:37:07 -0800, Max Leason wrote: > I'm attempting to read the first MB of a binary file and then do a md5 > hash on it so that i can find the file later despite it being moved or > any file name changes that may have been made to it. These files are > large (350-1400MB) video files and i often located on a different > computer and I figure that there is a low risk for generating the same > hash between two files. The problem occurs in the read command which > returns all \x00s. Any ideas why this is happening? > > Code: >open("Chuck.S01E01.HDTV.XViD-YesTV.avi", "rb").read(1024) > b'\x00\x00\x00\x00\x00\x00\x00' As MRAB says, maybe the first 1024 actually *are* all zero bytes. Wild guess: That's a file created by a bittorrent client which preallocates the files and that file above isn't downloaded completely yet!? Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list
Re: Reading the first MB of a binary file
Max Leason wrote: > Hi, > > I'm attempting to read the first MB of a binary file and then do a > md5 hash on it so that i can find the file later despite it being > moved or any file name changes that may have been made to it. These > files are large (350-1400MB) video files and i often located on a > different computer and I figure that there is a low risk for > generating the same hash between two files. The problem occurs in the > read command which returns all \x00s. Any ideas why this is > happening? > > Code: > open("Chuck.S01E01.HDTV.XViD-YesTV.avi", "rb").read(1024) > b'\x00\x00\x00\x00\x00\x00\x00' > You're reading the first 1024 bytes. Perhaps the first 1024 bytes of the file _are_ all zero! Try reading more and checking those, eg: SIZE = 1024 ** 2 >>> open("Chuck.S01E01.HDTV.XViD-YesTV.avi", "rb").read(SIZE) == b'\x00' * SIZE -- http://mail.python.org/mailman/listinfo/python-list
Reading the first MB of a binary file
Hi, I'm attempting to read the first MB of a binary file and then do a md5 hash on it so that i can find the file later despite it being moved or any file name changes that may have been made to it. These files are large (350-1400MB) video files and i often located on a different computer and I figure that there is a low risk for generating the same hash between two files. The problem occurs in the read command which returns all \x00s. Any ideas why this is happening? Code: open("Chuck.S01E01.HDTV.XViD-YesTV.avi", "rb").read(1024) b'\x00\x00\x00\x00\x00\x00\x00' -- http://mail.python.org/mailman/listinfo/python-list