[Tutor] reading binary file on windows and linux
Hello, I've got some trouble reading binary files with struct.unpack on windows. According to the documentation of the binary file's content, at the beginning there're some simple bytes (labeled as 'UChar: 8-bit unsigned byte'). Within those bytes there's a sequence to check the file's sanity. The sequence is (in ascii C-Notation): \n \r \n I've downloaded the file from the same website from two machines. One is a Windows 7 64-Bit, the other one is a virtual Linux machine. Now the trouble is while on linux everything is fine, on windows the carriage return does not appear when reading the file with struct.unpack. The file sizes on Linux and Windows are exaktly the same, and also my script determines the file sizes correctly on both plattforms (according to the OS). When I open the file on Windows in an editor and display the whitespaces, the linefeed and cariage-return are shown a expected. The code I'm using to check the first 80 bytes of the file is: import struct import sys with open(sys.argv[1]) as source: size = struct.calcsize(80B) raw_data = struct.unpack(80B, source.read(size)) for i, data in enumerate(raw_data): print i, data, chr(data) source.seek(0, 2) print source.tell() Any suggestions are highly appreciated. Cheers, Jan ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] reading binary file on windows and linux
On 9 May 2010 18:33, Jan Jansen knack...@googlemail.com wrote: Hello, I've got some trouble reading binary files with struct.unpack on windows. According to the documentation of the binary file's content, at the beginning there're some simple bytes (labeled as 'UChar: 8-bit unsigned byte'). Within those bytes there's a sequence to check the file's sanity. The sequence is (in ascii C-Notation): \n \r \n I've downloaded the file from the same website from two machines. One is a Windows 7 64-Bit, the other one is a virtual Linux machine. Now the trouble is while on linux everything is fine, on windows the carriage return does not appear when reading the file with struct.unpack. The file sizes on Linux and Windows are exaktly the same, and also my script determines the file sizes correctly on both plattforms (according to the OS). When I open the file on Windows in an editor and display the whitespaces, the linefeed and cariage-return are shown a expected. The code I'm using to check the first 80 bytes of the file is: import struct import sys with open(sys.argv[1]) as source: size = struct.calcsize(80B) raw_data = struct.unpack(80B, source.read(size)) for i, data in enumerate(raw_data): print i, data, chr(data) source.seek(0, 2) print source.tell() Any suggestions are highly appreciated. Cheers, Jan I'd guess that it's because newline in windows is /r/n and in linux it's just /n. If you read the file as binary rather than text then it should work the same on both platforms ie use: open(sys.argv[1], rb) HTH, Adam. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] reading binary file on windows and linux
On Sun, May 9, 2010 at 7:33 PM, Jan Jansen knack...@googlemail.com wrote: Hello, I've got some trouble reading binary files with struct.unpack on windows. According to the documentation of the binary file's content, at the beginning there're some simple bytes (labeled as 'UChar: 8-bit unsigned byte'). Within those bytes there's a sequence to check the file's sanity. The sequence is (in ascii C-Notation): \n \r \n I've downloaded the file from the same website from two machines. One is a Windows 7 64-Bit, the other one is a virtual Linux machine. Now the trouble is while on linux everything is fine, on windows the carriage return does not appear when reading the file with struct.unpack. The file sizes on Linux and Windows are exaktly the same, and also my script determines the file sizes correctly on both plattforms (according to the OS). When I open the file on Windows in an editor and display the whitespaces, the linefeed and cariage-return are shown a expected. The code I'm using to check the first 80 bytes of the file is: import struct import sys with open(sys.argv[1]) as source: size = struct.calcsize(80B) raw_data = struct.unpack(80B, source.read(size)) for i, data in enumerate(raw_data): print i, data, chr(data) source.seek(0, 2) print source.tell() Since the file is binary, you should use the b mode when opening it: with open(sys.argv[1], rb) as source: otherwise, the file will open in text mode, which converts newline characters to/from a platform specific representation when reading or writing. In windows, that representation is \r\n, meaning that that sequence is converted to just \n when you read from the file. That is why the carriage return disappears. Hugo ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] reading binary file on windows and linux
On Mon, 10 May 2010 03:33:51 am Jan Jansen wrote: Hello, I've got some trouble reading binary files with struct.unpack on windows. [...] The code I'm using to check the first 80 bytes of the file is: import struct import sys with open(sys.argv[1]) as source: You're opening the file in text mode. On Linux, there's no difference, but on Windows, it will do strange things to the end of lines. You need to open the file in binary mode: open(sys.argv[1], 'rb') -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] reading binary file on windows and linux
On Sun, 9 May 2010 19:33:51 +0200 Jan Jansen knack...@googlemail.com wrote: Hello, I've got some trouble reading binary files with struct.unpack on windows. According to the documentation of the binary file's content, at the beginning there're some simple bytes (labeled as 'UChar: 8-bit unsigned byte'). Within those bytes there's a sequence to check the file's sanity. The sequence is (in ascii C-Notation): \n \r \n I've downloaded the file from the same website from two machines. One is a Windows 7 64-Bit, the other one is a virtual Linux machine. Now the trouble is while on linux everything is fine, on windows the carriage return does not appear when reading the file with struct.unpack. The file sizes on Linux and Windows are exaktly the same, and also my script determines the file sizes correctly on both plattforms (according to the OS). When I open the file on Windows in an editor and display the whitespaces, the linefeed and cariage-return are shown a expected. The code I'm using to check the first 80 bytes of the file is: import struct import sys with open(sys.argv[1]) as source: size = struct.calcsize(80B) raw_data = struct.unpack(80B, source.read(size)) for i, data in enumerate(raw_data): print i, data, chr(data) source.seek(0, 2) print source.tell() I guess (but am not 100% sure because never use 'b'), the issue will be solved using: with open(sys.argv[1], 'rb') as source: The reason is by default files are opened in read 'r' and text mode. In text mode, whatever char seq is used by a given OS with the sense of line separator (\r\n' under win) is silently converted by python to a canonical code made of the single '\n' (char #0xa). So that, in your case, in the header sub-sequence '\r'+'\n' you lose '\r'. In so-called bynary mode 'b' instead, python does not perform this replacement anymore, so that you get the raw byte sequence. Hope I'm right on this and it helps. Denis vit esse estrany ☣ spir.wikidot.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor