Steven D'Aprano wrote: [Generally fine stuff, I am elaborating rather than dis-agreeing.] > On Sun, 12 Mar 2006 22:01:46 -0500, John Salerno wrote: > >> Erik Max Francis wrote: >> >>> You can use the struct module for converting fundamental types to a >>> portable string representation for writing to binary files. >> But if it's a string, why not just use a text file? What does a binary >> file do that a text file doesn't, aside from not converting the end of >> line characters? > > Nothing. It is all bytes under the hood. Modeling a file as "a continuous undifferentiated string of bytes under the hood" is a Unix-ism. There were (and are) other models.
> When writing lines to a file, Python does not automatically append the > line marker, so you need to do so yourself. This is, indeed the behavior with "write," but not with "print" A "print" statement ending w/o a comma will tack an end-of-line onto its output. > But some other languages do -- I believe C++ is one of those languages. > So C++ needs to know whether you are writing in text mode so it can > append that end-of-line maker, or binary mode so it doesn't. Actually C++ (and C) convert any ('\12' == '\n' == LF) character to the local file system's "line terminator" character on output to a text-mode file. > Since Python doesn't modify the line you write to the file, it doesn't > care whether you are writing in text or binary mode, it is all the same. Well, actually CPython uses C I/O, so it does convert the '\n' chars just as C does. > Operating systems such as Unix and Linux don't distinguish between binary > and text mode, the results are the same. I'm told that Windows does > distinguish between the two, although I couldn't tell you how they > differ. The way Windows differs from Unix: If the actual file data is built as: f = open('dead_parrot', 'wb') f.write('dead\r\nparrot') f.close() g = open('ex_parrot', 'w') g.write('Dead\nParrot') g.close() ft = open('dead_parrot', 'r') ft.read(6) returns 'dead\np' gt = open('ex_parrot', 'r') gt.read(6) returns 'Dead\nD' fb = open('dead_parrot', 'rb') fb.read(6) returns 'dead\r\n' gb = open('ex_parrot', 'rb') gb.read(6) returns 'Dead\r\n' In case you didn't follow the above too precisely, both files (dead_parrot and ex_parrot) have exactly the same byes as contents. This, by the way, is one of the few places Windows did it "by the standard" and Unix "made up their own standard." The Unix decision was, essentially: "there are too many ways to get in trouble with both CR and LF determining line ending: what do you do for LF-CR pairs, What does a LF by itself mean w/o a CR, .... Let's just treat LF as a single-character line separator." Note how funny this for how you type: you type <a> <b> <c> <Enter> for a line, but <Enter> sends a CR ('\r' == '\15' == ASCII 13), which the I/O systems somewhere magically transforms into a LF ('\n' == '\12' == ASCII 10). The C standard (which evolved with Unix) does these translation "for you" (or "to you" depending on your mood) because it was meant to be compatible with _many_ file systems, including those which did not explicitly represent ends-of-lines (text files are such systems are sequences of lines, and there is a maximum length to each line). By the way, before you think such systems are foolish, think about how nice it might sometimes be to get to line 20972 of a file without reading through the entire front of the file. --Scott David Daniels [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list