Re: Why 'r' mode anyway?
In article <[EMAIL PROTECTED]>, Tim Peters <[EMAIL PROTECTED]> wrote: . . . >reading up the bits in the index and offsets too, etc. IIRC, Unix was >actually quite novel at the time in insisting that all files were just >raw byte streams to the OS. Not just "novel", but "puzzling" and even "controversial". It was far from clear that the Unix way could be successful. . . . >but generally where it's reasonably easy to hide. It's not easy to >hide native file conventions, partly because Python wouldn't play well >with *other* platform software if it did. > >Remember that Guido worked on ABC before Python, and Python is in >(small) part a reaction against the extremes of ABC. ABC was 100% >platform-independent. You could read and write files from ABC. >However, the only files you could read from ABC were files that were >written by ABC -- and files written by ABC were essentially unusable >by other software. Socket semantics were also 100% portable in ABC: >it didn't have sockets, nor any way to extend the language to add >them. Etc -- ABC was a self-contained universe. "Plays well with >others" was a strong motivator for Python's design, and that often >means playing by others' rules. At a slightly different level, that--not playing well enough with others--is what held Smalltalk back. Again, a lot of this stuff wasn't obvious at the time, even as late as 1990. I think we understand better now that languages are secondary, in that good developers can be productive with all sorts of syntaxes and semantics; as a practical matter, daily struggles have to do with the libraries or how the languages access what is outside themselves. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why 'r' mode anyway?
Skip Montanaro wrote: Tim> "Plays well with others" was a strong motivator for Python's Tim> design, and that often means playing by others' rules. -- My vote for QOTW... Is it too late to slip it into the Zen of Python? It would certainly fit, and the existing koans don't really cover the concept. Its addition also seems fitting in light of the current PEP 246 discussion which is *all* about playing well with others :) Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://boredomandlaziness.skystorm.net -- http://mail.python.org/mailman/listinfo/python-list
Re: Why 'r' mode anyway?
Tim> "Plays well with others" was a strong motivator for Python's Tim> design, and that often means playing by others' rules. -- My vote for QOTW... Is it too late to slip it into the Zen of Python? Skip -- http://mail.python.org/mailman/listinfo/python-list
Re: Why 'r' mode anyway?
[Tim Peters] >> That differences may exist is reflected in the C >> standard, and the rules for text-mode files are more restrictive >> than most people would believe. [Irmen de Jong] > Apparently. Because I know only about the Unix <-> Windows > difference (windows converts \r\n <--> \n when using 'r' mode, > right). So it's in the line endings. That's one difference. The worse difference is that, in text mode on Windows, the first instance of chr(26) in a file is taken as meaning "that's the end of the file", no matter how many bytes may follow it. That's fine by the C standard, because everything about a text-mode file containing a chr(26) character is undefined. > Is there more obscure stuff going on on the other systems you > mentioned (Mac OS, VAX) ? I think on Mac Classic it was *just* line end differences. Native VAX has many file formats. "Record-based" file formats used to be very popular. There the OS saves meta-information in the file, such as each record contains an offset to the start of the next record, and may even contain an index structure to support random access to records quickly (for example, "a line" may be a record, and "read the last line" may go quickly). Read that in binary mode, and you'll be reading up the bits in the index and offsets too, etc. IIRC, Unix was actually quite novel at the time in insisting that all files were just raw byte streams to the OS. > (That means that the bug in Simplehttpserver that my patch > 839496 addressed, also occured on those systems? Or that > the patch may be incorrect after all??) Don't know, and (sorry) no time to dig. > While your argument about why Python doesn't use its own > platform- independent file format is sound of course, I find it often > a nuisance that platform specific things tricle trough into Python > itself and ultimately in the programs you write. I sometimes feel > that some parts of Python expose the underlying C/os > implementation a bit too much. Python never claimed write once > run anywhere (as that other language does) but it would have > been nice nevertheless ;-) > In practice it's just not possible I guess. It would be difficult at best. Python hides a lot of platform crap, but generally where it's reasonably easy to hide. It's not easy to hide native file conventions, partly because Python wouldn't play well with *other* platform software if it did. Remember that Guido worked on ABC before Python, and Python is in (small) part a reaction against the extremes of ABC. ABC was 100% platform-independent. You could read and write files from ABC. However, the only files you could read from ABC were files that were written by ABC -- and files written by ABC were essentially unusable by other software. Socket semantics were also 100% portable in ABC: it didn't have sockets, nor any way to extend the language to add them. Etc -- ABC was a self-contained universe. "Plays well with others" was a strong motivator for Python's design, and that often means playing by others' rules. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why 'r' mode anyway?
Tim Peters wrote: That differences may exist is reflected in the C standard, and the rules for text-mode files are more restrictive than most people would believe. Apparently. Because I know only about the Unix <-> Windows difference (windows converts \r\n <--> \n when using 'r' mode, right). So it's in the line endings. Is there more obscure stuff going on on the other systems you mentioned (Mac OS, VAX) ? (That means that the bug in Simplehttpserver that my patch 839496 addressed, also occured on those systems? Or that the patch may be incorrect after all??) While your argument about why Python doesn't use its own platform- independent file format is sound ofcourse, I find it often a nuisance that platform specific things tricle trough into Python itself and ultimately in the programs you write. I sometimes feel that some parts of Python expose the underlying C/os implementation a bit too much. Python never claimed write once run anywhere (as that other language does) but it would have been nice nevertheless ;-) In practice it's just not possible I guess. Thanks, --Irmen -- http://mail.python.org/mailman/listinfo/python-list
Re: Why 'r' mode anyway? (was: Re: Pickled text file causing ValueError (dos/unix issue))
Irmen de Jong wrote: > Tim Peters wrote: > > > Yes: regardless of platform, always open files used for pickles in > > binary mode. That is, pass "rb" to open() when reading a pickle file, > > and "wb" to open() when writing a pickle file. Then your pickle files > > will work unchanged on all platforms. The same is true of files > > containing binary data of any kind (and despite that pickle protocol 0 > > was called "text mode" for years, it's still binary data). > > I've been wondering why there even is the choice between binary mode > and text mode. Why can't we just do away with the 'text mode' ? We can't because characters and bytes are not the same things. But I believe what you're really complaining about is that "t" mode sometimes mysteriously corrupts data if processed by the code that expects binary files. In Python 3.0 it will be fixed because file.read will have to return different objects: bytes for "b" mode, str for "t" mode. It would be great if file type was split into binfile and textfile, removing need for cryptic "b" and "t" modes but I'm afraid that's too much of a change even for Python 3.0 Serge. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why 'r' mode anyway? (was: Re: Pickled text file causing ValueError (dos/unix issue))
[Irmen de Jong] > I've been wondering why there even is the choice between binary mode > and text mode. Why can't we just do away with the 'text mode' ? > What does it do, anyways? At least, if it does something, I'm sure > that it isn't something that can be done in Python itself if > really required to do so... It's not Python's decision, it's the operating system's. Whether there's an actual difference between text mode and binary mode is up to the operating system, and, if there is an actual difference, every detail about what the difference(s) consists of is also up to the operating system. That differences may exist is reflected in the C standard, and the rules for text-mode files are more restrictive than most people would believe. On Unixish systems, there's no difference. On Windows boxes, there are conceptually small differences with huge consequences, and the distinction appears to be kept just for backward-compatibility reasons. On some other systems, text and binary files are entirely different kinds of beasts. If Python didn't offer text mode then it would be clumsy at best to use Python to write ordinary human-readable text files in the format that native software on Windows, and Mac Classic, and VAX (and ...) expects (and the native format for text mode differs across all of them). If Python didn't offer binary mode then it wouldn't be possible to use Python to process data in binary files on Windows and Mac Classic and VAX (and ...). If Python used its own platform-independent file format, then it would end up creating files that other programs wouldn't be able to deal with. Live with it . -- http://mail.python.org/mailman/listinfo/python-list
Why 'r' mode anyway? (was: Re: Pickled text file causing ValueError (dos/unix issue))
Tim Peters wrote: Yes: regardless of platform, always open files used for pickles in binary mode. That is, pass "rb" to open() when reading a pickle file, and "wb" to open() when writing a pickle file. Then your pickle files will work unchanged on all platforms. The same is true of files containing binary data of any kind (and despite that pickle protocol 0 was called "text mode" for years, it's still binary data). I've been wondering why there even is the choice between binary mode and text mode. Why can't we just do away with the 'text mode' ? What does it do, anyways? At least, if it does something, I'm sure that it isn't something that can be done in Python itself if really required to do so... --Irmen -- http://mail.python.org/mailman/listinfo/python-list