Re: Why 'r' mode anyway?

2005-01-15 Thread Cameron Laird
In article <[EMAIL PROTECTED]>,
Tim Peters  <[EMAIL PROTECTED]> wrote:
.
.
.
>reading up the bits in the index and offsets too, etc.  IIRC, Unix was
>actually quite novel at the time in insisting that all files were just
>raw byte streams to the OS.
Not just "novel", but "puzzling" and even "controversial".
It was far from clear that the Unix way could be successful.
.
.
.
>but generally where it's reasonably easy to hide.  It's not easy to
>hide native file conventions, partly because Python wouldn't play well
>with *other* platform software if it did.
>
>Remember that Guido worked on ABC before Python, and Python is in
>(small) part a reaction against the extremes of ABC.  ABC was 100%
>platform-independent.  You could read and write files from ABC.
>However, the only files you could read from ABC were files that were
>written by ABC -- and files written by ABC were essentially unusable
>by other software.  Socket semantics were also 100% portable in ABC: 
>it didn't have sockets, nor any way to extend the language to add
>them.  Etc -- ABC was a self-contained universe.  "Plays well with
>others" was a strong motivator for Python's design, and that often
>means playing by others' rules.

At a slightly different level, that--not playing well enough
with others--is what held Smalltalk back.  Again, a lot of
this stuff wasn't obvious at the time, even as late as 1990.
I think we understand better now that languages are secondary,
in that good developers can be productive with all sorts of
syntaxes and semantics; as a practical matter, daily struggles
have to do with the libraries or how the languages access what
is outside themselves.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why 'r' mode anyway?

2005-01-15 Thread Nick Coghlan
Skip Montanaro wrote:
Tim> "Plays well with others" was a strong motivator for Python's
Tim> design, and that often means playing by others' rules.  --
My vote for QOTW...  Is it too late to slip it into the Zen of Python?
It would certainly fit, and the existing koans don't really cover the 
concept.
Its addition also seems fitting in light of the current PEP 246 discussion which 
is *all* about playing well with others :)

Cheers,
Nick.
--
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
http://boredomandlaziness.skystorm.net
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why 'r' mode anyway?

2005-01-15 Thread Skip Montanaro

Tim> "Plays well with others" was a strong motivator for Python's
Tim> design, and that often means playing by others' rules.  --

My vote for QOTW...  Is it too late to slip it into the Zen of Python?

Skip

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why 'r' mode anyway?

2005-01-14 Thread Tim Peters
[Tim Peters]
>> That differences may exist is reflected in the C
>> standard, and the rules for text-mode files are more restrictive
>> than most people would believe.

[Irmen de Jong]
> Apparently. Because I know only about the Unix <-> Windows
> difference (windows converts \r\n <--> \n when using 'r' mode,
> right).  So it's in the line endings.

That's one difference.  The worse difference is that, in text mode on
Windows, the first instance of chr(26) in a file is taken as meaning
"that's the end of the file", no matter how many bytes may follow it. 
That's fine by the C standard, because everything about a text-mode
file containing a chr(26) character is undefined.

> Is there more obscure stuff going on on the other systems you
> mentioned (Mac OS, VAX) ?

I think on Mac Classic it was *just* line end differences.  Native VAX
has many file formats.  "Record-based" file formats used to be very
popular.  There the OS saves meta-information in the file, such as
each record contains an offset to the start of the next record, and
may even contain an index structure to support random access to
records quickly (for example, "a line" may be a record, and "read the
last line" may go quickly).  Read that in binary mode, and you'll be
reading up the bits in the index and offsets too, etc.  IIRC, Unix was
actually quite novel at the time in insisting that all files were just
raw byte streams to the OS.

> (That means that the bug in Simplehttpserver that my patch
> 839496 addressed, also occured on those systems? Or that
> the patch may be incorrect after all??)

Don't know, and (sorry) no time to dig.

> While your argument about why Python doesn't use its own
> platform- independent file format is sound of course, I find it often
> a nuisance that platform specific things tricle trough into Python
> itself and ultimately in the programs you write. I sometimes feel
> that some parts of Python expose the underlying C/os
> implementation a bit too much. Python never claimed write once
> run anywhere (as that other language does) but it would have
> been nice nevertheless ;-)
> In practice it's just not possible I guess.

It would be difficult at best.  Python hides a lot of platform crap,
but generally where it's reasonably easy to hide.  It's not easy to
hide native file conventions, partly because Python wouldn't play well
with *other* platform software if it did.

Remember that Guido worked on ABC before Python, and Python is in
(small) part a reaction against the extremes of ABC.  ABC was 100%
platform-independent.  You could read and write files from ABC.
However, the only files you could read from ABC were files that were
written by ABC -- and files written by ABC were essentially unusable
by other software.  Socket semantics were also 100% portable in ABC: 
it didn't have sockets, nor any way to extend the language to add
them.  Etc -- ABC was a self-contained universe.  "Plays well with
others" was a strong motivator for Python's design, and that often
means playing by others' rules.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why 'r' mode anyway?

2005-01-14 Thread Irmen de Jong
Tim Peters wrote:
That differences may exist is reflected in the C
standard, and the rules for text-mode files are more restrictive than
most people would believe.
Apparently. Because I know only about the Unix <-> Windows difference
(windows converts \r\n <--> \n when using 'r' mode, right).
So it's in the line endings.
Is there more obscure stuff going on on the other systems you
mentioned (Mac OS, VAX) ?
(That means that the bug in Simplehttpserver that my patch
839496 addressed, also occured on those systems? Or that
the patch may be incorrect after all??)
While your argument about why Python doesn't use its own platform-
independent file format is sound ofcourse, I find it often a nuisance
that platform specific things tricle trough into Python itself and
ultimately in the programs you write. I sometimes feel that some
parts of Python expose the underlying C/os implementation
a bit too much. Python never claimed write once run anywhere (as
that other language does) but it would have been nice nevertheless ;-)
In practice it's just not possible I guess.
Thanks,
--Irmen
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why 'r' mode anyway? (was: Re: Pickled text file causing ValueError (dos/unix issue))

2005-01-14 Thread Serge Orlov
Irmen de Jong wrote:
> Tim Peters wrote:
>
> > Yes:  regardless of platform, always open files used for pickles in
> > binary mode.  That is, pass "rb" to open() when reading a pickle
file,
> > and "wb" to open() when writing a pickle file.  Then your pickle
files
> > will work unchanged on all platforms.  The same is true of files
> > containing binary data of any kind (and despite that pickle
protocol 0
> > was called "text mode" for years, it's still binary data).
>
> I've been wondering why there even is the choice between binary mode
> and text mode. Why can't we just do away with the 'text mode' ?

We can't because characters and bytes are not the same things. But I
believe what you're really complaining about is that "t" mode sometimes
mysteriously corrupts data if processed by the code that expects binary
files. In Python 3.0 it will be fixed because file.read will have to
return different objects: bytes for "b" mode, str for "t" mode. It
would be great if file type was split into binfile and textfile,
removing need for cryptic "b" and "t" modes but I'm afraid that's too
much of a change even for Python 3.0

  Serge.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why 'r' mode anyway? (was: Re: Pickled text file causing ValueError (dos/unix issue))

2005-01-14 Thread Tim Peters
[Irmen de Jong]
> I've been wondering why there even is the choice between binary mode
> and text mode. Why can't we just do away with the 'text mode' ?
> What does it do, anyways? At least, if it does something, I'm sure
> that it isn't something that can be done in Python itself if
> really required to do so...

It's not Python's decision, it's the operating system's.  Whether
there's an actual difference between text mode and binary mode is up
to the operating system, and, if there is an actual difference, every
detail about what the difference(s) consists of is also up to the
operating system.  That differences may exist is reflected in the C
standard, and the rules for text-mode files are more restrictive than
most people would believe.

On Unixish systems, there's no difference.  On Windows boxes, there
are conceptually small differences with huge consequences, and the
distinction appears to be kept just for backward-compatibility
reasons.  On some other systems, text and binary files are entirely
different kinds of beasts.

If Python didn't offer text mode then it would be clumsy at best to
use Python to write ordinary human-readable text files in the format
that native software on Windows, and Mac Classic, and VAX (and ...)
expects (and the native format for text mode differs across all of
them).  If Python didn't offer binary mode then it wouldn't be
possible to use Python to process data in binary files on Windows and
Mac Classic and VAX (and ...).  If Python used its own
platform-independent file format, then it would end up creating files
that other programs wouldn't be able to deal with.

Live with it .
-- 
http://mail.python.org/mailman/listinfo/python-list


Why 'r' mode anyway? (was: Re: Pickled text file causing ValueError (dos/unix issue))

2005-01-14 Thread Irmen de Jong
Tim Peters wrote:
Yes:  regardless of platform, always open files used for pickles in
binary mode.  That is, pass "rb" to open() when reading a pickle file,
and "wb" to open() when writing a pickle file.  Then your pickle files
will work unchanged on all platforms.  The same is true of files
containing binary data of any kind (and despite that pickle protocol 0
was called "text mode" for years, it's still binary data).
I've been wondering why there even is the choice between binary mode
and text mode. Why can't we just do away with the 'text mode' ?
What does it do, anyways? At least, if it does something, I'm sure
that it isn't something that can be done in Python itself if
really required to do so...
--Irmen
--
http://mail.python.org/mailman/listinfo/python-list