Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-10-02 Thread Nick Maclaren
Guido van Rossum [EMAIL PROTECTED] wrote:

 Does anyone else have the feeling that discussions with Mr. MacLaren
 don't usually bear any fruit?

Yes.  I do.  My ability to predict the (technical) future is good;
my ability to persuade people of it is almost non-existent.

However, when an almost identical thread to this one occurs in a
decade's time, I shall be safely retired.  And, assuming no changes
to the basic models, when one occurs in twenty years' time, I shall
be forgotten.  Such is life.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  [EMAIL PROTECTED]
Tel.:  +44 1223 334761Fax:  +44 1223 334679
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-10-01 Thread Nick Maclaren
Greg Ewing [EMAIL PROTECTED] wrote:

  I don't know PRECISELY what you mean by universal newlines mode
 
 I mean precisely what Python means by the term: any of
 \r, \n or \r\n represent a newline, and no distinction
 is made between them.

Excellent.  While this over-simplifies the issue, let's stick to
the over-simplified form, as we may be able to get somewhere.

The question is independent of what the outside system believes a
text file should look like, and is solely what Python believes a
sequence of characters should mean.  For example, does 'A\r\nB'
mean that B is separated from A by one newline or two?

The point is that, once we know that, we can design a translator
to and from Python's conventions to any reasonable system (and,
as I say, I have done it many times).  But, if Python's own
interpretation is ambiguous, it is a sure recipe for different
translators being incompatible, even on the same system.  Which
is what has happened here.

So, damn the outside system, EXACTLY what does Python mean by
such characters, and EXACTLY what uses of them are discouraged
as having unspecified meanings?  If we could get an answer to
that precisely enough to write a parse tree with all terminals
explicit, this problem would go away.

And that is all that I say can or should be done.  The details
of how to write the translators to other file systems are then
a separate matter.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  [EMAIL PROTECTED]
Tel.:  +44 1223 334761Fax:  +44 1223 334679
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-10-01 Thread Paul Moore
On 01/10/2007, Nick Maclaren [EMAIL PROTECTED] wrote:
 So, damn the outside system, EXACTLY what does Python mean by
 such characters, and EXACTLY what uses of them are discouraged
 as having unspecified meanings?  If we could get an answer to
 that precisely enough to write a parse tree with all terminals
 explicit, this problem would go away.

Python, the language, means nothing by the characters. They are bytes
with defined values in a byte string (in 2.x, in 3.0 they are Unicode
characters, but otherwise no difference). The *language* places no
interpretation on them.

Certain library functions place an interpretation on the byte values,
but you need to read the function definition for that. And (a) they
may not all be consistent, and (b) they may say follows platform
behaviour, but that's the way it is, so you have to live with it.

Paul.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-10-01 Thread Nick Maclaren
Paul Moore [EMAIL PROTECTED] wrote:
 
  So, damn the outside system, EXACTLY what does Python mean by
  such characters, and EXACTLY what uses of them are discouraged
  as having unspecified meanings?  If we could get an answer to
  that precisely enough to write a parse tree with all terminals
  explicit, this problem would go away.
 
 Python, the language, means nothing by the characters. They are bytes
 with defined values in a byte string (in 2.x, in 3.0 they are Unicode
 characters, but otherwise no difference). The *language* places no
 interpretation on them.

Actually, it's not that simple, because of the universal newline
rule and the fact that both Unix/C ASCII and Unicode DO provide
meanings for their characters, but let that pass.  Your statement
is not far off the situation.

 Certain library functions place an interpretation on the byte values,
 but you need to read the function definition for that. And (a) they
 may not all be consistent, and (b) they may say follows platform
 behaviour, but that's the way it is, so you have to live with it.

And that is why there will continue to be confusion and inconsistency,
and why there will be similar threads to this for the foreseeable
future.  If you regard continuing problems of this sort as acceptable,
then fine, but I am pointing out that they are fairly easy to avoid.
But only by specifying a precise Python model.

Incidentally, the response (b) you give is a common one, but isn't
usually correct when it is given.  It is, after all, the cause of
the problem that started this thread.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  [EMAIL PROTECTED]
Tel.:  +44 1223 334761Fax:  +44 1223 334679
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-10-01 Thread Steve Holden
Michael Foord wrote:
 Steven Bethard wrote:
 On 9/29/07, Michael Foord [EMAIL PROTECTED] wrote:
   
 Terry Reedy wrote:
 
 There are two normal ways for internal Python text to have \r\n:
 1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the
 same platform).
 2. Intentially put there by a programmer.  If s/he also chooses default \n
 translation on output, \rtranslation of \n is correct.

   
 Actually, I usually get these strings from Windows UI components. A file
 containing '\r\n' is read in with '\r\n' being translated to '\n'. New
 user input is added containing '\r\n' line endings. The file is written
 out and now contains a mix of '\r\n' and '\r\r\n'.
 
 Out of curiosity, why don't the Python wrappers for your Windows UI
 components do the appropriate '\r\n' - '\n' conversions?
   
 
 One of the great things about IronPython is that you don't *need* any 
 wrappers - you access .NET objects natively (which in fact wrap the 
 lower level win32 API) - and the .NET APIs are usually not as bad as you 
 probably assume. ;-)
 
This thread might represent an argument that you *do* need wrappers ...

 You just have to be aware that line endings are '\r\n'. I'm not sure how 
 or if pywin32 handles this.
 
Presumably that awareness should be implemented by the unnecessary 
wrappers.

regards
  Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd   http://www.holdenweb.com
Skype: holdenweb  http://del.icio.us/steve.holden

Sorry, the dog ate my .sigline

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-10-01 Thread Michael Foord
Steve Holden wrote:
 Michael Foord wrote:
   
 Steven Bethard wrote:
 
 On 9/29/07, Michael Foord [EMAIL PROTECTED] wrote:
   
   
 Terry Reedy wrote:
 
 
 There are two normal ways for internal Python text to have \r\n:
 1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the
 same platform).
 2. Intentially put there by a programmer.  If s/he also chooses default \n
 translation on output, \rtranslation of \n is correct.

   
   
 Actually, I usually get these strings from Windows UI components. A file
 containing '\r\n' is read in with '\r\n' being translated to '\n'. New
 user input is added containing '\r\n' line endings. The file is written
 out and now contains a mix of '\r\n' and '\r\r\n'.
 
 
 Out of curiosity, why don't the Python wrappers for your Windows UI
 components do the appropriate '\r\n' - '\n' conversions?
   
   
 One of the great things about IronPython is that you don't *need* any 
 wrappers - you access .NET objects natively (which in fact wrap the 
 lower level win32 API) - and the .NET APIs are usually not as bad as you 
 probably assume. ;-)

 
 This thread might represent an argument that you *do* need wrappers ...

   
 You just have to be aware that line endings are '\r\n'. I'm not sure how 
 or if pywin32 handles this.

 
 Presumably that awareness should be implemented by the unnecessary 
 wrappers.
   

Well, it's an OS level difference and I thought that in general Python 
*doesn't* try to protect you from OS differences.

These different line endings are returned by the components - and making 
the string type aware of where it comes from and transform itself 
accordingly seems odd. It also leaves you with all sorts of other 
problems like string comparison (do you ignore difference in line 
endings?), string length (on different sides of the .NET / IronPython 
strings would report different lengths).

It is also different from how libraries like wxPython behave - where 
they *don't* protect you from OS differences and if a textbox has '\r\n' 
line endings - that is what you get...

Michael
http://www.manning.com/foord


 regards
   Steve
   

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-10-01 Thread Bill Janssen
 Well, it's an OS level difference and I thought that in general Python 
 *doesn't* try to protect you from OS differences.

I think that's the key point.  In general, Python tries to present a
translucent interface to the OS in which OS differences can show
through, in contrast to other languages (Java?) which try to present a
complete abstraction of the underlying environment.  This makes Python
in general more useful, thought it also makes it harder to write
portable code in Python, because you have to be aware of the potential
differences (and they aren't particularly well documented -- it's not
clear that they can be).

Bill
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-10-01 Thread Greg Ewing
Guido van Rossum wrote:
 The best solution for IronPython is probably to have the occasional
 wrapper around .NET APIs that translates between \r\n and \n on the
 boundary between Python and .NET;

That's probably true. I was responding to the notion
that IronPython shouldn't need any wrappers. To make
that really true would require IronPython to become
a different language that has a different canonical
representation of newlines.

It's fine with me to keep things as they are.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiem! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-10-01 Thread Greg Ewing
Nick Maclaren wrote:
 if Python's own
 interpretation is ambiguous, it is a sure recipe for different
 translators being incompatible,

Python's own interpretation is not ambiguous. The
problem at hand is people wanting to use some random
mixture of Python and .NET conventions.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiem! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-10-01 Thread Greg Ewing
Michael Foord wrote:
 It is also different from how libraries like wxPython behave - where 
 they *don't* protect you from OS differences and if a textbox has '\r\n' 
 line endings - that is what you get...

That sounds like an undesirable deficiency of those library
wrappers, especially cross-platform ones like wxPython.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiem! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-10-01 Thread Terry Reedy

Nick Maclaren [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
| The question is independent of what the outside system believes a
| text file should look like, and is solely what Python believes a
| sequence of characters should mean.  For example, does 'A\r\nB'
| mean that B is separated from A by one newline or two?

The grammar presupposes that Python code is divided into lines.  Any 
successful interpreter must adjust to the external source's idea of line 
endings.  This is implementation, not language definition.

The grammar itself has no notion of structure within Python string objects. 
The split method lets one define anything as chunk separators.

The builtin compile method that uses strings as code input specifies \n and 
only \n as a line ending.  The universal line-ending model of string output 
to files does the same.  So from either viewpoint, the unambiguous answer 
to your question is 'one'.

Terry Jan Reedy



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-10-01 Thread Guido van Rossum
Does anyone else have the feeling that discussions with Mr. MacLaren
don't usually bear any fruit?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-30 Thread Nick Maclaren
Greg Ewing [EMAIL PROTECTED] wrote:
 
  Grrk.  That's the problem.  You don't get back what you have written
 
 You do as long as you *don't* use universal newlines mode
 for reading. This is the best that can be done, because
 universal newlines are inherently ambiguous.

I don't know PRECISELY what you mean by universal newlines mode,
and this issue is all about the details, so any response would
merely enhance the confusion.

 If you want universal newlines, you just have to accept
 that you can't also have \r characters meaning something
 other than newlines in your files. This is true regardless
 of what programming language or I/O model is being used.

No, that is not true, and I have used more than one model where
it wasn't.  Let's stick to models where newlines are special
characters - I prefer the ones where they are not, but that is
by the way.

Model 1:  certain characters can be used only in combination.
E.g. \f must occur immediately before (or after) a \n, which
it modifies.  r is either a newline-with-overprint or must be
associated with a \n.  In both cases, only ONE of the alternatives
is permitted in the chosen model - the other use then becomes an
error (and raises an exception).

Model 2: (BCPL) there are a variety of newline characters, \n for
plain newline, \f for newline-with-form-feed and \r for newline-
with-overprint.  ALL cause a newline, with the associated property.

Note that the above is what the program sees - what is written
to the outside world and how input is read is another matter.

But I can assure you, from my own and many other people's experience,
that neither of the above models cause the confusion being shown by
the postings in this thread.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  [EMAIL PROTECTED]
Tel.:  +44 1223 334761Fax:  +44 1223 334679
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-30 Thread skip

Michael Actually, I usually get these strings from Windows UI
Michael components. A file containing '\r\n' is read in with '\r\n'
Michael being translated to '\n'. New user input is added containing
Michael '\r\n' line endings. The file is written out and now contains a
Michael mix of '\r\n' and '\r\r\n'.

So you need a translation layer between the UI component and your code.
Treat the component as a text file and perform the desired mapping.  Yes?

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-30 Thread Michael Foord
[EMAIL PROTECTED] wrote:
 Michael Actually, I usually get these strings from Windows UI
 Michael components. A file containing '\r\n' is read in with '\r\n'
 Michael being translated to '\n'. New user input is added containing
 Michael '\r\n' line endings. The file is written out and now contains a
 Michael mix of '\r\n' and '\r\r\n'.

 So you need a translation layer between the UI component and your code.
 Treat the component as a text file and perform the desired mapping.  Yes?

   

Actually the problem was reported by one of the IronPython developers on 
behalf of another user. We stick to using the .NET file I/O and so don't 
have a problem. The only time it is an issue for us is our tests, where 
we have string literals in our test code (where new lines are obviously 
'\n') and we do a manual 'replace'. Not very difficult.

It is just slightly ironic that the time Python 'gets it wrong' (for 
some value of wrong) is when you are using text mode for I/O :-)

Michael

 Skip

   

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-30 Thread Nick Maclaren
Michael Foord [EMAIL PROTECTED] wrote:
 [EMAIL PROTECTED] wrote:
 
  Michael Actually, I usually get these strings from Windows UI
  Michael components. A file containing '\r\n' is read in with '\r\n'
  Michael being translated to '\n'. New user input is added containing
  Michael '\r\n' line endings. The file is written out and now contains a
  Michael mix of '\r\n' and '\r\r\n'.

  So you need a translation layer between the UI component and your code.
  Treat the component as a text file and perform the desired mapping.  Yes?
 
 Actually the problem was reported by one of the IronPython developers on 
 behalf of another user. We stick to using the .NET file I/O and so don't 
 have a problem. The only time it is an issue for us is our tests, where 
 we have string literals in our test code (where new lines are obviously 
 '\n') and we do a manual 'replace'. Not very difficult.
 
 It is just slightly ironic that the time Python 'gets it wrong' (for 
 some value of wrong) is when you are using text mode for I/O :-)

Plus ca change, 

That has been the problem for as long as I have been using the byte
stream model (nearly 40 years now).  Provided that you can get
control, OR there are well-defined semantics, you can sort things
out.  The semantics we define only the trivial case, and the
programmer must do something arcane, undefined and system-dependent
for the rest means that it is impossible for an interface to do
the 'right' translation unless it knows what each side of it is
assuming.

As I say, there are solutions.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  [EMAIL PROTECTED]
Tel.:  +44 1223 334761Fax:  +44 1223 334679
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-30 Thread Greg Ewing
Nick Maclaren wrote:
 I don't know PRECISELY what you mean by universal newlines mode

I mean precisely what Python means by the term: any of
\r, \n or \r\n represent a newline, and no distinction
is made between them.

You only need to use that if you don't know what convention
is being used by the file you're reading. And if you don't
know that, you've already lost information about what the
contents of the file means, and there's nothing that any
I/O system can do to get it back.

 Model 1:  certain characters can be used only in combination.
  ...

That's all fine if you know the file adheres to those
conventions. Just open it in binary mode and go for it.

The I/O systems of C and/or Python are designed for
environments where the files *don't* adhere to conventions
as helpful as that. They're making the best of what they're
given.

 Note that the above is what the program sees - what is written
 to the outside world and how input is read is another matter.
 
 But I can assure you, from my own and many other people's experience,
 that neither of the above models cause the confusion being shown by
 the postings in this thread.

There's no confusion about how newlines are represented
*inside* a Python program. The convention is quite clear -
a newline is \n and only \n. Confusion only arises when
people try to process strings internally that don't adhere
to that convention.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiem! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-30 Thread Greg Ewing
Michael Foord wrote:
 We stick to using the .NET file I/O and so don't 
 have a problem. The only time it is an issue for us is our tests, where 
 we have string literals in our test code (where new lines are obviously 
 '\n')

If you're going to do that, you really need to be consistent
about and have IronPython use \r\n internally for line endings
*everywhere*, including string literals.

 It is just slightly ironic that the time Python 'gets it wrong' (for 
 some value of wrong) is when you are using text mode for I/O :-)

I would say IronPython is getting it wrong by using inconsistent
internal representations of line endings.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiem! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-30 Thread Guido van Rossum
On 9/30/07, Greg Ewing [EMAIL PROTECTED] wrote:
 Michael Foord wrote:
  We stick to using the .NET file I/O and so don't
  have a problem. The only time it is an issue for us is our tests, where
  we have string literals in our test code (where new lines are obviously
  '\n')

 If you're going to do that, you really need to be consistent
 about and have IronPython use \r\n internally for line endings
 *everywhere*, including string literals.

I don't know what you mean by internally. There's lots of portable
code that uses the \n character in string literals (either to generate
line endings or to recognize them). That code can't suddenly be made
invalid. And changing all string literals that say \n to secretly
become \r\n would be worse than the \r -- \n swap that some old
Apple tools used to do. (If len(\n) == 2, what would len(\r\n)
be?)

  It is just slightly ironic that the time Python 'gets it wrong' (for
  some value of wrong) is when you are using text mode for I/O :-)

 I would say IronPython is getting it wrong by using inconsistent
 internal representations of line endings.

Honestly, I find it hard to see much merit in this discussion. A
number of Python libraries, including print() and io.py, use \n to
represent line endings in memory, and translate these to/from
platform-appropriate line endings when reading/writing text files.
OTOH, some other APIs, for example, sockets talking various internet
protocols (from SMTP to HTTP) as well as most (all?) native .NET APIs,
use \r\n to represent line endings. There are any number of ways to
convert between these conversions, including various invocations of
s.replace() and s.splitlines() (the latter does a
universal-newlines-like thing). Applications can take care of this,
and APIs can choose to use either convention for line endings (or
both, in the case of input).

Yes, occasionally users get confused. Too bad. They'll have to learn
about this issue. The issue isn't going away by wishing it to go away;
it is a fundamental difference between Windows and Unix, and neither
is likely to change or disappear. Changing Python to use the Windows
convention internally isn't going to help one bit. Changing Python to
use the platforn's convention is impossible without introducing a new
string escape that would mean \r\n on Windows and \n on Unix; and
given that there are legitimate reasons to sometimes deal with \r\n
explicitly even on Unix (and with just \n even on Windows) we wouldn't
be completely isolated from the issue. Changing APIs to not represent
the line ending as a character (as the Java I/O libraries do) would be
too big a change (and how would we distinguish between readline()
returning an empty line and EOF?) -- and I'm sure the issue still pops
up in plenty of places in Java.

The best solution for IronPython is probably to have the occasional
wrapper around .NET APIs that translates between \r\n and \n on the
boundary between Python and .NET; but one must be able to turn this
off or bypass the wrappers in cases where the data retrieved from one
.NET API is just passed straight on to another .NET API (and the
translation would just cause two redundant copies being made).

Get used to it. End of discussion.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Michael Foord
Guido van Rossum wrote:
 [snip..]
 Python *does* have its own I/O model. There are binary files and text
 files. For binary files, you write bytes and the semantic model is
 that of an array of bytes; byte indices are seek positions.

 For text files, the contents is considered to be Unicode, encoded as
 bytes in a binary file. So text file always has an underlying binary
 file. Two translations take place, both of which have defaults varying
 by platform. One translation is encoding Unicode text into bytes upon
 output, and decoding bytes to Unicode text upon input. This can use
 any encoding supported by the encodings package.

 The other translation deals with line endings. Upon input, any of
 \r\n, \r, or \n is translated to a single \n by default (this is nhe
 universal newlines algorithm from Python 2.x). This can be tweaked
 or disabled. Upon output, \n is translated into a platform specific
 string chosen from \r\n, \r, or \n. This can also be disabled or
 overridden. Note that \r, when written, is never treated specially; if
 you want special processing for \r on output, you can write your own
 translation layer.
   
So the question is, that when a string containing '\r\n' is written to a 
file in text mode on a Windows platform, should it be written with the 
encoded representation of '\r\n' or '\r\r\n'?

Purity would dictate the latter and practicality the former (IMO)...

However, that would mean that round tripping a string would change it 
('\r\n' would be written as '\r\n' and then read as '\n') - on the other 
hand (particularly given that we are treating the data as text and not a 
binary blob) I don't see how writing '\r\r\n' would ever actually be 
useful in text.

+1 on just writing '\r\n' from me.

Michael Foord
http://www.manning.com/foord


 That's all. There is nothing unimplementable or confusing in these
 specifications.

 Python doesn't care about record I/O on legacy OSes; it does care
 about variability found in practice between popular OSes.

 Note that \r, \n and friends in Python 3000 are either ASCII (in bytes
 literals) or Unicode (in text literals). Again, no support for legacy
 systems that don't use ASCII or a superset.

 Legacy OSes are called that for a reason.

   

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Terry Reedy

Michael Foord [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
| Guido van Rossum wrote:

[snip first part of nice summary of Python i/o model]

|  The other translation deals with line endings. Upon input, any of
|  \r\n, \r, or \n is translated to a single \n by default (this is nhe 
[sic]
|  universal newlines algorithm from Python 2.x). This can be tweaked
|  or disabled. Upon output, \n is translated into a platform specific
|  string chosen from \r\n, \r, or \n. This can also be disabled or
|  overridden. Note that \r, when written, is never treated specially; if
|  you want special processing for \r on output, you can write your own
|  translation layer.

| So the question is, that when a string containing '\r\n' is written to a
| file in text mode on a Windows platform, should it be written with the
| encoded representation of '\r\n' or '\r\r\n'?

I think Guido pretty clearly said that on output, the default behavior is 
that \r is nothing special.  If you want a special case exception, write a 
special case translator. +1 from me.

To propose otherwise is to propose that the default semantic meaning of 
Python text objects depend on the platform that it might be 
output-translated for.  I believe the point of universal newline support 
was to get away from this.

| Purity would dictate the latter and practicality the former (IMO)...

I disagree.  Special case exceptions complicate both learnability and code 
readability and maintainability.  Simplicity is practicality.  The symmetry 
of 'platform-line-endings =input \n =output plaform-line-endings' is both 
pure and practical.

| However, that would mean that round tripping a string would change it
| ('\r\n' would be written as '\r\n' and then read as '\n')

Whereas \r\r\n would be read back as \r\n, which is what should happen. 
Round-trip-ability is practical to me.

| - on the other
| hand (particularly given that we are treating the data as text and not a
| binary blob) I don't see how writing '\r\r\n' would ever actually be
| useful in text.

There are two normal ways for internal Python text to have \r\n:
1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the 
same platform).
2. Intentially put there by a programmer.  If s/he also chooses default \n 
translation on output, \rtranslation of \n is correct.

The leaves
1. Bugs due to ignorance or accident.  These should be repaired.
2. Other special situations, which can be handled by disabling, overriding, 
and layering the defaults.  This seems enough flexibility to me.

Terry Jan Reedy




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Michael Foord
Terry Reedy wrote:
 Michael Foord [EMAIL PROTECTED] wrote in message 
 news:[EMAIL PROTECTED]
 | Guido van Rossum wrote:

 [snip first part of nice summary of Python i/o model]

 |  The other translation deals with line endings. Upon input, any of
 |  \r\n, \r, or \n is translated to a single \n by default (this is nhe 
 [sic]
 |  universal newlines algorithm from Python 2.x). This can be tweaked
 |  or disabled. Upon output, \n is translated into a platform specific
 |  string chosen from \r\n, \r, or \n. This can also be disabled or
 |  overridden. Note that \r, when written, is never treated specially; if
 |  you want special processing for \r on output, you can write your own
 |  translation layer.

 | So the question is, that when a string containing '\r\n' is written to a
 | file in text mode on a Windows platform, should it be written with the
 | encoded representation of '\r\n' or '\r\r\n'?

 I think Guido pretty clearly said that on output, the default behavior is 
 that \r is nothing special.  If you want a special case exception, write a 
 special case translator. +1 from me.

 To propose otherwise is to propose that the default semantic meaning of 
 Python text objects depend on the platform that it might be 
 output-translated for.  I believe the point of universal newline support 
 was to get away from this.

 | Purity would dictate the latter and practicality the former (IMO)...

 I disagree.  Special case exceptions complicate both learnability and code 
 readability and maintainability.  Simplicity is practicality.  The symmetry 
 of 'platform-line-endings =input \n =output plaform-line-endings' is both 
 pure and practical.

 | However, that would mean that round tripping a string would change it
 | ('\r\n' would be written as '\r\n' and then read as '\n')

 Whereas \r\r\n would be read back as \r\n, which is what should happen. 
 Round-trip-ability is practical to me.

 | - on the other
 | hand (particularly given that we are treating the data as text and not a
 | binary blob) I don't see how writing '\r\r\n' would ever actually be
 | useful in text.

 There are two normal ways for internal Python text to have \r\n:
 1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the 
 same platform).
 2. Intentially put there by a programmer.  If s/he also chooses default \n 
 translation on output, \rtranslation of \n is correct.
   
Actually, I usually get these strings from Windows UI components. A file 
containing '\r\n' is read in with '\r\n' being translated to '\n'. New 
user input is added containing '\r\n' line endings. The file is written 
out and now contains a mix of '\r\n' and '\r\r\n'.

Michael


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Steven Bethard
On 9/29/07, Michael Foord [EMAIL PROTECTED] wrote:
 Terry Reedy wrote:
  There are two normal ways for internal Python text to have \r\n:
  1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the
  same platform).
  2. Intentially put there by a programmer.  If s/he also chooses default \n
  translation on output, \rtranslation of \n is correct.
 
 Actually, I usually get these strings from Windows UI components. A file
 containing '\r\n' is read in with '\r\n' being translated to '\n'. New
 user input is added containing '\r\n' line endings. The file is written
 out and now contains a mix of '\r\n' and '\r\r\n'.

Out of curiosity, why don't the Python wrappers for your Windows UI
components do the appropriate '\r\n' - '\n' conversions?

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
--- Bucky Katt, Get Fuzzy
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Michael Foord
Steven Bethard wrote:
 On 9/29/07, Michael Foord [EMAIL PROTECTED] wrote:
   
 Terry Reedy wrote:
 
 There are two normal ways for internal Python text to have \r\n:
 1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the
 same platform).
 2. Intentially put there by a programmer.  If s/he also chooses default \n
 translation on output, \rtranslation of \n is correct.

   
 Actually, I usually get these strings from Windows UI components. A file
 containing '\r\n' is read in with '\r\n' being translated to '\n'. New
 user input is added containing '\r\n' line endings. The file is written
 out and now contains a mix of '\r\n' and '\r\r\n'.
 

 Out of curiosity, why don't the Python wrappers for your Windows UI
 components do the appropriate '\r\n' - '\n' conversions?
   

One of the great things about IronPython is that you don't *need* any 
wrappers - you access .NET objects natively (which in fact wrap the 
lower level win32 API) - and the .NET APIs are usually not as bad as you 
probably assume. ;-)

You just have to be aware that line endings are '\r\n'. I'm not sure how 
or if pywin32 handles this.

Michael

 STeVe
   

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Steven Bethard
On 9/29/07, Michael Foord [EMAIL PROTECTED] wrote:
 Steven Bethard wrote:
  On 9/29/07, Michael Foord [EMAIL PROTECTED] wrote:
 
  Terry Reedy wrote:
 
  There are two normal ways for internal Python text to have \r\n:
  1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the
  same platform).
  2. Intentially put there by a programmer.  If s/he also chooses default \n
  translation on output, \rtranslation of \n is correct.
 
 
  Actually, I usually get these strings from Windows UI components. A file
  containing '\r\n' is read in with '\r\n' being translated to '\n'. New
  user input is added containing '\r\n' line endings. The file is written
  out and now contains a mix of '\r\n' and '\r\r\n'.
 
  Out of curiosity, why don't the Python wrappers for your Windows UI
  components do the appropriate '\r\n' - '\n' conversions?

 One of the great things about IronPython is that you don't *need* any
 wrappers - you access .NET objects natively (which in fact wrap the
 lower level win32 API) - and the .NET APIs are usually not as bad as you
 probably assume. ;-)

 You just have to be aware that line endings are '\r\n'.

Ahh, I see.  So all the .NET components function like Python 3.0's
io.open(..., newline='\n'), where no translation of \n (to or from
\r\n) is performed.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
--- Bucky Katt, Get Fuzzy
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Michael Foord
Steven Bethard wrote:
 On 9/29/07, Michael Foord [EMAIL PROTECTED] wrote:
   
 Steven Bethard wrote:
 
 On 9/29/07, Michael Foord [EMAIL PROTECTED] wrote:

   
 Terry Reedy wrote:

 
 There are two normal ways for internal Python text to have \r\n:
 1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the
 same platform).
 2. Intentially put there by a programmer.  If s/he also chooses default \n
 translation on output, \rtranslation of \n is correct.


   
 Actually, I usually get these strings from Windows UI components. A file
 containing '\r\n' is read in with '\r\n' being translated to '\n'. New
 user input is added containing '\r\n' line endings. The file is written
 out and now contains a mix of '\r\n' and '\r\r\n'.
 
 Out of curiosity, why don't the Python wrappers for your Windows UI
 components do the appropriate '\r\n' - '\n' conversions?
   
 One of the great things about IronPython is that you don't *need* any
 wrappers - you access .NET objects natively (which in fact wrap the
 lower level win32 API) - and the .NET APIs are usually not as bad as you
 probably assume. ;-)

 You just have to be aware that line endings are '\r\n'.
 

 Ahh, I see.  So all the .NET components function like Python 3.0's
 io.open(..., newline='\n'), where no translation of \n (to or from
 \r\n) is performed.
   

Effectively yes. Although for Python compatibility, opening a file in 
text mode using the python 'open' or 'file' will behave in the usual way.

Michael

 STeVe
   

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Paul Moore
 Actually, I usually get these strings from Windows UI components. A file
 containing '\r\n' is read in with '\r\n' being translated to '\n'. New
 user input is added containing '\r\n' line endings. The file is written
 out and now contains a mix of '\r\n' and '\r\r\n'.

 Out of curiosity, why don't the Python wrappers for your Windows UI
 components do the appropriate '\r\n' - '\n' conversions?

 One of the great things about IronPython is that you don't *need* any
 wrappers - you access .NET objects natively (which in fact wrap the
 lower level win32 API) - and the .NET APIs are usually not as bad as you
 probably assume. ;-)

Given the current lengthy discussion about newline translation, maybe
it isn't such a great thing :-)

Seriously, you do need a wrapper in this particular case - to convert
the .NET line ending convention to Python's. The issue here is that
such a wrapper is so trivial, that it's usually easier to simply do
the translation with adhoc .replace('\r\n', '\n') calls. The problem
comes when you accidentally forget a translation - then you get the
clash between the .NET (\r\\n) and Python (\n) models. But of course,
the solution in that case is to simply add the omitted translation,
not to change Python's IO model.

Of course, all this grand theory is just that - theory. In my case, it
helped me understand what's going on, but that's all. For real life
code, you just add the appropriate replace() calls. Whether theory
helps you keep track of where replace() is needed, or whether you just
know, doesn't really matter much.

But regardless - the Python IO model doesn't need changing. (Not even
2.x, and the py3k model is even better in this regard).

Paul.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Terry Reedy

Michael Foord [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
| Terry Reedy wrote:
|  There are two normal ways for internal Python text to have \r\n:
|  1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the
|  same platform).
|  2. Intentially put there by a programmer.  If s/he also chooses default 
\n
|  translation on output, \rtranslation of \n is correct.
| 
| Actually, I usually get these strings from Windows UI components. A file
| containing '\r\n' is read in with '\r\n' being translated to '\n'. New
| user input is added containing '\r\n' line endings. The file is written
| out and now contains a mix of '\r\n' and '\r\r\n'.

I covered this in the part you snipped:

2. Other special situations, which can be handled by disabling, 
overriding,
and layering the defaults.  This seems enough flexibility to me.

While mixing input like this may seem 'normal' to you, I believe it is 
'special'
considering the total Python community.  I can think of at least 4 decent 
solutions, depending on the details of the input and what you do with it.

tjr



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Greg Ewing
Michael Foord wrote:
 One of the great things about IronPython is that you don't *need* any 
 wrappers - you access .NET objects natively

But it seems that you really *do* need wrappers to
deal with the line endings problem, whether they're
provided automatically or you it yourself manually.

This is reminiscent of the C-string vs. Pascal-string
fiasco when Apple switched from Pascal to C as their
main application programming language. Some development
environments provided glue code that did the translation
automatically; others required you to do it yourself,
which was a huge nuisance.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-26 Thread Michael Foord
Dino Viehland wrote:
 My understanding is that users can write code that uses only \n and Python 
 will write the end-of-line character(s) that are appropriate for the platform 
 when writing to a file.  That's what I meant by uses \n for everything 
 internally.

 But if you write \r\n to a file Python completely ignores the presence of the 
 \r and transforms the \n into a \r\n anyway, hence the \r\r in the resulting 
 stream.  My last question is simply does anyone find writing \r\r\n when the 
 original string contained \r\n a useful behavior - personally I don't see how 
 it is.

 But Guido's response makes this sound like it's a problem w/ VC++ stdio 
 implementation and not something that Python is explicitly doing.  Anyway, 
 it'd might be useful to have a text-mode file that you can write \r\n to and 
 only get \r\n in the resulting file.  But if the general sentiment is 
 s.replace('\r', '') is the way to go we can advice our users of the behavior 
 when interoperating w/ APIs that return \r\n in strings.
   

We always do replace('\r\n','\n') but same difference...

Michael


 -Original Message-
 From: Martin v. Löwis [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, September 26, 2007 3:01 PM
 To: Dino Viehland
 Cc: python-dev@python.org
 Subject: Re: [Python-Dev] New lines, carriage returns, and Windows

   
 This works great as long as you stay within an entirely Python world.
 Because Python uses \n for everything internally
 

 I think you misunderstand fairly significantly how this all works
 together. Python does not use \n for everything internally. Python
 is well capable of representing \r separately, and does so if you
 ask it to.

   
 So I'm curious: Is there a reason this behavior is useful that I'm
 missing?
 

 I think you are missing how it works in the first place (or else
 you failed to communicate to me what precise behavior you find
 puzzling).

 Regards,
 Martin

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk

   

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-26 Thread Dino Viehland
And if this is fine for you, given that you may have the largest WinForms / 
IronPython code base, I tend to think the replace may be reasonable.  But we 
have had someone get surprised by this behavior.

-Original Message-
From: Michael Foord [mailto:[EMAIL PROTECTED]
Sent: Wednesday, September 26, 2007 3:15 PM
To: Dino Viehland
Cc: python-dev@python.org
Subject: Re: [python] Re: [Python-Dev] New lines, carriage returns, and Windows

Dino Viehland wrote:
 My understanding is that users can write code that uses only \n and Python 
 will write the end-of-line character(s) that are appropriate for the platform 
 when writing to a file.  That's what I meant by uses \n for everything 
 internally.

 But if you write \r\n to a file Python completely ignores the presence of the 
 \r and transforms the \n into a \r\n anyway, hence the \r\r in the resulting 
 stream.  My last question is simply does anyone find writing \r\r\n when the 
 original string contained \r\n a useful behavior - personally I don't see how 
 it is.

 But Guido's response makes this sound like it's a problem w/ VC++ stdio 
 implementation and not something that Python is explicitly doing.  Anyway, 
 it'd might be useful to have a text-mode file that you can write \r\n to and 
 only get \r\n in the resulting file.  But if the general sentiment is 
 s.replace('\r', '') is the way to go we can advice our users of the behavior 
 when interoperating w/ APIs that return \r\n in strings.


We always do replace('\r\n','\n') but same difference...

Michael


 -Original Message-
 From: Martin v. Löwis [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, September 26, 2007 3:01 PM
 To: Dino Viehland
 Cc: python-dev@python.org
 Subject: Re: [Python-Dev] New lines, carriage returns, and Windows


 This works great as long as you stay within an entirely Python world.
 Because Python uses \n for everything internally


 I think you misunderstand fairly significantly how this all works
 together. Python does not use \n for everything internally. Python
 is well capable of representing \r separately, and does so if you
 ask it to.


 So I'm curious: Is there a reason this behavior is useful that I'm
 missing?


 I think you are missing how it works in the first place (or else
 you failed to communicate to me what precise behavior you find
 puzzling).

 Regards,
 Martin

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-26 Thread Michael Foord
Dino Viehland wrote:
 And if this is fine for you, given that you may have the largest WinForms / 
 IronPython code base, I tend to think the replace may be reasonable.  But we 
 have had someone get surprised by this behavior.
   

It is a slight impedance mismatch between Python and Windows - but isn't 
restricted to IronPython, so changing Python semantics doesn't seem like 
the right answer.

Alternatively a more intelligent text mode (that writes '\n' as '\r\n' 
and '\r\n' as '\r\n' on Windows) doesn't sound like *such* a bad idea - 
but you will still get caught out by this. A string read in text mode  
will read '\r\n' as '\n'. Setting this on a winforms component will 
still do the wrong thing. Better to be aware of the difference and use 
binary mode.

Michael

 -Original Message-
 From: Michael Foord [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, September 26, 2007 3:15 PM
 To: Dino Viehland
 Cc: python-dev@python.org
 Subject: Re: [python] Re: [Python-Dev] New lines, carriage returns, and 
 Windows

 Dino Viehland wrote:
   
 My understanding is that users can write code that uses only \n and Python 
 will write the end-of-line character(s) that are appropriate for the 
 platform when writing to a file.  That's what I meant by uses \n for 
 everything internally.

 But if you write \r\n to a file Python completely ignores the presence of 
 the \r and transforms the \n into a \r\n anyway, hence the \r\r in the 
 resulting stream.  My last question is simply does anyone find writing 
 \r\r\n when the original string contained \r\n a useful behavior - 
 personally I don't see how it is.

 But Guido's response makes this sound like it's a problem w/ VC++ stdio 
 implementation and not something that Python is explicitly doing.  Anyway, 
 it'd might be useful to have a text-mode file that you can write \r\n to and 
 only get \r\n in the resulting file.  But if the general sentiment is 
 s.replace('\r', '') is the way to go we can advice our users of the behavior 
 when interoperating w/ APIs that return \r\n in strings.

 

 We always do replace('\r\n','\n') but same difference...

 Michael

   
 -Original Message-
 From: Martin v. Löwis [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, September 26, 2007 3:01 PM
 To: Dino Viehland
 Cc: python-dev@python.org
 Subject: Re: [Python-Dev] New lines, carriage returns, and Windows


 
 This works great as long as you stay within an entirely Python world.
 Because Python uses \n for everything internally

   
 I think you misunderstand fairly significantly how this all works
 together. Python does not use \n for everything internally. Python
 is well capable of representing \r separately, and does so if you
 ask it to.


 
 So I'm curious: Is there a reason this behavior is useful that I'm
 missing?

   
 I think you are missing how it works in the first place (or else
 you failed to communicate to me what precise behavior you find
 puzzling).

 Regards,
 Martin

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk


 


   

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com