Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Glenn Linderman
On approximately 4/28/2009 10:52 PM, came the following characters from the keyboard of Martin v. Löwis: C. File on disk with the invalid surrogate code, accessed via the str interface, no decoding happens, matches in memory the file on disk with the byte that translates to the same surrogate, ac

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Thomas Breuel
On Wed, Apr 29, 2009 at 07:45, "Martin v. Löwis" wrote: > Your claim was > that PEP 383 may have unfortunate effects on Windows, No, I simply think that PEP 383 is not sufficiently specified to be able to tell. > and I'm telling > you that it won't, because the behavior of Python on Windows w

Re: [Python-Dev] Python-Dev PEP 383: Non-decodable Bytes in System Character?Interfaces

2009-04-28 Thread Martin v. Löwis
> I would like utility functions to perform: > os-bytes->funny-encoded > funny-encoded->os-bytes > or explicit example code snippets for same in the PEP text. Done! Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mail

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Martin v. Löwis
> I'm more concerned with your (yours? someone else's?) mention of shift > characters. I'm unfamiliar with these encodings: to translate such a > thing into a Latin example, is it the case that there are schemes with > valid encodings that look like: > > [SHIFT] a b c > > which would produce "A

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Martin v. Löwis
>> The Python UTF-8 codec will happily encode half-surrogates; people argue >> that it is a bug that it does so, however, it would help in this >> specific case. > > Can we use this encoding scheme for writing into files as well? We've > turned the filename with undecodable bytes into a string wi

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Martin v. Löwis
>>> C. File on disk with the invalid surrogate code, accessed via the str >>> interface, no decoding happens, matches in memory the file on disk with >>> the byte that translates to the same surrogate, accessed via the bytes >>> interface. Ambiguity. >> >> Is that an alternative to A and B? > > I

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Martin v. Löwis
> The wide APIs use UTF-16. UTF-16 suffers from the same problem as > UTF-8: not all sequences of words are valid UTF-16 sequences. In > particular, sequences containing isolated surrogate pairs are not > well-formed according to the Unicode standard. Therefore, the existence > of a wide charact

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Thomas Breuel
> > It cannot crash Python; it can only crash > hypothetical third-party programs or libraries with deficient error > checking and > unreasonable assumptions about input data. The error checking isn't necessarily deficient. For example, a safe and legitimate thing to do is for third party librar

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Glenn Linderman
On approximately 4/28/2009 4:06 PM, came the following characters from the keyboard of Cameron Simpson: I think I may be able to resolve Glenn's issues with the scheme lower down (through careful use of definitions and hand waving). Close. You at least resolved what you thought my issue was

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Glenn Linderman
On approximately 4/28/2009 7:40 PM, came the following characters from the keyboard of R. David Murray: On Tue, 28 Apr 2009 at 13:37, Glenn Linderman wrote: C. File on disk with the invalid surrogate code, accessed via the str interface, no decoding happens, matches in memory the file on disk w

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Cameron Simpson
On 28Apr2009 13:37, Glenn Linderman wrote: > On approximately 4/28/2009 1:25 PM, came the following characters from > the keyboard of Martin v. Löwis: >>> The UTF-8b representation suffers from the same potential ambiguities as >>> the PUA characters... >> >> Not at all the same ambiguities. He

[Python-Dev] Proposed: a new function-based C API for declaring Python types

2009-04-28 Thread Larry Hastings
EXECUTIVE SUMMARY I've written a patch against py3k trunk creating a new function-based API for creating extension types in C. This allows PyTypeObject to become a (mostly) private structure. THE PROBLEM Here's how you create an extension type using the current API. * First, find some cod

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Toshio Kuratomi
Martin v. Löwis wrote: >> Since the serialization of the Unicode string is likely to use UTF-8, >> and the string for such a file will include half surrogates, the >> application may raise an exception when encoding the names for a >> configuration file. These encoding exceptions will be as rare a

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Cameron Simpson
On 28Apr2009 14:37, Thomas Breuel wrote: | But the biggest problem with the proposal is that it isn't needed: if you | want to be able to turn arbitrary byte sequences into unicode strings and | back, just set your encoding to iso8859-15. That already works and it | doesn't require any changes.

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread R. David Murray
On Tue, 28 Apr 2009 at 13:37, Glenn Linderman wrote: C. File on disk with the invalid surrogate code, accessed via the str interface, no decoding happens, matches in memory the file on disk with the byte that translates to the same surrogate, accessed via the bytes interface. Ambiguity. Unles

Re: [Python-Dev] Python-Dev PEP 383: Non-decodable Bytes in System Character?Interfaces

2009-04-28 Thread Cameron Simpson
On 28Apr2009 11:49, Antoine Pitrou wrote: | Paul Moore gmail.com> writes: | > | > I've yet to hear anyone claim that they would have an actual problem | > with a specific piece of code they have written. | | Yep, that's the problem. Lots of theoretical problems noone has ever encountered | bro

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Toshio Kuratomi
Zooko O'Whielacronx wrote: > On Apr 28, 2009, at 6:46 AM, Hrvoje Niksic wrote: >> If you switch to iso8859-15 only in the presence of undecodable UTF-8, >> then you have the same round-trip problem as the PEP: both b'\xff' and >> b'\xc3\xbf' will be converted to u'\u00ff' without a way to >> unambi

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Glenn Linderman
On approximately 4/28/2009 2:01 PM, came the following characters from the keyboard of MRAB: Glenn Linderman wrote: On approximately 4/28/2009 11:55 AM, came the following characters from the keyboard of MRAB: I've been thinking of "python-escape" only in terms of UTF-8, the only encoding menti

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Cameron Simpson
I think I may be able to resolve Glenn's issues with the scheme lower down (through careful use of definitions and hand waving). On 27Apr2009 23:52, Glenn Linderman wrote: > On approximately 4/27/2009 7:11 PM, came the following characters from > the keyboard of Cameron Simpson: [...] >> There

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Glenn Linderman
On approximately 4/28/2009 2:02 PM, came the following characters from the keyboard of Martin v. Löwis: Glenn Linderman wrote: On approximately 4/28/2009 1:25 PM, came the following characters from the keyboard of Martin v. Löwis: The UTF-8b representation suffers from the same potential ambigu

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Antoine Pitrou
Thomas Breuel gmail.com> writes: > > And, in fact, Windows Vista happily creates files with malformed UTF-16 encodings, and os.listdir() happily returns them. The PEP won't change that, so what's the problem exactly? > Under your proposal, passing the output from a correctly implemented file s

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Thomas Breuel
> > On Windows, the Wide APIs are already used throughout the code base, > e.g. SetEnvironmentVariableW/_wenviron. If you need to find out the > specific API for a specific functionality, please read the source code. > [...] > No, I don't assume that. I assume that all functions are strictly > ava

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Martin v. Löwis
Glenn Linderman wrote: > On approximately 4/28/2009 1:25 PM, came the following characters from > the keyboard of Martin v. Löwis: >>> The UTF-8b representation suffers from the same potential ambiguities as >>> the PUA characters... >> >> Not at all the same ambiguities. Here, again, the two choi

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread MRAB
Glenn Linderman wrote: On approximately 4/28/2009 11:55 AM, came the following characters from the keyboard of MRAB: I've been thinking of "python-escape" only in terms of UTF-8, the only encoding mentioned in the PEP. In UTF-8, bytes 0x00 to 0x7F are decodable. UTF-8 is only mentioned in the

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Martin v. Löwis
> Others have made this suggestion, and it is helpful to the PEP, but not > sufficient. As implemented as an error handler, I'm not sure that the > b'\xed\xb3\xbf' sequence would trigger the error handler, if the UTF-8 > decoder is happy with it. Which, in my testing, it is. Rest assured that th

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Glenn Linderman
On approximately 4/28/2009 1:25 PM, came the following characters from the keyboard of Martin v. Löwis: The UTF-8b representation suffers from the same potential ambiguities as the PUA characters... Not at all the same ambiguities. Here, again, the two choices: A. use PUA characters to repres

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Glenn Linderman
On approximately 4/28/2009 6:01 AM, came the following characters from the keyboard of Lino Mastrodomenico: 2009/4/28 Glenn Linderman : The switch from PUA to half-surrogates does not resolve the issues with the encoding not being a 1-to-1 mapping, though. The very fact that you think you can

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Glenn Linderman
On approximately 4/28/2009 11:55 AM, came the following characters from the keyboard of MRAB: I've been thinking of "python-escape" only in terms of UTF-8, the only encoding mentioned in the PEP. In UTF-8, bytes 0x00 to 0x7F are decodable. UTF-8 is only mentioned in the sense of having special

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Martin v. Löwis
> The UTF-8b representation suffers from the same potential ambiguities as > the PUA characters... Not at all the same ambiguities. Here, again, the two choices: A. use PUA characters to represent undecodable bytes, in particular for UTF-8 (the PEP actually never proposed this to happen).

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Martin v. Löwis
MRAB wrote: > Martin v. Löwis wrote: >>> Furthermore, I don't believe that PEP 383 works consistently on Windows, >> >> What makes you say that? PEP 383 will have no effect on Windows, >> compared to the status quo, whatsoever. >> > You could argue that if Windows is actually returning UTF-16 with

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Martin v. Löwis
> Your proposal says that utf-8b would be used for file systems, but then > you also say that it might be used for command line arguments and > environment variables. So, which specific APIs will it be used with on > Windows and on POSIX systems? On Windows, the Wide APIs are already used through

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-28 Thread Zooko O'Whielacronx
On Apr 28, 2009, at 13:01 PM, Thomas Breuel wrote: (2) Should the default UTF-8 encoder for file system operations be allowed to generate illegal byte sequences? I think that's a definite no; if I set the encoding for a device to UTF-8, I never want Python to try to write illegal UTF-8 stri

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Thomas Breuel
On Tue, Apr 28, 2009 at 20:45, "Martin v. Löwis" wrote: > > Furthermore, I don't believe that PEP 383 works consistently on Windows, > > What makes you say that? PEP 383 will have no effect on Windows, > compared to the status quo, whatsoever. > That's what you believe, but it's not clear to me

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Glenn Linderman
On approximately 4/28/2009 10:53 AM, came the following characters from the keyboard of James Y Knight: On Apr 28, 2009, at 2:50 AM, Martin v. Löwis wrote: James Y Knight wrote: Hopefully it can be assumed that your locale encoding really is a non-overlapping superset of ASCII, as is required

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread MRAB
Martin v. Löwis wrote: Furthermore, I don't believe that PEP 383 works consistently on Windows, What makes you say that? PEP 383 will have no effect on Windows, compared to the status quo, whatsoever. You could argue that if Windows is actually returning UTF-16 with half surrogates that they

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Zooko O'Whielacronx
On Apr 28, 2009, at 6:46 AM, Hrvoje Niksic wrote: Are you proposing to unconditionally encode file names as iso8859-15, or to do so only when undecodeable bytes are encountered? For what it is worth, what we have previously planned to do for the Tahoe project is the second of these -- decod

[Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-28 Thread Thomas Breuel
I think we should break up this problem into several parts: (1) Should the default UTF-8 decoder fail if it gets an illegal byte sequence. It's probably OK for the default decoder to be lenient in some way (see below). (2) Should the default UTF-8 encoder for file system operations be allowed to

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread MRAB
James Y Knight wrote: On Apr 28, 2009, at 2:50 AM, Martin v. Löwis wrote: James Y Knight wrote: Hopefully it can be assumed that your locale encoding really is a non-overlapping superset of ASCII, as is required by POSIX... Can you please point to the part of the POSIX spec that says that s

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Glenn Linderman
On approximately 4/28/2009 10:00 AM, came the following characters from the keyboard of Martin v. Löwis: An alternative that doesn't suffer from the risk of not being able to store decoded strings would have been the use of PUA characters, but people rejected it because of the potential ambigui

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Martin v. Löwis
> Furthermore, I don't believe that PEP 383 works consistently on Windows, What makes you say that? PEP 383 will have no effect on Windows, compared to the status quo, whatsoever. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Thomas Breuel
> > However, it is "mission creep": Martin didn't volunteer to > write a PEP for it, he volunteered to write a PEP to solve the > "roundtrip the value of os.listdir()" problem. And he succeeded, up > to some minor details. Yes, it solves that problem. But that doesn't come without cost. Most i

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread James Y Knight
On Apr 28, 2009, at 2:50 AM, Martin v. Löwis wrote: James Y Knight wrote: Hopefully it can be assumed that your locale encoding really is a non-overlapping superset of ASCII, as is required by POSIX... Can you please point to the part of the POSIX spec that says that such overlapping is forb

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Martin v. Löwis
> If the PEP depends on this being changed, it should be mentioned in the > PEP. The PEP says that the utf-8b codec decodes invalid bytes into low surrogates. I have now clarified that a strict definition of UTF-8 is assumed for utf-8b. Regards, Martin ___

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Martin v. Löwis
> Since the serialization of the Unicode string is likely to use UTF-8, > and the string for such a file will include half surrogates, the > application may raise an exception when encoding the names for a > configuration file. These encoding exceptions will be as rare as the > unusual names (whic

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Martin v. Löwis
> It does solve this issue, because (unlike e.g. U+F01FF) '\udcff' is > not a valid Unicode character (not a character at all, really) and the > only way you can put this in a POSIX filename is if you use a very > lenient UTF-8 encoder that gives you b'\xed\xb3\xbf'. > > Since this byte sequence

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Martin v. Löwis
> If we follow your approach, that ISO8859-15 string will get turned into > an escaped unicode string inside Python. If I understand your proposal > correctly, if it's a output file name and gets passed to Python's open > function, Python will then decode that string and end up with an > ISO8859-1

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Stephen J. Turnbull
Thomas Breuel writes: > PEP 383 doesn't make it any easier; it just turns one set of > problems into another. That's false. There is an interesting class of problems of the form "get a list of names from the OS and allow the user to select from it, and retrieve corresponding content." People

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Stephen J. Turnbull
Paul Moore writes: > But it seems to me that there is an assumption that problems will > arise when code gets a potentially funny-decoded string and doesn't > know where it came from. > > Is that a real concern? Yes, it's a real concern. I don't think it's possible to show a small piece of

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Michael Urman
On Mon, Apr 27, 2009 at 23:43, Stephen J. Turnbull wrote: > Nobody said we were at the stage of *saving* the [attachment]! But speaking of saving files, I think that's the biggest hole in this that has been nagging at the back of my mind. This PEP intends to allow easy access to filenames and oth

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Duncan Booth
Hrvoje Niksic wrote: > Assume a UTF-8 locale. A file named b'\xff', being an invalid UTF-8 > sequence, will be converted to the half-surrogate '\udcff'. However, > a file named b'\xed\xb3\xbf', a valid[1] UTF-8 sequence, will also be > converted to '\udcff'. Those are quite different POSIX p

Re: [Python-Dev] One more proposed formatting change for 3.1

2009-04-28 Thread Paul Moore
2009/4/28 Mark Dickinson : > Here's one more proposed change, this time for formatting > of floats using format() and the empty presentation type. > To avoid repeating myself, here's the text from the issue > I just opened: > > http://bugs.python.org/issue5864 > > """ > In all versions of Python fr

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Lino Mastrodomenico
2009/4/28 Hrvoje Niksic : > Lino Mastrodomenico wrote: >> >> Since this byte sequence [b'\xed\xb3\xbf'] doesn't represent a valid >> character when >> decoded with UTF-8, it should simply be considered an invalid UTF-8 >> sequence of three bytes and decoded to '\udced\udcb3\udcbf' (*not* >> '\udcff

Re: [Python-Dev] lone surrogates in utf-8

2009-04-28 Thread Antoine Pitrou
Hrvoje Niksic avl.com> writes: > > "Should be considered" or "will be considered"? Python 3.0's UTF-8 > decoder happily accepts it and returns u'\udcff': > > >>> b'\xed\xb3\xbf'.decode('utf-8') > '\udcff' Yes, there is already a bug entry for it: http://bugs.python.org/issue3672 I think we

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Hrvoje Niksic
Lino Mastrodomenico wrote: Since this byte sequence [b'\xed\xb3\xbf'] doesn't represent a valid character when decoded with UTF-8, it should simply be considered an invalid UTF-8 sequence of three bytes and decoded to '\udced\udcb3\udcbf' (*not* '\udcff'). "Should be considered" or "will be co

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System C haracter Interfaces

2009-04-28 Thread Antoine Pitrou
Thomas Breuel gmail.com> writes: > > How can you bring up practical problems against something that hasn't been implemented? The PEP is simple enough that you can simulate its effect by manually computing the resulting unicode string for a hypothetical broken filename. Several people have alread

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Lino Mastrodomenico
2009/4/28 Glenn Linderman : > The switch from PUA to half-surrogates does not resolve the issues with the > encoding not being a 1-to-1 mapping, though.  The very fact that you  think > you can get away with use of lone surrogates means that other people might, > accidentally or intentionally, also

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread R. David Murray
On Tue, 28 Apr 2009 at 09:30, Thomas Breuel wrote: Therefore, when Python encounters path names on a file system that are not consistent with the (assumed) encoding for that file system, Python should raise an error. This is what happens currently, and users are quite unhappy about it. We nee

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Hrvoje Niksic
Thomas Breuel wrote: But the biggest problem with the proposal is that it isn't needed: if you want to be able to turn arbitrary byte sequences into unicode strings and back, just set your encoding to iso8859-15. That already works and it doesn't require any changes. Are you proposing to unc

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Hrvoje Niksic
Lino Mastrodomenico wrote: Let's suppose that I use Python 2.x or something else to create a file with name b'\xff'. My (Linux) system has a sane configuration and the filesystem encoding is UTF-8, so it's an invalid name but the kernel will blindly accept it anyway. With this PEP, Python 3.1 li

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Thomas Breuel
> > Yep, that's the problem. Lots of theoretical problems noone has ever > encountered > brought up against a PEP which resolves some actual problems people > encounter on > a regular basis. How can you bring up practical problems against something that hasn't been implemented? The fact that no

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Ronald Oussoren
For what it's worth, the OSX API's seem to behave as follows: * If you create a file with an non-UTF8 name on a HFS+ filesystem the system automaticly encodes the name. That is, open(chr(255), 'w') will silently create a file named '%FF' instead of the name you'd expect on a unix system.

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Lino Mastrodomenico
2009/4/28 Thomas Breuel : > If we follow PEP 383, you will get lots of errors anyway because those > strings, when encoded in utf-8b, will result in an error when you try to > write them on a Windows file system or any other system that doesn't allow > the byte sequences that the utf-8b encodes. I

Re: [Python-Dev] Can not run under python 2.6

2009-04-28 Thread Jianchun Zhou
OK, Thanks a lot. On Tue, Apr 28, 2009 at 8:06 PM, Michael Foord wrote: > Jianchun Zhou wrote: > >> Hi, there: >> >> I am new to python, and now I got a trouble: >> >> I have an application named canola, it is written under python 2.5, and >> can run normally under python 2.5 >> >> But when it co

Re: [Python-Dev] Can not run under python 2.6

2009-04-28 Thread Michael Foord
Jianchun Zhou wrote: Hi, there: I am new to python, and now I got a trouble: I have an application named canola, it is written under python 2.5, and can run normally under python 2.5 But when it comes under python 2.6, problem up, it says: Traceback (most recent call last): File "/usr/li

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Michael Foord
Paul Moore wrote: 2009/4/28 Antoine Pitrou : Paul Moore gmail.com> writes: I've yet to hear anyone claim that they would have an actual problem with a specific piece of code they have written. Yep, that's the problem. Lots of theoretical problems noone has ever encountered brou

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Paul Moore
2009/4/28 Antoine Pitrou : > Paul Moore gmail.com> writes: >> >> I've yet to hear anyone claim that they would have an actual problem >> with a specific piece of code they have written. > > Yep, that's the problem. Lots of theoretical problems noone has ever > encountered > brought up against a P

[Python-Dev] One more proposed formatting change for 3.1

2009-04-28 Thread Mark Dickinson
Here's one more proposed change, this time for formatting of floats using format() and the empty presentation type. To avoid repeating myself, here's the text from the issue I just opened: http://bugs.python.org/issue5864 """ In all versions of Python from 2.6 up, I get the following behaviour:

[Python-Dev] Can not run under python 2.6

2009-04-28 Thread Jianchun Zhou
Hi, there: I am new to python, and now I got a trouble: I have an application named canola, it is written under python 2.5, and can run normally under python 2.5 But when it comes under python 2.6, problem up, it says: Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System C haracter Interfaces

2009-04-28 Thread Antoine Pitrou
Paul Moore gmail.com> writes: > > I've yet to hear anyone claim that they would have an actual problem > with a specific piece of code they have written. Yep, that's the problem. Lots of theoretical problems noone has ever encountered brought up against a PEP which resolves some actual problems

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Oleg Broytmann
On Tue, Apr 28, 2009 at 11:32:26AM +0200, Thomas Breuel wrote: > On Tue, Apr 28, 2009 at 11:00, Oleg Broytmann wrote: > > I have an FTP server to which clients with different local encodings > > are connecting. FTP protocol doesn't have a notion of encoding so filenames > > on the filesystem are

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Thomas Breuel
On Tue, Apr 28, 2009 at 11:00, Oleg Broytmann wrote: > On Tue, Apr 28, 2009 at 10:37:45AM +0200, Thomas Breuel wrote: > > Returning an error for an incorrect encoding doesn't make > > internationalization harder, it makes it easier because it makes > debugging > > easier. > >What is a "correc

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Paul Moore
2009/4/28 Glenn Linderman : > So assume a non-decodable sequence in a name.  That puts us into Martin's > funny-decode scheme.  His funny-decode scheme produces a bare string, > indistinguishable from a bare string that would be produced by a str API > that happens to contain that same sequence.  D

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Oleg Broytmann
On Tue, Apr 28, 2009 at 10:37:45AM +0200, Thomas Breuel wrote: > Returning an error for an incorrect encoding doesn't make > internationalization harder, it makes it easier because it makes debugging > easier. What is a "correct encoding"? I have an FTP server to which clients with differen

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Thomas Breuel
> > >Until it's hard there will be no internationalization. A fact of life, > damn it. Programmers are lazy, and have many problems to solve. PEP 383 doesn't make it any easier; it just turns one set of problems into another. Actually, it makes it worse, since any problems that show up now s

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Oleg Broytmann
On Tue, Apr 28, 2009 at 09:30:01AM +0200, Thomas Breuel wrote: > Programmers may find it inconvenient that they have to spend time figuring > out and deal with platform-dependent file system encoding issues and > errors. But internationalization and unicode are hard, that's just a fact > of life.

Re: [Python-Dev] PEP 383 (again)

2009-04-28 Thread Thomas Breuel
> > Therefore, when Python encounters path names on a file system > > that are not consistent with the (assumed) encoding for that file > > system, Python should raise an error. > > This is what happens currently, and users are quite unhappy about it. We need to keep "users" and "programmers" dis