[Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Toshio Kuratomi
I opened up bug http://bugs.python.org/issue4006 a while ago and it was suggested in the report that it's not a bug but a feature and so I should come here to see about getting the feature changed :-) I have a specific problem with os.environ and a somewhat less important architectural issue with

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Adam Olsen
On Thu, Dec 4, 2008 at 1:02 PM, Toshio Kuratomi <[EMAIL PROTECTED]> wrote: > I opened up bug http://bugs.python.org/issue4006 a while ago and it was > suggested in the report that it's not a bug but a feature and so I > should come here to see about getting the feature changed :-) > > I have a spec

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread André Malo
* Adam Olsen wrote: > On Thu, Dec 4, 2008 at 1:02 PM, Toshio Kuratomi <[EMAIL PROTECTED]> wrote: > > I opened up bug http://bugs.python.org/issue4006 a while ago and it was > > suggested in the report that it's not a bug but a feature and so I > > should come here to see about getting the featu

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Toshio Kuratomi
Adam Olsen wrote: > On Thu, Dec 4, 2008 at 1:02 PM, Toshio Kuratomi <[EMAIL PROTECTED]> wrote: >> I opened up bug http://bugs.python.org/issue4006 a while ago and it was >> suggested in the report that it's not a bug but a feature and so I >> should come here to see about getting the feature change

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Nick Coghlan
Toshio Kuratomi wrote: > The bug report I opened suggests creating a PEP to address this issue. > I think that's a good idea for whether os.listdir() and friends should > be changed to raise an exception but not having any way to get at some > environment variables seems like it's just a bug that n

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Adam Olsen
On Thu, Dec 4, 2008 at 2:09 PM, André Malo <[EMAIL PROTECTED]> wrote: > * Adam Olsen wrote: >> On Thu, Dec 4, 2008 at 1:02 PM, Toshio Kuratomi <[EMAIL PROTECTED]> > wrote: >> > I opened up bug http://bugs.python.org/issue4006 a while ago and it was >> > suggested in the report that it's not a bug b

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Terry Reedy
Toshio Kuratomi wrote: I opened up bug http://bugs.python.org/issue4006 a while ago and it was suggested in the report that it's not a bug but a feature and so I should come here to see about getting the feature changed :-) It does you no good and (and will irritate others) to conflate 'design

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Adam Olsen
On Thu, Dec 4, 2008 at 2:19 PM, Nick Coghlan <[EMAIL PROTECTED]> wrote: > Toshio Kuratomi wrote: >> The bug report I opened suggests creating a PEP to address this issue. >> I think that's a good idea for whether os.listdir() and friends should >> be changed to raise an exception but not having any

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Toshio Kuratomi
Adam Olsen wrote: > On Thu, Dec 4, 2008 at 2:09 PM, André Malo <[EMAIL PROTECTED]> wrote: >> * Adam Olsen wrote: >>> On Thu, Dec 4, 2008 at 1:02 PM, Toshio Kuratomi <[EMAIL PROTECTED]> >> wrote: I opened up bug http://bugs.python.org/issue4006 a while ago and it was suggested in the repor

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread André Malo
* Adam Olsen wrote: > On Thu, Dec 4, 2008 at 2:09 PM, André Malo <[EMAIL PROTECTED]> wrote: > > Here's an example which will become popular soon, I guess: CGI scripts > > and, of course WSGI applications. All those get their environment in an > > unknown encoding. In the worst case one can blow

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Toshio Kuratomi
Terry Reedy wrote: > Toshio Kuratomi wrote: >> I opened up bug http://bugs.python.org/issue4006 a while ago and it was >> suggested in the report that it's not a bug but a feature and so I >> should come here to see about getting the feature changed :-) > > It does you no good and (and will irrita

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Adam Olsen
On Thu, Dec 4, 2008 at 3:47 PM, André Malo <[EMAIL PROTECTED]> wrote: > * Adam Olsen wrote: > >> On Thu, Dec 4, 2008 at 2:09 PM, André Malo <[EMAIL PROTECTED]> wrote: > >> > Here's an example which will become popular soon, I guess: CGI scripts >> > and, of course WSGI applications. All those get t

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Toshio Kuratomi
Adam Olsen wrote: > On Thu, Dec 4, 2008 at 2:19 PM, Nick Coghlan <[EMAIL PROTECTED]> wrote: >> Toshio Kuratomi wrote: >>> The bug report I opened suggests creating a PEP to address this issue. >>> I think that's a good idea for whether os.listdir() and friends should >>> be changed to raise an exce

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Martin v. Löwis
> In the bug report I opened, I listed four ways to fix this along with > the pros and cons: I'm in favour of a different, fifth solution: 5) represent all environment variables in Unicode strings, including the ones that currently fail to decode. (then do the same to file names, then drop

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Terry Reedy
Toshio Kuratomi wrote: I would think life would be ultimately easier if either the file server or the shell server automatically translated file names from jis and utf8 and back, so that the PATH on the *nix shell server is entirely utf8. This is not possible because no part of the computer k

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread James Y Knight
On Dec 4, 2008, at 6:39 PM, Martin v. Löwis wrote: I'm in favour of a different, fifth solution: 5) represent all environment variables in Unicode strings, including the ones that currently fail to decode. (then do the same to file names, then drop the byte-oriented file operations again)

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Terry Reedy
James Y Knight wrote: On Dec 4, 2008, at 6:39 PM, Martin v. Löwis wrote: I'm in favour of a different, fifth solution: 5) represent all environment variables in Unicode strings, including the ones that currently fail to decode. (then do the same to file names, then drop the byte-oriented

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Adam Olsen
On Thu, Dec 4, 2008 at 6:14 PM, James Y Knight <[EMAIL PROTECTED]> wrote: > On Dec 4, 2008, at 6:39 PM, Martin v. Löwis wrote: >> >> I'm in favour of a different, fifth solution: >> >> 5) represent all environment variables in Unicode strings, >> including the ones that currently fail to decode. >

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Dino Viehland
ginal Message- > From: [EMAIL PROTECTED] [mailto:python- > [EMAIL PROTECTED] On Behalf Of Adam Olsen > Sent: Thursday, December 04, 2008 6:32 PM > To: James Y Knight > Cc: "Martin v. Löwis"; python-dev List > Subject: Re: [Python-Dev] Python-3.0, unicode, and os.environ

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Adam Olsen
On Thu, Dec 4, 2008 at 8:24 PM, Dino Viehland <[EMAIL PROTECTED]> wrote: > Does anyone know what Mono does here? Presumably they have the exact same > problem as all strings in .NET are Unicode, and filenames/env vars/etc... > are always strings. > > Maybe if it's gotta be broken at least it can b

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread glyph
On 02:08 am, [EMAIL PROTECTED] wrote: James Y Knight wrote: On Dec 4, 2008, at 6:39 PM, Martin v. L�wis wrote: I'm in favour of a different, fifth solution: 5) represent all environment variables in Unicode strings, including the ones that currently fail to decode. (then do the same to fil

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread glyph
On 02:32 am, [EMAIL PROTECTED] wrote: On Thu, Dec 4, 2008 at 6:14 PM, James Y Knight <[EMAIL PROTECTED]> wrote: FWIW, I still agree with Martin that that's the most reasonable solution. It died because nobody presented a viable solution, and I maintain no solution is possible. All suggesti

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Adam Olsen
On Thu, Dec 4, 2008 at 8:55 PM, <[EMAIL PROTECTED]> wrote: > > On 02:32 am, [EMAIL PROTECTED] wrote: >> >> On Thu, Dec 4, 2008 at 6:14 PM, James Y Knight <[EMAIL PROTECTED]> wrote: > >>> FWIW, I still agree with Martin that that's the most reasonable solution. >> >> It died because nobody presente

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Guido van Rossum
>> On Dec 4, 2008, at 6:39 PM, Martin v. Löwis wrote: >>> I'm in favour of a different, fifth solution: >>> >>> 5) represent all environment variables in Unicode strings, >>> including the ones that currently fail to decode. >>> (then do the same to file names, then drop the byte-oriented >>> f

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Adam Olsen
On Thu, Dec 4, 2008 at 10:14 PM, Guido van Rossum <[EMAIL PROTECTED]> wrote: > At the risk of bringing up something that was already rejected, let me > propose something that follows the path taken in 3.0 for filenames, > rather than doubling back: > > For os.environ, os.getenv() and os.putenv(), I

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Guido van Rossum
On Thu, Dec 4, 2008 at 9:46 PM, Adam Olsen <[EMAIL PROTECTED]> wrote: > On Thu, Dec 4, 2008 at 10:14 PM, Guido van Rossum <[EMAIL PROTECTED]> wrote: >> On Windows, the bytes APIs should probably not exist. > > -0. I'd prefer byte APIs return UTF-16 bytes and the unicode APIs > become validating.

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Martin v. Löwis
> Let's bring out all the same arguments, come to no conclusion, and let > it taper off unresolved, yet again! :) This time, it will be different. I will write a PEP, and will request that anybody proposing an alternative solution also write a PEP (and no change is made to the code before the PEPs

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread Martin v. Löwis
> Please, if you have a *new* idea that doesn't have a failure mode, by > all means post it. But don't resurrect a pointless bikeshed. While I completely agree that it is pointless to reiterate the same arguments over and over, I disagree that the bikeshed metapher applies. This metapher (IIUC) d

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Steve Holden
Martin v. Löwis wrote: >> Please, if you have a *new* idea that doesn't have a failure mode, by >> all means post it. But don't resurrect a pointless bikeshed. > > While I completely agree that it is pointless to reiterate the same > arguments over and over, I disagree that the bikeshed metapher

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Adam Olsen
On Fri, Dec 5, 2008 at 12:00 AM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: >> Please, if you have a *new* idea that doesn't have a failure mode, by >> all means post it. But don't resurrect a pointless bikeshed. > > While I completely agree that it is pointless to reiterate the same > arguments

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Nick Coghlan
[EMAIL PROTECTED] wrote: > At least this time I think I've encapsulated pretty much my entire > argument here, so if you don't buy it, we can probably just agree to > disagree :). Glyph, the only point I would add to your message is this one: Adding a "blessed" way to encode arbitrary binary data

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Victor Stinner
Le Friday 05 December 2008 00:39:24 Martin v. Löwis, vous avez écrit : > 5) represent all environment variables in Unicode strings, >including the ones that currently fail to decode. >(then do the same to file names, then drop the byte-oriented > file operations again) Please, don't do

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Ulrich Eckhardt
On Friday 05 December 2008, [EMAIL PROTECTED] wrote: > Filenames and environment variables would all need to be encoded or > decoded according to this magic encoding. Those, and commandline arguments, too. Uli -- Sator Laser GmbH Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Ulrich Eckhardt
On Friday 05 December 2008, Adam Olsen wrote: > Many of the windows APIs use UTF-16 without validating it. They'll > pass through invalid strings until they hit something that does > validate, at which point it'll blow up. > > I suspect that it doesn't happen very often in practice, as having > on

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Victor Stinner
Hi, Le Thursday 04 December 2008 21:02:19 Toshio Kuratomi, vous avez écrit : > I opened up bug http://bugs.python.org/issue4006 a while ago and it was > suggested in the report that it's not a bug but a feature and so I > should come here to see about getting the feature changed :-) Yeah, I prefe

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Ulrich Eckhardt
On Friday 05 December 2008, Guido van Rossum wrote: > At the risk of bringing up something that was already rejected, let me > propose something that follows the path taken in 3.0 for filenames, > rather than doubling back: > > For os.environ, os.getenv() and os.putenv(), I think a similar > approa

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread James Y Knight
On Dec 5, 2008, at 5:27 AM, Ulrich Eckhardt wrote: Using the byte variant is equally fubar, because e.g. on MS Windows it is not supported, except through a very lossy roundtrip through the locale's codepage, limiting your functionality. Yeah, IMO whole mess could have been avoided by keepin

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi
Terry Reedy wrote: > Toshio Kuratomi wrote: >> >>> I would think life would be ultimately easier if either the file server >>> or the shell server automatically translated file names from jis and >>> utf8 and back, so that the PATH on the *nix shell server is entirely >>> utf8. >> >> This is not po

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi
Victor Stinner wrote: > Hi, > > Le Thursday 04 December 2008 21:02:19 Toshio Kuratomi, vous avez écrit : > >> These mixed encodings can occur for a variety of reasons. Here's an >> example that isn't too contrived :-) >> (...) >> Furthermore, they don't want to suffer from the space loss of usin

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Guido van Rossum
On Fri, Dec 5, 2008 at 2:27 AM, Ulrich Eckhardt <[EMAIL PROTECTED]> wrote: > Seriously, what would you suggest to someone that > wants to handle paths in a portable way? Using the Unicode variants of > functions is fubar, because encoding/decoding is not universally possible. > Using the byte varia

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi
Guido van Rossum wrote: > On Fri, Dec 5, 2008 at 2:27 AM, Ulrich Eckhardt <[EMAIL PROTECTED]> wrote: >> In 99% of all cases, using the default encoding will work and do what people >> expect, which is why I would make this conversion automatic. In all other >> cases, it will at least not fail silen

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Guido van Rossum
On Fri, Dec 5, 2008 at 12:05 PM, Toshio Kuratomi <[EMAIL PROTECTED]> wrote: > Guido van Rossum wrote: >> On Fri, Dec 5, 2008 at 2:27 AM, Ulrich Eckhardt <[EMAIL PROTECTED]> wrote: >>> In 99% of all cases, using the default encoding will work and do what people >>> expect, which is why I would make

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi
Guido van Rossum wrote: > Glob was just an example. Many use cases for directory traversal > couldn't care less if they see *all* files. > Okay. Makes it harder to prove correct or not if I don't know what the use case is :-) I can't think of a single use case off-hand. Even your example of a ?

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi
Guido van Rossum wrote: > At the risk of bringing up something that was already rejected, let me > propose something that follows the path taken in 3.0 for filenames, > rather than doubling back: > > For os.environ, os.getenv() and os.putenv(), I think a similar > approach as used for os.listdir()

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Victor Stinner
Hi, > > But they are open questions (already asked in the bug tracker): > > I answered these in the bug tracker. Here are the answers for the > mailing list: Oh, sorry. I didn't follow the end of the discussion on the bug tracker. > >os.environb['PATH'] = '\xff' > >=> os.environ['PATH']

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Nick Coghlan
Toshio Kuratomi wrote: > Guido van Rossum wrote: >> Glob was just an example. Many use cases for directory traversal >> couldn't care less if they see *all* files. >> > Okay. Makes it harder to prove correct or not if I don't know what the > use case is :-) I can't think of a single use case off-

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi
Victor Stinner wrote: >>> It would be maybe easier if os.environ supports bytes and unicode keys. >>> But we have to keep these assertions: >>>os.environ[bytes] -> bytes >>>os.environ[str] -> str >> I think the same choices have to be made here. If LANG=C, we still have >> to decide what t

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Nick Coghlan
Toshio Kuratomi wrote: > Are most programs specific to one organization or are they distributed > to other people? The former. That's pretty well documented in assorted IT literature ('shrink-wrap' and open source commodity software are still relatively new players on the scene that started to shi

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi
Nick Coghlan wrote: > Toshio Kuratomi wrote: >> Are most programs specific to one organization or are they distributed >> to other people? > > The former. That's pretty well documented in assorted IT literature > ('shrink-wrap' and open source commodity software are still relatively > new players

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Toshio Kuratomi
Nick Coghlan wrote: > Toshio Kuratomi wrote: >> Guido van Rossum wrote: >>> Glob was just an example. Many use cases for directory traversal >>> couldn't care less if they see *all* files. >>> >> Okay. Makes it harder to prove correct or not if I don't know what the >> use case is :-) I can't thi

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread rdmurray
On Fri, 5 Dec 2008 at 12:11, Guido van Rossum wrote: On Fri, Dec 5, 2008 at 12:05 PM, Toshio Kuratomi <[EMAIL PROTECTED]> wrote: Guido van Rossum wrote: On Fri, Dec 5, 2008 at 2:27 AM, Ulrich Eckhardt <[EMAIL PROTECTED]> wrote: In 99% of all cases, using the default encoding will work and do w

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Nick Coghlan
Toshio Kuratomi wrote: > Nick Coghlan wrote: >> Toshio Kuratomi wrote: >>> Guido van Rossum wrote: Glob was just an example. Many use cases for directory traversal couldn't care less if they see *all* files. >>> Okay. Makes it harder to prove correct or not if I don't know what the

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Michael Urman
On Fri, Dec 5, 2008 at 18:48, Nick Coghlan <[EMAIL PROTECTED]> wrote: > Toshio Kuratomi wrote: >> Nick Coghlan wrote: >>> Toshio Kuratomi wrote: Guido van Rossum wrote: > Glob was just an example. Many use cases for directory traversal > couldn't care less if they see *all* files.

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Steven D'Aprano
On Sat, 6 Dec 2008 09:18:47 am Nick Coghlan wrote: > Toshio Kuratomi wrote: > > Guido van Rossum wrote: > >> Glob was just an example. Many use cases for directory traversal > >> couldn't care less if they see *all* files. > > > > Okay. Makes it harder to prove correct or not if I don't know what

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Nick Coghlan
Toshio Kuratomi wrote: > Nick Coghlan wrote: >> Toshio Kuratomi wrote: >>> Are most programs specific to one organization or are they distributed >>> to other people? >> The former. That's pretty well documented in assorted IT literature >> ('shrink-wrap' and open source commodity software are stil

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Martin v. Löwis
>> 5) represent all environment variables in Unicode strings, >>including the ones that currently fail to decode. >>(then do the same to file names, then drop the byte-oriented >> file operations again) > > Please, don't do that! Bytes are not characters! And environment variables, co

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread James Y Knight
On Dec 5, 2008, at 7:48 PM, Nick Coghlan wrote: You can't display a non-decodable filename to the user, hence the user will have no idea what they're working on. Non-filesystem related apps have no business trying to deal with insane filenames. Sigh, same arguments, all over again. Again, *bot

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Michael Urman
On Fri, Dec 5, 2008 at 19:22, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: >> Please, don't do that! Bytes are not characters! > > And environment variables, command line arguments, and file names > are not bytes, but characters. On Windows NT, sure. On Unix they're still bytes no matter how much

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Martin v. Löwis
>> And environment variables, command line arguments, and file names >> are not bytes, but characters. > > On Windows NT, sure. On Unix they're still bytes no matter how much we > want them to be characters. Only in the API of the OS itself. Treating them as bytes in the application is a mistake.

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Steven D'Aprano
On Sat, 6 Dec 2008 11:48:27 am Nick Coghlan wrote: > Toshio Kuratomi wrote: > > Nick Coghlan wrote: ... > >> Why? Most programs won't be able to do anything with it. And if > >> the program *can* do something with it... that's what the bytes > >> version of the APIs are for. > > > > Nonsense. A pr

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Ulrich Eckhardt wrote: > On Friday 05 December 2008, Guido van Rossum wrote: >> At the risk of bringing up something that was already rejected, let me >> propose something that follows the path taken in 3.0 for filenames, >> rather than doubling back:

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread rdmurray
On Sat, 6 Dec 2008 at 13:06, Steven D'Aprano wrote: Applications can deal with such weird file names. KDE's file manager (konqueror) and file selection dialog both show the character as a small square, presumably the font's missing character glyph, and KDE apps can open and save the file. Still s

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Stephen J. Turnbull
Nick Coghlan writes: > True, but it's still a fairly important problem to have a solution to. > Even internally in large organisations there can be some pretty insane > environments as cruft accumulates over the years. M&A and globalization makes it inevitable. Toshio will remember the Mizuho

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Stephen J. Turnbull
"Martin v. Löwis" writes: > >> 5) represent all environment variables in Unicode strings, > >>including the ones that currently fail to decode. > >>(then do the same to file names, then drop the byte-oriented > >> file operations again) > > > > Please, don't do that! Bytes are no

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Stephen J. Turnbull
Guido van Rossum writes: > This sounds too pessimistic to me. I expect that in five years it will > be universally accepted that these variables must be encoded in a > standard encoding. Archival material will not catch up until the plastic rots. And I bet it takes ten years before the Japane

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-05 Thread Bugbee, Larry
There has been some discussion here that users should use the str or byte function variant based on what is relevant to their system, for example when getting a list of file names or opening a file. That thought process really doesn't do much for those of us that write code that needs to run on a

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread Oleg Broytmann
On Fri, Dec 05, 2008 at 08:37:45PM -0500, James Y Knight wrote: > On Dec 5, 2008, at 7:48 PM, Nick Coghlan wrote: >> You can't display a non-decodable filename to the user, hence the user >> will have no idea what they're working on. Non-filesystem related apps >> have no business trying to deal wi

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread Oleg Broytmann
On Sat, Dec 06, 2008 at 12:03:55PM +1100, Steven D'Aprano wrote: > I'd rather have the Python API report errors then silence them, at least > by default. +1 for encoding errors by default. Oleg. -- Oleg Broytmannhttp://phd.pp.ru/[EMAIL PROTECTED] Progr

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread Oleg Broytmann
On Sat, Dec 06, 2008 at 02:22:29AM +0100, "Martin v. L?wis" wrote: > And environment variables, command line arguments, and file names > are not bytes, but characters. "There is no such thing as plain text!" If you say "these are characters" you must also name the encoding for them. LANG/LC_ALL

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread Toshio Kuratomi
Nick Coghlan wrote: > Toshio Kuratomi wrote: >>> >> Nonsense. A program can do tons of things with a non-decodable >> filename. Where it's limited is non-decodable filedata. > > You can't display a non-decodable filename to the user, hence the user > will have no idea what they're working on. No

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread Guido van Rossum
On Fri, Dec 5, 2008 at 10:18 PM, Bugbee, Larry <[EMAIL PROTECTED]> wrote: > There has been some discussion here that users should use the str or > byte function variant based on what is relevant to their system, for > example when getting a list of file names or opening a file. That > thought proc

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread Guido van Rossum
On Fri, Dec 5, 2008 at 8:57 PM, Tres Seaver <[EMAIL PROTECTED]> wrote: > Amen! the idea that paths, environment varioables, and stuff pulled off > of sockets can be treated as text rather than strings is just wishful > thinking. Unfortunately most of the programmers of the world *do* think that w

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread Toshio Kuratomi
Bugbee, Larry wrote: > There has been some discussion here that users should use the str or > byte function variant based on what is relevant to their system, for > example when getting a list of file names or opening a file. That > thought process really doesn't do much for those of us that write

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread glyph
On 02:34 pm, [EMAIL PROTECTED] wrote: On Fri, Dec 05, 2008 at 08:37:45PM -0500, James Y Knight wrote: On Dec 5, 2008, at 7:48 PM, Nick Coghlan wrote: You can't display a non-decodable filename to the user, hence the user will have no idea what they're working on. Non-filesystem related apps h

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread Guido van Rossum
On Sat, Dec 6, 2008 at 10:53 AM, <[EMAIL PROTECTED]> wrote: > On 02:34 pm, [EMAIL PROTECTED] wrote: >> I agree 100%. Russian Unix users use at least 5 different encodings >> (koi8-r, cp1251 and utf-8 are the most frequent in use, cp866 and >> iso-8859-5 are less frequent). I have an FTP server wi

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread Nick Coghlan
Oleg Broytmann wrote: > My filemanager > (Midnight Commander, for the matter) shows these files and directories as > "?.???", but I can chdir to such directories, and I can open such > files. It would be a big bad blow for me if filemanagers (or other > programs) start to filter these filenames

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread Nick Coghlan
Toshio Kuratomi wrote: > Note 2: If there isn't a parallel API on all platforms, for instance, > Guido's proposal to not have os.environb on Windows, then you'll still > have to have a platform specific check. (Likely you should try to access > os.evironb in this instance and if it doesn't exist, u

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread Antoine Pitrou
Nick Coghlan gmail.com> writes: > > If the binary APIs are missing from a major platform (i.e. Windows) then > the choice to use them brings with it a major cross-platform portability > problem that should really be handled by the standard library. +1 I might also add that providing binary APIs

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread André Malo
* Nick Coghlan wrote: > Toshio Kuratomi wrote: > > Note 2: If there isn't a parallel API on all platforms, for instance, > > Guido's proposal to not have os.environb on Windows, then you'll still > > have to have a platform specific check. (Likely you should try to > > access os.evironb in this in

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread Nick Coghlan
André Malo wrote: >> While on Windows: >> - underlying OS API uses Unicode >> - Unicode API just passes values straight through >> - binary API uses the system encoding to decode bytes names and values >> to be passed to the OS API and to encode Unicode names and values >> received from the OS API

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread Aahz
On Sun, Dec 07, 2008, Nick Coghlan wrote: > > If the binary APIs are missing from a major platform (i.e. Windows) then > the choice to use them brings with it a major cross-platform portability > problem that should really be handled by the standard library. +1 -- Aahz ([EMAIL PROTECTED])

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread Adam Olsen
On Sat, Dec 6, 2008 at 6:51 PM, Nick Coghlan <[EMAIL PROTECTED]> wrote: > André Malo wrote: >>> While on Windows: >>> - underlying OS API uses Unicode >>> - Unicode API just passes values straight through >>> - binary API uses the system encoding to decode bytes names and values >>> to be passed to

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread Toshio Kuratomi
Guido van Rossum wrote: > On Sat, Dec 6, 2008 at 10:53 AM, <[EMAIL PROTECTED]> wrote: >> I find it interesting to note that the only users in this discussion who >> actually have these problems in real life all have this attitude. It is >> expected that in an imperfect world we will have imperfe

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread glyph
On 06:07 am, [EMAIL PROTECTED] wrote: Guido van Rossum wrote: On Sat, Dec 6, 2008 at 10:53 AM, <[EMAIL PROTECTED]> wrote: I find it interesting to note that the only users in this discussion who actually have these problems in real life all have this attitude. For file managers and simi

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Hagen Fürstenau
> If the Unicode APIs only have correct unicode, sure. If not you'll > get errors translating to UTF-8 (and the byte APIs are supposed to > pass bad names through unaltered.) Kinda ironic, no? As far as I can see all Python Unicode strings can be encoded to UTF-8, even things like lone surrogate

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Adam Olsen
On Sun, Dec 7, 2008 at 2:07 AM, Hagen Fürstenau <[EMAIL PROTECTED]> wrote: >> If the Unicode APIs only have correct unicode, sure. If not you'll >> get errors translating to UTF-8 (and the byte APIs are supposed to >> pass bad names through unaltered.) Kinda ironic, no? > > As far as I can see al

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Hagen Fürstenau
>> As far as I can see all Python Unicode strings can be encoded to UTF-8, >> even things like lone surrogates because Python doesn't care about them. >> So both the Unicode API and the binary API would be fail-safe on Windows. > > Python is broken and needs to be fixed. > > http://bugs.python.or

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Adam Olsen
On Sun, Dec 7, 2008 at 2:35 AM, Hagen Fürstenau <[EMAIL PROTECTED]> wrote: >>> As far as I can see all Python Unicode strings can be encoded to UTF-8, >>> even things like lone surrogates because Python doesn't care about them. >>> So both the Unicode API and the binary API would be fail-safe on Wi

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Toshio Kuratomi
[EMAIL PROTECTED] wrote: > > On 06:07 am, [EMAIL PROTECTED] wrote: >> Most apps aren't file managers or ftp clients but when they interact >> with files (for instance, a file selection dialog) they need to be able >> to show the user all the relevant files. So on an app-by-app basis the >> need f

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Michael Urman
On Sun, Dec 7, 2008 at 11:35, Adam Olsen <[EMAIL PROTECTED]> wrote: >>> http://bugs.python.org/issue3672 >>> http://bugs.python.org/issue3297 > > No. Unicode *requires* them to be treated as errors. If you want to > pass them through then you're creating a custom encoding... which you > might arg

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Adam Olsen
On Sun, Dec 7, 2008 at 11:18 AM, Michael Urman <[EMAIL PROTECTED]> wrote: > On Sun, Dec 7, 2008 at 11:35, Adam Olsen <[EMAIL PROTECTED]> wrote: http://bugs.python.org/issue3672 http://bugs.python.org/issue3297 >> >> No. Unicode *requires* them to be treated as errors. If you want to >>

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Terry Reedy
Toshio Kuratomi wrote: - If this is true, a definition of os.listdir() that would better meet programmer expectation would be: "Give me all files in a directory with the output as str type". The definition of os.listdir() would be "Give me all files in a directory with the output as bytes typ

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Guido van Rossum
On Sun, Dec 7, 2008 at 1:20 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: > Toshio Kuratomi wrote: > >> - If this is true, a definition of os.listdir() that would >> better meet programmer expectation would be: "Give me all files in a >> directory with the output as str type". The definition of >> o

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Nick Coghlan
Terry Reedy wrote: > Toshio Kuratomi wrote: > >> - If this is true, a definition of os.listdir() that would >> better meet programmer expectation would be: "Give me all files in a >> directory with the output as str type". The definition of >> os.listdir() would be "Give me all files in a direc

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Greg Ewing
Nick Coghlan wrote: For binary wrappers around the Windows Unicode APIs, I was thinking specifically of using UTF-8, since that should be able to encode anything the Unicode APIs can handle. Why shouldn't the binary interface just expose the raw utf16 as bytes? -- Greg ___

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Terry Reedy
Guido van Rossum wrote: On Sun, Dec 7, 2008 at 1:20 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: Toshio Kuratomi wrote: - If this is true, a definition of os.listdir() that would better meet programmer expectation would be: "Give me all files in a directory with the output as str type". The de

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Glenn Linderman
On approximately 12/7/2008 10:56 AM, came the following characters from the keyboard of Adam Olsen: You might receive a UTF-8 encoded file name from a malicious user, check if it contains something dangerous (like "../../../../../etc/password"), then decode it. If your decoder isn't compliant

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Stephen J. Turnbull
Glenn Linderman writes: > But if you are interested in checking for security issues, shouldn't you > _first_ decode into some canonical form, Yes. That's all that is being asked for: that Python do strict decoding to a canonical form by default. That's a lot to ask, as it turns out, but th

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Glenn Linderman
On approximately 12/7/2008 8:13 PM, came the following characters from the keyboard of Stephen J. Turnbull: Glenn Linderman writes: > But if you are interested in checking for security issues, shouldn't you > _first_ decode into some canonical form, Yes. That's all that is being asked fo

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Adam Olsen
On Sun, Dec 7, 2008 at 9:45 PM, Glenn Linderman <[EMAIL PROTECTED]> wrote: > On approximately 12/7/2008 8:13 PM, came the following characters from the > keyboard of Stephen J. Turnbull: >> >> Glenn Linderman writes: >> >> > But if you are interested in checking for security issues, shouldn't >> y

  1   2   >