[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread STINNER Victor
STINNER Victor added the comment: I think that issue9992.patch fixes also #4388 because it uses the same encoding (FS encoding, utf8) on OSX to encode and to decode command line arguments. -- ___ Python tracker __

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread STINNER Victor
STINNER Victor added the comment: issue9992.patch: - Remove PYTHONFSENCODING environment variable - Mac OS X: Use utf-8 to decode command line arguments - Fix issue #9992 (this issue): attached test, locale_fs_encoding.py, pass - Fix issue #9988 - Fix issue #10014 - Fix issue #10039 $ dif

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread STINNER Victor
STINNER Victor added the comment: > I guess LANG and LC_CTYPE can be used for other purposes > such as internationalization. That's why there are different environement variables: * LC_MESSAGES for i18n (messages) * LC_CTYPE for the encoding * LC_TIME for time and date * etc. -- _

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread STINNER Victor
STINNER Victor added the comment: MAL> You can't just tell people to go with whatever encoding setup MAL> you prefer to make Python's guessing easier or more correct. Python doesn't really *guess* the encoding, it just reads the encoding from the locale. What do you mean by "more correct"? Ho

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Martin v . Löwis
Martin v. Löwis added the comment: > You can't possibly expect a user to switch to using UTF-8 for > all his/her applications just because Python needs this to > properly decode file names. If the user hasn't switched to UTF-8, why would Python need that to properly decode file names?

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread STINNER Victor
STINNER Victor added the comment: MAL> If you remove the PYTHONFSENCODING, then we have to reconsider MAL> removal of sys.setfilesystemencoding(). Plase, Marc, read my comments. You never consider technical problems, you just propose to ensure that "Python just works", without answerin

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread STINNER Victor
STINNER Victor added the comment: MvL> > - Windows: unicode for command line/env, mbcs to decode filenames MvL> No: unicode for filenames also. Yes, I mean unicode for everything, but decode bytes data from the mbcs encoding. -- ___ Python tracker

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Martin v. Löwis wrote: > > Martin v. Löwis added the comment: > >> Being pedantic about forcing some encoding onto things that don't >> have an encoding won't really work out in practice. Dealing with >> file names, OS environments, pipes and sockets is d

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Antoine Pitrou
Antoine Pitrou added the comment: > However, I completely fail to see the advantage that the > PYTHONFSENCODING variable has over the LANG variable. If it's > possible to set PTHONFSENCODING in some application, it surely > is also possible to set LANG (or LC_CTYPE), no? Setting the > latter als

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Martin v . Löwis
Martin v. Löwis added the comment: > Being pedantic about forcing some encoding onto things that don't > have an encoding won't really work out in practice. Dealing with > file names, OS environments, pipes and sockets is dirty work, so > I think we should go with the 80-20 approach in making 80

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Martin v. Löwis wrote: > > Martin v. Löwis added the comment: > >> If you remove both, Python will get very poor grades for OS >> interoperability on platforms that often deal with multiple >> different encodings for file names. > > Why that? It will wor

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Martin v . Löwis
Martin v. Löwis added the comment: > There is one reason for not wanting to assume that the encoding is > always UTF-8: the user might access the system from a non-UTF8 > terminal (such as when logging in with an SSH session from a system > not using UTF-8, or using an alternate terminal applica

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Ronald Oussoren
Ronald Oussoren added the comment: On 09 Oct, 2010,at 02:07 PM, Antoine Pitrou wrote: Antoine Pitrou added the comment: > For the command line, it would mean that we > introduced a new encoding: "command line encoding", which will be utf-8 on > OSX. Or more generally "environment encoding

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Martin v . Löwis
Martin v. Löwis added the comment: > If you remove both, Python will get very poor grades for OS > interoperability on platforms that often deal with multiple > different encodings for file names. Why that? It will work very well in such a setting, much better than, say, Java. --

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Martin v . Löwis
Martin v. Löwis added the comment: > You mean that we should use the following encoding for the command line > arguments, environment variables and all filenames/paths: > - Mac OS X: utf-8 > - Windows: unicode for command line/env, mbcs to decode filenames No: unicode for filenames also. >

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: STINNER Victor wrote: > > I like this solution because it doesn't change a lot of things. I agree to > drop PYTHONFSENCODING because it looks like PYTHONFSENCODING introduced more > inconsistencies than it solved. If you remove the PYTHONFSENCODING, then

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread STINNER Victor
STINNER Victor added the comment: > > ... So Antoine and Martin: which encoding do you prefer? > > I still propose to drop the fsname encoding. Then this question goes away. You mean that we should use the following encoding for the command line arguments, environment variables and all filena

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-10 Thread Martin v . Löwis
Martin v. Löwis added the comment: > I don't know what you mean by dropping, since OS X by construction needs > a filesystem encoding (utf-8) different from the locale encoding; See above. I propose to stop using the locale encoding for command line arguments and environment variables on OSX, a

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-10 Thread Antoine Pitrou
Antoine Pitrou added the comment: Le dimanche 10 octobre 2010 à 18:23 +, Martin v. Löwis a écrit : > Martin v. Löwis added the comment: > > > For the command line arguments and environment variables, we don't have a > > lot > > of choices: locale or filesystem encodings. So Antoine and M

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-10 Thread Martin v . Löwis
Martin v. Löwis added the comment: > For the command line arguments and environment variables, we don't have a lot > of choices: locale or filesystem encodings. So Antoine and Martin: which > encoding do you prefer? I still propose to drop the fsname encoding. Then this question goes away. -

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-10 Thread STINNER Victor
STINNER Victor added the comment: > > What? No. We have problems because we don't use the same encoding to > > decode and to encode the same data type. It's not a problem to use a > > different encoding for each data type (stdout, filenames, environment > > variables, ...). > > This is exactly

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-10 Thread Martin v . Löwis
Martin v. Löwis added the comment: Am 10.10.2010 17:51, schrieb STINNER Victor: > > STINNER Victor added the comment: > >> We run into problems because we have two inconsistent encodings, >> ... > > What? No. We have problems because we don't use the same encoding to > decode and to encode t

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-10 Thread STINNER Victor
STINNER Victor added the comment: > We run into problems because we have two inconsistent > encodings, ... What? No. We have problems because we don't use the same encoding to decode and to encode the same data type. It's not a problem to use a different encoding for each data type (stdout, f

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-09 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Please no. We run into problems because we have two inconsistent > encodings, and now you propose to introduce another one, allowing > for even more inconsistencies??? It would not really be a "third encoding", since it would replace the locale encoding for a

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-09 Thread Martin v . Löwis
Martin v. Löwis added the comment: Am 09.10.2010 14:07, schrieb Antoine Pitrou: > > Antoine Pitrou added the comment: > >> For the command line, it would mean that we >> introduced a new encoding: "command line encoding", which will be utf-8 on >> OSX. > > Or more generally "environment en

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-09 Thread STINNER Victor
STINNER Victor added the comment: > So perhaps it would be best if Python had two external default encodings: > the IO one (command line arguments, environment variables, text files), > and the file name encoding (defaulting to the IO encoding if not set) Hum, I prefer to consider the FS encodi

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-09 Thread Antoine Pitrou
Antoine Pitrou added the comment: > For the command line, it would mean that we > introduced a new encoding: "command line encoding", which will be utf-8 on > OSX. Or more generally "environment encoding", if it's also used for env vars. This could solve the subprocess issue neatly.

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-09 Thread STINNER Victor
STINNER Victor added the comment: > Perhaps. We could also declare that command line arguments and > environment variables are always UTF-8-encoded on OSX (which I think > would be fairly accurate) Python uses the filesystem encoding to encode/decode environment variables, and OSX, fs encoding