[issue19847] Setting the default filesystem-encoding

2017-12-18 Thread STINNER Victor

STINNER Victor  added the comment:

Follow-up: the PEP 538 (bpo-28180) and PEP 540 (bpo-29240) have been accepted 
and implemented in Python 3.7. Python 3.7 will now use UTF-8 by default for the 
POSIX locale, and the encoding can be forced to UTF-8 using -X utf8 option.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19847] Setting the default filesystem-encoding

2013-12-13 Thread STINNER Victor

STINNER Victor added the comment:

I'm closing this issue as invalid for the same reason than I closed the issue 
#19846:
http://bugs.python.org/issue19846#msg205675

--
resolution:  - invalid
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19847
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19847] Setting the default filesystem-encoding

2013-12-13 Thread STINNER Victor

STINNER Victor added the comment:

I created the issue #19977 as a follow up of this one: Use surrogateescape 
error handler for sys.stdout on UNIX for the C locale.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19847
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19847] Setting the default filesystem-encoding

2013-12-12 Thread STINNER Victor

STINNER Victor added the comment:

See also the issue #19846.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19847
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19847] Setting the default filesystem-encoding

2013-12-02 Thread STINNER Victor

STINNER Victor added the comment:

sys.getfilesystemencoding() says for Unix: On Unix, the encoding is the user’s 
preference according to the result of nl_langinfo(CODESET), or 'utf-8' if 
nl_langinfo(CODESET) failed.

Oh, this documentation is wrong since at least Python 3.2: if 
nl_langinfo(CODESET) fails, Python exits immediatly with a (fatal) error.

There is no (more?) such fallback to utf-8.

--
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19847
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19847] Setting the default filesystem-encoding

2013-12-02 Thread STINNER Victor

STINNER Victor added the comment:

I fixed the documentation, thanks for your report!

--
assignee:  - docs@python
components: +Documentation -IO
nosy: +docs@python
resolution:  - fixed
status: open - closed
versions: +Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19847
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19847] Setting the default filesystem-encoding

2013-12-02 Thread STINNER Victor

STINNER Victor added the comment:

Code in Python 3.4.

initfsencoding():
http://hg.python.org/cpython/file/e3c48bddf621/Python/pythonrun.c#l965

get_locale_encoding():
http://hg.python.org/cpython/file/e3c48bddf621/Python/pythonrun.c#l250

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19847
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19847] Setting the default filesystem-encoding

2013-12-02 Thread Sworddragon

Sworddragon added the comment:

It is nice that you could fixed the documentation due to this report but this 
was just a sideeffect - so closing this report and moving it to Documentation 
was maybe wrong.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19847
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19847] Setting the default filesystem-encoding

2013-12-02 Thread STINNER Victor

STINNER Victor added the comment:

(Oops, I specified the wrong issue number in my commits.)

New changeset b231e0c3fd26 by Victor Stinner in branch '3.3':
Issue #19728: Fix sys.getfilesystemencoding() documentation
http://hg.python.org/cpython/rev/b231e0c3fd26

New changeset e3c48bddf621 by Victor Stinner in branch 'default':
(Merge 3.3) Issue #19728: Fix sys.getfilesystemencoding() documentation
http://hg.python.org/cpython/rev/e3c48bddf621

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19847
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19847] Setting the default filesystem-encoding

2013-12-02 Thread STINNER Victor

STINNER Victor added the comment:

It is nice that you could fixed the documentation due to this report but this 
was just a sideeffect - so closing this report and moving it to Documentation 
was maybe wrong.

Oh sorry, I read the issue too quickly, I stopped at the first sentence. I 
reopen the issue the reply to the other points.


In my opinion relying on the locale environment is risky since 
filesystem-encoding != locale. This is especially the case if working on a 
filesystem from an external media like an external hard disk drive. Operating 
on multiple media can also result in different filesystem-encodings.

This issue is not specific to Python. If you mount an USB key formated in VFAT 
with the wrong encoding on Linux, you will get mojibake in your file explorer. 
Same issue if you connect a network share (ex: NFS) using a different encoding 
than the server. You can find many other examples (hint: Mac OS X and Unicode 
normalization).

There is no good compromise here. The only two safe options are:

(A) convert filenames of your filesystem to the same encoding than your 
computer (there are tools for that, like convmv)

(B) use raw bytes instead of Unicode, Python 3 should accept bytes anywhere 
that OS data is expected (filenames, command line arguments, environment 
variables)

All operating systems (except Windows) are now using UTF-8 by default for the 
locale encoding. So slowly, mojibake issues on filename should become very rare.


It would be useful if the user can make his own checks and change the default 
filesystem-encoding if needed.

This idea was already proposed in issue #8622, but it was a big fail. Please 
read my following email for more information:
https://mail.python.org/pipermail/python-dev/2010-October/104509.html

--
resolution: fixed - 
status: closed - open

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19847
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19847] Setting the default filesystem-encoding

2013-12-02 Thread Sworddragon

Sworddragon added the comment:

 This idea was already proposed in issue #8622, but it was a big fail.

Not completely: If your locale is utf-8 and you want to operate on an utf-8 
filesystem all is fine. But what if you want then to operate on a ntfs 
(non-utf-8) partition? As I know there is no way to apply Python-environment 
variables on the fly with an effect to the interpreter. In my opinion this is 
the reason why a setter is needed here.

Otherwise the user has to go sure to use .encode() on all filesystem 
operations. Also he must ensure that .encode() doesn't throw any exception if 
the code must be robust. And with issue http://bugs.python.org/issue19846 this 
must likely be done with the content too. This will be really a hell in 
increasing the number of lines due to exception checking.

Is there a special reason that is against such a setter? The current advantage 
would be a huge increasing in maintainability of Python scripts who are relying 
on a high stability.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19847
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19847] Setting the default filesystem-encoding

2013-11-30 Thread Sworddragon

New submission from Sworddragon:

sys.getfilesystemencoding() says for Unix: On Unix, the encoding is the user’s 
preference according to the result of nl_langinfo(CODESET), or 'utf-8' if 
nl_langinfo(CODESET) failed.

In my opinion relying on the locale environment is risky since 
filesystem-encoding != locale. This is especially the case if working on a 
filesystem from an external media like an external hard disk drive. Operating 
on multiple media can also result in different filesystem-encodings.

It would be useful if the user can make his own checks and change the default 
filesystem-encoding if needed.

--
components: IO
messages: 204853
nosy: Sworddragon
priority: normal
severity: normal
status: open
title: Setting the default filesystem-encoding
type: enhancement
versions: Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19847
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com