Marc-Andre Lemburg <m...@egenix.com> added the comment:

STINNER Victor wrote:
> 
> STINNER Victor <victor.stin...@haypocalc.com> added the comment:
> 
>> The command line -h explanation is missing from the patch.
> 
> done
> 
>> The documentation should mention that the env var is only
>> read once; subsequent changes to the env var are not seen
>> by Python
> 
> I copied the PYTHONIOENCODING doc which doesn't mention that. Does Python 
> re-read other environment variables at runtime? Anyway, I changed the doc to:
> 
> +   If this is set before running the intepreter, it overrides the encoding 
> used
> +   for the filesystem encoding (see :func:`sys.getfilesystemencoding`).
> 
> I also changed PYTHONIOENCODING doc. Is it better?

Yes, thanks.

>> If the codec lookup fails, Python should either issue a warning
> 
> Ok, done. I patched also get_codeset() and get_codec_name() to always set a 
> Python error.
> 
>> ... and then ignore the env var (using the get_codeset() API).
> 
> Good idea, done.
> 
>> Unrelated to the env var, but still important: if get_codeset()
>> does not return a known codec, Python should issue a warning
>> before falling back to the default setting. Otherwise, a
>> Python user will never know that there's an issue and this
>> make debugging a lot harder.
> 
> It does already write a message to stderr, but it doesn't explain why it 
> failed.
> 
> I changed initfsencoding() to display two messages on get_codeset() error. 
> First explain why get_codeset() failed (with the Python error) and then say 
> that we fallback to utf-8.
> 
> Full example (PYTHONFSENCODING error and simulated get_codeset() error):
> ---
> PYTHONFSENCODING is not a valid encoding:
> LookupError: unknown encoding: xxx
> Unable to get the locale encoding:
> ValueError: CODESET is not set or empty
> Unable to get the filesystem encoding: fallback to utf-8
> ---

Looks good !

>> We should also add a new sys.setfilesystemencoding() ...
> 
> No, I plan to REMOVE this function. sys.setfilesystemencoding() is dangerous 
> because it introduces a lot of inconsistencies: this function is unable to 
> reencode all filenames in all objects (eg. Python is unable to find filenames 
> in user objects or 3rd party libraries). Eg. if you change the filesystem 
> from utf8 to ascii, it will not be possible to use existing non-ascii 
> (unicode) filenames: they will raise UnicodeEncodeError. As 
> sys.setdefaultencoding() in Python2, I think that sys.setfilesystemencoding() 
> is the root of evil :-)

Sorry, I wasn't aware we had such a function (and was looking at the
wrong file so didn't find it).

> At startup, initfsencoding() sets the filesystem encoding using the locale 
> encoding. Even for the startup process (with very few objects), it's very 
> hard to find all filenames:
>  - sys.path
>  - sys.meta_path
>  - sys.modules
>  - sys.executable
>  - all code objects
>  - and I'm not sure that the list is complete
> 
> See #9630 for the details.
> 
> To remove sys.setfilesystemencoding(), I already patched PEP 383 tests 
> (r84170) and I will open a new issue. But it's maybe better to commit both 
> changes (remove the function and PYTHONFSENCODING) at the same time.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue8622>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to