Re: Non-unicode file names

2018-08-09 Thread Thomas Jollans
On 09/08/18 05:13, INADA Naoki wrote: > Please use Python 3.7. > > Python 3.7 has several improvements on this area. Thanks! Darkly remembering something about UTF-8 mode, I suspected it might... > > * When PEP 538 or 540 is used, default error handler for stdio is > surrogateescape > * You can

Re: Non-unicode file names

2018-08-08 Thread Marko Rauhamaa
INADA Naoki : > For Python 3.6, I think best way to allow arbitrary bytes on stdout is > using `PYTHONIOENCODING=utf-8:surrogateescape` environment variable. Good info! Marko -- https://mail.python.org/mailman/listinfo/python-list

Re: Non-unicode file names

2018-08-08 Thread INADA Naoki
Please use Python 3.7. Python 3.7 has several improvements on this area. * When PEP 538 or 540 is used, default error handler for stdio is surrogateescape * You can sys.stdout.reconfigure(errors='surrogateescape') For Python 3.6, I think best way to allow arbitrary bytes on stdout is using `PYTH

Re: Non-unicode file names

2018-08-08 Thread Cameron Simpson
On 09Aug2018 03:14, MRAB wrote: [...] Is it true that Unix filenames can contain control characters, e.g. \x07? Yep. They're just byte strings. You can't have \0 (NUL) because the API uses NUL terminated strings, and you can't use slash '/' in the filename components because that is the comp

Re: Non-unicode file names

2018-08-08 Thread MRAB
On 2018-08-09 01:14, Thomas Jollans wrote: On 09/08/18 01:48, MRAB wrote: On 2018-08-08 23:16, Thomas Jollans wrote: On *nix, file names are bytes. In real life, we prefer to think of file names as strings. How non-ASCII file names are created is determined by the locale, and on most systems th

Re: Non-unicode file names

2018-08-08 Thread Thomas Jollans
On 09/08/18 01:48, MRAB wrote: > On 2018-08-08 23:16, Thomas Jollans wrote: >> On *nix, file names are bytes. In real life, we prefer to think of file >> names as strings. How non-ASCII file names are created is determined by >> the locale, and on most systems these days, every locale uses UTF-8 an

Re: Non-unicode file names

2018-08-08 Thread MRAB
On 2018-08-08 23:16, Thomas Jollans wrote: On *nix, file names are bytes. In real life, we prefer to think of file names as strings. How non-ASCII file names are created is determined by the locale, and on most systems these days, every locale uses UTF-8 and everybody's happy. Of course this does

Non-unicode file names

2018-08-08 Thread Thomas Jollans
On *nix, file names are bytes. In real life, we prefer to think of file names as strings. How non-ASCII file names are created is determined by the locale, and on most systems these days, every locale uses UTF-8 and everybody's happy. Of course this doesn't mean you'll never run into and old direct