On Wed, Mar 17, 2021 at 1:11 AM Michał Górny <mgo...@gentoo.org> wrote:
> On Wed, 2021-03-17 at 13:55 +0900, Inada Naoki wrote: > > OK. setuptools doesn't specify encoding at all. So locale-specific > > encoding is used. > > We can not fix it in short term. > > How about writing paths as bytestrings in the long term? I think this > should eliminate the necessity of knowing the correct encoding for > the filesystem. > On Linux and many Unixes, there is no "correct" filesystem encoding. ASCII and UTF-8 are probably the most common encodings for individual files, maybe even large collections of files, but nevertheless, paths are bytestrings. Treating paths as UTF-8 works fine for most files, but once in a while there'll be a filename that fails to convert, and that's not the fault of the filename. For example, what happens if you need a file to be named touch "Ma$(echo | tr '\012' '\361')ana" ? For a presentation application (for EG), assuming UTF-8 is probably fine, maybe even a good thing. But for a filesystem backup tool, it's important to not assume an encoding so you can back up and restore all filenames irrespective of what the files' creators intended encodingwise.
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/HLTFATPMRA57UU3KQOXHIMELZZGXUUJJ/ Code of Conduct: http://python.org/psf/codeofconduct/