Alexander Belopolsky writes: > In fact, once the language moratorium is over, I will argue that > str.encode() and byte.decode() should deprecate encoding argument and > just do UTF-8 encoding/decoding. Hopefully by that time most people > will forget that other encodings exist. (I can dream, right?)
It's just a dream. There's a pile of archival material, often on R/O media, out there that won't be transcoded any more quickly than the inscriptions on Tutankhamun's tomb. Remember, Python is a language used to implement such translations. It's not an application. I think it would be reasonable to make UTF-8 the *default* encoding on all platforms, except for internal OS functions, where Windows will presumably continue to use UTF-16 and *nix distros will probably continue to agree to disagree about whether on-disk format is NFD or NFC (and the Python language as yet doesn't know about NFC v. NFD, although the library does). In the discussion of PEP 263, I proposed that the external encoding of Python scripts themselves be fixed as UTF-8, and other encodings would have to be pretranslated by an appropriate codec. That was shouted down by the European contingent, who wanted to continue using Latin-1 and Latin-2 without codecs or a wrapper to call them transparently. However, this time around you might get a more sympathetic hearing. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com