Re: [Python-Dev] transform() and untransform() methods, and the codec registry

Stephen J. Turnbull Sat, 04 Dec 2010 00:33:48 -0800

Alexander Belopolsky writes:

 > In fact, once the language moratorium is over, I will argue that
 > str.encode() and byte.decode() should deprecate encoding argument and
 > just do UTF-8 encoding/decoding.  Hopefully by that time most people
 > will forget that other encodings exist.  (I can dream, right?)


It's just a dream.  There's a pile of archival material, often on R/O
media, out there that won't be transcoded any more quickly than the
inscriptions on Tutankhamun's tomb.

Remember, Python is a language used to implement such translations.
It's not an application.  I think it would be reasonable to make UTF-8
the *default* encoding on all platforms, except for internal OS
functions, where Windows will presumably continue to use UTF-16 and
*nix distros will probably continue to agree to disagree about whether
on-disk format is NFD or NFC (and the Python language as yet doesn't
know about NFC v. NFD, although the library does).

In the discussion of PEP 263, I proposed that the external encoding of
Python scripts themselves be fixed as UTF-8, and other encodings would
have to be pretranslated by an appropriate codec.  That was shouted
down by the European contingent, who wanted to continue using Latin-1
and Latin-2 without codecs or a wrapper to call them transparently.
However, this time around you might get a more sympathetic hearing.

_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] transform() and untransform() methods, and the codec registry

Reply via email to