>> The ability to change the default encoding is a misfeature. There's >> essentially no way to write correct Python code in the presence of >> this feature. > > How so? If every single piece of text in your project is encoded in a > superset of ascii (such as utf-8), why would this be a problem?
What is "every single piece of text"? Every string occurring in source code? or also every single string that may be read from a file, a socket, out of a database, or from a user interface? How can you be certain that any string is UTF-8 when doing any reasonable IO? > Even if you were evil/stupid and mixed encodings, surely all you'd get > is different unicode errors or mayvbe the odd strange character during > display? One specific problem is dictionaries will stop working correctly if you set the default encoding to anything but ASCII. The reason is that with UTF-8 as the default encoding, you get py> u"\u20ac" == u"\u20ac".encode("utf-8") True py> hash(u"\u20ac") == hash(u"\u20ac".encode("utf-8")) False So objects that compare equal will not hash equal. As a consequence, you may have two different values for what should be the same key in a dictionary. > Well, flipping that giant switch has worked in production for the past 5 > years, so I'm afraid I'll respectfully disagree. I'd suspect the > pragmatics of real world software are with that function even exists, > and it's extremely useful when used correctly... It has worked in your application. See my example above: it is very easy to create applications that stop working correctly if you use setdefaultencoding (at all - the only supported value is "latin-1", since Unicode strings hash the same as byte strings if all characters are in row 0). Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com