On Fri, Feb 17, 2006, "Martin v. L?wis" wrote: > Josiah Carlson wrote: >> >> How are users confused? > > Users do > > py> "Martin v. L?wis".encode("utf-8") > Traceback (most recent call last): > File "<stdin>", line 1, in ? > UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11: > ordinal not in range(128) > > because they want to convert the string "to Unicode", and they have > found a text telling them that .encode("utf-8") is a reasonable > method.
The problem is that they don't understand that "Martin v. L?wis" is not Unicode -- once all strings are Unicode, this is guaranteed to work. While it's not absolutely true, my experience of watching Unicode confusion is that the simplest approach for newbies is: encode FROM Unicode, decode TO Unicode. Most people when they start playing with Unicode think of it as just another text encoding rather than suddenly replacing "the universe" as the most base form of text. -- Aahz ([EMAIL PROTECTED]) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com