On Tue, 10 Aug 2010 02:24:00 pm Dave Angel wrote: > > repr() returns the string > > representation, not the byte representation. Try this: > > That's what I was missing. Somehow I assumed it was converting to > byte strings. > > I had assumed that it reverted to /uxxxx or /Uxxxxxxxx whenever a > character was outside the ASCII range. That would be a direct analog > to what seems to happen on byte strings. Does it only escape > newlines and single quotes ?
It seems to escape control characters, such as \x00 \x11 \x88 or \x99, but nothing else. There's probably something in the Unicode standard that specifies which characters need to be escaped and which don't. [...] > Any suggestions how to fix the Windows console to interpret utf8? I don't know about Windows, but under Linux there is a menu command for most xterms that let you set it. Googling led me to this page: which, if I've read it right, suggests that you should be able to type: chcp 65001 at the DOS prompt before launching Python, and it theoretically will use UTF-8. Good luck. -- Steven D'Aprano _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor