Re: Unicode question : turn "José" into u"José"

2006-04-05 Thread Ben Finney
"Ian Sparks" <[EMAIL PROTECTED]> writes: > This is probably stupid and/or misguided but supposing I'm passed a > byte-string value that I want to be unicode, this is what I do. I'm > sure I'm missing something very important. Perhaps you need to read one of the good Python Unicode tutorials, such

Re: Unicode question : turn "José" into u"José"

2006-04-05 Thread Kent Johnson
ianaré wrote: > maybe a bit off topic, but how does one find the console's encoding > from within python? > In [1]: import sys In [3]: sys.stdout.encoding Out[3]: 'cp437' In [4]: sys.stdin.encoding Out[4]: 'cp437' Kent -- http://mail.python.org/mailman/listinfo/python-list

Re: Unicode question : turn "José" into u"José"

2006-04-05 Thread John Machin
The most important thing that you are missing is that you need to know the encoding used for the 8-bit-character string. Let's guess that it's Latin1. Then all you have to do is use the unicode() builtin function, or the string decode method. # >>> s = 'Jos\xe9' # >>> s # 'Jos\xe9' # >>> u = unico

Re: Unicode question : turn "José" into u"José"

2006-04-05 Thread ianaré
maybe a bit off topic, but how does one find the console's encoding from within python? -- http://mail.python.org/mailman/listinfo/python-list

Re: Unicode question : turn "José" into u"José"

2006-04-05 Thread aurora
First of all, if you run this on the console, find out your console's encoding. In my case it is English Windows XP. It uses 'cp437'. C:\>chcp Active code page: 437 Then >>> s = "José" >>> u = u"Jos\u00e9" # same thing in unicode escape >>> s.decode('cp437') == u # use encoding that

Unicode question : turn "José" into u"José"

2006-04-05 Thread Ian Sparks
This is probably stupid and/or misguided but supposing I'm passed a byte-string value that I want to be unicode, this is what I do. I'm sure I'm missing something very important. Short version : >>> s = "José" #Start with non-unicode string >>> unicoded = eval("u'%s'" % "José") Long version :