First of all, if you run this on the console, find out your console's encoding. In my case it is English Windows XP. It uses 'cp437'.
C:\>chcp Active code page: 437 Then >>> s = "José" >>> u = u"Jos\u00e9" # same thing in unicode escape >>> s.decode('cp437') == u # use encoding that match your console True >>> wy > This is probably stupid and/or misguided but supposing I'm passed a > byte-string value that I want to be unicode, this is what I do. I'm sure > I'm missing something very important. > > Short version : > >>>> s = "José" #Start with non-unicode string >>>> unicoded = eval("u'%s'" % "José") > > Long version : > >>>> s = "José" #Start with non-unicode string >>>> s #Lets look at it > 'Jos\xe9' >>>> escaped = s.encode('string_escape') >>>> escaped > 'Jos\\xe9' >>>> unicoded = eval("u'%s'" % escaped) >>>> unicoded > u'Jos\xe9' > >>>> test = u"José" #What they should have passed me >>>> test == unicoded #Am I really getting the same thing? > True #Yay! > > > > -- http://mail.python.org/mailman/listinfo/python-list