2008/10/11 Damian Johnson <[EMAIL PROTECTED]>

> Hi, when getting text via the raw_input method it's always a string (even
> if it contains non-ASCII characters). The problem lies in that whenever I
> try to check equality against a Unicode string it fails. I've tried using
> the unicode method to 'cast' the string to the Unicode type but this throws
> an exception:
>

Python needs to know the encoding of the bytestring in order to convert it
to unicode.  If you don't specify an encoding, ascii is assumed, which
doesn't work for any bytestrings that actually contain non-ASCII data.
Since you are reading the string from standard input, try using the encoding
associated with stdin:

>>> a = raw_input("text: ")
text: おはよう
>>> b = u"おはよう"
>>> import sys
>>> unicode(a,sys.stdin.encoding) == b
True

Karen
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to