Keo Sophon wrote: > Hi, > > Today i tested u=unicode(str,'utf-8') and u=str.decode('utf-8'). Then in both > case I used: > > if isinstance(u,str): > print "just string" > else: > print "unicode" > > the result of both case are "unicode". So it seems u=unicode(str,'utf-8') and > u=str.decode('utf-8') are the same. How about the processing inside? is it > same?
I don't know the details of how they are implemented but they do have the same result. As far as I know you can use whichever form you find more readable. There are a few special-purpose encodings for which the result of decode() is a byte string rather than a unicode string; for these encodings, you have to use str.decode(). For example: In [42]: 'abc'.decode('string_escape') Out[42]: 'abc' In [44]: unicode('abc', 'string_escape') ------------------------------------------------------------ Traceback (most recent call last): File "<ipython console>", line 1, in ? TypeError: decoder did not return an unicode object (type=str) Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor