[EMAIL PROTECTED] wrote:

uhm ... then there is a misprint in the discussion of the recipe;
BTW what's the difference between .encode and .decode ?
(yes, I have been living in happy ASCII-land until now ... ;)


# -*- coding: latin-1 -*-


# here i make a unicode string unicode_file = u'Some danish characters æøå' #.encode('hex') print type(unicode_file) print repr(unicode_file) print ''


# I can convert this unicode string to an ordinary string. # because æøå are in the latin-1 charmap it can be understood as # a latin-1 string # the æøå characters even has the same value in both latin1_file = unicode_file.encode('latin-1') print type(latin1_file) print repr(latin1_file) print latin1_file print ''


## I can *not* convert it to ascii #ascii_file = unicode_file.encode('ascii') #print ''


# I can also convert it to utf-8 utf8_file = unicode_file.encode('utf-8') print type(utf8_file) print repr(utf8_file) print utf8_file print ''


#utf8_file is now an ordinary string. again it can help to think of it as a file
#format.
#
#I can convert this file/string back to unicode again by using the decode method.
#It tells python to decode this "file format" as utf-8 when it loads it onto a
#unicode string. And we are back where we started



unicode_file = utf8_file.decode('utf-8') print type(unicode_file) print repr(unicode_file) print ''


# So basically you can encode a unicode string into a special string/file format
# and you can decode a string from a special string/file format back into unicode.



###################################


<type 'unicode'> u'Some danish characters \xe6\xf8\xe5'

<type 'str'>
'Some danish characters \xe6\xf8\xe5'
Some danish characters æøå

<type 'str'>
'Some danish characters \xc3\xa6\xc3\xb8\xc3\xa5'
Some danish characters æøå

<type 'unicode'>
u'Some danish characters \xe6\xf8\xe5'





--

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to