[EMAIL PROTECTED] wrote: > Im totally new to Python so please bare with me.
That's no problem, really. I don't use a spellchecker, either, and it wouldn't have protected you from that particular typo... > Data is entered into my program using the folling code - > > str = raw_input(command) > words = str.split() > > for word in words: > word = unicode(word,'latin-1') > word.encode('utf8') > > This gives an error: > > File "C:\Python25\lib\encodings\cp850.py", line 12, in encode > return codecs.charmap_encode(input,errors,encoding_map) > UnicodeEncodeError: 'charmap' codec can't encode character u'\x94' in > position 0 > : character maps to <undefined> > > but the following works. > > str = raw_input(command) > words = str.split() > > for word in words: > uni = u"" > uni = unicode(word,'latin-1') > uni.encode('utf8') Here you show us the same code twice, as the uni = u"" assignment has no effect, and a traceback that is probably generated when you try to print uni Here's my guess: The encoding you actually need is cp850, the same that your Python interpreter is trying to use, but in which unichr(0x94) is undefined. In general, you are not free to use a random encoding; rather, you have to use what your console expects. import sys s = raw_input(command) s = unicode(s, sys.stdin.encoding) # trust python to find out the proper # encoding. If that fails use a constant, # probably "cp850" words = s.split(): for word in words: print word # trust python, but if it doesn't work out: # word = word.encode("cp850") # print word By the way, strings are immutable (cannot be altered once created), so the following > word.encode('utf8') > print word is actually spelt word = word.encode("utf8") print word If your data is not read from the console and it contains characters that cannot be printed, unicode.encode() accepts a second parameter to deal with it, see >>> help(u"".encode) Peter -- http://mail.python.org/mailman/listinfo/python-list