>> raw = unicode("125° 15' 5.55''", 'utf-8') > > Again, I think this can be simplified to > raw = u"125° 15' 5.55''"
It does, but it's getting confusing when I compare the following: >>> raw = u"125° 15' 5.55''" 125° 15' 5.55'' >>> print u"125° 15' 5.55''" UnicodeEncodeError: 'ascii' codec can't encode characters in position 3-4: ordinal not in range(128) >>> print u"125° 15' 5.55''".encode('utf-8') 125° 15' 5.55'' >>> print unicode("125° 15' 5.55''") UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 3: ordinal not in range(128) >>> print unicode("125° 15' 5.55''", 'utf-8') UnicodeEncodeError: 'ascii' codec can't encode character u'\xb0' in position 3: ordinal not in range(128) So apart from the errors all being slightly different, is there perhaps some difference between the str() and repr() functions (looks like repr uses escape backslashes)? Or does it simply have to do with my locale, which is set to the default "C" (terminal = standard Mac OS X terminal, with UTF-8 encoding)? Although that wouldn't explain to me why the third statement works. And checking the default encoding inside the python cmdline, I see that my sys module doesn't actually have a setdefaultencoding() method; was that something that should have been properly configured at compile time? The documentation mentions something about the site module, but I can't find it there either. Any enlightenment on this is welcome. Evert _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor