Peter Landgren <peter.tal...@telia.com> added the comment: I am not sure I can follow you. I will try to be more specific.
The test string consists originally of one character; the Czech Š. 1. On Linux with Python 2.6.4 1.1 If I keep the original code line order: label = obj.get() print type(label), repr(label) label = " ".join(label.split()) print type(label), repr(label) label = unicode(label) if len(label) > 40: label = label[:40] + "..." Both lines print type(label), repr(label) gives: <type 'str'> '\xc5\xa0' 1.2 If I change order and take the unicode conversion first: label = obj.get() label = unicode(label) print type(label), repr(label) label = " ".join(label.split()) print type(label), repr(label) if len(label) > 40: label = label[:40] + "..." Both lines print type(label), repr(label) gives: <type 'unicode'> u'\u0160' 2. On Windows with Python 2.6.5 2.1 The original code line order: The lines print type(label), repr(label) gives <type 'str'> '\xc5\xa0' <type 'str'> '\xc5' 8217: ERROR: gramps.py: line 138: Unhandled exception .... 2.2 If I change order and take the unicode conversion first: Both lines print type(label), repr(label) gives: <type 'unicode'> u'\u0160' 3. If I use this little code: # -*- coding: utf-8 -*- label = 'Š' print type(label), repr(label) label = " ".join(label.split()) print type(label), repr(label) I get <type 'str'> '\xc5\xa0' <type 'str'> '\xc5\xa0' on both Linux and Windows. The examples above under 1. and 2. comes from an application, Gramps. There is still something I don't understand. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue8859> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com