On Thu, 14 Oct 2010 12:13:50 am David Hutto wrote: > I see it now. I knew that the u outside ' ' is the coding for the > string, but I thought I had to strip it before using it since that > was how it showed up. The bug of course would be that graphs that > start with u would go to the second letter, but the u would still be > used in alphabetization, because the alphabetizing is prior to > stripping.
No, the u is not part of the string, it is part of the *syntax* for the string, just like the quotation marks. The first character of "abc" is a and not ", and the first character of u"abc" is also a and not u. Another way to think of it... the u" " of Unicode strings is just a delimiter, just like the [ ] of lists or { } of dicts -- it's not part of the string/list/dict. In Python 3 this confusion is lessened. Unicode strings (characters) are written using just " as the delimiter (or ' if you prefer), instead of u" " as used by Python 2. Byte strings are written using b" " instead. This makes the common case (text strings) simple, and the uncommon case (byte strings) more complicated. -- Steven D'Aprano _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor