Ron Adam schrieb: > Guido van Rossum wrote: >> That would be great! This will automatically turn \u1234 into 6 >> characters, right? > > I'm not exactly clear when the '\uxxxx' characters get converted. There > isn't any conversion done in tokanize.c that I can see. It's primarily > only concerned with finding the beginning and ending of the string at that > point. It looks like everything between the beginning and end is just > passed along "as is" and it's translated further later in the chain.
Look at Python/ast.c, which has functions parsestr() and decode_unicode(). The latter calls PyUnicode_DecodeRawUnicodeEscape() which I think is the function you're looking for. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
