"Guido van Rossum" <[EMAIL PROTECTED]> writes: > When a source file contains a string literal with an out-of-range \U > escape (e.g. "\U12345678"), instead of a syntax error pointing to the > offending literal, I get this, without any indication of the file or > line: > > UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in > position 0-9: illegal Unicode character > > This is quite hard to track down. (Both the location of the bad > literal in the source file, and the origin of the error in the parser. > :-) Can someone come up with a fix? > > I note that raw escapes show a slightly different error. I also note > that the same issue exists for u"..." literals in Python 2.5.
For what it's worth, I posted a patch to ast.c against the 2.6 trunk which massages the unicode exception into a SyntaxError showing the location. That approach lets unicodeobject.c handle the gory details while ast.c handles the SyntaxError generation. It might be a solution until something deeper along the lines of Martin's thoughts is possibly developed. I don't think that any reference adjustments are needed, but someone should check the patch. www.python.org/sf/1755885 -- KBK _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
