New submission from Baptiste Mispelon:
When a syntax error happens, the exception that gets printed has an extra line
with a caret that helps locate the error.
If the line also contains an identifier with non-ascii characters, then this
caret is misaligned (too far on the right).
I've investigated briefly and it seems that the offset attribute on the
SyntaxError has a wrong value:
for varname in ['a', 'é', '蟒']: # 1, 2 and 3 bytes
try:
exec("%s$" % varname) # SyntaxError
except SyntaxError as e:
print(e.offset) # should be 2
The example above prints 2, 3, and 4 when it should be printing 2 every time.
It seems that the calculation of the offset takes into account the size in
bytes instead of the size in characters.
I've tested and reproduced the issue on 3.2.2 and on a recent clone of the
mercurial repository (dd5e98ddcd39).
----------
components: Interpreter Core
messages: 172470
nosy: bmispelon
priority: normal
severity: normal
status: open
title: Wrong offset on SyntaxError when identifier contains non-ascii characters
type: behavior
versions: Python 3.2, Python 3.4
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue16173>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com