New submission from Baptiste Mispelon:

When a syntax error happens, the exception that gets printed has an extra line 
with a caret that helps locate the error.

If the line also contains an identifier with non-ascii characters, then this 
caret is misaligned (too far on the right).

I've investigated briefly and it seems that the offset attribute on the 
SyntaxError has a wrong value:

    for varname in ['a', 'é', '蟒']: # 1, 2 and 3 bytes
        try:
            exec("%s$" % varname) # SyntaxError
        except SyntaxError as e:
            print(e.offset) # should be 2

The example above prints 2, 3, and 4 when it should be printing 2 every time.

It seems that the calculation of the offset takes into account the size in 
bytes instead of the size in characters.

I've tested and reproduced the issue on 3.2.2 and on a recent clone of the 
mercurial repository (dd5e98ddcd39).

----------
components: Interpreter Core
messages: 172470
nosy: bmispelon
priority: normal
severity: normal
status: open
title: Wrong offset on SyntaxError when identifier contains non-ascii characters
type: behavior
versions: Python 3.2, Python 3.4

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue16173>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to