New submission from Hirokazu Yamamoto <[EMAIL PROTECTED]>:

Hello. I found another problem related to issue2301.
SyntaxError cursor "^" is shifted when multibyte
characters are in line (before "^").

I think this is because err->text is stored as UTF-8
which requires 3 bytes for multibyte character,
but actually cp932 (my console encoding) requires only 2 bytes for it.

So "^" is shited to right 5 bytes because there is 5 multibyte chars.

C:\Documents and Settings\WhiteRabbit>py3k x.py
push any key....

  File "x.py", line 3
    print "あいうえお"
                          ^
SyntaxError: invalid syntax
[22567 refs]

Sorry, I didn't know what PyTokenizer_RestoreEncoding really doing.
That function adjusted err_ret->offset for this encoding conversion.
So, Python2.5 can output cursor in right place. (Of course, if source
encoding is not compatible for console encoding, broken string is printed
though. Anyway, cursor is right)

C:\Documents and Settings\WhiteRabbit>py a.py
  File "a.py", line 2
    x "、「、、、ヲ、ィ、ェ"
                 ^
SyntaxError: invalid syntax
[8728 refs]

I tried to fix this problem, but I'm not sure how to fix this.

----------
components: None
messages: 63895
nosy: ocean-city
severity: normal
status: open
title: [Py3k] SyntaxError cursor shifted if multibyte character is in line.
versions: Python 3.0

__________________________________
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2382>
__________________________________
_______________________________________________
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to