[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2018-08-18 Thread Terry J. Reedy


Terry J. Reedy  added the comment:

IDLE avoids the problem of calculating a location for a '^' below the bad line 
by instead asking tk to give the marked character (and maybe more) a 'ERROR' 
tag, which shows as a red background.  So it marks the '$' of 'A_I_U_E_O$' and 
the 'alid' slice of 'inv\u200balid' (from duplicate #10384).  When the marked 
character is '\n', the space following the line is tagged.  Is it possible to 
do something similar with any of the major system consoles?

--
nosy: +terry.reedy

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2014-09-30 Thread STINNER Victor

STINNER Victor added the comment:

The issue #10384 has been marked as a duplicate of this issue: it's a similar 
issue, identifier which contains invisible character.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2014-09-30 Thread Alexander Belopolsky

Alexander Belopolsky added the comment:

The original problem is still present


Python 3.5.0a0 (default:5313b4c0bb6c, Sep 30 2014, 18:55:45)
 A_I_U_E_O$ = None
  File stdin, line 1
A_I_U_E_O$ = None
 ^
SyntaxError: invalid syntax

Replace A_I_U_E_O above with the Japanese script.  I get codec error from the 
server when I try to paste my session as is.

(Note that invalid character is $ above and not the Japanese AIUEO.)

Another outstanding issue is with zero-width characters.  See #10384.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2014-01-21 Thread Roundup Robot

Roundup Robot added the comment:

New changeset eb7565c212f1 by Serhiy Storchaka in branch '3.3':
Issue #2382: SyntaxError cursor ^ now is written at correct position in most
http://hg.python.org/cpython/rev/eb7565c212f1

New changeset ea34b2b0b8ae by Serhiy Storchaka in branch 'default':
Issue #2382: SyntaxError cursor ^ now is written at correct position in most
http://hg.python.org/cpython/rev/ea34b2b0b8ae

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2014-01-21 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
assignee: serhiy.storchaka - 
stage: patch review - needs patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2014-01-14 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

If no one complain I'll commit last patch tomorrow.

--
assignee:  - serhiy.storchaka
stage:  - patch review
type:  - behavior
versions: +Python 3.4 -Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2013-09-25 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Added tests. I think it will be worth apply this patch which fixes the issue 
for most Europeans and than continue working on the issue of wide characters.

--
Added file: http://bugs.python.org/file31874/adjust_offset_2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2013-09-25 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


Removed file: http://bugs.python.org/file27506/adjust_offset-3.3.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2013-06-10 Thread Alexander Belopolsky

Alexander Belopolsky added the comment:

haypo The purpose of this issue is to handle CJK characters taking 2 haypo 
columns instead of 1 in a terminal, or did I misunderstand it?

That's the other half of the problem, but the more common issue is misplaced 
caret when non-ascii characters are present:

 ¡™£¢∞§¶•ªº
  File stdin, line 1
¡™£¢∞§¶•ªº
  ^
SyntaxError: invalid character in identifier

With Serhiy's patch:

 ¡™£¢∞§¶•ªº
  File stdin, line 1
¡™£¢∞§¶•ªº
 ^
SyntaxError: invalid character in identifier

--
nosy: +belopolsky

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2013-06-10 Thread Alexander Belopolsky

Alexander Belopolsky added the comment:

Serhiy's patch is lacking tests, but it passes the test I proposed at #10382 at 
attaching here.

--
Added file: http://bugs.python.org/file30534/test.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2012-10-09 Thread Arfrever Frehtes Taifersar Arahesis

Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:


--
nosy: +Arfrever

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2012-10-09 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
nosy: +serhiy.storchaka

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2012-10-09 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is a patch upgraded to Python 3.3. It uses a little different approach and 
works with invalid encoded data. unicode_utf8size.patch is not needed.

This patch fixes a half of the issue - working with non-ascii non-wide 
characters. It's enough for many people. Let's commit it and go further.

--
Added file: http://bugs.python.org/file27506/adjust_offset-3.3.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2012-10-09 Thread STINNER Victor

STINNER Victor added the comment:

 This patch fixes a half of the issue - working with non-ascii
 non-wide characters.

The purpose of this issue is to handle CJK characters taking 2 columns instead 
of 1 in a terminal, or did I misunderstand it?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2011-12-20 Thread Petri Lehtinen

Petri Lehtinen pe...@digip.org added the comment:

What's the status of this issue?

FWIW, this is not only a problem with east asian characters:

 ä äää
  File stdin, line 1
ä äää
^
SyntaxError: invalid syntax

--
nosy: +petri.lehtinen
versions: +Python 3.2, Python 3.3 -Python 3.0

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2011-07-14 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

I just created the issue #12568 for unicode_width.patch.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2010-07-09 Thread Ezio Melotti

Changes by Ezio Melotti ezio.melo...@gmail.com:


--
nosy: +ezio.melotti

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2009-03-17 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

Proof of concept of patch fixing this issue:
 - parse_syntax_error() reads the text line into a PyUnicodeObject* 
instead of a const char**
 - create utf8_to_unicode_offset(): convert byte offset to a number of 
characters. The Python version should be something like:

   def utf8_to_unicode_offset(text, byte_offset):
  utf8 = text.encode(utf-8)
  utf8 = utf8[:byte_offset]
  text = str(utf8, utf-8)
  return len(text)

 - reuse adjust_offset() from 
py3k_adjust_cursor_at_syntax_error_v2.patch, but force the use of 
wcswidth() because HAVE_WCSWIDTH is not defined by configure
 - print_error_text() works on unicode characters and not on bytes!

The patch should be refactorized:
 - move adjust_offset(), utf8_to_unicode_offset(), utf8_len() in 
unicodeobject.c. You might create a new method width() for the 
unicode type. This method can be used to fix center(), ljust() and 
rjust() unicode methods (see issue #3446).

--
Added file: http://bugs.python.org/file13354/issue2382.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2009-03-17 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

For an easier review, I splitted my patch in multiple small patches:
 - unicode_utf8size.patch: create _PyUnicode_UTF8Size() function: 
Number of bytes needed to encode the unicode character as UTF-8
 - unicode_width.patch: create PyUnicode_Width(): Number of column 
needed to represent the string in the current locale. -1 is returned 
in case of an error.
 - adjust_offset.patch: Change unit of SyntaxError.offset, convert 
utf8 offset to unicode offset
 - print_exception.patch: process error text as an unicode string 
(instead of a byte string), convert offset from characters 
to columns

Dependencies:
 - adjust_offset.patch depends on unicode_utf8size.patch
 - print_exception.patch depends on unicode_width.patch

Changes since issue2382.patch:
 - PyUnicode_Width() doesn't change the locale
 - PyUnicode_Width() uses WideCharToMultiByte() on MS_WINDOWS, and 
wcswidth() otherwise (before: do nothing if HAVE_WCSWIDTH is not 
definied)
 - the offset was converted from utf8 index to unicode index only in 
print_error_text(), not on SyntaxError creation
 - _PyUnicode_UTF8Size() and PyUnicode_Width() are public

--
Added file: http://bugs.python.org/file13356/unicode_utf8size.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2009-03-17 Thread STINNER Victor

Changes by STINNER Victor victor.stin...@haypocalc.com:


Added file: http://bugs.python.org/file13357/unicode_width.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2009-03-17 Thread STINNER Victor

Changes by STINNER Victor victor.stin...@haypocalc.com:


Added file: http://bugs.python.org/file13358/adjust_offset.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2009-03-17 Thread STINNER Victor

Changes by STINNER Victor victor.stin...@haypocalc.com:


Added file: http://bugs.python.org/file13359/print_exception.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2009-03-17 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

Comments about my own patches.

unicode_width.patch: 

* error messages should be improved:
  ValueError(Unable to compute string width) for Windows
  IOError(strerror(errno)) otherwise

adjust_offset.patch: 

* format_exception_only() from Lib/traceback.py may need a fix
* about the documentation: it looks like SyntaxError.offset unit is 
not documentation in exceptions.rst (should it be documented, or 
leaved unchanged?)

print_exception.patch:

* i'm not sure of the reference counts (ref leak?)
* in case of PyUnicode_FromUnicode(text, textlen) error, 
PyFile_WriteObject(textobj, f, Py_PRINT_RAW); 
PyFile_WriteString(\n, f); is used to display the line but textobj 
may already ends with \n.
* format_exception_only() from Lib/traceback.py should do the same job 
than fixed print_exception(): get the string width (to fix this 
issue!)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2009-03-15 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

This issue is a problem of units. The error text is an utf8 *byte* 
string and offset is a number of *bytes*. The goal is to get the text 
*width* of a *character* string. We have to:
 1- convert offset from bytes number to character number
 2- get the error message as (unicode) characters
 3- get the width of text[:offset]

It's already possible to get (2) from the utf8 string, and code from 
ocean-city's patch (py3k_adjust_cursor_at_syntax_error_v2.patch) can 
be used for (3). The most difficult point is (1).

I will try to implement that.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2009-03-15 Thread David W. Lambert

David W. Lambert lamber...@corning.com added the comment:

Resolution of this may be applicable to Issue3446 as well.
center, ljust and rjust are inconsistent with unicode parameters

--
nosy: +LambertDW

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2008-10-06 Thread Hirokazu Yamamoto

Hirokazu Yamamoto [EMAIL PROTECTED] added the comment:

At least my one unicode char is one space suggestion corrects the case 
of Western languages, and all messages with single-width characters.

I'm not happy with this solution. ;-(

Doesn't the exact width depend on 
the terminal capabilities? and fonts, and combining diacritics...

I have to admit you are right. 

Nevertheless, I got coLinux(Debian) which has localed wcswidth(3), so I
created another experimental patch.
(py3k_adjust_cursor_at_syntax_error_v2.patch)

The strategy is ...
1. Try to convert to unicode. If fails, nothing changed to offset.
2. If system has wcswidth(3), try that function
3. If system is windows, try WideCharToMultibyte with CP_ACP
4. If above 2/3 fails or system is others, use unicode length as offset
(Amaury's suggestion)

This patch ignores file encoding. Again, this patch is experimental,
best effort, but maybe better than current state.

P.S.
I tested this patch on coLinux with ja_JP.UTF-8 locale and manual
#define HAVE_WCSWIDTH 1
because I don't know how to change configure script.

Added file: 
http://bugs.python.org/file11707/py3k_adjust_cursor_at_syntax_error_v2.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2008-10-06 Thread Hirokazu Yamamoto

Hirokazu Yamamoto [EMAIL PROTECTED] added the comment:

Experimental patch was experimental, wcswidth(3) returns 1 for East
Asian Ambiguous character.

debian:~/python-dev/py3k# ./python /mnt/windows/a.py
  File /mnt/windows/a.py, line 3
♪xÅx abc
 ^
should point 'c'. And another one

debian:~/python-dev/py3k# export LANG=C
debian:~/python-dev/py3k# ./python /mnt/windows/a.py
  File /mnt/windows/a.py, line 3
\u266ax\u212bx abc
 ^
SyntaxError: invalid syntax

Please forget my patch. :-(

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2008-10-01 Thread Amaury Forgeot d'Arc

Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment:

For the moment, I'd suggest that one unicode character has a the same
with as the space character, assuming that stdout.encoding correctly
matches the terminal.

Then the C implementation could do something similar to the statements I
added in traceback.py:
offset = len(line.encode('utf-8')[:offset].decode('utf-8'))

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2008-10-01 Thread Amaury Forgeot d'Arc

Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment:

This seems to be a difficult problem. Doesn't the exact width depend on 
the terminal capabilities? and fonts, and combining diacritics...

An easy way to put the caret at the same exact position is to repeat the 
beginning of the line up to the offending offset:
   print あいうえお
   print あいうえお^--
But I don't know how to make it look less ugly.

At least my one unicode char is one space suggestion corrects the case 
of Western languages, and all messages with single-width characters.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2008-10-01 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

See also a related issue: issue3975.

--
nosy: +haypo

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2008-09-30 Thread Amaury Forgeot d'Arc

Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment:

I think that your patch works only for terminals where one byte of the 
encoded text is displayed as one character on the terminal. This is not 
true for utf-8 terminals, for example.

In the attached patch, I tried to write some unit tests, (I had to adapt 
the traceback module as well), and one test still fails because the 
captured stderr has a utf-8 encoding.
I think that it's better to count unicode characters.

--
nosy: +amaury.forgeotdarc
Added file: http://bugs.python.org/file11670/traceback_adjust_cursor.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2008-09-21 Thread Hirokazu Yamamoto

Changes by Hirokazu Yamamoto [EMAIL PROTECTED]:


Removed file: http://bugs.python.org/file9786/fix.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2008-09-21 Thread Hirokazu Yamamoto

Hirokazu Yamamoto [EMAIL PROTECTED] added the comment:

Patch revised.

--
components: +Interpreter Core -None
Added file: 
http://bugs.python.org/file11548/py3k_adjust_cursor_at_syntax_error.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2382
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2008-03-19 Thread Hirokazu Yamamoto

Hirokazu Yamamoto [EMAIL PROTECTED] added the comment:

 (I assumed get_length_in_bytes(f,  , 1) == 1 but I'm not sure
  this is always true in other platforms. Probably nicer and more
  general solution may exist)

This assumption still lives, but I cannot find better solution.
I'm thinking now attached patch is good enough.

Added file: http://bugs.python.org/file9786/fix.patch

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2382
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2008-03-19 Thread Hirokazu Yamamoto

Changes by Hirokazu Yamamoto [EMAIL PROTECTED]:


Removed file: http://bugs.python.org/file9723/experimental.patch

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2382
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2008-03-18 Thread Hirokazu Yamamoto

Hirokazu Yamamoto [EMAIL PROTECTED] added the comment:

 I tried to fix this problem, but I'm not sure how to fix this.

Quick observation...

///
// Possible Solution

1. Convert err-text to console compatible encoding (not to source
encoding like in python2.x) where PyTokenizer_RestoreEncoding is there.

2. err-text is UTF-8, actual output is done in
Python/pythonrun.c(print_error_text), so adjust offset there.

///
// Solution requires...
1.
  - PyUnicode_DecodeUTF8 in Python/pythonrun.c(err_input) should
be changed to some kind of bytes API.

  - The way to write bytes to File object directly is needed.

2.
  - The way to know actual byte length of given unicode + encoding.


// Experimental patch

Attached as experimental patch of solution 2. Looks agly, but
seems working on my environment.
 (I assumed get_length_in_bytes(f,  , 1) == 1 but I'm not sure
  this is always true in other platforms. Probably nicer and more
  general solution may exist)

--
keywords: +patch
Added file: http://bugs.python.org/file9723/experimental.patch

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2382
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2008-03-17 Thread Hirokazu Yamamoto

New submission from Hirokazu Yamamoto [EMAIL PROTECTED]:

Hello. I found another problem related to issue2301.
SyntaxError cursor ^ is shifted when multibyte
characters are in line (before ^).

I think this is because err-text is stored as UTF-8
which requires 3 bytes for multibyte character,
but actually cp932 (my console encoding) requires only 2 bytes for it.

So ^ is shited to right 5 bytes because there is 5 multibyte chars.

C:\Documents and Settings\WhiteRabbitpy3k x.py
push any key

  File x.py, line 3
print あいうえお
  ^
SyntaxError: invalid syntax
[22567 refs]

Sorry, I didn't know what PyTokenizer_RestoreEncoding really doing.
That function adjusted err_ret-offset for this encoding conversion.
So, Python2.5 can output cursor in right place. (Of course, if source
encoding is not compatible for console encoding, broken string is printed
though. Anyway, cursor is right)

C:\Documents and Settings\WhiteRabbitpy a.py
  File a.py, line 2
x 、「、、、ヲ、ィ、ェ
 ^
SyntaxError: invalid syntax
[8728 refs]

I tried to fix this problem, but I'm not sure how to fix this.

--
components: None
messages: 63895
nosy: ocean-city
severity: normal
status: open
title: [Py3k] SyntaxError cursor shifted if multibyte character is in line.
versions: Python 3.0

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2382
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com