[issue18118] curses utf8 output broken in Python2

2013-06-03 Thread STINNER Victor

STINNER Victor added the comment:

"Sounds sensible. Are you aware of a workaround for this issue? I.e.
is there any way to force Python2.7 to use the wide mode for
outputting characters?"

I don't think that it is possible to workaround this issue, it is a
bug in the design of curses, related to Unicode. I suppose that
libncursesw uses an array of wchar_t characters when the *_wch() and
*wstr() functions are used, whereas your version looks to use an array
of char* characters and so is unable to understand that a character is
composed of two bytes (ex: b"\xc3\xa4" for u"ä").

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18118] curses utf8 output broken in Python2

2013-06-02 Thread helmut

helmut added the comment:

> I suppose that screen.addstr(0, 0, u"äöü".encode("utf-8")) works.

It works as in "the output looks as the one expected". Long lines with utf8 
characters will make it break again though.

screen.addstr(0, 0, "äöü" * 20) # assuming COLUMNS=80

Will give two rows of characters of which the first row is 40 characters long.

> If "_cursessomething.so" is already linked against libncursesw.so.5, the fix 
> is to use waddwstr(), but such change cannot be done in a minor release like 
> Python 2.7.6. So I'm closing this issue as wont fix => you have to move to 
> Python 3.3.

Sounds sensible. Are you aware of a workaround for this issue? I.e. is there 
any way to force Python2.7 to use the wide mode for outputting characters?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18118] curses utf8 output broken in Python2

2013-06-02 Thread STINNER Victor

STINNER Victor added the comment:

u"äöü" encoded to "utf-8" gives '\xc3\xa4\xc3\xb6\xc3\xbc'

"\303\303\303\274" is '\xc3\xc3\xc3\xbc'.

I guess that curses considers that '\xc3\xa4' is a string of 2 characters: 
screen.addstr(0, 1, "ö") replaces the second "character", '\xa4'.

I suppose that screen.addstr(0, 0, u"äöü".encode("utf-8")) works.

If "_cursessomething.so" is already linked against libncursesw.so.5, the fix is 
to use waddwstr(), but such change cannot be done in a minor release like 
Python 2.7.6. So I'm closing this issue as wont fix => you have to move to 
Python 3.3.

--
resolution:  -> wont fix
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18118] curses utf8 output broken in Python2

2013-06-02 Thread helmut

helmut added the comment:

All reproducers confirmed that their _cursessomething.so is linked against 
libncursesw.so.5.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18118] curses utf8 output broken in Python2

2013-06-02 Thread STINNER Victor

STINNER Victor added the comment:

Is your Python curses module linked to libncurses.so.5 or libncursesw.so.5? 
Example:

$ ldd /usr/lib/python2.7/lib-dynload/_cursesmodule.so |grep curses
libncursesw.so.5 => /lib/libncursesw.so.5 (0x00375000)

libncursesw has a much better support of Unicode than libncurses.

Since Python 3.3, the Python curses.window.addstr() method uses waddwstr() when 
the module is linked to libncursesw, which also improves the Unicode support.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18118] curses utf8 output broken in Python2

2013-06-02 Thread R. David Murray

R. David Murray added the comment:

I believe this is one of a class of bugs that are fixed in Python3, and that 
are unlikely to be fixed in Python2.  I'll defer to Victor, though, who made a 
number of curses unicode fixes in Python3.

--
nosy: +haypo, r.david.murray
title: curses utf8 output broken -> curses utf8 output broken in Python2

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18118] curses utf8 output broken

2013-06-02 Thread helmut

New submission from helmut:

Consider the test case below.

<<<
#!/usr/bin/python
# -*- encoding: utf8 -*-

import curses

def wrapped(screen):
screen.addstr(0, 0, "ä")
screen.addstr(0, 1, "ö")
screen.addstr(0, 2, "ü")
screen.getch()

if __name__ == "__main__":
curses.wrapper(wrapped)
>>>

Expected output: "äöü"
Output on py3.3: as expected
Output on py2.7.3: "?ü"
The actual bytes (as determined by strace) were "\303\303\303\274". Observe the 
inclusion of broken utf8 sequences.

This issue was initially discovered on Debian sid, but independently confirmed 
on Arch Linux and two more unknown.

--
components: Library (Lib)
messages: 190479
nosy: helmut
priority: normal
severity: normal
status: open
title: curses utf8 output broken
type: behavior
versions: Python 2.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com