Hi Martijn.

On Tue, Sep 23, 2008 at 12:57:12PM +0200, Martijn van Oosterhout wrote:
> On Tue, Sep 23, 2008 at 12:46:50PM +0200, Gerfried Fuchs wrote:
> >  When you then press tab to go to the next entry, type e for edit, you
> > have the New description: line infront of you, and the current text. Add
> > an 8bit character, and the cursor will move not one character but two to
> > the right. When you move the cursor back with the arrow key it jumps two
> > characters instead of only one.
>
> Aha. Interactive mode, that makes a difference. The problem is then in
> pal_rl_ncurses_hack which does indeed seem the place the cursor wrong
> when using multibyte characters. Now all we need is a way to determine
> the correct location and it can be fixed...

I did a *quick* look into the code before I read your mail again and my
first guess was that the line "readline_x = col + strlen(
locale_prompt);" in pal_rl_get_raw_line() counts the length wrong and
thus indirectly causes this bug, but I might be wrong.

When fixing this bug we should consider that one UTF-8 characters might
need two columns to be displayed properly, e.g. some Chinese signs.
wcwidth(3) and wcswidth(3) look good under this aspect to determine the
display width of one wide character or one wide-character string.

Since we don't use wchar_t* we probably need to convert between
multibyte strings (what we have now) and wide character strings before
we call above-mentioned functions. One can use mbsrtowcs(), wcsrtombs()
and mbtowc() to do this.

Alternatively mbstowcs(NULL,s,0) could be used to count the number of
characters of a string. Using this solution would fix this bug for
languages that do not use signs which need more than one column to be
displayed correctly.

I found http://www.cl.cam.ac.uk/~mgk25/unicode.html and
http://www.chemie.fu-berlin.de/chemnet/use/info/libc/libc_18.html to be
very helpful to gain some deeper knowledge about Unicode.

Grepping for strlen in pal's source code and rethinking what we want to
archive (to get the number of columns needed to display a character, the
number of bytes, the number of characters ...) might point us to some
unreported bugs.


Regards
Carsten



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to