Re: Bug in libiconv?

simrw Wed, 26 Jan 2011 04:15:43 -0800

> Here's what happens on Cygwin:
>
> $ gcc -g -o ic ic.c -liconv
> $ ./ic
> iconv: 138 <Invalid or incomplete multibyte or wide character>
> in = <Liian pitkÃ¤ sana>, inbuf = <Ã¤ sana>, inbytesleft = 7,
outbytesleft = 492
>   iconv: 138 <Invalid or incomplete multibyte or wide character>
>   in = <Liian pitkÃ¤ sana>, inbuf = <Ã¤ sana>, inbytesleft = 7,
outbytesleft = 492
>   iconv: 138 <Invalid or incomplete multibyte or wide character>
>   in = <Liian pitkÃ¤ sana>, inbuf = <Ã¤ sana>, inbytesleft = 7,
outbytesleft = 492
>   in = <Liian pitkÃ¤ sana>, inbuf = <>, inbytesleft = 0, outbytesleft = 480
>
> So, AFAICS, there are two problems:
>
>   - Even though iconv_open has been opened explicitely with "UTF-8" as
>     input string, the conversion still depends on the current application
>     codeset.  That dsoesn't make sense.
>
>   - Even though the last parameter to iconv is defined in bytes, the
>     value of outbytesleft after the conversion is the number of remaining
>     wchar"t's, not the number of remaining bytes.  That's contrary to
> what POSIX defines, see
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/iconv.html


IMHO, the count is correct.
On Windows/Cygwin, wchar_t is 2 bytes, on Linux, 4 bytes.
So the buffer is 512 bytes.
In the first 3 cases, 10 input bytes were consumed so that there remains
in the buffer (512 - 20) = 492 bytes.
In the last case all 16 bytes are consumed so there remains in
the buffer (512 - 32) = 480 bytes.

Roger


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Re: Bug in libiconv?

Reply via email to