Edit report at https://bugs.php.net/bug.php?id=48147&edit=1

 ID:                 48147
 Updated by:         ezy...@php.net
 Reported by:        kulakov74 at yandex dot ru
 Summary:            iconv with //IGNORE cuts the string
-Status:             Bogus
+Status:             Re-Opened
 Type:               Bug
 Package:            ICONV related
 Operating System:   Linux
 PHP Version:        5.*, 6CVS (2009-05-05)
 Block user comment: N
 Private report:     N

 New Comment:

I think I understand how to fix this bug, without modifying glibc. We need to 
modify our invocation of iconv in order to mirror the behavior of 
iconv_prog.c:process_block() when the '-c' flag is set (if we mimic the code 
closely enough, we also get an extra bonus of sensible block processing 
behavior, which is better than the horrible over-allocation iconv does right 
now). In particular, we need to handle the EILSEQ error code correctly.


Previous Comments:
------------------------------------------------------------------------
[2011-12-18 22:34:38] ezy...@php.net

Upstream bugs:

http://sources.redhat.com/bugzilla/show_bug.cgi?id=13517
http://sources.redhat.com/bugzilla/show_bug.cgi?id=13518

------------------------------------------------------------------------
[2011-12-18 19:37:53] ezy...@php.net

Not broken in latest version of libiconv

ezyang@javelin:~/Desktop/libiconv-1.14/src$ ./iconv_no_i18n --version
iconv (GNU libiconv 1.14)
Copyright (C) 2000-2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Bruno Haible.
ezyang@javelin:~/Desktop/libiconv-1.14/src$ ./iconv_no_i18n -f utf-8 -t 
iso-8859-1//IGNORE ~/iconv.html | wc -c
15312
ezyang@javelin:~/Desktop/libiconv-1.14/src$ iconv -f utf-8 -t 
iso-8859-1//IGNORE ~/iconv.html | wc -c
iconv: illegal input sequence at position 8168
8157

------------------------------------------------------------------------
[2009-05-07 13:58:21] j...@php.net

We still can't fix bugs in glibc iconv implementation. Try this on 
command line and you get same results:

# iconv -f utf-8 -t iso-8859-1 iconv.html > /dev/null
iconv: illegal input sequence at position 3589

# iconv -f utf-8 -t iso-8859-1//IGNORE iconv.html > /dev/null
iconv: illegal input sequence at position 8168


------------------------------------------------------------------------
[2009-05-07 07:50:52] lbarn...@php.net

Marked it as verified as I got exactly the same results:

The first iconv() call (the one without //IGNORE) fails on the emphasis 
character "…" (value="Search…"), which can't be represented in ISO-8859-1.

The second iconv() call (the one with //IGNORE) fails later (so the emphasis is 
ignored, which may means that the //IGNORE flag is supported), and there is no 
apparent reason for failing at offset 8157 (only regular ASCII chars around).

------------------------------------------------------------------------
[2009-05-06 18:36:10] j...@php.net

Arnaud: Please don't reopen bogus bugs without explanation. 

------------------------------------------------------------------------


The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at

    https://bugs.php.net/bug.php?id=48147


-- 
Edit this bug report at https://bugs.php.net/bug.php?id=48147&edit=1

Reply via email to