ID: 48147 Updated by: j...@php.net Reported By: kulakov74 at yandex dot ru -Status: Verified +Status: Bogus Bug Type: ICONV related Operating System: Linux PHP Version: 5.*, 6CVS (2009-05-05) New Comment:
We still can't fix bugs in glibc iconv implementation. Try this on command line and you get same results: # iconv -f utf-8 -t iso-8859-1 iconv.html > /dev/null iconv: illegal input sequence at position 3589 # iconv -f utf-8 -t iso-8859-1//IGNORE iconv.html > /dev/null iconv: illegal input sequence at position 8168 Previous Comments: ------------------------------------------------------------------------ [2009-05-07 07:50:52] lbarn...@php.net Marked it as verified as I got exactly the same results: The first iconv() call (the one without //IGNORE) fails on the emphasis character "â¦" (value="Searchâ¦"), which can't be represented in ISO-8859-1. The second iconv() call (the one with //IGNORE) fails later (so the emphasis is ignored, which may means that the //IGNORE flag is supported), and there is no apparent reason for failing at offset 8157 (only regular ASCII chars around). ------------------------------------------------------------------------ [2009-05-06 18:36:10] j...@php.net Arnaud: Please don't reopen bogus bugs without explanation. ------------------------------------------------------------------------ [2009-05-06 18:18:07] kulakov74 at yandex dot ru No. The fact the script displays the notice "iconv(): Detected an illegal character ..." in both cases is not related to the fact whether the option is implemented: this is controlled by error_reporting(E_ALL). The option IGNORE only controls whether iconv will stop at the character or not. Also, the length of the resulting string is different (greater) with IGNORE, and while without it the string ends at exactly where the illegal character is, with IGNORE it ends at a random point where no such characters occur. Also, I did not mention - this is not the only file I converted, many others were converted correctly with the option, and their length only decreased a little. But there were 2 files which were truncated, 1 of them (the smaller) is used for the test case. Can you run the test with the latest PHP releases? Actually this is why I reported the bug. I tried it on other servers with PHP 4.3.3, 5.1.4, 5.1.6, 5.2.4 and 5.2.6 and yep! - I finally found one with 5.2.9 (built Feb 27 2009) and it displayed the same results everywhere. I repeat, the TRANSLIT option works fine, while it does the same and even more. ------------------------------------------------------------------------ [2009-05-06 14:38:39] j...@php.net It just means you're using glibc iconv implementation which does not have the IGNORE parameter implemented. ------------------------------------------------------------------------ [2009-05-06 05:13:10] kulakov74 at yandex dot ru Here goes the script. I'm not sure about the limit on external resources - I have the file to convert, so it is downloaded. <?php error_reporting(E_ALL); $Body1=file_get_contents("http://www.oppcharts.com/iconv.html"); echo(strlen($Body1)."\n"); $Body2=iconv('UTF-8', 'ISO-8859-1', $Body1); echo(strlen($Body2)."\n"); $Body2=iconv('UTF-8', 'ISO-8859-1//IGNORE', $Body1); echo(strlen($Body2)."\n"); ?> ------------------------------------------------------------------------ The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/48147 -- Edit this bug report at http://bugs.php.net/?id=48147&edit=1