ID:               48147
 Updated by:       j...@php.net
 Reported By:      kulakov74 at yandex dot ru
-Status:           Verified
+Status:           Bogus
 Bug Type:         ICONV related
 Operating System: Linux
 PHP Version:      5.*, 6CVS (2009-05-05)
 New Comment:

We still can't fix bugs in glibc iconv implementation. Try this on 
command line and you get same results:

# iconv -f utf-8 -t iso-8859-1 iconv.html > /dev/null
iconv: illegal input sequence at position 3589

# iconv -f utf-8 -t iso-8859-1//IGNORE iconv.html > /dev/null
iconv: illegal input sequence at position 8168



Previous Comments:
------------------------------------------------------------------------

[2009-05-07 07:50:52] lbarn...@php.net

Marked it as verified as I got exactly the same results:

The first iconv() call (the one without //IGNORE) fails on the emphasis
character "…" (value="Search…"), which can't be represented in
ISO-8859-1.

The second iconv() call (the one with //IGNORE) fails later (so the
emphasis is ignored, which may means that the //IGNORE flag is
supported), and there is no apparent reason for failing at offset 8157
(only regular ASCII chars around).

------------------------------------------------------------------------

[2009-05-06 18:36:10] j...@php.net

Arnaud: Please don't reopen bogus bugs without explanation. 

------------------------------------------------------------------------

[2009-05-06 18:18:07] kulakov74 at yandex dot ru

No. The fact the script displays the notice "iconv(): Detected an
illegal character ..." in both cases is not related to the fact whether
the option is implemented: this is controlled by error_reporting(E_ALL).
The option IGNORE only controls whether iconv will stop at the character
or not. 

Also, the length of the resulting string is different (greater) with
IGNORE, and while without it the string ends at exactly where the
illegal character is, with IGNORE it ends at a random point where no
such characters occur. 

Also, I did not mention - this is not the only file I converted, many
others were converted correctly with the option, and their length only
decreased a little. But there were 2 files which were truncated, 1 of
them (the smaller) is used for the test case. 

Can you run the test with the latest PHP releases? Actually this is why
I reported the bug. I tried it on other servers with PHP 4.3.3, 5.1.4,
5.1.6, 5.2.4 and 5.2.6 and yep! - I finally found one with 5.2.9 (built
Feb 27 2009) and it displayed the same results everywhere. 

I repeat, the TRANSLIT option works fine, while it does the same and
even more.

------------------------------------------------------------------------

[2009-05-06 14:38:39] j...@php.net

It just means you're using glibc iconv implementation which does not 
have the IGNORE parameter implemented.

------------------------------------------------------------------------

[2009-05-06 05:13:10] kulakov74 at yandex dot ru

Here goes the script. I'm not sure about the limit on external
resources - I have the file to convert, so it is downloaded. 

<?php

error_reporting(E_ALL); 

$Body1=file_get_contents("http://www.oppcharts.com/iconv.html";);

echo(strlen($Body1)."\n");
$Body2=iconv('UTF-8', 'ISO-8859-1', $Body1);
echo(strlen($Body2)."\n");

$Body2=iconv('UTF-8', 'ISO-8859-1//IGNORE', $Body1);
echo(strlen($Body2)."\n");

?>

------------------------------------------------------------------------

The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
    http://bugs.php.net/48147

-- 
Edit this bug report at http://bugs.php.net/?id=48147&edit=1

Reply via email to