ID: 38425
User updated by: stronk7 at moodle dot org
Reported By: stronk7 at moodle dot org
Status: Bogus
Bug Type: ICONV related
Operating System: Irrelevant
PHP Version: 5.1.4
New Comment:
Thanks, I agree!
But shouldn't the //TRANSLIT mode modify such behaviour and
allow the conversion to continue? If not, perhaps some minor
modification to the manual page could help, because it seems
that both the //TRANSLIT and //IGNORE modes continue with the
conversion and that's not true for the //TRANSLIT mode.
Previous Comments:
------------------------------------------------------------------------
[2006-08-11 12:58:07] [EMAIL PROTECTED]
Notice: iconv(): Detected an illegal character in input string
Even though the character might be legal, it's up to libiconv to decide
if it's legal or not and PHP can't change that.
------------------------------------------------------------------------
[2006-08-11 12:51:41] stronk7 at moodle dot org
Here it's an example script:
<?php
echo phpversion();
$original = 'Hello' . chr(240) . 'World';
$result = iconv('EUC-JP', 'UTF-8//TRANSLIT', $original);
echo $result;
?>
IMO, it should return:
"Hello?World" (or, if I'm not wrong, "Hello World")
instead it returns "Hello" and stops, where the TRANSLIST
mode should make it arrive to the end.
------------------------------------------------------------------------
[2006-08-11 12:19:52] [EMAIL PROTECTED]
Thank you for this bug report. To properly diagnose the problem, we
need a short but complete example script to be able to reproduce
this bug ourselves.
A proper reproducing script starts with <?php and ends with ?>,
is max. 10-20 lines long and does not require any external
resources such as databases, etc. If the script requires a
database to demonstrate the issue, please make sure it creates
all necessary tables, stored procedures etc.
Please avoid embedding huge scripts into the report.
------------------------------------------------------------------------
[2006-08-11 11:19:58] stronk7 at moodle dot org
Description:
------------
I'm using iconv (with the //TRANSLIT option enabled in order
to convert from a lot of different encodings to UTF-8.
Everything seems to be working fine, but I've found one
situation where, perhaps, //TRANSLIT isn't working properly.
It happens when I'm trying to convert from euc-jp to utf-8
and the string contains some 0xA0 chars.
All the string is english text, but with some spaces between
words using the 0xA0 char (instead of the correct 0x20
char).
I'm not an expert about euc-jp but it seems that the 0xA0
char hasn't meaning at all in that encoding.
http://lfw.org/text/jp.html#euc
Perhaps it shouldn't break the conversion process, at least
when running under //TRANSLIT mode?
The //IGNORE mode seems to work ok, ignoring such character.
If I read the documentatio well it seems that //TRANSLIT
should transform such character (hopefully to one space, if
I'm not wrong).
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=38425&edit=1