From:             Diomedes_01 at yahoo dot com
Operating system: Solaris 9
PHP version:      5.0.4
PHP Bug Type:     Strings related
Bug description:  Unable to properly convert from ISO-8859-1 to UTF-8

Description:
------------
I am unable to properly encode certain strings from ISO-8859-1 to UTF-8. I
have tried using utf8_encode, mb_convert_encoding and iconv with no
success. The code I am attempting this on is as follows:

Reproduce code:
---------------
<?php
$main_test_string = "référendum sur la Constitution européenne";
$string_test = mb_detect_encoding($main_test_string, 'UTF-8,
ISO-8859-1');
echo "Encoding used: $string_test<br>"; // Properly displays ISO-8859-1

// First try converting with iconv
$iconv_test = iconv("ISO-8859-1", "UTF-8", $main_test_string);
echo "Iconv test: $iconv_test<br>"; // Displays nothing. No data
whatsoever

// Now try converting with mb_convert_encoding
$mb_test = mb_convert_encoding($main_test_string, "UTF-8", "ISO-8859-1");
$string_test2 = mb_detect_encoding($mb_test, 'UTF-8, ISO-8859-1'); 
echo "Encoding used: $string_test2<br>"; // Indicates string is now UTF-8
encoded (which is wrong)
echo "MB Test convert value: $mb_test<br>"; // Displays: référendum sur
la Constitution européenne; doesn't look like UTF-8 to me

// Finally try utf8_encode
$utf8_encode_test = utf8_encode($main_test_string);
$string_test3 = mb_detect_encoding($textfieldabstract, 'UTF-8,
ISO-8859-1');
echo "Encoding used: $string_test3<br>"; // Indicates string is now UTF-8
encoded (which is wrong)
echo "Abstract post conversion: $utf8_encode_test<br>"; // Same as before,
displays: référendum sur la Constitution européenne 
?>

Expected result:
----------------
I should be seeing UTF-8 (Unicode) translated text of the style:
'&#917;&#955;&#955;&#951;&#957;&#953;'

Note that the above does work for non-latin based character sets like
chinese, japanese, russian, greek, etc.

Actual result:
--------------
What I am seeing is the following string:

référendum sur la Constitution européenne

Definately not UTF-8. Could be Klingon. :-)

I will admit I am not a Unicode master but this is certainly quite
puzzling. According to the documentation, iconv is supposed to work in
this case but it is not displaying any data. I am running PHP 5.0.4 with
iconv enabled. (I see it in my phpinfo output)

Please advise.

-- 
Edit bug report at http://bugs.php.net/?id=32880&edit=1
-- 
Try a CVS snapshot (php4):   http://bugs.php.net/fix.php?id=32880&r=trysnapshot4
Try a CVS snapshot (php5.0): 
http://bugs.php.net/fix.php?id=32880&r=trysnapshot50
Try a CVS snapshot (php5.1): 
http://bugs.php.net/fix.php?id=32880&r=trysnapshot51
Fixed in CVS:                http://bugs.php.net/fix.php?id=32880&r=fixedcvs
Fixed in release:            http://bugs.php.net/fix.php?id=32880&r=alreadyfixed
Need backtrace:              http://bugs.php.net/fix.php?id=32880&r=needtrace
Need Reproduce Script:       http://bugs.php.net/fix.php?id=32880&r=needscript
Try newer version:           http://bugs.php.net/fix.php?id=32880&r=oldversion
Not developer issue:         http://bugs.php.net/fix.php?id=32880&r=support
Expected behavior:           http://bugs.php.net/fix.php?id=32880&r=notwrong
Not enough info:             
http://bugs.php.net/fix.php?id=32880&r=notenoughinfo
Submitted twice:             
http://bugs.php.net/fix.php?id=32880&r=submittedtwice
register_globals:            http://bugs.php.net/fix.php?id=32880&r=globals
PHP 3 support discontinued:  http://bugs.php.net/fix.php?id=32880&r=php3
Daylight Savings:            http://bugs.php.net/fix.php?id=32880&r=dst
IIS Stability:               http://bugs.php.net/fix.php?id=32880&r=isapi
Install GNU Sed:             http://bugs.php.net/fix.php?id=32880&r=gnused
Floating point limitations:  http://bugs.php.net/fix.php?id=32880&r=float
No Zend Extensions:          http://bugs.php.net/fix.php?id=32880&r=nozend
MySQL Configuration Error:   http://bugs.php.net/fix.php?id=32880&r=mysqlcfg

Reply via email to