ID: 47366 User updated by: max at injapan dot ru Reported By: max at injapan dot ru -Status: Open +Status: Closed Bug Type: mbstring related Operating System: CentOS 5.2 PHP Version: 5.3CVS-2009-02-12 (snap) New Comment:
Problem solved with encoding EUCJP-WIN instead of EUC-JP. Previous Comments: ------------------------------------------------------------------------ [2009-02-12 10:06:17] max at injapan dot ru Text in "Expected result" field is messed a little: of course, expected output is just one character U+2161. ------------------------------------------------------------------------ [2009-02-12 10:04:11] max at injapan dot ru Description: ------------ mb_convert_encoding converts symbols \xAD\xB5-\xAD\xBF incorrectly from EUC-JP to UTF-8. It's possible that some other symbols converted incorrectly too, but I have no possibility to check it to full extent. Unicode has corresponding codepoints, i.e. U+2161 for Ⅱ. Majority of EUC-JP texts is converted mormally. Reproduce code: --------------- echo mb_convert_encoding("\xAD\xB6", "UTF-8", "EUC-JP"); Expected result: ---------------- string «Ⅱ» (U+2161) printed to STDOUT Actual result: -------------- string «?» printed to STDOUT ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=47366&edit=1