ID: 48210 Updated by: j...@php.net Reported By: nilon at kartio dot org -Status: Open +Status: Bogus Bug Type: mbstring related Operating System: Debian Lenny PHP Version: 5.2.9 New Comment:
Please note that encoding detection is not always perfect. Especially, when the string is too short, the wrong detection might be caused. Previous Comments: ------------------------------------------------------------------------ [2009-05-09 17:58:09] nilon at kartio dot org With strict option result is: string(10) "ISO-8859-1" string(10) "ISO-8859-1" string(10) "ISO-8859-1" string(5) "UTF-8" Still last one should return false. ------------------------------------------------------------------------ [2009-05-09 17:53:42] nilon at kartio dot org Description: ------------ mb_detect_encoding detects latin1 'รค' as UTF-8 when it clearly isn't multibyte character. Reproduce code: --------------- <?php var_dump(mb_detect_encoding("\xe4", 'UTF-8, ISO-8859-1')); var_dump(mb_detect_encoding("\xe4", 'ISO-8859-1, UTF-8')); var_dump(mb_detect_encoding("\xe4", 'ISO-8859-1')); var_dump(mb_detect_encoding("\xe4", 'UTF-8')); ?> Expected result: ---------------- string(10) "ISO-8859-1" string(10) "ISO-8859-1" string(10) "ISO-8859-1" bool "false" Actual result: -------------- string(5) "UTF-8" string(10) "ISO-8859-1" string(10) "ISO-8859-1" string(5) "UTF-8" ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=48210&edit=1