ID:               48210
 Updated by:       j...@php.net
 Reported By:      nilon at kartio dot org
-Status:           Open
+Status:           Bogus
 Bug Type:         mbstring related
 Operating System: Debian Lenny
 PHP Version:      5.2.9
 New Comment:

Please note that encoding detection is not always perfect.
Especially, when the string is too short, the wrong detection might be
caused.


Previous Comments:
------------------------------------------------------------------------

[2009-05-09 17:58:09] nilon at kartio dot org

With strict option result is:
string(10) "ISO-8859-1"
string(10) "ISO-8859-1"
string(10) "ISO-8859-1"
string(5) "UTF-8"

Still last one should return false.

------------------------------------------------------------------------

[2009-05-09 17:53:42] nilon at kartio dot org

Description:
------------
mb_detect_encoding detects latin1 'รค' as UTF-8 when it clearly isn't
multibyte character.

Reproduce code:
---------------
<?php
var_dump(mb_detect_encoding("\xe4", 'UTF-8, ISO-8859-1'));
var_dump(mb_detect_encoding("\xe4", 'ISO-8859-1, UTF-8'));
var_dump(mb_detect_encoding("\xe4", 'ISO-8859-1'));
var_dump(mb_detect_encoding("\xe4", 'UTF-8'));
?>

Expected result:
----------------
string(10) "ISO-8859-1"
string(10) "ISO-8859-1"
string(10) "ISO-8859-1"
bool "false"

Actual result:
--------------
string(5) "UTF-8"
string(10) "ISO-8859-1"
string(10) "ISO-8859-1"
string(5) "UTF-8"


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=48210&edit=1

Reply via email to