Edit report at https://bugs.php.net/bug.php?id=60412&edit=1
ID: 60412 Updated by: [email protected] Reported by: mike dot squire at gmail dot com -Summary: UTF-8 functions doesn't respect unicode equivalence +Summary: UTF-8 functions doesn't respect unicode equivalence - Need Normalization -Status: Open +Status: Analyzed Type: Bug -Package: Unicode Engine related +Package: mbstring related -Operating System: OSX (though probably all) +Operating System: all -PHP Version: 5.3.8 +PHP Version: 5.4SVN-2011-11-04 (SVN) Block user comment: N Private report: N New Comment: What you are looking for is normalization. Intl module has it, but mbstring does not. I changed bug type to feature request. Previous Comments: ------------------------------------------------------------------------ [2011-11-29 22:17:42] mike dot squire at gmail dot com Description: ------------ Quote from http://en.wikipedia.org/wiki/Unicode_equivalence: "...the code point U+006E (the Latin lowercase 'n') followed by U+0303 (the combining tilde 'âÌ') is defined by Unicode to be canonically equivalent to the single code point U+00F1 (the lowercase letter 'ñ' of the Spanish alphabet). Therefore, those sequences should be displayed in the same manner, should be treated in the same way by applications such as alphabetizing names or searching, and may be substituted for each other." It might be this is more a case of just documenting that the unicode functions don't support unicode equivalence (for completeness). Test script: --------------- echo "Output recorded from a terminal interpreting UTF-8\n\n"; var_dump("\x6e\xcc\x83"); var_dump(utf8_encode("\xf1")); var_dump(utf8_decode("\x6e\xcc\x83") == "\xf1"); var_dump(mb_convert_encoding("\x6e\xcc\x83", "ISO-8859-1", "UTF-8") == "\xf1"); Expected result: ---------------- Output recorded from a terminal interpreting UTF-8 string(3) "ñ" string(2) "ñ" bool(true) bool(true) Actual result: -------------- Output recorded from a terminal interpreting UTF-8 string(3) "ñ" string(2) "ñ" bool(false) bool(false) ------------------------------------------------------------------------ -- Edit this bug report at https://bugs.php.net/bug.php?id=60412&edit=1
