From: sageptr at gmail dot com
Operating system: Any
PHP version: 5.4.4
Package: mbstring related
Bug Type: Bug
Bug description:wrong unicode mapping in some charsets
Description:
------------
ext/mbstring/libmbfl/filters/unicode_table_cp1251.h:
static const unsigned short cp1251_ucs_table[] = {
0x0402, 0x0403, 0x201a, 0x0453, 0x201e, 0x2026, 0x2020, 0x2021,
0x20ac, 0x2030, 0x0409, 0x2039, 0x040a, 0x040c, 0x040b, 0x040f,
0x0452, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
0x003f, 0x2122, 0x0459, 0x203a, 0x045a, 0x045c, 0x045b, 0x045f,
...
Character 0x98 is mapped to 0x003f (question mark), but actually it's
unmapped
in cp1251 charset. It should be mapped to 0xfffd (substitution character),
not
to 0x003f.
ext/mbstring/libmbfl/filters/unicode_table_cp1252.h:
static const unsigned short cp1252_ucs_table[] = {
0x20ac,0xfffe,0x201a,0x0192,0x201e,0x2026,0x2020,0x2021,
0x02c6,0x2030,0x0160,0x2039,0x0152,0xfffe,0x017d,0xfffe,
0xfffe,0x2018,0x2019,0x201c,0x201d,0x2022,0x2013,0x2014,
0x02dc,0x2122,0x0161,0x203a,0x0153,0xfffe,0x017e,0x0178
};
Missing characters are mapped to 0xfffe. But actually it's BOM character,
not
substitution character, as it expected to be.
--
Edit bug report at https://bugs.php.net/bug.php?id=62545&edit=1
--
Try a snapshot (PHP 5.4):
https://bugs.php.net/fix.php?id=62545&r=trysnapshot54
Try a snapshot (PHP 5.3):
https://bugs.php.net/fix.php?id=62545&r=trysnapshot53
Try a snapshot (trunk):
https://bugs.php.net/fix.php?id=62545&r=trysnapshottrunk
Fixed in SVN:
https://bugs.php.net/fix.php?id=62545&r=fixed
Fixed in SVN and need be documented:
https://bugs.php.net/fix.php?id=62545&r=needdocs
Fixed in release:
https://bugs.php.net/fix.php?id=62545&r=alreadyfixed
Need backtrace:
https://bugs.php.net/fix.php?id=62545&r=needtrace
Need Reproduce Script:
https://bugs.php.net/fix.php?id=62545&r=needscript
Try newer version:
https://bugs.php.net/fix.php?id=62545&r=oldversion
Not developer issue:
https://bugs.php.net/fix.php?id=62545&r=support
Expected behavior:
https://bugs.php.net/fix.php?id=62545&r=notwrong
Not enough info:
https://bugs.php.net/fix.php?id=62545&r=notenoughinfo
Submitted twice:
https://bugs.php.net/fix.php?id=62545&r=submittedtwice
register_globals:
https://bugs.php.net/fix.php?id=62545&r=globals
PHP 4 support discontinued:
https://bugs.php.net/fix.php?id=62545&r=php4
Daylight Savings: https://bugs.php.net/fix.php?id=62545&r=dst
IIS Stability:
https://bugs.php.net/fix.php?id=62545&r=isapi
Install GNU Sed:
https://bugs.php.net/fix.php?id=62545&r=gnused
Floating point limitations:
https://bugs.php.net/fix.php?id=62545&r=float
No Zend Extensions:
https://bugs.php.net/fix.php?id=62545&r=nozend
MySQL Configuration Error:
https://bugs.php.net/fix.php?id=62545&r=mysqlcfg