ID: 46478 Updated by: moriyo...@php.net Reported By: for-bugs at hnw dot jp -Status: Assigned +Status: Closed Bug Type: Feature/Change Request Operating System: * PHP Version: 5.2.6 Assigned To: moriyoshi New Comment:
This bug has been fixed in SVN. Snapshots of the sources are packaged every three hours; this change will be in the next snapshot. You can grab the snapshot at http://snaps.php.net/. Thank you for the report, and for helping us make PHP better. Previous Comments: ------------------------------------------------------------------------ [2009-12-22 05:50:35] s...@php.net Automatic comment from SVN on behalf of moriyoshi Revision: http://svn.php.net/viewvc/?view=revision&revision=292467 Log: - Fix bug #46478 (htmlentities() uses obsolete mapping table for character entity references) ------------------------------------------------------------------------ [2008-11-09 16:39:06] moriyo...@php.net I think this is a bug, but correcting the table should break BC too. ------------------------------------------------------------------------ [2008-11-04 12:56:40] for-bugs at hnw dot jp Description: ------------ ext/standard/html.c has incorrect mapping table which htmlentities() uses. html.c is based on http://www.unicode.org/Public/MAPPINGS/OBSOLETE/UNI2SGML.TXT, but this mapping table is obsolete and not compatible with HTML4.0 or XHTML1.0. For example, U+2235(which is encoded to "\xe2\x88\xb5" with UTF-8) is not in http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent, but htmlentities() returns "∵". U+226A(≪) and U+226B(≫) are similler case. Reproduce code: --------------- <?php var_dump(htmlentities("\xe2\x88\xb5", ENT_QUOTES, "utf-8")); Expected result: ---------------- string(3) "æ" Actual result: -------------- string(8) "∵" ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=46478&edit=1