ID: 29119 Updated by: [EMAIL PROTECTED] Reported By: peter at desk dot nl -Status: Assigned +Status: Closed Bug Type: Strings related Operating System: Linux 2.6.5 PHP Version: 5.0.0RC3 Assigned To: moriyoshi New Comment:
This bug has been fixed in CVS. Snapshots of the sources are packaged every three hours; this change will be in the next snapshot. You can grab the snapshot at http://snaps.php.net/. Thank you for the report, and for helping us make PHP better. Previous Comments: ------------------------------------------------------------------------ [2004-07-19 15:17:57] peter at desk dot nl nb : the above-mentioned online tests at http://fire.desk.nl/ are now no longer relevant as that machine runs the patched version. ------------------------------------------------------------------------ [2004-07-19 14:26:57] peter at desk dot nl after looking at the sources, this patch seems to fix it : fire:/home/peter/php-5.0.0/ext/standard# diff -u html.c.org html.c --- html.c.org 2004-07-19 14:24:08.000000000 +0200 +++ html.c 2004-07-19 14:13:50.000000000 +0200 @@ -158,7 +158,7 @@ "thinsp", NULL, NULL, "zwnj", "zwj", "lrm", "rlm", NULL, NULL, NULL, "ndash", "mdash", NULL, NULL, NULL, "lsquo", "rsquo", "sbquo", NULL, "ldquo", "rdquo", "bdquo", - "dagger", "Dagger", "bull", NULL, NULL, NULL, "hellip", + NULL,"dagger", "Dagger", "bull", NULL, NULL, NULL, "hellip", NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, "permil", NULL, "prime", "Prime", NULL, NULL, NULL, NULL, NULL, "lsaquo", "rsaquo", NULL, NULL, NULL, "oline", NULL, NULL, NULL, NULL, NULL, ------------------------------------------------------------------------ [2004-07-13 16:53:49] peter at desk dot nl Description: ------------ when decoding named entities with html_entity_decode, the resulting characters are sometimes incorrect. e.g., "€" becomes the UTF-8 representation of the ⁄ entity (fraction slash). not all entities are incorrectly translated, though. numerical entities work correctly. tested with both php-5.0.0RC3 and the proposed -RC4 from snaps.php.net/~andi/ config : ./configure --with-apxs=/usr/bin/apxs --with-mysql=/usr/local/mysql --with-gd --enable-safe-mode --with-dom=/usr --enable-ftp --with-zlib --with-xsl --with-xmlrpc --enable-cli --enable-bcmath --with-iconv --with-jpeg-dir=/usr --with-png-dir=/usr --with-xpm-dir=/usr/X11R6 Reproduce code: --------------- // all tests are viewed with browser encoding forced on UTF-8. <?php print html_entity_decode("€",ENT_NOQUOTES,"UTF-8"); print "<br />"; print htmlentities(html_entity_decode("€",ENT_NOQUOTES,"UTF-8"),ENT_NOQUOTES,"UTF-8"); ?> // other test also available on // http://fire.desk.nl/decode/index.php // (just enter € and submit) // and http://fire.desk.nl/decode/index.phps (source) Expected result: ---------------- a euro-sign a euro-sign Actual result: -------------- a fractional slash a euro-sign ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=29119&edit=1