ID:               29119
 User updated by:  peter at desk dot nl
 Reported By:      peter at desk dot nl
 Status:           Open
 Bug Type:         Strings related
 Operating System: Linux 2.6.5
 PHP Version:      5.0.0RC3
 New Comment:

after looking at the sources, this patch seems to fix it :

fire:/home/peter/php-5.0.0/ext/standard# diff -u html.c.org html.c
--- html.c.org  2004-07-19 14:24:08.000000000 +0200
+++ html.c      2004-07-19 14:13:50.000000000 +0200
@@ -158,7 +158,7 @@
        "thinsp", NULL, NULL, "zwnj", "zwj", "lrm", "rlm",
        NULL, NULL, NULL, "ndash", "mdash", NULL, NULL, NULL,
        "lsquo", "rsquo", "sbquo", NULL, "ldquo", "rdquo", "bdquo",
-       "dagger", "Dagger",     "bull", NULL, NULL, NULL, "hellip",
+       NULL,"dagger", "Dagger",        "bull", NULL, NULL, NULL,
"hellip",
        NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, "permil",
NULL,
        "prime", "Prime", NULL, NULL, NULL, NULL, NULL, "lsaquo",
"rsaquo",
        NULL, NULL, NULL, "oline", NULL, NULL, NULL, NULL, NULL,


Previous Comments:
------------------------------------------------------------------------

[2004-07-13 16:53:49] peter at desk dot nl

Description:
------------
when decoding named entities with html_entity_decode, the resulting
characters are sometimes incorrect. e.g., "€" becomes the UTF-8
representation of the ⁄ entity (fraction slash). not all entities
are incorrectly translated, though. numerical entities work correctly.

tested with both php-5.0.0RC3 and the proposed -RC4 from
snaps.php.net/~andi/

config :

./configure  --with-apxs=/usr/bin/apxs --with-mysql=/usr/local/mysql
--with-gd --enable-safe-mode --with-dom=/usr --enable-ftp --with-zlib
--with-xsl --with-xmlrpc --enable-cli --enable-bcmath --with-iconv
--with-jpeg-dir=/usr --with-png-dir=/usr --with-xpm-dir=/usr/X11R6



Reproduce code:
---------------
// all tests are viewed with browser encoding forced on UTF-8.
<?php
  print html_entity_decode("&euro;",ENT_NOQUOTES,"UTF-8");
  print "<br />";
  print
htmlentities(html_entity_decode("&euro;",ENT_NOQUOTES,"UTF-8"),ENT_NOQUOTES,"UTF-8");
?>

// other test also available on 
// http://fire.desk.nl/decode/index.php
// (just enter &euro; and submit)
// and http://fire.desk.nl/decode/index.phps (source)

Expected result:
----------------
a euro-sign
a euro-sign


Actual result:
--------------
a fractional slash
a euro-sign



------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=29119&edit=1

Reply via email to