Edit report at http://bugs.php.net/bug.php?id=52712&edit=1
ID: 52712 Updated by: ahar...@php.net Reported by: matias dot perrone at gmail dot com Summary: html_entity_decode does not support all standard entities -Status: Open +Status: Bogus Type: Bug Package: Strings related Operating System: Windows 7 PHP Version: 5.2.14 Block user comment: N New Comment: html_entity_decode() can only decode entities that exist in the given character set. None of your example entities occur in ISO-8859-1, therefore they have to be left as entities. To see this in action: if you change the character set to ISO-8859-15, the € entity does get correctly decoded, since ISO-8859-15 added the ⬠character to ISO-8859-1. You'd be much better off using a Unicode character set like UTF-8, since that can represent all of the characters defined by HTML entities. Not a bug; closing. Previous Comments: ------------------------------------------------------------------------ [2010-08-27 06:01:15] matias dot perrone at gmail dot com Description: ------------ The function "html_entity_decode" does not support all html entities as documented in http://www.w3.org/TR/html4/sgml/entities.html Test script: --------------- $sEntities = '’ ‘ “ ” € ˆ'; echo "Start: ".$sEntities."\n"; $sEntities = html_entity_decode(($sEntities), ENT_QUOTES, "ISO-8859-1"); echo "Result: ".$sEntities; Expected result: ---------------- Start: ’ ‘ “ ” € ˆ Result: â â â â â¬ Ë Actual result: -------------- Start: ’ ‘ “ ” € ˆ Result: ’ ‘ “ ” € ˆ ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/bug.php?id=52712&edit=1