From: arnaud dot lb at gmail dot com Operating system: Any PHP version: 5.2.5 PHP Bug Type: Unicode Function Upgrades related Bug description: htmlspecialchars returns empty string on invalid unicode sequence
Description: ------------ htmlspecialchars/htmlentities returns an empty string when the input contains an invalid unicode sequence. I think these functions should just skip the invalid sequences or encode them byte by byte (e.g. 0xE9 => é), instead of discarding the whole string. Sometimes you have to display arbitrary strings of unknow encoding. So you make them more safe using htmlspecialchars($string, ENT_COMPAT, "site_encoding, utf-8 in my case"), but if there is at least one invalid sequence in the string, it returns an empty string :/ Reproduce code: --------------- $string = "Voil\xE0"; // "VoilĂ ", in ISO-8859-15 var_dump(htmlspecialchars($string, ENT_COMPAT, "utf-8")); Expected result: ---------------- string(4) "Voil" OR string(10) "Voilà" Actual result: -------------- string(0) "" -- Edit bug report at http://bugs.php.net/?id=43896&edit=1 -- Try a CVS snapshot (PHP 4.4): http://bugs.php.net/fix.php?id=43896&r=trysnapshot44 Try a CVS snapshot (PHP 5.2): http://bugs.php.net/fix.php?id=43896&r=trysnapshot52 Try a CVS snapshot (PHP 5.3): http://bugs.php.net/fix.php?id=43896&r=trysnapshot53 Try a CVS snapshot (PHP 6.0): http://bugs.php.net/fix.php?id=43896&r=trysnapshot60 Fixed in CVS: http://bugs.php.net/fix.php?id=43896&r=fixedcvs Fixed in release: http://bugs.php.net/fix.php?id=43896&r=alreadyfixed Need backtrace: http://bugs.php.net/fix.php?id=43896&r=needtrace Need Reproduce Script: http://bugs.php.net/fix.php?id=43896&r=needscript Try newer version: http://bugs.php.net/fix.php?id=43896&r=oldversion Not developer issue: http://bugs.php.net/fix.php?id=43896&r=support Expected behavior: http://bugs.php.net/fix.php?id=43896&r=notwrong Not enough info: http://bugs.php.net/fix.php?id=43896&r=notenoughinfo Submitted twice: http://bugs.php.net/fix.php?id=43896&r=submittedtwice register_globals: http://bugs.php.net/fix.php?id=43896&r=globals PHP 3 support discontinued: http://bugs.php.net/fix.php?id=43896&r=php3 Daylight Savings: http://bugs.php.net/fix.php?id=43896&r=dst IIS Stability: http://bugs.php.net/fix.php?id=43896&r=isapi Install GNU Sed: http://bugs.php.net/fix.php?id=43896&r=gnused Floating point limitations: http://bugs.php.net/fix.php?id=43896&r=float No Zend Extensions: http://bugs.php.net/fix.php?id=43896&r=nozend MySQL Configuration Error: http://bugs.php.net/fix.php?id=43896&r=mysqlcfg
