Edit report at https://bugs.php.net/bug.php?id=60884&edit=1
ID: 60884 Updated by: johan...@php.net Reported by: t dot nickl at exse dot de Summary: htmlentities() behaves differently and thus breaks existing code -Status: Open +Status: Bogus Type: Bug Package: *General Issues Operating System: CentOS 4.4 PHP Version: 5.4.0RC6 Block user comment: N Private report: N New Comment: Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://www.php.net/manual/ and the instructions on how to report a bug at http://bugs.php.net/how-to-report.php In PHP 5.4 the default_charset php.ini option was set to utf-8. You can override this in php.ini or .htaccess or such. Previous Comments: ------------------------------------------------------------------------ [2012-01-25 15:29:09] t dot nickl at exse dot de Description: ------------ //This code must be run via web: //This is a string from e.g. some database containing a german umlaut 'ä'. Note the encoding really is iso8859-1 . It's just assigned here literally to be concise. $a = "Rechnungsadresse ändern"; //this output works: (An empty string activates some autodetection) var_dump(htmlentities($a, ENT_COMPAT | ENT_HTML401, '')); //this works too (the same output is generated): var_dump(htmlentities($a, ENT_COMPAT | ENT_HTML401, 'ISO-8859-1')); //this does NOT work (outputs empty string) var_dump(htmlentities($a)); // Reason: php changed the charset htmlentities uses when you NOT give anything (90% of the code out there): //determine_charset() : /////////////////////////////////////////////////////// // php-5.2.1/ext/standard/html.c : // /* Guarantee default behaviour for backwards compatibility */ // if (charset_hint == NULL) // return cs_8859_1; ///////////////////////////////////////////////////// // php-5.4.0RC4/ext/standard/html.c : // /* Default is now UTF-8 */ // if (charset_hint == NULL) // return cs_utf_8; // This breaks the meaning of existing german code. For example, typo3 outputs empty string if end users used german umlauts in rich text editor in backend. // Please change determine_charset() back to using cs_8859_1 if the third parameter of htmlentities() is omitted. Test script: --------------- See description. Expected result: ---------------- See description. Actual result: -------------- See description. ------------------------------------------------------------------------ -- Edit this bug report at https://bugs.php.net/bug.php?id=60884&edit=1