Package: php-htmlpurifier Version: 4.3.0+dfsg1-1 Severity: important We use HTML Purifier to clean up HTML mails from customers before displaying then. Under certain circumstances an ISO-8859-1 HTML string is cut off in the middle. The following scripts reproduces the problem:
require_once "HTMLPurifier.auto.php"; $in = "€".str_repeat(".", 50000); $cfg = HTMLPurifier_Config::createDefault(); $cfg->set("Core.Encoding", "iso-8859-1"); $purifier = new HTMLPurifier($cfg); $out = $purifier->purify($in); echo "in: ".strlen($in)."<br>"; echo "out: ".strlen($out)."<br>"; echo $out; Output: in: 50007 out: 8159 ................... [...] Expected Output: in: 50007 out: 50007 [Euro symbol]............ [...] The problem does not occur with encoding set to UTF-8. Unfortunately we cannot just convert the encoding as the encoding is also declared in the HTML header of the input string. -- System Information: Debian Release: 6.0.1 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 2.6.26-2-xen-amd64 (SMP w/2 CPU cores) Locale: LANG=de_DE.UTF-8, LC_CTYPE=en_US.utf8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages php-htmlpurifier depends on: ii php5 5.3.3-7+squeeze1 server-side, HTML-embedded scripti Versions of packages php-htmlpurifier recommends: ii php5-cli 5.3.3-7+squeeze1 command-line interpreter for the p php-htmlpurifier suggests no packages. -- no debconf information -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org