ID: 33198 User updated by: Bjorn dot Wiberg at its dot uu dot se Reported By: Bjorn dot Wiberg at its dot uu dot se -Status: Feedback +Status: Open Bug Type: Unknown/Other Function Operating System: IBM AIX 5.2.0.0 ML5 PHP Version: 5.0.4 New Comment:
Hi! Please try the following: http://www.anst.uu.se/bwiberg/php/utf8_bug.phtml ...exhibits the bug randomly just like *.php. When in error, the output looks like this: http://www.anst.uu.se/bwiberg/php/utf8_bug_result3.txt http://www.anst.uu.se/bwiberg/php/utf8_bug.html ...does not exhibit the bug; output does not get re-encoded in any way, i.e. UTF-8 characters remain UTF-8: http://www.anst.uu.se/bwiberg/php/utf8_bug_result4.txt It appears that something happens to the output (sometimes) when PHP parses the file. Best regards, Björn Previous Comments: ------------------------------------------------------------------------ [2005-06-15 21:10:50] [EMAIL PROTECTED] And if you change *.php to *.<something_else> it still b0rks? Why do you think it's PHPs fault? >From what I can see, there is no PHP functions involved at all. ------------------------------------------------------------------------ [2005-05-31 09:42:51] Bjorn dot Wiberg at its dot uu dot se Description: ------------ When a PHP/PHTML file (a file getting parsed by PHP) contains a leading UTF-8 Byte Order Mark (BOM), EF BB BF in hex, the character encoding of the output varies. Possibly relevant PHP configuration snippets from httpd.conf: php_value default_charset none php_value default_mimetype "text/html" php_admin_value output_buffering 4096 php_admin_value output_handler none PHP configuration flags (from config.nice): CPPFLAGS='-I/usr/local/include' \ LDFLAGS='-L/lib -L/opt/freeware/lib -L/usr/local/lib' \ CC='/usr/local/bin/gcc' \ './configure' \ '--enable-bcmath' \ '--enable-calendar' \ '--enable-dba' \ '--enable-dbase' \ '--enable-dbx' \ '--enable-debug' \ '--enable-dio' \ '--enable-exif' \ '--enable-embedded-mysqli' \ '--enable-filepro' \ '--enable-ftp' \ '--enable-gd-jis-conv' \ '--enable-gd-native-ttf' \ '--enable-mbstring' \ '--enable-memory-limit' \ '--enable-shmop' \ '--enable-soap' \ '--enable-sockets' \ '--enable-sysvmsg' \ '--enable-sysvsem' \ '--enable-sysvshm' \ '--enable-yp' \ '--enable-zend-multibyte' \ '--prefix=/apache/php' \ '--with-apxs2=/apache/bin/apxs' \ '--with-bz2' \ '--with-freetype-dir' \ '--with-gd' \ '--with-gdbm' \ '--with-gettext' \ '--with-inifile' \ '--with-jpeg-dir' \ '--with-ldap' \ '--with-libxml-dir' \ '--with-mime-magic' \ '--with-mysql=/usr/local/mysql' \ '--with-openssl=/opt/freeware' \ '--with-png-dir' \ '--with-tiff-dir' \ '--with-ttf' \ '--with-xpm-dir' \ '--with-zlib' \ '--with-zlib-dir' \ "$@" Reproduce code: --------------- Source code with leading BOM: http://www.anst.uu.se/bwiberg/php/utf8_bug.php.txt Corresponding page yielding varying result: http://www.anst.uu.se/bwiberg/php/utf8_bug.php (Use telnet or other "dumb" tool to dump the results; it appears that the result varies by server process that handles the request. We're using the Apache 2 prefork MPM.) Example of incorrect result, here we get ISO-8859-1 encoded output: http://www.anst.uu.se/bwiberg/php/utf8_bug_result1.txt Expected result: ---------------- Example of correct result, here we get UTF-8 encoded output: http://www.anst.uu.se/bwiberg/php/utf8_bug_result2.txt Actual result: -------------- Varies between the two examples. Make sure to close the connection in between (or the same server process will serve your multiple requests). Restarting the web server does not help. Changing the PHP settings for default character set (default_charset) and default MIME type (default_mimetype) does not help; I have tried all combinations. Turning off output buffering (php_admin_flag output_buffering off) does not help. ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=33198&edit=1