ID: 22108 Updated by: [EMAIL PROTECTED] Reported By: [EMAIL PROTECTED] Status: Open Bug Type: Feature/Change Request Operating System: Any PHP Version: All (as of the current implementation) -Assigned To: +Assigned To: moriyoshi New Comment:
reassigning Previous Comments: ------------------------------------------------------------------------ [2003-02-08 06:10:51] [EMAIL PROTECTED] Ok, the UTF-8 BOM was new to me. If i find the time i'll have a look at it over the weekend. I think the solution would be somewhere in zend's multibyte support since i fear adding that bom to mbstring alone does not do the trick. ------------------------------------------------------------------------ [2003-02-08 05:43:14] [EMAIL PROTECTED] derick, assuming that you wanted to create a version of the the example at http://www.php.net/manual/en/introduction.php#intro-whatis which displayed the text "Hi, I'm a PHP script" in multiple languages, how would you propose doing it? The only way is to use a form of unicode encoding. The least intrusive of these ways is utf-8 because it encodes the text in such a way that ascii characters (7 bit characters) are still plain ascii characters, and all encoded characters are always >128 and will never be mistaken for ascii. I haven't seen any documentation which states that php can only handle ascii text, please direct me to it if it exists. If there is some known problem with PHP parsing UTF-8 scripts, I haven't found it yet in a multitude of different files with different languages which PHP is parsing happily. The only problem that I have had is that any files which have an UTF-8 BOM, PHP is mistakenly outputting the BOM as input. This is a bug of PHP. The solution is easy, on loading a file, strip the BOM if it exists. Make it optional processing via a php.ini config argument if necessary. Don't be US-centric in your thinking, there is far more world existing outside those borders. Regards, Brodie. ------------------------------------------------------------------------ [2003-02-08 04:24:12] [EMAIL PROTECTED] PHP doesn't want UNICODE scripts, but just ASCII ones. Not a bug -> bogus. ------------------------------------------------------------------------ [2003-02-08 02:01:11] [EMAIL PROTECTED] And assigning this task to me. ------------------------------------------------------------------------ [2003-02-08 01:48:15] [EMAIL PROTECTED] Yes, I suppose this might be a bug, but most of developers involved in PHP are not just so aware of this issue as you expected (and I had expected). So I thought that changing the category is a better choice than bogusing. ------------------------------------------------------------------------ The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/22108 -- Edit this bug report at http://bugs.php.net/?id=22108&edit=1