ID:               22108
 Updated by:       [EMAIL PROTECTED]
 Reported By:      [EMAIL PROTECTED]
-Status:           Bogus
+Status:           Open
-Bug Type:         Output Control
+Bug Type:         Feature/Change Request
 Operating System: windows 2000
 PHP Version:      4.2.3
 New Comment:

Because BOM issue has been referenced repeatedly as a header output
preventer and we should be more aware of this, I don't see any reason
we have to mark this report as bogus.

Changing category from "output control" to a kind of "feature
request".



Previous Comments:
------------------------------------------------------------------------

[2003-02-07 13:57:22] [EMAIL PROTECTED]

Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

BOM = Byte Order Mark for UCS-2 encoding
This value sould not be used in UTF-8 since the only
reason besides detecting the byte order of UCS-2 was a 
special non breaking space. And newer Unicode versions 
have another representation for the same thing.

Anyhow BOM = FE FF
That makes depending on the byte order:
UCS-2BE <-> "\xFE\xFF"
UCS-2LE <-> "\xFF\xFE"

Therefore a sequence of "EF BB" is another character and 
must not be ignored.


------------------------------------------------------------------------

[2003-02-07 10:42:16] [EMAIL PROTECTED]

sniper,

imagine someone would want to echo some text in eg. French.
In that case, if you'd save it as ascii, you would get corrupted
output. So instead you'd have to save as utf-8. Which seems to cause
problems (or so [EMAIL PROTECTED] tells us)

------------------------------------------------------------------------

[2003-02-07 08:58:21] [EMAIL PROTECTED]

And why an earth would you save PHP files in any other
format than ascii?


------------------------------------------------------------------------

[2003-02-07 08:53:10] [EMAIL PROTECTED]

What is a BOM ?

Derick

------------------------------------------------------------------------

[2003-02-07 08:46:36] [EMAIL PROTECTED]

Problem:
When a php file is saved in utf-8 format with the UTF-8 BOM as the
first three bytes of the file (EF BB BF), PHP doesn't ignore these
bytes when loading and compiling the file, but instead considers them
output coming prior to the <?php. This causes incorrect display of the
page and failure of any http header output.

It does this even when the internal character format is set in php.ini
to be utf-8. 

Desired outcome:
PHP recognizes the utf-8 bom and disregards it.


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=22108&edit=1

Reply via email to