ID:               49350
 Updated by:       j...@php.net
 Reported By:      soapergem at gmail dot com
-Status:           Open
+Status:           Bogus
 Bug Type:         Filesystem function related
 Operating System: Windows XP
 PHP Version:      5.3.0
 New Comment:

Of course it does. If it didn't, it would be broken.


Previous Comments:
------------------------------------------------------------------------

[2009-08-24 22:31:38] soapergem at gmail dot com

Description:
------------
When text files are saved with UTF-8 encoding, a few characters are
saved at the front called the "Byte Order Mark" (read more about it on
Wikipedia). They are supposed to remain hidden and just be used as
meta-data to indicate that the file is saved with UTF-8 formatting.
Their hex values are EF BB BF, which is represented in ASCII by "".

The trouble is that when you read in a UTF-8 text file with either
fgets or fgetcsv, PHP misinterprets the BOM as literal text and includes
it with all the other text.

Reproduce code:
---------------
<?php

if ( $fp = fopen('ut8_text_file.txt') ) {

    echo fgets($fp);
    fclose($fp);

}

?>

Expected result:
----------------
Whatever text is saved on the first line of the text file.

Actual result:
--------------
Whatever text is saved on the first line of the text file.


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=49350&edit=1

Reply via email to