ID: 49350 Updated by: j...@php.net Reported By: soapergem at gmail dot com -Status: Open +Status: Bogus Bug Type: Filesystem function related Operating System: Windows XP PHP Version: 5.3.0 New Comment:
Of course it does. If it didn't, it would be broken. Previous Comments: ------------------------------------------------------------------------ [2009-08-24 22:31:38] soapergem at gmail dot com Description: ------------ When text files are saved with UTF-8 encoding, a few characters are saved at the front called the "Byte Order Mark" (read more about it on Wikipedia). They are supposed to remain hidden and just be used as meta-data to indicate that the file is saved with UTF-8 formatting. Their hex values are EF BB BF, which is represented in ASCII by "". The trouble is that when you read in a UTF-8 text file with either fgets or fgetcsv, PHP misinterprets the BOM as literal text and includes it with all the other text. Reproduce code: --------------- <?php if ( $fp = fopen('ut8_text_file.txt') ) { echo fgets($fp); fclose($fp); } ?> Expected result: ---------------- Whatever text is saved on the first line of the text file. Actual result: -------------- Whatever text is saved on the first line of the text file. ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=49350&edit=1