Edit report at https://bugs.php.net/bug.php?id=49874&edit=1

 ID:                 49874
 Updated by:         yohg...@php.net
 Reported by:        jketterl at chipxonio dot de
 Summary:            ftell() and fseek() inconsistency when using stream
                     filters
 Status:             Open
 Type:               Bug
 Package:            Filesystem function related
 Operating System:   linux (ubuntu)
-PHP Version:        5.2.11
+PHP Version:        5.5.4
 Block user comment: N
 Private report:     N

 New Comment:

It seems 5.5 has this problem still

string(8) "Line 01
"
string(8) "Line 02
"
string(8) "Line 01
"
string(5) "e 01
"
[yohgaki@dev php-5.4]$ php -v
PHP 5.5.4 (cli) (built: Sep 19 2013 13:06:40)


Previous Comments:
------------------------------------------------------------------------
[2009-10-15 06:54:31] jketterl at chipxonio dot de

thanks for having a look

i tried with and without. the challenge is to get it working without, because 
that's the worst case my app has to deal with, but the BOM doesn't seem to 
solve this.

$ hexdump test-with-bom.csv
0000000 feff 004c 0069 006e 0065 0020 0030 0031
0000010 000a 004c 0069 006e 0065 0020 0030 0032
0000020 000a 004c 0069 006e 0065 0020 0030 0033
0000030 000a 004c 0069 006e 0065 0020 0030 0034
0000040 000a
0000042

$ php test.php
string(8) "Line 01
"
string(8) "Line 02
"
string(8) "Line 01
"
string(5) "e 01
"

i also tried opening the file including the BOM without a stream filter, but 
that just resulted in php reading in two extra chars (the BOM converted in some 
way i guess) on the beginning of the first line.

i thought i'd attach the sample files to this bug, but it seems like i can't. 
i've uploaded them here instead: http://www.djmacgyver.net/tmp/php-ftell/

------------------------------------------------------------------------
[2009-10-14 16:40:00] sjo...@php.net

Thank you for your bug report. Does your test.csv file start with a BOM? You 
can determine this by viewing the file in a hex editor. If it starts with fffe 
or feff, it has a BOM (byte order mark).

------------------------------------------------------------------------
[2009-10-14 11:39:39] jketterl at chipxonio dot de

Description:
------------
exact php version: PHP 5.2.11-0.dotdeb.1 with Suhosin-Patch 0.9.7 (cli) (built: 
Sep 20 2009 09:41:43)
this bug is also be filter-/stream-related. i just believe it might be easier 
to fix on the filesystem side, that's why i chose that category.

when using a php stream filter to convert input from utf-16 into iso8859 (or 
most probably from any 2byte-encoded charset into any single-byte-encode 
charset) the ftell() and fseek() functions start to behave inconsistently.

more precisely: fseek() jumps to exact offsets ignoring the 2byte-encoding, 
whereas ftell() seems to return the number of bytes read *after* the filter has 
been applied. thus it is not possible to fseek() back to a certain offset that 
has been stored with ftell() before.

the content of the testfile used in the code examples is as follows:
Line 01
Line 02
Line 03
Line 04

Reproduce code:
---------------
$file = 'test.csv';

$fp = fopen($file, 'r');
stream_filter_append($fp, 'convert.iconv.utf16/iso8859-15');
$line = fgets($fp);
var_dump($line);
$line = fgets($fp);
var_dump($line);
fclose($fp);

$fp = fopen($file, 'r');
stream_filter_append($fp, 'convert.iconv.utf16/iso8859-15');
$line = fgets($fp);
var_dump($line);
fseek($fp, ftell($fp)); // this shouldn't move anything - but it does...
$line = fgets($fp);
var_dump($line);
fclose($fp);

Expected result:
----------------
string(8) "Line 01
"
string(8) "Line 02
"
string(8) "Line 01
"
string(8) "Line 02
"

Actual result:
--------------
string(8) "Line 01
"
string(8) "Line 02
"
string(8) "Line 01
"
string(4) " 01
"


------------------------------------------------------------------------



-- 
Edit this bug report at https://bugs.php.net/bug.php?id=49874&edit=1

Reply via email to