ID:               50686
 User updated by:  thuejk at gmail dot com
 Reported By:      thuejk at gmail dot com
 Status:           Assigned
 Bug Type:         Filesystem function related
 Operating System: Ubuntu
 PHP Version:      5.3.1
 Assigned To:      iliaa
 New Comment:

I have written a test suite (for my own local replacement functions for
fcsvget/put). So if you reimplement the PHP functions then you might
want to send me your new implementation, so I can run it through the
test suite.


Previous Comments:
------------------------------------------------------------------------

[2010-01-08 13:05:37] thuejk at gmail dot com

Also, according to RFC 4180, CSV lines should be terminated by \r\n,
not \n.

Since CSV is a data exchange format, the line termination should not be
dependent on the system generating the CSV file.

fputcsv() currently terminates CSV-lines with \n on my Linux server.

----

PS: I just noticed that RFC 4180 is from 2005, so you have some excuse,
since the PHP function predates the RFC. However, the RFC cites 4
previous non-RFC definitions, which all agree with the RFC on the
important points, so it is not clear that the PHP implementation ever
had any right to claim it had anything to do with the CSV format. And of
course at least OpenOffice's CSV-support, and probably many other
programs, can't work with the input/output of the PHP functions.

------------------------------------------------------------------------

[2010-01-07 18:42:44] j...@php.net

Ilia, maybe you want to check this out? :)

------------------------------------------------------------------------

[2010-01-07 16:05:44] thuejk at gmail dot com

Description:
------------
According to http://en.wikipedia.org/wiki/Comma-separated_values (which
I assume is quoting RFC 4180), CSV does not have an escape character,
but instead quotes fields with ", and escapes '"' with '""'.

But fputcsv() escapes '"' with '\"', and fgetcsv() incorrectly chokes
on the fragment '\""' (which should be unescaped to '\"').

This is a problem for for example StarOffice, which actually implements
the standard, and (understandably) chokes on fputcsv() output like
"\"",2,3

I haven't tested if this is also a problem in MS office, but it is if
MS Office implements the CSV standard correctly.

Note that the fputcsv() manual at
http://dk.php.net/manual/en/function.fputcsv.php says: "Format line as
CSV", and similarly with fgetcsv(), so there is no doubt that the PHP
functions should follow the RFC.

This bug seems to be the result of bug report
http://bugs.php.net/bug.php?id=22382 , which is obviously bogus, but was
"fixed" by breaking php's CSV support.

Reproduce code:
---------------
<?php
echo "<pre>fputcsv:\n";

$matrix = Array(Array('a\\"b',2,3));
$temp_handle = tmpfile();
foreach ($matrix as $row) {
  fputcsv($temp_handle, $row);
}
fseek($temp_handle, 0) === 0;
$str = fread($temp_handle, 1024*1024*100);
echo $str;
fclose($temp_handle); // this removes the tmpfile() file

echo "\nfgetcsv:\n";
$temp_handle = tmpfile();
fwrite($temp_handle, '"a\\""", 2, 3');
fseek($temp_handle, 0);
$a = fgetcsv($temp_handle);
echo $a[0];
fclose($temp_handle); // this removes the tmpfile() file
?>


Expected result:
----------------
fputcsv:
"a\""b",2,3

fgetcsv:
a\"


Actual result:
--------------
fputcsv:
"a\"b",2,3

fgetcsv:



------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=50686&edit=1

Reply via email to