ID: 50686 User updated by: thuejk at gmail dot com Reported By: thuejk at gmail dot com Status: Assigned Bug Type: Filesystem function related Operating System: Ubuntu PHP Version: 5.3.1 Assigned To: iliaa New Comment:
I have written a test suite (for my own local replacement functions for fcsvget/put). So if you reimplement the PHP functions then you might want to send me your new implementation, so I can run it through the test suite. Previous Comments: ------------------------------------------------------------------------ [2010-01-08 13:05:37] thuejk at gmail dot com Also, according to RFC 4180, CSV lines should be terminated by \r\n, not \n. Since CSV is a data exchange format, the line termination should not be dependent on the system generating the CSV file. fputcsv() currently terminates CSV-lines with \n on my Linux server. ---- PS: I just noticed that RFC 4180 is from 2005, so you have some excuse, since the PHP function predates the RFC. However, the RFC cites 4 previous non-RFC definitions, which all agree with the RFC on the important points, so it is not clear that the PHP implementation ever had any right to claim it had anything to do with the CSV format. And of course at least OpenOffice's CSV-support, and probably many other programs, can't work with the input/output of the PHP functions. ------------------------------------------------------------------------ [2010-01-07 18:42:44] j...@php.net Ilia, maybe you want to check this out? :) ------------------------------------------------------------------------ [2010-01-07 16:05:44] thuejk at gmail dot com Description: ------------ According to http://en.wikipedia.org/wiki/Comma-separated_values (which I assume is quoting RFC 4180), CSV does not have an escape character, but instead quotes fields with ", and escapes '"' with '""'. But fputcsv() escapes '"' with '\"', and fgetcsv() incorrectly chokes on the fragment '\""' (which should be unescaped to '\"'). This is a problem for for example StarOffice, which actually implements the standard, and (understandably) chokes on fputcsv() output like "\"",2,3 I haven't tested if this is also a problem in MS office, but it is if MS Office implements the CSV standard correctly. Note that the fputcsv() manual at http://dk.php.net/manual/en/function.fputcsv.php says: "Format line as CSV", and similarly with fgetcsv(), so there is no doubt that the PHP functions should follow the RFC. This bug seems to be the result of bug report http://bugs.php.net/bug.php?id=22382 , which is obviously bogus, but was "fixed" by breaking php's CSV support. Reproduce code: --------------- <?php echo "<pre>fputcsv:\n"; $matrix = Array(Array('a\\"b',2,3)); $temp_handle = tmpfile(); foreach ($matrix as $row) { fputcsv($temp_handle, $row); } fseek($temp_handle, 0) === 0; $str = fread($temp_handle, 1024*1024*100); echo $str; fclose($temp_handle); // this removes the tmpfile() file echo "\nfgetcsv:\n"; $temp_handle = tmpfile(); fwrite($temp_handle, '"a\\""", 2, 3'); fseek($temp_handle, 0); $a = fgetcsv($temp_handle); echo $a[0]; fclose($temp_handle); // this removes the tmpfile() file ?> Expected result: ---------------- fputcsv: "a\""b",2,3 fgetcsv: a\" Actual result: -------------- fputcsv: "a\"b",2,3 fgetcsv: ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=50686&edit=1