#22382 [Com]: fgetcsv does not allow escaped quotes
ID: 22382 Comment by: mr dot heat at gmx dot de Reported By: Stevenv at operamail dot com Status: Closed Bug Type: Filesystem function related Operating System: FreeBSD 4.7 PHP Version: 4.3.2-dev New Comment: This never was a bug. Instead, now 4.3.3's fGetCSV() is inconsistent/contradictory. In short: In 4.3.3 fGetCSV() mixes both Microsoft standard CSV and Unix delimiter separated values. The first uses to escape double quotes, the second uses addslashes(). (See http://www.catb.org/~esr/writings/taoup/html/ch05s02.html for detailed explanation.) Two examples: prior433.csv: In PHP, write as \ to escape quotes. prior433.php: ?php $fp = fopen(prior433.csv, rb); $array = fgetcsv($fp, 1000); echo $array[0]; //Output: In PHP, write as \ to escape quotes. ? since433.csv: In PHP, write as \\ to escape quotes. since433.php: ?php $fp = fopen(prior433.csv, rb); $array = fgetcsv($fp, 100); echo stripslashes($array[0]); //Same output as above, //but note the differences in both .csv and .php files ? This makes fGetCSV() a entirely different function in 4.3.3. I don't think it's a good idea to break any script written before 4.3.3 that uses fGetCSV(). Furthermore, there was nothing wrong in 4.3.2. The old Windows/Excel quotation style was binary-safe already (nearly binary-safe, see below). Suggestion #1: Introduce a new function fGetDSV() and restore the behaviour fGetCSV() worked before. Suggestion #2: Introduce a fifth parameter fGetCSV(resource handle, int length [, string delimiter [, string enclosure [, bool unix_escape_style]]]) which is false for Windows style (default; values are surrounded by enclosure characters; enclosure characters are escaped by another enclosure character) or true for Unix style (enclosure may be empty; any critical character is escaped by a backslash; values are returned without the backslashes; double enclosure characters are not stripped). Additional bug: fGetCSV() breaks if there is a \x00 character anywhere. This means, it's not binary-safe (concerns any version). Previous Comments: [2003-02-23 21:17:39] [EMAIL PROTECTED] This bug has been fixed in CVS. In case this was a PHP problem, snapshots of the sources are packaged every three hours; this change will be in the next snapshot. You can grab the snapshot at http://snaps.php.net/. In case this was a documentation problem, the fix will show up soon at http://www.php.net/manual/. In case this was a PHP.net website problem, the change will show up on the PHP.net site and on the mirror sites in short time. Thank you for the report, and for helping us make PHP better. [2003-02-22 21:59:44] [EMAIL PROTECTED] I take that back, it isn't fixed in that snapshot, so don't bother testing. Verified within 4.3.2-dev. [2003-02-22 21:18:38] [EMAIL PROTECTED] Please try using this CVS snapshot: http://snaps.php.net/php4-STABLE-latest.tar.gz For Windows: http://snaps.php.net/win32/php4-win32-STABLE-latest.zip AFAIK, this is already fixed. [2003-02-22 19:45:00] Stevenv at operamail dot com As the summary says, fgetcsv does not allow escaped quotes. When csv fields come from user input, it is often the case that addslashes() is run on them then enclosed in quotes. However, fgetcsv() removes anything after the escaped quote. Code: ?php /* make a csv file */ $fp = fopen('csv_file', 'w+'); $fields = array(); $fields[0] = '' . addslashes('This is Field One') . ''; $fields[1] = 'field two'; $fields[2] = 'field three'; fwrite($fp, implode(',', $fields)); /* start all over */ fseek($fp, 0); var_dump(fgetcsv($fp, 4096)); ? Outputs: array(3) { [0]= string(9) This is \ [1]= string(9) field two [2]= string(11) field three } The behavior I expected would have been for the first field to read: This is \Field One\ Much like the functionality described on http://rath.ca/Misc/Perl_CSV/CSV-2.0.html#csv specification. Thanks -- Edit this bug report at http://bugs.php.net/?id=22382edit=1
#22382 [Com]: fgetcsv does not allow escaped quotes
ID: 22382 Comment by: mr dot heat at gmx dot de Reported By: Stevenv at operamail dot com Status: Closed Bug Type: Filesystem function related Operating System: FreeBSD 4.7 PHP Version: 4.3.2-dev New Comment: Don't you think making fgetcsv() in 4.3.2+ incompatible to fgetcsv() in previous versions is critical? (Re-post because comment was deleted due to unknown reasons.) Example prior432.csv: Write as \ to escape quotes. Example prior432.php: ?php $fp = fopen(prior432.csv, rb); $array = fgetcsv($fp, 1000); echo $array[0]; //Output: Write as \ to escape quotes. ? Example since432.csv: Write \ as \\\ to escape quotes. Example since432.php: ?php $fp = fopen(since432.csv, rb); $array = fgetcsv($fp, 100); echo stripslashes($array[0]); //Same output as above ? This makes fgetcsv() a entirely different function. Don't you think it's a bad idea to break any script written before 4.3.2 that uses fgetcsv()? Suggestion #1: Introduce a new function fgetdsv() and restore the behaviour fgetcsv() worked before. Suggestion #2: Introduce a fifth parameter fgetcsv(resource handle, int length [, string delimiter [, string enclosure [, bool unix_escape_style]]]) which is false for Windows style (default; values are surrounded by enclosure characters; enclosure characters are escaped by another enclosure character) or true for Unix style (enclosure may be empty; any critical character is escaped by a backslash; values are returned without the backslashes; double enclosure characters are not stripped). Previous Comments: [2003-02-23 21:17:39] [EMAIL PROTECTED] This bug has been fixed in CVS. In case this was a PHP problem, snapshots of the sources are packaged every three hours; this change will be in the next snapshot. You can grab the snapshot at http://snaps.php.net/. In case this was a documentation problem, the fix will show up soon at http://www.php.net/manual/. In case this was a PHP.net website problem, the change will show up on the PHP.net site and on the mirror sites in short time. Thank you for the report, and for helping us make PHP better. [2003-02-22 19:45:00] Stevenv at operamail dot com As the summary says, fgetcsv does not allow escaped quotes. When csv fields come from user input, it is often the case that addslashes() is run on them then enclosed in quotes. However, fgetcsv() removes anything after the escaped quote. Code: ?php /* make a csv file */ $fp = fopen('csv_file', 'w+'); $fields = array(); $fields[0] = '' . addslashes('This is Field One') . ''; $fields[1] = 'field two'; $fields[2] = 'field three'; fwrite($fp, implode(',', $fields)); /* start all over */ fseek($fp, 0); var_dump(fgetcsv($fp, 4096)); ? Outputs: array(3) { [0]= string(9) This is \ [1]= string(9) field two [2]= string(11) field three } The behavior I expected would have been for the first field to read: This is \Field One\ Much like the functionality described on http://rath.ca/Misc/Perl_CSV/CSV-2.0.html#csv specification. Thanks -- Edit this bug report at http://bugs.php.net/?id=22382edit=1
#22382 [Com]: fgetcsv does not allow escaped quotes
ID: 22382 Comment by: mr dot heat at gmx dot de Reported By: Stevenv at operamail dot com Status: Closed Bug Type: Filesystem function related Operating System: FreeBSD 4.7 PHP Version: 4.3.2-dev New Comment: Don't you think making fgetcsv() in 4.3.2+ incompatible to fgetcsv() in previous versions is critical? (Re-post because comment was deleted due to unknown reasons.) Example prior432.csv: Write as \ to escape quotes. Example prior432.php: ?php $fp = fopen(prior432.csv, rb); $array = fgetcsv($fp, 1000); echo $array[0]; //Output: Write as \ to escape quotes. ? Example since432.csv: Write \ as \\\ to escape quotes. Example since432.php: ?php $fp = fopen(since432.csv, rb); $array = fgetcsv($fp, 100); echo stripslashes($array[0]); //Same output as above ? This makes fgetcsv() a entirely different function. Don't you think it's a bad idea to break any script written before 4.3.2 that uses fgetcsv()? Suggestion #1: Introduce a new function fgetdsv() and restore the behaviour fgetcsv() worked before. Suggestion #2: Introduce a fifth parameter fgetcsv(resource handle, int length [, string delimiter [, string enclosure [, bool unix_escape_style]]]) which is false for Windows style (default; values are surrounded by enclosure characters; enclosure characters are escaped by another enclosure character) or true for Unix style (enclosure may be empty; any critical character is escaped by a backslash; values are returned without the backslashes; double enclosure characters are not stripped). Previous Comments: [2003-02-23 21:17:39] [EMAIL PROTECTED] This bug has been fixed in CVS. In case this was a PHP problem, snapshots of the sources are packaged every three hours; this change will be in the next snapshot. You can grab the snapshot at http://snaps.php.net/. In case this was a documentation problem, the fix will show up soon at http://www.php.net/manual/. In case this was a PHP.net website problem, the change will show up on the PHP.net site and on the mirror sites in short time. Thank you for the report, and for helping us make PHP better. [2003-02-22 19:45:00] Stevenv at operamail dot com As the summary says, fgetcsv does not allow escaped quotes. When csv fields come from user input, it is often the case that addslashes() is run on them then enclosed in quotes. However, fgetcsv() removes anything after the escaped quote. Code: ?php /* make a csv file */ $fp = fopen('csv_file', 'w+'); $fields = array(); $fields[0] = '' . addslashes('This is Field One') . ''; $fields[1] = 'field two'; $fields[2] = 'field three'; fwrite($fp, implode(',', $fields)); /* start all over */ fseek($fp, 0); var_dump(fgetcsv($fp, 4096)); ? Outputs: array(3) { [0]= string(9) This is \ [1]= string(9) field two [2]= string(11) field three } The behavior I expected would have been for the first field to read: This is \Field One\ Much like the functionality described on http://rath.ca/Misc/Perl_CSV/CSV-2.0.html#csv specification. Thanks -- Edit this bug report at http://bugs.php.net/?id=22382edit=1