Edit report at https://bugs.php.net/bug.php?id=55763&edit=1
ID: 55763
Comment by: darren at dcook dot org
Reported by: talk at alexmingoia dot com
Summary: str_getcsv incorrectly handles line-breaks inside
fields
Status: Open
Type: Bug
Package: Strings related
Operating System: OS X 10.6
PHP Version: 5.3.8
Block user comment: N
Private report: N
New Comment:
The problem can also be shown with the example from the Wikipedia page
(http://en.wikipedia.org/wiki/Comma-separated_values):
$s2=<<<EOD
Year,Make,Model,Description,Price
1997,Ford,E350,"ac, abs, moon",3000.00
1999,Chevy,"Venture ""Extended Edition""","",4900.00
1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00
1996,Jeep,Grand Cherokee,"MUST SELL!
air, moon roof, loaded",4799.00
EOD;
$lines=str_getcsv($s2,"\n");
print_r($lines);
It outputs:
Array
(
[0] => Year,Make,Model,Description,Price
[1] => 1997,Ford,E350,"ac, abs, moon",3000.00
[2] => 1999,Chevy,"Venture ""Extended Edition""","",4900.00
[3] => 1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00
[4] => 1996,Jeep,Grand Cherokee,"MUST SELL!
[5] => air, moon roof, loaded",4799.00
)
But it should output:
Array
(
[0] => Year,Make,Model,Description,Price
[1] => 1997,Ford,E350,"ac, abs, moon",3000.00
[2] => 1999,Chevy,"Venture ""Extended Edition""","",4900.00
[3] => 1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00
[4] => 1996,Jeep,Grand Cherokee,"MUST SELL!
air, moon roof, loaded",4799.00
)
Previous Comments:
------------------------------------------------------------------------
[2011-09-22 16:45:02] talk at alexmingoia dot com
Sorry... expected output should be
array(4) {
[0]=>
string(15) "Name,Desc,Email"
[1]=>
string(4) "Alex"
[2]=>
string(18) "Is a PHP
developer
"
[3]=>
string(16) "[email protected]"
}
------------------------------------------------------------------------
[2011-09-22 16:41:15] talk at alexmingoia dot com
Description:
------------
RFC4180 states that fields can contain line breaks as long as they are properly
enclosed by double-quotes.
str_getcsv treats line-breaks inside of enclosed fields as new records in the
CSV.
Setting 'auto_detect_line_ending' to TRUE or using "\r\n" instead of "\n" still
produces incorrect results.
Test script:
---------------
$csv = file_get_contents('test.csv');
$csvArray = str_getcsv($csv, "\n");
var_dump($csvArray);
Expected result:
----------------
array(4) {
[0]=>
string(15) "Name,Desc,Email"
[1]=>
string(4) "Alex"
[2]=>
string(18) "Is a PHP developer"
[3]=>
string(16) "[email protected]"
}
Actual result:
--------------
array(4) {
[0]=>
string(15) "Name,Desc,Email"
[1]=>
string(14) "Alex,"Is a PHP"
[2]=>
string(9) "developer"
[3]=>
string(17) ",[email protected]"
}
------------------------------------------------------------------------
--
Edit this bug report at https://bugs.php.net/bug.php?id=55763&edit=1