[ 
https://issues.apache.org/jira/browse/ARROW-6004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16891078#comment-16891078
 ] 

Antoine Pitrou commented on ARROW-6004:
---------------------------------------

Pandas does this:
{code:python}
>>> pd.read_csv(io.BytesIO(b"""ab,cd\n12,34\n\r\n56,78\n"""))                   
>>>                                                                             
>>>           
   ab  cd
0  12  34
1  56  78
>>> pd.read_csv(io.BytesIO(b"""ab,cd\n12,34\n\r\n56,78\n"""), 
>>> skip_blank_lines=False)                                                     
>>>                             
     ab    cd
0  12.0  34.0
1   NaN   NaN
2  56.0  78.0
{code}


> [C++] CSV reader ignore_empty_lines option doesn't handle empty lines
> ---------------------------------------------------------------------
>
>                 Key: ARROW-6004
>                 URL: https://issues.apache.org/jira/browse/ARROW-6004
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>            Reporter: Neal Richardson
>            Priority: Minor
>              Labels: csv
>
> Followup to https://issues.apache.org/jira/browse/ARROW-5747. If 
> {{ignore_empty_lines}} is false and there are empty lines, it fails to parse 
> (again, with {{Invalid: Empty CSV file}}).
> Correct behavior should be to fill those empty lines with missing data for 
> all columns.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to