[jira] [Commented] (ARROW-12661) [C++] CSV add skip rows after column names

Nate Clark (Jira) Mon, 10 May 2021 07:21:06 -0700


    [ 
https://issues.apache.org/jira/browse/ARROW-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17341933#comment-17341933
 ]


Nate Clark commented on ARROW-12661:
------------------------------------

Sorry, I wrote this in the PR first, so repeating it here.

Is the preferred solution the current implementation of 
{{ReadOptions::skip_rows_after_header or the more flexible approach of turning 
skip_rows into a vector of indexes to skip. While the first does fulfill the 
original request of this ticket I do like the flexibility offered by the row 
ranges to skip. If the ranges is preferred I think it makes sense for me to 
work on ARROW-12675 so that the absolute row is known throughout the parser 
layers}}

> [C++] CSV add skip rows after column names
> ------------------------------------------
>
>                 Key: ARROW-12661
>                 URL: https://issues.apache.org/jira/browse/ARROW-12661
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Nate Clark
>            Priority: Major
>              Labels: csv, pull-request-available
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Some programs generate csv files with additional descriptive information 
> about the columns on a row after the names. For files like this it would be 
> nice to have an option which reads the first row as column names and then can 
> skip those rows after the names.
> This could probably be implemented easily as either another option parallel 
> ReadOptions::skip_rows or a boolean which indicates if skipping should occur 
> before or after the column names are read.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARROW-12661) [C++] CSV add skip rows after column names

Reply via email to