[jira] [Created] (ARROW-8532) [C++][CSV] Add support for sentinel values.
Ravil Bikbulatov created ARROW-8532: --- Summary: [C++][CSV] Add support for sentinel values. Key: ARROW-8532 URL: https://issues.apache.org/jira/browse/ARROW-8532 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Ravil Bikbulatov Some systems still use sentinel values to store nulls. It would be good if read_csv would place sentinel values and user wouldn't need to convet null bitmaps to sentinel values. Adding this support doesn't contradict Arrow specification as null values are undefined. Also it wouldn't add any overhead to read_csv. Since Arrow is general purpose framework I think we can relieve users from pain of converting bitmats to sentinel values. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8527) [C++][CSV] Add support for ReadOptions::skip_rows >= block_size
Ravil Bikbulatov created ARROW-8527: --- Summary: [C++][CSV] Add support for ReadOptions::skip_rows >= block_size Key: ARROW-8527 URL: https://issues.apache.org/jira/browse/ARROW-8527 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Ravil Bikbulatov Current implementation throws error in reader.cc:286 when skip_rows > header. However, in some workloads skip_rows used for not only skipping header but for just skipping first n-rows. In this case block-size constraint is greatly interferes. I think this constraint could be removed without performance reduction. -- This message was sent by Atlassian Jira (v8.3.4#803005)