[GitHub] [arrow-rs] Ted-Jiang commented on issue #2197: ArrayReader::skip_records API

GitBox Wed, 27 Jul 2022 20:37:24 -0700


Ted-Jiang commented on issue #2197:
URL: https://github.com/apache/arrow-rs/issues/2197#issuecomment-1197617137


   Yes, I agree this need improvement before make api public.
   
   > Much like RecordReader we need to separate read_records from consuming the 
resulting data, i.e. replace ArrayReader::next_batch with 
ArrayReader::read_records and ArrayReader::consume_batch.
   
   I think you mean: we can call `read_records` multiple times until there are 
enough values in buf then we can call  `consume_batch`. To make sure avoid 
small data patch (now if selection_len less than batch_size will return a batch 
with selection_len rows ).
   
   How about make this combine logic in `impl Iterator for 
ParquetRecordBatchReader` ，if we call  `read_records` multiple times it should 
depend on the `selections`,  why not  add a loop check in Iterator to feed 
enough rows in result batch🤔


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [arrow-rs] Ted-Jiang commented on issue #2197: ArrayReader::skip_records API

Reply via email to