Weston Pace created ARROW-12371:
-----------------------------------

             Summary: [C++] Allow EnumeratingGenerator to be async-reentrant
                 Key: ARROW-12371
                 URL: https://issues.apache.org/jira/browse/ARROW-12371
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
            Reporter: Weston Pace


The combination of EnumeratingGenerator and ResequencingGenerator can be used 
to process items in a "first available" fashion.  This is currently used in the 
scanner to compensate for intermittent fragment performance.

A potential further improvement would be to use this same pattern for 
out-of-order readahead.  For example, when reading a parquet file or an IPC 
file via S3 the reader may request multiple batches in parallel.  If the next 
batch is slow but the later batches are fast we could start processing the 
later batches while we wait for the next batch.

This would be a pretty minor improvement to latency (probably won't affect 
throughput much) so I don't know that it is a very high priority fix.  It may 
be best to wait until profiling shows this is an issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to