[ 
https://issues.apache.org/jira/browse/IMPALA-9874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Rawat reassigned IMPALA-9874:
--------------------------------------

    Assignee: Abhishek Rawat

> Reduce or avoid I/O for pruned columns
> --------------------------------------
>
>                 Key: IMPALA-9874
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9874
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend
>            Reporter: Tim Armstrong
>            Assignee: Abhishek Rawat
>            Priority: Critical
>              Labels: parquet
>
> Skipping decoding of values may not be effective at reducing I/O in some 
> cases, because we start the I/O in StartScans(). We don't wait for the I/O 
> until we actually read the first data page from the column reader. So there 
> is a race to determine whether the I/O happens in some cases.
> There are a couple of things we can do here.
> * The basic thing is to issue reads for the column readers in the order in 
> which they are needed. We may be able to get this for free by ordering the 
> column readers based on materialisation order.
> * We also want to avoid issuing I/O for columns that are not needed, if 
> predicates are highly selective. This is maybe a bit harder and avoids more 
> trade-offs, since delaying issuing of the reads may impact scan latency.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to