[ https://issues.apache.org/jira/browse/IMPALA-9874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Abhishek Rawat reassigned IMPALA-9874: -------------------------------------- Assignee: Abhishek Rawat > Reduce or avoid I/O for pruned columns > -------------------------------------- > > Key: IMPALA-9874 > URL: https://issues.apache.org/jira/browse/IMPALA-9874 > Project: IMPALA > Issue Type: Sub-task > Components: Backend > Reporter: Tim Armstrong > Assignee: Abhishek Rawat > Priority: Critical > Labels: parquet > > Skipping decoding of values may not be effective at reducing I/O in some > cases, because we start the I/O in StartScans(). We don't wait for the I/O > until we actually read the first data page from the column reader. So there > is a race to determine whether the I/O happens in some cases. > There are a couple of things we can do here. > * The basic thing is to issue reads for the column readers in the order in > which they are needed. We may be able to get this for free by ordering the > column readers based on materialisation order. > * We also want to avoid issuing I/O for columns that are not needed, if > predicates are highly selective. This is maybe a bit harder and avoids more > trade-offs, since delaying issuing of the reads may impact scan latency. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org