Mostafa Mokhtar created IMPALA-5621:
---------------------------------------

             Summary: Apply Parquet stats optimizations in conjunction with 
predicates against Parquet stats
                 Key: IMPALA-5621
                 URL: https://issues.apache.org/jira/browse/IMPALA-5621
             Project: IMPALA
          Issue Type: Sub-task
          Components: Backend
            Reporter: Mostafa Mokhtar
            Assignee: Taras Bobrovytsky


Impala can skip processing blocks based on predicates against Parquet 
statistics, for Rowgroups that qualify the predicates use data stored in the 
Parquet statistics to speedup the query 

{code}
select count(*), max(ss_item_sk) from store_sales where where ss_item_sk > 10 
and ss_item_sk < 9999999999; 
{code}

For RowGroups that have min(ss_item_sk) > 10 and max(ss_item_sk) the scanner 
should use the count stored in the stats opposed to evaluating each row in the 
RowGroup, same thing applies to min/max values. 




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to