[ https://issues.apache.org/jira/browse/DRILL-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Damien Profeta updated DRILL-5795: ---------------------------------- Labels: doc-impacting (was: ) > Filter pushdown for parquet handles multi rowgroup file > ------------------------------------------------------- > > Key: DRILL-5795 > URL: https://issues.apache.org/jira/browse/DRILL-5795 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Parquet > Reporter: Damien Profeta > Labels: doc-impacting > > DRILL-1950 implemented the filter pushdown for parquet file but only in the > case of one rowgroup per parquet file. In the case of multiple rowgroups per > files, it detects that the rowgroup can be pruned but then tell to the > drillbit to read the whole file which leads to performance issue. > Having multiple rowgroup per file helps to handle partitioned dataset and > still read only the relevant subset of data without ending with more file > than really needed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)