[ https://issues.apache.org/jira/browse/HIVE-15131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Adesh Kumar Rao updated HIVE-15131: ----------------------------------- Status: Open (was: Patch Available) > Change Parquet reader to read metadata on the task side > ------------------------------------------------------- > > Key: HIVE-15131 > URL: https://issues.apache.org/jira/browse/HIVE-15131 > Project: Hive > Issue Type: Bug > Components: Reader > Reporter: Chao Sun > Assignee: Adesh Kumar Rao > Priority: Major > Attachments: HIVE-15131.1.patch, HIVE-15131.2.patch, > HIVE-15131.3.patch, HIVE-15131.4.patch > > > Currently the {{ParquetRecordReaderWrapper}} still uses the {{readFooter}} > API without filtering, which means it needs to read metadata about all row > groups every time. This could some issues when input dataset is particularly > big and has many columns. > [Parquet-84|https://issues.apache.org/jira/browse/PARQUET-84] introduced > another API which allows to do row group filtering on the task side. Hive > should adopt this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)