Chao Sun created HIVE-15131: ------------------------------- Summary: Change Parquet reader to read metadata on the task side Key: HIVE-15131 URL: https://issues.apache.org/jira/browse/HIVE-15131 Project: Hive Issue Type: Bug Components: Reader Reporter: Chao Sun Assignee: Chao Sun
Currently the {{ParquetRecordReaderWrapper}} still uses the {{readFooter}} API without filtering, which means it needs to read metadata about all row groups every time. This could some issues when input dataset is particularly big and has many columns. [Parquet-84|https://issues.apache.org/jira/browse/PARQUET-84] introduced another API which allows to do row group filtering on the task side. Hive should adopt this API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)