Chao Sun created HIVE-15131:
-------------------------------

             Summary: Change Parquet reader to read metadata on the task side
                 Key: HIVE-15131
                 URL: https://issues.apache.org/jira/browse/HIVE-15131
             Project: Hive
          Issue Type: Bug
          Components: Reader
            Reporter: Chao Sun
            Assignee: Chao Sun


Currently the {{ParquetRecordReaderWrapper}} still uses the {{readFooter}} API 
without filtering, which means it needs to read metadata about all row groups 
every time. This could some issues when input dataset is particularly big and 
has many columns.

[Parquet-84|https://issues.apache.org/jira/browse/PARQUET-84] introduced 
another API which allows to do row group filtering on the task side. Hive 
should adopt this API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to