[ 
https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-6578:
-----------------------------

    Description: ORC provides file level statistics which can be used in 
analyze partialscan and noscan cases to compute basic statistics like number of 
rows, number of files, total file size and raw data size. On the writer side, a 
new interface was added earlier (StatsProvidingRecordWriter) that exposed stats 
when writing a table. Similarly, a new interface StatsProvidingRecordReader can 
be added which when implemented should provide stats that are gathered by the 
underlying file format.  (was: ORC provides file level statistics which can be 
used in analyze partialscan and noscan cases to compute basic statistics like 
number of rows, number of files, total file size and raw data size.)

> Use ORC file footer statistics through StatsProvidingRecordReader interface 
> for analyze command
> -----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-6578
>                 URL: https://issues.apache.org/jira/browse/HIVE-6578
>             Project: Hive
>          Issue Type: New Feature
>    Affects Versions: 0.13.0
>            Reporter: Prasanth J
>            Assignee: Prasanth J
>              Labels: orcfile
>         Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch
>
>
> ORC provides file level statistics which can be used in analyze partialscan 
> and noscan cases to compute basic statistics like number of rows, number of 
> files, total file size and raw data size. On the writer side, a new interface 
> was added earlier (StatsProvidingRecordWriter) that exposed stats when 
> writing a table. Similarly, a new interface StatsProvidingRecordReader can be 
> added which when implemented should provide stats that are gathered by the 
> underlying file format.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to