You have to explicit specifics column list in analyze command for gathering 
columns stats.

This command will only collect basic stats like number of rows, total file 
size, raw data size, number of files.
analyze table user_table partition(dt='2014-06-01',hour='00') compute 
statistics;

To collect column statistics add the column list like below
analyze table user_table partition(dt='2014-06-01',hour='00') compute 
statistics for columns a, b, c;

Thanks
Prasanth Jayachandran

On Jul 24, 2014, at 5:13 AM, Sandeep Samudrala <sandeep.samudr...@inmobi.com> 
wrote:

> I am trying to enable Column statistics usage with Parquet tables. This is 
> the query I am executing. However on explain, I see that even though Basic 
> stats: COMPLETE is seen Column stats is seen asNONE.
> Can someone please explain what else I need to debug/fix this.
> 
> set hive.compute.query.using.stats=true;
> set hive.stats.reliable=true;
> set hive.stats.fetch.column.stats=true;
> set hive.stats.fetch.partition.stats=true;
> set hive.cbo.enable=true;
> 
> analyze table user_table partition(dt='2014-06-01',hour='00') compute 
> statistics;
> 
> explain select min(a), max(b), min(c) from user_table;
> 
> hive> explain select min(a), max(b), min(c) from usertable;
> OK
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 is a root stage
> 
> STAGE PLANS:
>   Stage: Stage-1
>     Map Reduce
>       Map Operator Tree:
>           TableScan
>             alias: user_table
>             Statistics: Num rows: 55490383 Data size: 1831182639 Basic stats: 
> COMPLETE Column stats: NONE
>             Select Operator
>               expressions: a (type: double), b (type: double), c (type: int)
>               outputColumnNames: a, b, c
>               Statistics: Num rows: 55490383 Data size: 1831182639 Basic 
> stats: COMPLETE Column stats: NONE
>               Group By Operator
>                 aggregations: min(a), max(b), min(c)
>                 mode: hash
>                 outputColumnNames: _col0, _col1, _col2
>                 Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE 
> Column stats: NONE
>                 Reduce Output Operator
>                   sort order:
>                   Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE 
> Column stats: NONE
>                   value expressions: _col0 (type: double), _col1 (type: 
> double), _col2 (type: int)
>       Reduce Operator Tree:
>         Group By Operator
>           aggregations: min(VALUE._col0), max(VALUE._col1), min(VALUE._col2)
>           mode: mergepartial
>           outputColumnNames: _col0, _col1, _col2
>           Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE Column 
> stats: NONE
>           Select Operator
>             expressions: _col0 (type: double), _col1 (type: double), _col2 
> (type: int)
>             outputColumnNames: _col0, _col1, _col2
>             Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE 
> Column stats: NONE
>             File Output Operator
>               compressed: false
>               Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE 
> Column stats: NONE
>               table:
>                   input format: org.apache.hadoop.mapred.TextInputFormat
>                   output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>                   serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> 
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
> 
> 
> Thanks,
> -sandeep
> 
> _____________________________________________________________
> The information contained in this communication is intended solely for the 
> use of the individual or entity to whom it is addressed and others authorized 
> to receive it. It may contain confidential or legally privileged information. 
> If you are not the intended recipient you are hereby notified that any 
> disclosure, copying, distribution or taking any action in reliance on the 
> contents of this information is strictly prohibited and may be unlawful. If 
> you have received this communication in error, please notify us immediately 
> by responding to this email and then delete it from your system. The firm is 
> neither liable for the proper and complete transmission of the information 
> contained in this communication nor for any delay in its receipt.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Reply via email to