You have to explicit specifics column list in analyze command for gathering columns stats.
This command will only collect basic stats like number of rows, total file size, raw data size, number of files. analyze table user_table partition(dt='2014-06-01',hour='00') compute statistics; To collect column statistics add the column list like below analyze table user_table partition(dt='2014-06-01',hour='00') compute statistics for columns a, b, c; Thanks Prasanth Jayachandran On Jul 24, 2014, at 5:13 AM, Sandeep Samudrala <sandeep.samudr...@inmobi.com> wrote: > I am trying to enable Column statistics usage with Parquet tables. This is > the query I am executing. However on explain, I see that even though Basic > stats: COMPLETE is seen Column stats is seen asNONE. > Can someone please explain what else I need to debug/fix this. > > set hive.compute.query.using.stats=true; > set hive.stats.reliable=true; > set hive.stats.fetch.column.stats=true; > set hive.stats.fetch.partition.stats=true; > set hive.cbo.enable=true; > > analyze table user_table partition(dt='2014-06-01',hour='00') compute > statistics; > > explain select min(a), max(b), min(c) from user_table; > > hive> explain select min(a), max(b), min(c) from usertable; > OK > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 is a root stage > > STAGE PLANS: > Stage: Stage-1 > Map Reduce > Map Operator Tree: > TableScan > alias: user_table > Statistics: Num rows: 55490383 Data size: 1831182639 Basic stats: > COMPLETE Column stats: NONE > Select Operator > expressions: a (type: double), b (type: double), c (type: int) > outputColumnNames: a, b, c > Statistics: Num rows: 55490383 Data size: 1831182639 Basic > stats: COMPLETE Column stats: NONE > Group By Operator > aggregations: min(a), max(b), min(c) > mode: hash > outputColumnNames: _col0, _col1, _col2 > Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE > Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE > Column stats: NONE > value expressions: _col0 (type: double), _col1 (type: > double), _col2 (type: int) > Reduce Operator Tree: > Group By Operator > aggregations: min(VALUE._col0), max(VALUE._col1), min(VALUE._col2) > mode: mergepartial > outputColumnNames: _col0, _col1, _col2 > Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE Column > stats: NONE > Select Operator > expressions: _col0 (type: double), _col1 (type: double), _col2 > (type: int) > outputColumnNames: _col0, _col1, _col2 > Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE > Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE > Column stats: NONE > table: > input format: org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > > Stage: Stage-0 > Fetch Operator > limit: -1 > > > Thanks, > -sandeep > > _____________________________________________________________ > The information contained in this communication is intended solely for the > use of the individual or entity to whom it is addressed and others authorized > to receive it. It may contain confidential or legally privileged information. > If you are not the intended recipient you are hereby notified that any > disclosure, copying, distribution or taking any action in reliance on the > contents of this information is strictly prohibited and may be unlawful. If > you have received this communication in error, please notify us immediately > by responding to this email and then delete it from your system. The firm is > neither liable for the proper and complete transmission of the information > contained in this communication nor for any delay in its receipt. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.