> On Jan. 24, 2017, 5:58 a.m., pengcheng xiong wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java, line 3650
> > <https://reviews.apache.org/r/55731/diff/3/?file=1613301#file1613301line3650>
> >
> >     Thanks Chaoyu for the patch and comments. I still think we do not need 
> > the function hasStatsInParameters. We only care about row_count and 
> > raw_data_size, especially row_count. We use row_count to answer some query 
> > directly from metastore. The other parameters, totalSize etc do not matter. 
> > The other part of the patch looks good to me. Thanks.

Thanks, PengCheng. In current code, command "alter table .. set tblproperty" 
can also set the stats including row_count, raw_data_size, without setting the 
flag STATS_GENERATED to USER  like it does for "alter table .. update 
statistics" in DDLSemanticAnalyzer. So in this case, we need a way in DDLTask 
alterTableOrSinglePartition to determine whether this property change is 
related to the stats. I used hasStatsInParameters, though it also include the 
change of numFiles & totalSize into consideration.
I quite do not understand, if we only care about row_count/raw_data_size, why 
do we recalcualte the fastStats (numFiles/totalSize) in HMS in a lot of cases?
Also I want to confirm that currently COLUMN_STATS_ACCURATE and BASIC_STATS 
only reflect the stats accuracy of row_count and raw_data_size? If so, is there 
any use of numFiles/totalSize?


- Chaoyu


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55731/#review162771
-----------------------------------------------------------


On Jan. 24, 2017, 5:01 a.m., Chaoyu Tang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55731/
> -----------------------------------------------------------
> 
> (Updated Jan. 24, 2017, 5:01 a.m.)
> 
> 
> Review request for hive and pengcheng xiong.
> 
> 
> Bugs: HIVE-15653
>     https://issues.apache.org/jira/browse/HIVE-15653
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> For most of alter table operations like table rename, add columns, change 
> column type etc (besides the set table properties), the table stats status 
> should not change. But for some other operations like update statistics, 
> change location, the basic stats status should change.
> 
> 
> Diffs
> -----
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 4aea152 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java a1fb874 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> 0f472e7 
>   ql/src/test/queries/clientpositive/alter_table_stats_status.q PRE-CREATION 
>   ql/src/test/results/clientpositive/alter_table_stats_status.q.out 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/55731/diff/
> 
> 
> Testing
> -------
> 
> 1. Manual tests
> 2. new unit tests
> 
> 
> Thanks,
> 
> Chaoyu Tang
> 
>

Reply via email to