> On Jan. 24, 2017, 5:58 a.m., pengcheng xiong wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java, line 3650 > > <https://reviews.apache.org/r/55731/diff/3/?file=1613301#file1613301line3650> > > > > Thanks Chaoyu for the patch and comments. I still think we do not need > > the function hasStatsInParameters. We only care about row_count and > > raw_data_size, especially row_count. We use row_count to answer some query > > directly from metastore. The other parameters, totalSize etc do not matter. > > The other part of the patch looks good to me. Thanks.
Thanks, PengCheng. In current code, command "alter table .. set tblproperty" can also set the stats including row_count, raw_data_size, without setting the flag STATS_GENERATED to USER like it does for "alter table .. update statistics" in DDLSemanticAnalyzer. So in this case, we need a way in DDLTask alterTableOrSinglePartition to determine whether this property change is related to the stats. I used hasStatsInParameters, though it also include the change of numFiles & totalSize into consideration. I quite do not understand, if we only care about row_count/raw_data_size, why do we recalcualte the fastStats (numFiles/totalSize) in HMS in a lot of cases? Also I want to confirm that currently COLUMN_STATS_ACCURATE and BASIC_STATS only reflect the stats accuracy of row_count and raw_data_size? If so, is there any use of numFiles/totalSize? - Chaoyu ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/55731/#review162771 ----------------------------------------------------------- On Jan. 24, 2017, 5:01 a.m., Chaoyu Tang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/55731/ > ----------------------------------------------------------- > > (Updated Jan. 24, 2017, 5:01 a.m.) > > > Review request for hive and pengcheng xiong. > > > Bugs: HIVE-15653 > https://issues.apache.org/jira/browse/HIVE-15653 > > > Repository: hive-git > > > Description > ------- > > For most of alter table operations like table rename, add columns, change > column type etc (besides the set table properties), the table stats status > should not change. But for some other operations like update statistics, > change location, the basic stats status should change. > > > Diffs > ----- > > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java > 4aea152 > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java a1fb874 > ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java > 0f472e7 > ql/src/test/queries/clientpositive/alter_table_stats_status.q PRE-CREATION > ql/src/test/results/clientpositive/alter_table_stats_status.q.out > PRE-CREATION > > Diff: https://reviews.apache.org/r/55731/diff/ > > > Testing > ------- > > 1. Manual tests > 2. new unit tests > > > Thanks, > > Chaoyu Tang > >