> On Jan. 19, 2017, 11:23 p.m., pengcheng xiong wrote: > >
XiongPeng, I updated the 2nd patches and also replied you comments, please take a look to see if they make sense. Thanks > On Jan. 19, 2017, 11:23 p.m., pengcheng xiong wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java, line 3645 > > <https://reviews.apache.org/r/55731/diff/1/?file=1609803#file1609803line3645> > > > > If it is for user update stats, you will have > > StatsSetupConst.STATS_GENERATED = StatsSetupConst.USER automatically. Thus > > it is not necessary to have/call hasStatsInParameters function. > > Chaoyu Tang wrote: > Thanks PengCheng for reviewing the patch. I looked through the code > related to this STATS status and feel a little confusing, here are some > questions I need help from you. To my understanding: > 1. For all operations which set STATS_GENERATED to TASK, we should set > BASIC_STATS as TRUE, right? > 2. But for the stats row_count, raw_data_size updated by the command > alter table .. update statistics, the STATS_GENERATED is USER, should we set > BASIC_STATS as TRUE or FALSE? > 3. These stats could sometimes be set as parameters using command alter > table .. settblproperties, in these cases, the BASIC_STATS should be set > FALSE? > > pengcheng xiong wrote: > 1. yes. what i mean here is that, you can just check if > STATS_GENERATED==StatsSetupConst.USER to determine whether it > hasStatsInParameters. > 2. set it to FALSE. But note that when we call > "StatsSetupConst.setBasicStatsState(params, StatsSetupConst.FALSE);" we just > remove the entry for "COLUMN_STATS_ACCURATE". Thus, set it to false is equal > to remove the entry. > 3. yes. it should be set to FALSE. And as a result, there is no entry for > "COLUMN_STATS_ACCURATE" In item 3, these stats might include row_count, raw_data_size, numFiles, totalSize etc. > On Jan. 19, 2017, 11:23 p.m., pengcheng xiong wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java, line 3657 > > <https://reviews.apache.org/r/55731/diff/1/?file=1609803#file1609803line3657> > > > > And, could u just remove the property rather than set it to false? We > > treat the missing of the property the same as set it to false. > > Chaoyu Tang wrote: > alterTableOrSinglePartition could be called in a loop in the method > alterTable(Hive db, AlterTableDesc alterTbl) (see DDLTask line 3408). Setting > it to false could avoid it to be reset to true in the beginning of the > method, I wonder if it makes sense. > > pengcheng xiong wrote: > Yes, i understand and I agree with the logic that you want to do. My > point is that, rather than put <DO_NOT_UPDATE_STATS, FALSE> entry, we can > just remove the key <DO_NOT_UPDATE_STATS>. We will check whether the property > contains that key later. See Line 219 in MetaStoreUtils.java. removing the key <DO_NOT_UPDATE_STATS> instead of setting it to FALSE. > On Jan. 19, 2017, 11:23 p.m., pengcheng xiong wrote: > > ql/src/test/queries/clientpositive/alter_table_stats_status.q, line 1 > > <https://reviews.apache.org/r/55731/diff/1/?file=1609804#file1609804line1> > > > > Could u add some more test cases for partition stats? Thanks. > > Chaoyu Tang wrote: > Will provide soon in next patch after testing. Add the tests for partition stats. > On Jan. 19, 2017, 11:23 p.m., pengcheng xiong wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java, line 3655 > > <https://reviews.apache.org/r/55731/diff/1/?file=1609803#file1609803line3655> > > > > The same as above See comments in code: // STATS_GENERATED==StatsSetupConst.USER is set only when the stats row_count, raw_data_size // are updated using alter table .. update statistics .. But the stats parameters including // totalSize etc could also be set using command like alter table .. set tblproperties, in this // case, STATS_GENERATED will not be set, so we check hasStatsInParameters(alterTbl.getProps()) // here which also covered the case were STATS_GENERATED==StatsSetupConst.USER To see if it makes sense to use hasStatsInParameters instead of only checking STATS_GENERATED==StatsSetupConst.USER - Chaoyu ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/55731/#review162357 ----------------------------------------------------------- On Jan. 19, 2017, 10:29 p.m., Chaoyu Tang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/55731/ > ----------------------------------------------------------- > > (Updated Jan. 19, 2017, 10:29 p.m.) > > > Review request for hive and pengcheng xiong. > > > Bugs: HIVE-15653 > https://issues.apache.org/jira/browse/HIVE-15653 > > > Repository: hive-git > > > Description > ------- > > For most of alter table operations like table rename, add columns, change > column type etc (besides the set table properties), the table stats status > should not change. But for some other operations like update statistics, > change location, the basic stats status should change. > > > Diffs > ----- > > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java a1fb874 > ql/src/test/queries/clientpositive/alter_table_stats_status.q PRE-CREATION > ql/src/test/results/clientpositive/alter_table_stats_status.q.out > PRE-CREATION > > Diff: https://reviews.apache.org/r/55731/diff/ > > > Testing > ------- > > 1. Manual tests > 2. new unit tests > > > Thanks, > > Chaoyu Tang > >