> On Jan. 19, 2017, 11:23 p.m., pengcheng xiong wrote:
> >

XiongPeng, I updated the 2nd patches and also replied you comments, please take 
a look to see if they make sense. Thanks


> On Jan. 19, 2017, 11:23 p.m., pengcheng xiong wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java, line 3645
> > <https://reviews.apache.org/r/55731/diff/1/?file=1609803#file1609803line3645>
> >
> >     If it is for user update stats, you will have     
> > StatsSetupConst.STATS_GENERATED = StatsSetupConst.USER automatically. Thus 
> > it is not necessary to have/call hasStatsInParameters function.
> 
> Chaoyu Tang wrote:
>     Thanks PengCheng for reviewing the patch. I looked through the code 
> related to this STATS status and feel a little confusing, here are some 
> questions I need help from you. To my understanding:
>     1. For all operations which set STATS_GENERATED to TASK, we should set 
> BASIC_STATS as TRUE, right?
>     2. But for the stats row_count, raw_data_size updated by the command 
> alter table .. update statistics, the STATS_GENERATED is USER, should we set 
> BASIC_STATS as TRUE or FALSE?
>     3. These stats could sometimes be set as parameters using command alter 
> table .. settblproperties, in these cases, the BASIC_STATS should be set 
> FALSE?
> 
> pengcheng xiong wrote:
>     1. yes. what i mean here is that, you can just check if 
> STATS_GENERATED==StatsSetupConst.USER to determine whether it 
> hasStatsInParameters. 
>     2. set it to FALSE. But note that when we call 
> "StatsSetupConst.setBasicStatsState(params, StatsSetupConst.FALSE);" we just 
> remove the entry for "COLUMN_STATS_ACCURATE". Thus, set it to false is equal 
> to remove the entry.
>     3. yes. it should be set to FALSE. And as a result, there is no entry for 
> "COLUMN_STATS_ACCURATE"

In item 3, these stats might include row_count, raw_data_size, numFiles, 
totalSize etc.


> On Jan. 19, 2017, 11:23 p.m., pengcheng xiong wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java, line 3657
> > <https://reviews.apache.org/r/55731/diff/1/?file=1609803#file1609803line3657>
> >
> >     And, could u just remove the property rather than set it to false? We 
> > treat the missing of the property the same as set it to false.
> 
> Chaoyu Tang wrote:
>     alterTableOrSinglePartition could be called in a loop in the method 
> alterTable(Hive db, AlterTableDesc alterTbl) (see DDLTask line 3408). Setting 
> it to false could avoid it to be reset to true in the beginning of the 
> method, I wonder if it makes sense.
> 
> pengcheng xiong wrote:
>     Yes, i understand and I agree with the logic that you want to do. My 
> point is that, rather than put <DO_NOT_UPDATE_STATS, FALSE> entry, we can 
> just remove the key <DO_NOT_UPDATE_STATS>. We will check whether the property 
> contains that key later. See Line 219 in MetaStoreUtils.java.

removing the key <DO_NOT_UPDATE_STATS> instead of setting it to FALSE.


> On Jan. 19, 2017, 11:23 p.m., pengcheng xiong wrote:
> > ql/src/test/queries/clientpositive/alter_table_stats_status.q, line 1
> > <https://reviews.apache.org/r/55731/diff/1/?file=1609804#file1609804line1>
> >
> >     Could u add some more test cases for partition stats? Thanks.
> 
> Chaoyu Tang wrote:
>     Will provide soon in next patch after testing.

Add the tests for partition stats.


> On Jan. 19, 2017, 11:23 p.m., pengcheng xiong wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java, line 3655
> > <https://reviews.apache.org/r/55731/diff/1/?file=1609803#file1609803line3655>
> >
> >     The same as above

See comments in code:
      // STATS_GENERATED==StatsSetupConst.USER is set only when the stats 
row_count, raw_data_size
      // are updated using alter table .. update statistics .. But the stats 
parameters including
      // totalSize etc could also be set using command like alter table .. set 
tblproperties, in this
      // case, STATS_GENERATED will not be set, so we check 
hasStatsInParameters(alterTbl.getProps())
      // here which also covered the case were 
STATS_GENERATED==StatsSetupConst.USER
To see if it makes sense to use hasStatsInParameters instead of only checking 
STATS_GENERATED==StatsSetupConst.USER


- Chaoyu


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55731/#review162357
-----------------------------------------------------------


On Jan. 19, 2017, 10:29 p.m., Chaoyu Tang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55731/
> -----------------------------------------------------------
> 
> (Updated Jan. 19, 2017, 10:29 p.m.)
> 
> 
> Review request for hive and pengcheng xiong.
> 
> 
> Bugs: HIVE-15653
>     https://issues.apache.org/jira/browse/HIVE-15653
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> For most of alter table operations like table rename, add columns, change 
> column type etc (besides the set table properties), the table stats status 
> should not change. But for some other operations like update statistics, 
> change location, the basic stats status should change.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java a1fb874 
>   ql/src/test/queries/clientpositive/alter_table_stats_status.q PRE-CREATION 
>   ql/src/test/results/clientpositive/alter_table_stats_status.q.out 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/55731/diff/
> 
> 
> Testing
> -------
> 
> 1. Manual tests
> 2. new unit tests
> 
> 
> Thanks,
> 
> Chaoyu Tang
> 
>

Reply via email to