[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-30 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15990334#comment-15990334
 ] 

Pengcheng Xiong commented on HIVE-16147:


+1

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-28 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15989269#comment-15989269
 ] 

Chaoyu Tang commented on HIVE-16147:


[~pxiong] Thanks for looking into this. Yeah, I made some changes to fix the 
test failures and also optimized the code a little. I have uploaded the 2nd 
patch to RB requesting for the review.

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-28 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15989211#comment-15989211
 ] 

Pengcheng Xiong commented on HIVE-16147:


[~ctang.ma], may i ask what did u change from the 1st patch? thanks.

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-28 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15989168#comment-15989168
 ] 

Chaoyu Tang commented on HIVE-16147:


The only one test failure is not related to this patch. [~pxiong] could you 
review the patch? Thanks

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988389#comment-15988389
 ] 

Hive QA commented on HIVE-16147:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865390/HIVE-16147.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10635 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4912/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4912/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4912/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12865390 - PreCommit-HIVE-Build

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143   

[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-27 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987783#comment-15987783
 ] 

Chaoyu Tang commented on HIVE-16147:


The test failures are not related to the patch. [~pxiong], could you help to 
review it again? Thanks

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987696#comment-15987696
 ] 

Hive QA commented on HIVE-16147:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865390/HIVE-16147.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10640 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.ql.parse.TestParseNegativeDriver.testCliDriver[wrong_distinct2]
 (batchId=233)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4894/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4894/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4894/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12865390 - PreCommit-HIVE-Build

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>

[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15985430#comment-15985430
 ] 

Hive QA commented on HIVE-16147:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865159/HIVE-16147.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10636 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_stats_status]
 (batchId=51)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=236)
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testAlterViewParititon
 (batchId=200)
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testAlterViewParititon 
(batchId=203)
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testAlterViewParititon
 (batchId=199)
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyClient.testAlterViewParititon 
(batchId=197)
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testAlterViewParititon 
(batchId=208)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4886/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4886/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4886/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12865159 - PreCommit-HIVE-Build

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> 

[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-24 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15982331#comment-15982331
 ] 

Pengcheng Xiong commented on HIVE-16147:


LGTM. +1 pending tests.

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-24 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15982021#comment-15982021
 ] 

Chaoyu Tang commented on HIVE-16147:


Patch has been uploaded to RB. [~pxiong], could you help to review it. Thanks.

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)