[jira] [Commented] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty

2015-10-16 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961662#comment-14961662
 ] 

Sushanth Sowmyan commented on HIVE-12083:
-

Thanks, Thejas!

Committed to branch-1, branch-1.2 and master, where HIVE-10965 exists.

> HIVE-10965 introduces thrift error if partNames or colNames are empty
> -
>
> Key: HIVE-12083
> URL: https://issues.apache.org/jira/browse/HIVE-12083
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 1.0.2
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-12083.2.patch, HIVE-12083.patch
>
>
> In the fix for HIVE-10965, there is a short-circuit path that causes an empty 
> AggrStats object to be returned if partNames is empty or colNames is empty:
> {code}
> diff --git 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> index 0a56bac..ed810d2 100644
> --- 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> +++ 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats(
>public AggrStats aggrColStatsForPartitions(String dbName, String tableName,
>List partNames, List colNames, boolean 
> useDensityFunctionForNDVEstimation)
>throws MetaException {
> +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); 
> // Nothing to aggregate.
>  long partsFound = partsFoundForPartitions(dbName, tableName, partNames, 
> colNames);
>  List colStatsList;
>  // Try to read from the cache first
> {code}
> This runs afoul of thrift requirements that AggrStats have required fields:
> {code}
> struct AggrStats {
> 1: required list colStats,
> 2: required i64 partsFound // number of partitions for which stats were found
> }
> {code}
> Thus, we get errors as follows:
> {noformat}
> 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer 
> (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing 
> of message.
> org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is 
> unset! Struct:AggrStats(colStats:null, partsFound:0)
> at 
> org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Normally, this would not occur since HIVE-10965 does also include a guard on 
> the client-side for colNames.isEmpty() to not call the metastore call at all, 
> but there is no guard for partNames being empty, and would still cause an 
> error on the metastore side if the thrift call were called directly, as would 
> happen if the client is from an older version before this was patched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty

2015-10-16 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961432#comment-14961432
 ] 

Thejas M Nair commented on HIVE-12083:
--

+1

> HIVE-10965 introduces thrift error if partNames or colNames are empty
> -
>
> Key: HIVE-12083
> URL: https://issues.apache.org/jira/browse/HIVE-12083
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 1.0.2
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-12083.2.patch, HIVE-12083.patch
>
>
> In the fix for HIVE-10965, there is a short-circuit path that causes an empty 
> AggrStats object to be returned if partNames is empty or colNames is empty:
> {code}
> diff --git 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> index 0a56bac..ed810d2 100644
> --- 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> +++ 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats(
>public AggrStats aggrColStatsForPartitions(String dbName, String tableName,
>List partNames, List colNames, boolean 
> useDensityFunctionForNDVEstimation)
>throws MetaException {
> +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); 
> // Nothing to aggregate.
>  long partsFound = partsFoundForPartitions(dbName, tableName, partNames, 
> colNames);
>  List colStatsList;
>  // Try to read from the cache first
> {code}
> This runs afoul of thrift requirements that AggrStats have required fields:
> {code}
> struct AggrStats {
> 1: required list colStats,
> 2: required i64 partsFound // number of partitions for which stats were found
> }
> {code}
> Thus, we get errors as follows:
> {noformat}
> 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer 
> (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing 
> of message.
> org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is 
> unset! Struct:AggrStats(colStats:null, partsFound:0)
> at 
> org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Normally, this would not occur since HIVE-10965 does also include a guard on 
> the client-side for colNames.isEmpty() to not call the metastore call at all, 
> but there is no guard for partNames being empty, and would still cause an 
> error on the metastore side if the thrift call were called directly, as would 
> happen if the client is from an older version before this was patched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty

2015-10-14 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14957767#comment-14957767
 ] 

Sushanth Sowmyan commented on HIVE-12083:
-

[~thejas]/[~ashutoshc], can I bug either of you for a review for the updated 
patch?

> HIVE-10965 introduces thrift error if partNames or colNames are empty
> -
>
> Key: HIVE-12083
> URL: https://issues.apache.org/jira/browse/HIVE-12083
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 1.0.2
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-12083.2.patch, HIVE-12083.patch
>
>
> In the fix for HIVE-10965, there is a short-circuit path that causes an empty 
> AggrStats object to be returned if partNames is empty or colNames is empty:
> {code}
> diff --git 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> index 0a56bac..ed810d2 100644
> --- 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> +++ 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats(
>public AggrStats aggrColStatsForPartitions(String dbName, String tableName,
>List partNames, List colNames, boolean 
> useDensityFunctionForNDVEstimation)
>throws MetaException {
> +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); 
> // Nothing to aggregate.
>  long partsFound = partsFoundForPartitions(dbName, tableName, partNames, 
> colNames);
>  List colStatsList;
>  // Try to read from the cache first
> {code}
> This runs afoul of thrift requirements that AggrStats have required fields:
> {code}
> struct AggrStats {
> 1: required list colStats,
> 2: required i64 partsFound // number of partitions for which stats were found
> }
> {code}
> Thus, we get errors as follows:
> {noformat}
> 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer 
> (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing 
> of message.
> org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is 
> unset! Struct:AggrStats(colStats:null, partsFound:0)
> at 
> org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Normally, this would not occur since HIVE-10965 does also include a guard on 
> the client-side for colNames.isEmpty() to not call the metastore call at all, 
> but there is no guard for partNames being empty, and would still cause an 
> error on the metastore side if the thrift call were called directly, as would 
> happen if the client is from an older version before this was patched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty

2015-10-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955751#comment-14955751
 ] 

Hive QA commented on HIVE-12083:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12766245/HIVE-12083.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9682 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5631/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5631/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5631/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12766245 - PreCommit-HIVE-TRUNK-Build

> HIVE-10965 introduces thrift error if partNames or colNames are empty
> -
>
> Key: HIVE-12083
> URL: https://issues.apache.org/jira/browse/HIVE-12083
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 1.0.2
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-12083.2.patch, HIVE-12083.patch
>
>
> In the fix for HIVE-10965, there is a short-circuit path that causes an empty 
> AggrStats object to be returned if partNames is empty or colNames is empty:
> {code}
> diff --git 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> index 0a56bac..ed810d2 100644
> --- 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> +++ 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats(
>public AggrStats aggrColStatsForPartitions(String dbName, String tableName,
>List partNames, List colNames, boolean 
> useDensityFunctionForNDVEstimation)
>throws MetaException {
> +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); 
> // Nothing to aggregate.
>  long partsFound = partsFoundForPartitions(dbName, tableName, partNames, 
> colNames);
>  List colStatsList;
>  // Try to read from the cache first
> {code}
> This runs afoul of thrift requirements that AggrStats have required fields:
> {code}
> struct AggrStats {
> 1: required list colStats,
> 2: required i64 partsFound // number of partitions for which stats were found
> }
> {code}
> Thus, we get errors as follows:
> {noformat}
> 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer 
> (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing 
> of message.
> org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is 
> unset! Struct:AggrStats(colStats:null, partsFound:0)
> at 
> org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536)
>

[jira] [Commented] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty

2015-10-12 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953504#comment-14953504
 ] 

Sushanth Sowmyan commented on HIVE-12083:
-

Spoke to ashutosh about this - going to make one more change - in addition to 
the short-circuit on the client side, the desired behaviour on the client side 
would also be to return an empty AggrStats rather than returning null.

> HIVE-10965 introduces thrift error if partNames or colNames are empty
> -
>
> Key: HIVE-12083
> URL: https://issues.apache.org/jira/browse/HIVE-12083
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 1.0.2
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-12083.patch
>
>
> In the fix for HIVE-10965, there is a short-circuit path that causes an empty 
> AggrStats object to be returned if partNames is empty or colNames is empty:
> {code}
> diff --git 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> index 0a56bac..ed810d2 100644
> --- 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> +++ 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats(
>public AggrStats aggrColStatsForPartitions(String dbName, String tableName,
>List partNames, List colNames, boolean 
> useDensityFunctionForNDVEstimation)
>throws MetaException {
> +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); 
> // Nothing to aggregate.
>  long partsFound = partsFoundForPartitions(dbName, tableName, partNames, 
> colNames);
>  List colStatsList;
>  // Try to read from the cache first
> {code}
> This runs afoul of thrift requirements that AggrStats have required fields:
> {code}
> struct AggrStats {
> 1: required list colStats,
> 2: required i64 partsFound // number of partitions for which stats were found
> }
> {code}
> Thus, we get errors as follows:
> {noformat}
> 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer 
> (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing 
> of message.
> org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is 
> unset! Struct:AggrStats(colStats:null, partsFound:0)
> at 
> org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Normally, this would not occur since HIVE-10965 does also include a guard on 
> the client-side for colNames.isEmpty() to not call the metastore call at all, 
> but there is no guard for partNames being empty, and would still cause an 
> error on the metastore side if the thrift call were called directly, as would 
> happen if the client is from an older version before this was patched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty

2015-10-12 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953421#comment-14953421
 ] 

Sushanth Sowmyan commented on HIVE-12083:
-

> Should we short circuit for empty partitions case as well in the client side ?

I think that makes sense and we should. I didn't initially because I hadn't 
evaluated the calling codepath to see if there was a difference between a null 
return and an empty return for AggrStats from the HMSC for the empty partNames 
case. Now that I've looked through that in some detail, I am for it. I will 
update the patch.

> Does the case where table has not partition columns also use the 
> getAggrColStatsFor method ? If that is the case we should not be 
> shortcircuting this way.

I thought of that, but irrespective of whether the client short-circuits, the 
metastore server will short circuit anyway, it's only a matter of a difference 
between returning null and an empty object.

> HIVE-10965 introduces thrift error if partNames or colNames are empty
> -
>
> Key: HIVE-12083
> URL: https://issues.apache.org/jira/browse/HIVE-12083
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 1.0.2
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-12083.patch
>
>
> In the fix for HIVE-10965, there is a short-circuit path that causes an empty 
> AggrStats object to be returned if partNames is empty or colNames is empty:
> {code}
> diff --git 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> index 0a56bac..ed810d2 100644
> --- 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> +++ 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats(
>public AggrStats aggrColStatsForPartitions(String dbName, String tableName,
>List partNames, List colNames, boolean 
> useDensityFunctionForNDVEstimation)
>throws MetaException {
> +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); 
> // Nothing to aggregate.
>  long partsFound = partsFoundForPartitions(dbName, tableName, partNames, 
> colNames);
>  List colStatsList;
>  // Try to read from the cache first
> {code}
> This runs afoul of thrift requirements that AggrStats have required fields:
> {code}
> struct AggrStats {
> 1: required list colStats,
> 2: required i64 partsFound // number of partitions for which stats were found
> }
> {code}
> Thus, we get errors as follows:
> {noformat}
> 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer 
> (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing 
> of message.
> org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is 
> unset! Struct:AggrStats(colStats:null, partsFound:0)
> at 
> org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Normally, this would not oc

[jira] [Commented] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty

2015-10-11 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14952481#comment-14952481
 ] 

Thejas M Nair commented on HIVE-12083:
--

The patch looks good to me. Thanks for adding the tests! +1

I have two follow up questions, which could be addressed in separate jiras -
1. Should we short circuit for empty partitions case as well in the client side 
?
2. Does the case where table has not partition columns also use the 
getAggrColStatsFor method ? If that is the case we should not be shortcircuting 
this way. [~ashutoshc]

> HIVE-10965 introduces thrift error if partNames or colNames are empty
> -
>
> Key: HIVE-12083
> URL: https://issues.apache.org/jira/browse/HIVE-12083
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 1.0.2
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-12083.patch
>
>
> In the fix for HIVE-10965, there is a short-circuit path that causes an empty 
> AggrStats object to be returned if partNames is empty or colNames is empty:
> {code}
> diff --git 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> index 0a56bac..ed810d2 100644
> --- 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> +++ 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats(
>public AggrStats aggrColStatsForPartitions(String dbName, String tableName,
>List partNames, List colNames, boolean 
> useDensityFunctionForNDVEstimation)
>throws MetaException {
> +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); 
> // Nothing to aggregate.
>  long partsFound = partsFoundForPartitions(dbName, tableName, partNames, 
> colNames);
>  List colStatsList;
>  // Try to read from the cache first
> {code}
> This runs afoul of thrift requirements that AggrStats have required fields:
> {code}
> struct AggrStats {
> 1: required list colStats,
> 2: required i64 partsFound // number of partitions for which stats were found
> }
> {code}
> Thus, we get errors as follows:
> {noformat}
> 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer 
> (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing 
> of message.
> org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is 
> unset! Struct:AggrStats(colStats:null, partsFound:0)
> at 
> org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Normally, this would not occur since HIVE-10965 does also include a guard on 
> the client-side for colNames.isEmpty() to not call the metastore call at all, 
> but there is no guard for partNames being empty, and would still cause an 
> error on the metastore side if the thrift call were called directly, as would 
> happen if the client is from an older version before this was patched.



--
This message was sent by 

[jira] [Commented] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty

2015-10-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14952260#comment-14952260
 ] 

Hive QA commented on HIVE-12083:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12765951/HIVE-12083.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9667 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5608/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5608/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5608/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12765951 - PreCommit-HIVE-TRUNK-Build

> HIVE-10965 introduces thrift error if partNames or colNames are empty
> -
>
> Key: HIVE-12083
> URL: https://issues.apache.org/jira/browse/HIVE-12083
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 1.0.2
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-12083.patch
>
>
> In the fix for HIVE-10965, there is a short-circuit path that causes an empty 
> AggrStats object to be returned if partNames is empty or colNames is empty:
> {code}
> diff --git 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> index 0a56bac..ed810d2 100644
> --- 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> +++ 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats(
>public AggrStats aggrColStatsForPartitions(String dbName, String tableName,
>List partNames, List colNames, boolean 
> useDensityFunctionForNDVEstimation)
>throws MetaException {
> +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); 
> // Nothing to aggregate.
>  long partsFound = partsFoundForPartitions(dbName, tableName, partNames, 
> colNames);
>  List colStatsList;
>  // Try to read from the cache first
> {code}
> This runs afoul of thrift requirements that AggrStats have required fields:
> {code}
> struct AggrStats {
> 1: required list colStats,
> 2: required i64 partsFound // number of partitions for which stats were found
> }
> {code}
> Thus, we get errors as follows:
> {noformat}
> 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer 
> (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing 
> of message.
> org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is 
> unset! Struct:AggrStats(colStats:null, partsFound:0)
> at 
> org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(H