[jira] [Commented] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty
[ https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961662#comment-14961662 ] Sushanth Sowmyan commented on HIVE-12083: - Thanks, Thejas! Committed to branch-1, branch-1.2 and master, where HIVE-10965 exists. > HIVE-10965 introduces thrift error if partNames or colNames are empty > - > > Key: HIVE-12083 > URL: https://issues.apache.org/jira/browse/HIVE-12083 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.2.1, 1.0.2 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-12083.2.patch, HIVE-12083.patch > > > In the fix for HIVE-10965, there is a short-circuit path that causes an empty > AggrStats object to be returned if partNames is empty or colNames is empty: > {code} > diff --git > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > index 0a56bac..ed810d2 100644 > --- > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > +++ > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats( >public AggrStats aggrColStatsForPartitions(String dbName, String tableName, >List partNames, List colNames, boolean > useDensityFunctionForNDVEstimation) >throws MetaException { > +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); > // Nothing to aggregate. > long partsFound = partsFoundForPartitions(dbName, tableName, partNames, > colNames); > List colStatsList; > // Try to read from the cache first > {code} > This runs afoul of thrift requirements that AggrStats have required fields: > {code} > struct AggrStats { > 1: required list colStats, > 2: required i64 partsFound // number of partitions for which stats were found > } > {code} > Thus, we get errors as follows: > {noformat} > 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer > (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing > of message. > org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is > unset! Struct:AggrStats(colStats:null, partsFound:0) > at > org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Normally, this would not occur since HIVE-10965 does also include a guard on > the client-side for colNames.isEmpty() to not call the metastore call at all, > but there is no guard for partNames being empty, and would still cause an > error on the metastore side if the thrift call were called directly, as would > happen if the client is from an older version before this was patched. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty
[ https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961432#comment-14961432 ] Thejas M Nair commented on HIVE-12083: -- +1 > HIVE-10965 introduces thrift error if partNames or colNames are empty > - > > Key: HIVE-12083 > URL: https://issues.apache.org/jira/browse/HIVE-12083 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.2.1, 1.0.2 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-12083.2.patch, HIVE-12083.patch > > > In the fix for HIVE-10965, there is a short-circuit path that causes an empty > AggrStats object to be returned if partNames is empty or colNames is empty: > {code} > diff --git > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > index 0a56bac..ed810d2 100644 > --- > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > +++ > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats( >public AggrStats aggrColStatsForPartitions(String dbName, String tableName, >List partNames, List colNames, boolean > useDensityFunctionForNDVEstimation) >throws MetaException { > +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); > // Nothing to aggregate. > long partsFound = partsFoundForPartitions(dbName, tableName, partNames, > colNames); > List colStatsList; > // Try to read from the cache first > {code} > This runs afoul of thrift requirements that AggrStats have required fields: > {code} > struct AggrStats { > 1: required list colStats, > 2: required i64 partsFound // number of partitions for which stats were found > } > {code} > Thus, we get errors as follows: > {noformat} > 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer > (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing > of message. > org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is > unset! Struct:AggrStats(colStats:null, partsFound:0) > at > org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Normally, this would not occur since HIVE-10965 does also include a guard on > the client-side for colNames.isEmpty() to not call the metastore call at all, > but there is no guard for partNames being empty, and would still cause an > error on the metastore side if the thrift call were called directly, as would > happen if the client is from an older version before this was patched. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty
[ https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14957767#comment-14957767 ] Sushanth Sowmyan commented on HIVE-12083: - [~thejas]/[~ashutoshc], can I bug either of you for a review for the updated patch? > HIVE-10965 introduces thrift error if partNames or colNames are empty > - > > Key: HIVE-12083 > URL: https://issues.apache.org/jira/browse/HIVE-12083 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.2.1, 1.0.2 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-12083.2.patch, HIVE-12083.patch > > > In the fix for HIVE-10965, there is a short-circuit path that causes an empty > AggrStats object to be returned if partNames is empty or colNames is empty: > {code} > diff --git > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > index 0a56bac..ed810d2 100644 > --- > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > +++ > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats( >public AggrStats aggrColStatsForPartitions(String dbName, String tableName, >List partNames, List colNames, boolean > useDensityFunctionForNDVEstimation) >throws MetaException { > +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); > // Nothing to aggregate. > long partsFound = partsFoundForPartitions(dbName, tableName, partNames, > colNames); > List colStatsList; > // Try to read from the cache first > {code} > This runs afoul of thrift requirements that AggrStats have required fields: > {code} > struct AggrStats { > 1: required list colStats, > 2: required i64 partsFound // number of partitions for which stats were found > } > {code} > Thus, we get errors as follows: > {noformat} > 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer > (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing > of message. > org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is > unset! Struct:AggrStats(colStats:null, partsFound:0) > at > org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Normally, this would not occur since HIVE-10965 does also include a guard on > the client-side for colNames.isEmpty() to not call the metastore call at all, > but there is no guard for partNames being empty, and would still cause an > error on the metastore side if the thrift call were called directly, as would > happen if the client is from an older version before this was patched. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty
[ https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955751#comment-14955751 ] Hive QA commented on HIVE-12083: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12766245/HIVE-12083.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9682 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5631/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5631/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5631/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12766245 - PreCommit-HIVE-TRUNK-Build > HIVE-10965 introduces thrift error if partNames or colNames are empty > - > > Key: HIVE-12083 > URL: https://issues.apache.org/jira/browse/HIVE-12083 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.2.1, 1.0.2 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-12083.2.patch, HIVE-12083.patch > > > In the fix for HIVE-10965, there is a short-circuit path that causes an empty > AggrStats object to be returned if partNames is empty or colNames is empty: > {code} > diff --git > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > index 0a56bac..ed810d2 100644 > --- > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > +++ > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats( >public AggrStats aggrColStatsForPartitions(String dbName, String tableName, >List partNames, List colNames, boolean > useDensityFunctionForNDVEstimation) >throws MetaException { > +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); > // Nothing to aggregate. > long partsFound = partsFoundForPartitions(dbName, tableName, partNames, > colNames); > List colStatsList; > // Try to read from the cache first > {code} > This runs afoul of thrift requirements that AggrStats have required fields: > {code} > struct AggrStats { > 1: required list colStats, > 2: required i64 partsFound // number of partitions for which stats were found > } > {code} > Thus, we get errors as follows: > {noformat} > 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer > (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing > of message. > org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is > unset! Struct:AggrStats(colStats:null, partsFound:0) > at > org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536) >
[jira] [Commented] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty
[ https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953504#comment-14953504 ] Sushanth Sowmyan commented on HIVE-12083: - Spoke to ashutosh about this - going to make one more change - in addition to the short-circuit on the client side, the desired behaviour on the client side would also be to return an empty AggrStats rather than returning null. > HIVE-10965 introduces thrift error if partNames or colNames are empty > - > > Key: HIVE-12083 > URL: https://issues.apache.org/jira/browse/HIVE-12083 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.2.1, 1.0.2 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-12083.patch > > > In the fix for HIVE-10965, there is a short-circuit path that causes an empty > AggrStats object to be returned if partNames is empty or colNames is empty: > {code} > diff --git > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > index 0a56bac..ed810d2 100644 > --- > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > +++ > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats( >public AggrStats aggrColStatsForPartitions(String dbName, String tableName, >List partNames, List colNames, boolean > useDensityFunctionForNDVEstimation) >throws MetaException { > +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); > // Nothing to aggregate. > long partsFound = partsFoundForPartitions(dbName, tableName, partNames, > colNames); > List colStatsList; > // Try to read from the cache first > {code} > This runs afoul of thrift requirements that AggrStats have required fields: > {code} > struct AggrStats { > 1: required list colStats, > 2: required i64 partsFound // number of partitions for which stats were found > } > {code} > Thus, we get errors as follows: > {noformat} > 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer > (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing > of message. > org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is > unset! Struct:AggrStats(colStats:null, partsFound:0) > at > org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Normally, this would not occur since HIVE-10965 does also include a guard on > the client-side for colNames.isEmpty() to not call the metastore call at all, > but there is no guard for partNames being empty, and would still cause an > error on the metastore side if the thrift call were called directly, as would > happen if the client is from an older version before this was patched. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty
[ https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953421#comment-14953421 ] Sushanth Sowmyan commented on HIVE-12083: - > Should we short circuit for empty partitions case as well in the client side ? I think that makes sense and we should. I didn't initially because I hadn't evaluated the calling codepath to see if there was a difference between a null return and an empty return for AggrStats from the HMSC for the empty partNames case. Now that I've looked through that in some detail, I am for it. I will update the patch. > Does the case where table has not partition columns also use the > getAggrColStatsFor method ? If that is the case we should not be > shortcircuting this way. I thought of that, but irrespective of whether the client short-circuits, the metastore server will short circuit anyway, it's only a matter of a difference between returning null and an empty object. > HIVE-10965 introduces thrift error if partNames or colNames are empty > - > > Key: HIVE-12083 > URL: https://issues.apache.org/jira/browse/HIVE-12083 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.2.1, 1.0.2 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-12083.patch > > > In the fix for HIVE-10965, there is a short-circuit path that causes an empty > AggrStats object to be returned if partNames is empty or colNames is empty: > {code} > diff --git > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > index 0a56bac..ed810d2 100644 > --- > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > +++ > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats( >public AggrStats aggrColStatsForPartitions(String dbName, String tableName, >List partNames, List colNames, boolean > useDensityFunctionForNDVEstimation) >throws MetaException { > +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); > // Nothing to aggregate. > long partsFound = partsFoundForPartitions(dbName, tableName, partNames, > colNames); > List colStatsList; > // Try to read from the cache first > {code} > This runs afoul of thrift requirements that AggrStats have required fields: > {code} > struct AggrStats { > 1: required list colStats, > 2: required i64 partsFound // number of partitions for which stats were found > } > {code} > Thus, we get errors as follows: > {noformat} > 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer > (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing > of message. > org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is > unset! Struct:AggrStats(colStats:null, partsFound:0) > at > org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Normally, this would not oc
[jira] [Commented] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty
[ https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14952481#comment-14952481 ] Thejas M Nair commented on HIVE-12083: -- The patch looks good to me. Thanks for adding the tests! +1 I have two follow up questions, which could be addressed in separate jiras - 1. Should we short circuit for empty partitions case as well in the client side ? 2. Does the case where table has not partition columns also use the getAggrColStatsFor method ? If that is the case we should not be shortcircuting this way. [~ashutoshc] > HIVE-10965 introduces thrift error if partNames or colNames are empty > - > > Key: HIVE-12083 > URL: https://issues.apache.org/jira/browse/HIVE-12083 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.2.1, 1.0.2 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-12083.patch > > > In the fix for HIVE-10965, there is a short-circuit path that causes an empty > AggrStats object to be returned if partNames is empty or colNames is empty: > {code} > diff --git > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > index 0a56bac..ed810d2 100644 > --- > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > +++ > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats( >public AggrStats aggrColStatsForPartitions(String dbName, String tableName, >List partNames, List colNames, boolean > useDensityFunctionForNDVEstimation) >throws MetaException { > +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); > // Nothing to aggregate. > long partsFound = partsFoundForPartitions(dbName, tableName, partNames, > colNames); > List colStatsList; > // Try to read from the cache first > {code} > This runs afoul of thrift requirements that AggrStats have required fields: > {code} > struct AggrStats { > 1: required list colStats, > 2: required i64 partsFound // number of partitions for which stats were found > } > {code} > Thus, we get errors as follows: > {noformat} > 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer > (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing > of message. > org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is > unset! Struct:AggrStats(colStats:null, partsFound:0) > at > org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Normally, this would not occur since HIVE-10965 does also include a guard on > the client-side for colNames.isEmpty() to not call the metastore call at all, > but there is no guard for partNames being empty, and would still cause an > error on the metastore side if the thrift call were called directly, as would > happen if the client is from an older version before this was patched. -- This message was sent by
[jira] [Commented] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty
[ https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14952260#comment-14952260 ] Hive QA commented on HIVE-12083: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12765951/HIVE-12083.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9667 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5608/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5608/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5608/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12765951 - PreCommit-HIVE-TRUNK-Build > HIVE-10965 introduces thrift error if partNames or colNames are empty > - > > Key: HIVE-12083 > URL: https://issues.apache.org/jira/browse/HIVE-12083 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.2.1, 1.0.2 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-12083.patch > > > In the fix for HIVE-10965, there is a short-circuit path that causes an empty > AggrStats object to be returned if partNames is empty or colNames is empty: > {code} > diff --git > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > index 0a56bac..ed810d2 100644 > --- > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > +++ > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats( >public AggrStats aggrColStatsForPartitions(String dbName, String tableName, >List partNames, List colNames, boolean > useDensityFunctionForNDVEstimation) >throws MetaException { > +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); > // Nothing to aggregate. > long partsFound = partsFoundForPartitions(dbName, tableName, partNames, > colNames); > List colStatsList; > // Try to read from the cache first > {code} > This runs afoul of thrift requirements that AggrStats have required fields: > {code} > struct AggrStats { > 1: required list colStats, > 2: required i64 partsFound // number of partitions for which stats were found > } > {code} > Thus, we get errors as follows: > {noformat} > 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer > (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing > of message. > org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is > unset! Struct:AggrStats(colStats:null, partsFound:0) > at > org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(H