[jira] [Resolved] (HIVE-24344) [cache store] Add valid flag in table wrapper for all constraint
[ https://issues.apache.org/jira/browse/HIVE-24344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Sharma resolved HIVE-24344. -- Resolution: Duplicate > [cache store] Add valid flag in table wrapper for all constraint > - > > Key: HIVE-24344 > URL: https://issues.apache.org/jira/browse/HIVE-24344 > Project: Hive > Issue Type: Sub-task >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > > Description > Currently if get null for a constraint value we fall back to raw store to > validate weather NULL is correct or not. We can add a valid flag which states > that NULL constraint value is correct and thus reduce raw store call. > DOD > Add flag for all 6 constraint in cachedstore -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work stopped] (HIVE-24344) [cache store] Add valid flag in table wrapper for all constraint
[ https://issues.apache.org/jira/browse/HIVE-24344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-24344 stopped by Ashish Sharma. > [cache store] Add valid flag in table wrapper for all constraint > - > > Key: HIVE-24344 > URL: https://issues.apache.org/jira/browse/HIVE-24344 > Project: Hive > Issue Type: Sub-task >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > > Description > Currently if get null for a constraint value we fall back to raw store to > validate weather NULL is correct or not. We can add a valid flag which states > that NULL constraint value is correct and thus reduce raw store call. > DOD > Add flag for all 6 constraint in cachedstore -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24436) Fix Avro NULL_DEFAULT_VALUE compatibility issue
[ https://issues.apache.org/jira/browse/HIVE-24436?focusedWorklogId=517244&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517244 ] ASF GitHub Bot logged work on HIVE-24436: - Author: ASF GitHub Bot Created on: 27/Nov/20 05:38 Start Date: 27/Nov/20 05:38 Worklog Time Spent: 10m Work Description: wangyum commented on pull request #1715: URL: https://github.com/apache/hive/pull/1715#issuecomment-734654148 cc @sunchao @iemejia @viirya @dongjoon-hyun This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517244) Time Spent: 0.5h (was: 20m) > Fix Avro NULL_DEFAULT_VALUE compatibility issue > --- > > Key: HIVE-24436 > URL: https://issues.apache.org/jira/browse/HIVE-24436 > Project: Hive > Issue Type: Improvement > Components: Avro >Affects Versions: 2.3.8 >Reporter: Yuming Wang >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Exception1: > {noformat} > - create hive serde table with Catalog > *** RUN ABORTED *** > java.lang.NoSuchMethodError: 'void > org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, > java.lang.String, org.codehaus.jackson.JsonNode)' > at > org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.createAvroField(TypeInfoToSchema.java:76) > at > org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.convert(TypeInfoToSchema.java:61) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.getSchemaFromCols(AvroSerDe.java:170) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:114) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:83) > at > org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:533) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:450) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:437) > at > org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:281) > at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:263) > {noformat} > Exception2: > {noformat} > - alter hive serde table add columns -- partitioned - AVRO *** FAILED *** > org.apache.spark.sql.AnalysisException: > org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.avro.AvroRuntimeException: Unknown datum class: class > org.codehaus.jackson.node.NullNode; > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:112) > at > org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:245) > at > org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94) > at > org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:346) > at > org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:166) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) > at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228) > at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3680) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24436) Fix Avro NULL_DEFAULT_VALUE compatibility issue
[ https://issues.apache.org/jira/browse/HIVE-24436?focusedWorklogId=517243&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517243 ] ASF GitHub Bot logged work on HIVE-24436: - Author: ASF GitHub Bot Created on: 27/Nov/20 05:36 Start Date: 27/Nov/20 05:36 Worklog Time Spent: 10m Work Description: wangyum commented on a change in pull request #1715: URL: https://github.com/apache/hive/pull/1715#discussion_r531391483 ## File path: serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java ## @@ -235,14 +236,14 @@ private Schema createAvroArray(TypeInfo typeInfo) { private List getFields(Schema.Field schemaField) { List fields = new ArrayList(); -JsonNode nullDefault = JsonNodeFactory.instance.nullNode(); +JsonProperties.Null nullDefault = JsonProperties.NULL_VALUE; if (schemaField.schema().getType() == Schema.Type.RECORD) { for (Schema.Field field : schemaField.schema().getFields()) { fields.add(new Schema.Field(field.name(), field.schema(), field.doc(), nullDefault)); } } else { fields.add(new Schema.Field(schemaField.name(), schemaField.schema(), schemaField.doc(), - nullDefault)); +nullDefault)); Review comment: Related code: https://github.com/apache/avro/blob/release-1.8.2/lang/java/avro/src/main/java/org/apache/avro/Schema.java#L421-L424 https://github.com/apache/avro/blob/release-1.8.2/lang/java/avro/src/main/java/org/apache/avro/util/internal/JacksonUtils.java#L57 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517243) Time Spent: 20m (was: 10m) > Fix Avro NULL_DEFAULT_VALUE compatibility issue > --- > > Key: HIVE-24436 > URL: https://issues.apache.org/jira/browse/HIVE-24436 > Project: Hive > Issue Type: Improvement > Components: Avro >Affects Versions: 2.3.8 >Reporter: Yuming Wang >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Exception1: > {noformat} > - create hive serde table with Catalog > *** RUN ABORTED *** > java.lang.NoSuchMethodError: 'void > org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, > java.lang.String, org.codehaus.jackson.JsonNode)' > at > org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.createAvroField(TypeInfoToSchema.java:76) > at > org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.convert(TypeInfoToSchema.java:61) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.getSchemaFromCols(AvroSerDe.java:170) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:114) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:83) > at > org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:533) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:450) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:437) > at > org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:281) > at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:263) > {noformat} > Exception2: > {noformat} > - alter hive serde table add columns -- partitioned - AVRO *** FAILED *** > org.apache.spark.sql.AnalysisException: > org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.avro.AvroRuntimeException: Unknown datum class: class > org.codehaus.jackson.node.NullNode; > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:112) > at > org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:245) > at > org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94) > at > org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:346) > at > org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:166) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) > at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228
[jira] [Work logged] (HIVE-24436) Fix Avro NULL_DEFAULT_VALUE compatibility issue
[ https://issues.apache.org/jira/browse/HIVE-24436?focusedWorklogId=517241&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517241 ] ASF GitHub Bot logged work on HIVE-24436: - Author: ASF GitHub Bot Created on: 27/Nov/20 05:32 Start Date: 27/Nov/20 05:32 Worklog Time Spent: 10m Work Description: wangyum opened a new pull request #1715: URL: https://github.com/apache/hive/pull/1715 ### What changes were proposed in this pull request? This pr replace `null` with `JsonProperties.NULL_VALUE` to fix compatibility issue: 1. java.lang.NoSuchMethodError: 'void org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, java.lang.String, org.codehaus.jackson.JsonNode)' ``` - create hive serde table with Catalog *** RUN ABORTED *** java.lang.NoSuchMethodError: 'void org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, java.lang.String, org.codehaus.jackson.JsonNode)' at org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.createAvroField(TypeInfoToSchema.java:76) at org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.convert(TypeInfoToSchema.java:61) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.getSchemaFromCols(AvroSerDe.java:170) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:114) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:83) at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:533) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:450) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:437) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:281) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:263) ``` 2. org.apache.avro.AvroRuntimeException: Unknown datum class: class org.codehaus.jackson.node.NullNode ``` - alter hive serde table add columns -- partitioned - AVRO *** FAILED *** org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.avro.AvroRuntimeException: Unknown datum class: class org.codehaus.jackson.node.NullNode; at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:112) at org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:245) at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:346) at org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:166) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3680) ``` ### Why are the changes needed? For compatibility with Avro 1.9.x and Avro 1.10.0. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Build and run Spark test: ``` mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.hive.execution.HiveDDLSuite test ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517241) Remaining Estimate: 0h Time Spent: 10m > Fix Avro NULL_DEFAULT_VALUE compatibility issue > --- > > Key: HIVE-24436 > URL: https://issues.apache.org/jira/browse/HIVE-24436 > Project: Hive > Issue Type: Improvement > Components: Avro >Affects Versions: 2.3.8 >Reporter: Yuming Wang >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Exception1: > {noformat} > - create hive serde table with Catalog > *** RUN ABORTED *** > java.lang.NoSuchMethodError: 'void > org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, > java.lang.String, org.codehaus.jackson.JsonNode)'
[jira] [Updated] (HIVE-24436) Fix Avro NULL_DEFAULT_VALUE compatibility issue
[ https://issues.apache.org/jira/browse/HIVE-24436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24436: -- Labels: pull-request-available (was: ) > Fix Avro NULL_DEFAULT_VALUE compatibility issue > --- > > Key: HIVE-24436 > URL: https://issues.apache.org/jira/browse/HIVE-24436 > Project: Hive > Issue Type: Improvement > Components: Avro >Affects Versions: 2.3.8 >Reporter: Yuming Wang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Exception1: > {noformat} > - create hive serde table with Catalog > *** RUN ABORTED *** > java.lang.NoSuchMethodError: 'void > org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, > java.lang.String, org.codehaus.jackson.JsonNode)' > at > org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.createAvroField(TypeInfoToSchema.java:76) > at > org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.convert(TypeInfoToSchema.java:61) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.getSchemaFromCols(AvroSerDe.java:170) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:114) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:83) > at > org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:533) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:450) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:437) > at > org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:281) > at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:263) > {noformat} > Exception2: > {noformat} > - alter hive serde table add columns -- partitioned - AVRO *** FAILED *** > org.apache.spark.sql.AnalysisException: > org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.avro.AvroRuntimeException: Unknown datum class: class > org.codehaus.jackson.node.NullNode; > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:112) > at > org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:245) > at > org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94) > at > org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:346) > at > org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:166) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) > at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228) > at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3680) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24411) Make ThreadPoolExecutorWithOomHook more awareness of OutOfMemoryError
[ https://issues.apache.org/jira/browse/HIVE-24411?focusedWorklogId=517216&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517216 ] ASF GitHub Bot logged work on HIVE-24411: - Author: ASF GitHub Bot Created on: 27/Nov/20 02:24 Start Date: 27/Nov/20 02:24 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on pull request #1695: URL: https://github.com/apache/hive/pull/1695#issuecomment-734548436 @kgyrtkirk could you please take a look? thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517216) Time Spent: 40m (was: 0.5h) > Make ThreadPoolExecutorWithOomHook more awareness of OutOfMemoryError > - > > Key: HIVE-24411 > URL: https://issues.apache.org/jira/browse/HIVE-24411 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Minor > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Now the ThreadPoolExecutorWithOomHook invokes some oom hooks and stops the > HiveServer2 in case of OutOfMemoryError when executing the tasks. The > exception is obtained by calling method _future.get()_, however the exception > may never be an instance of OutOfMemoryError, as the exception is wrapped in > ExecutionException, see the method report in class FutureTask. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24424) Use PreparedStatements in DbNotificationListener getNextNLId
[ https://issues.apache.org/jira/browse/HIVE-24424?focusedWorklogId=517215&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517215 ] ASF GitHub Bot logged work on HIVE-24424: - Author: ASF GitHub Bot Created on: 27/Nov/20 02:18 Start Date: 27/Nov/20 02:18 Worklog Time Spent: 10m Work Description: belugabehr commented on pull request #1704: URL: https://github.com/apache/hive/pull/1704#issuecomment-734539984 Thanks @miklosgergely for the review. Can you please take a look once more? I believe I've addressed your comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517215) Time Spent: 1h 10m (was: 1h) > Use PreparedStatements in DbNotificationListener getNextNLId > > > Key: HIVE-24424 > URL: https://issues.apache.org/jira/browse/HIVE-24424 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Simplify the code, remove debug logging concatenation, and make it more > readable, -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24424) Use PreparedStatements in DbNotificationListener getNextNLId
[ https://issues.apache.org/jira/browse/HIVE-24424?focusedWorklogId=517214&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517214 ] ASF GitHub Bot logged work on HIVE-24424: - Author: ASF GitHub Bot Created on: 27/Nov/20 02:17 Start Date: 27/Nov/20 02:17 Worklog Time Spent: 10m Work Description: belugabehr commented on a change in pull request #1704: URL: https://github.com/apache/hive/pull/1704#discussion_r531349263 ## File path: hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java ## @@ -970,28 +971,44 @@ private static void close(ResultSet rs) { } } - private long getNextNLId(Statement stmt, SQLGenerator sqlGenerator, String sequence) + /** + * Get the next notification log ID. + * + * @return The next ID to use for a notification log message + * @throws SQLException if a database access error occurs or this method is + * called on a closed connection + * @throws MetaException if the sequence table is not properly initialized + */ + private long getNextNLId(Connection con, SQLGenerator sqlGenerator, String sequence) throws SQLException, MetaException { -String s = sqlGenerator.addForUpdateClause("select \"NEXT_VAL\" from " + -"\"SEQUENCE_TABLE\" where \"SEQUENCE_NAME\" = " + quoteString(sequence)); -LOG.debug("Going to execute query <" + s + ">"); -ResultSet rs = null; -try { - rs = stmt.executeQuery(s); - if (!rs.next()) { -throw new MetaException("Transaction database not properly configured, can't find next NL id."); +final String seq_sql = "select \"NEXT_VAL\" from \"SEQUENCE_TABLE\" where \"SEQUENCE_NAME\" = ?"; Review comment: Fixed. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517214) Time Spent: 1h (was: 50m) > Use PreparedStatements in DbNotificationListener getNextNLId > > > Key: HIVE-24424 > URL: https://issues.apache.org/jira/browse/HIVE-24424 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Simplify the code, remove debug logging concatenation, and make it more > readable, -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23891) Using UNION sql clause and speculative execution can cause file duplication in Tez
[ https://issues.apache.org/jira/browse/HIVE-23891?focusedWorklogId=517204&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517204 ] ASF GitHub Bot logged work on HIVE-23891: - Author: ASF GitHub Bot Created on: 27/Nov/20 00:44 Start Date: 27/Nov/20 00:44 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1294: URL: https://github.com/apache/hive/pull/1294#issuecomment-734518903 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517204) Time Spent: 2h 10m (was: 2h) > Using UNION sql clause and speculative execution can cause file duplication > in Tez > -- > > Key: HIVE-23891 > URL: https://issues.apache.org/jira/browse/HIVE-23891 > Project: Hive > Issue Type: Bug >Reporter: George Pachitariu >Assignee: George Pachitariu >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23891.1.patch > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Hello, > the specific scenario when this can happen: > - the execution engine is Tez; > - speculative execution is on; > - the query inserts into a table and the last step is a UNION sql clause; > The problem is that Tez creates an extra layer of subdirectories when there > is a UNION. Later, when deduplicating, Hive doesn't take that into account > and only deduplicates folders but not the files inside. > So for a query like this: > {code:sql} > insert overwrite table union_all > select * from union_first_part > union all > select * from union_second_part; > {code} > The folder structure afterwards will be like this (a possible example): > {code:java} > .../union_all/HIVE_UNION_SUBDIR_1/00_0 > .../union_all/HIVE_UNION_SUBDIR_1/00_1 > .../union_all/HIVE_UNION_SUBDIR_2/00_1 > {code} > The attached patch increases the number of folder levels that Hive will check > recursively for duplicates when we have a UNION in Tez. > Feel free to reach out if you have any questions :). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23965) Improve plan regression tests using TPCDS30TB metastore dump and custom configs
[ https://issues.apache.org/jira/browse/HIVE-23965?focusedWorklogId=517196&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517196 ] ASF GitHub Bot logged work on HIVE-23965: - Author: ASF GitHub Bot Created on: 26/Nov/20 22:24 Start Date: 26/Nov/20 22:24 Worklog Time Spent: 10m Work Description: zabetak commented on pull request #1714: URL: https://github.com/apache/hive/pull/1714#issuecomment-734496312 This is the same PR with https://github.com/apache/hive/pull/1347 plus an extra commit https://github.com/apache/hive/pull/1714/commits/df6e610c7f7b11b0bf06b500b25613c1a811c055 to handle metastore upgrades without the need to rebuild and publish the docker image. The initial PR (https://github.com/apache/hive/pull/1347) was reverted from master since tests were failing. Between the pre-commit runs and the post-commit runs some commits affected the schema of the metastore thus leading to these failures. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517196) Time Spent: 5h 40m (was: 5.5h) > Improve plan regression tests using TPCDS30TB metastore dump and custom > configs > --- > > Key: HIVE-23965 > URL: https://issues.apache.org/jira/browse/HIVE-23965 > Project: Hive > Issue Type: Improvement >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: master355.tgz > > Time Spent: 5h 40m > Remaining Estimate: 0h > > The existing regression tests (HIVE-12586) based on TPC-DS have certain > shortcomings: > The table statistics do not reflect cardinalities from a specific TPC-DS > scale factor (SF). Some tables are from a 30TB dataset, others from 200GB > dataset, and others from a 3GB dataset. This mix leads to plans that may > never appear when using an actual TPC-DS dataset. > The existing statistics do not contain information about partitions something > that can have a big impact on the resulting plans. > The existing regression tests rely on more or less on the default > configuration (hive-site.xml). In real-life scenarios though some of the > configurations differ and may impact the choices of the optimizer. > This issue aims to address the above shortcomings by using a curated > TPCDS30TB metastore dump along with some custom hive configurations. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23965) Improve plan regression tests using TPCDS30TB metastore dump and custom configs
[ https://issues.apache.org/jira/browse/HIVE-23965?focusedWorklogId=517195&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517195 ] ASF GitHub Bot logged work on HIVE-23965: - Author: ASF GitHub Bot Created on: 26/Nov/20 22:19 Start Date: 26/Nov/20 22:19 Worklog Time Spent: 10m Work Description: zabetak opened a new pull request #1714: URL: https://github.com/apache/hive/pull/1714 ### What changes were proposed in this pull request and why? 1. Add new perf driver, TestTezTPCDS30TBCliDriver, relying on a dockerized metastore. 2. Use Dockerized postgres metastore with TPC-DS 30TB dump 3. Remove old drivers (with and without constraints), related classes (e.g., MetastoreDumpUtility), and resources. 4. Use Hive config properties obtained and curated from real-life usages 5. Allow AbstractCliConfig to override metastore DB type 6. Rework CorePerfCliDriver to allow pre-initialized metastores 7. Remove redundant logs in System.err. Logging and throwing an exception is an anti-pattern. 8. Replace assertions with exceptions and improve the messages. 9. Upgrade postgres JDBC driver to version 42.2.14 to be compatible with the docker image used 10. Disable queries 14 (HIVE-24167), 30 (HIVE-23964) 11. Re-enable CBO plan tests for queries 44, 45, 67, 70, 86 The queries were disabled as part of HIVE-20718. They were supposed to be fixed in Calcite 1.18.0 and currently Hive is in 1.21.0 so it is not surprising that they pass. 12. Add missing queries: cbo_query41, cbo_query62, query62 ### Does this PR introduce _any_ user-facing change? No, except the fact that old TestTezPerf drivers no longer exist. ### How was this patch tested? `mvn test -Dtest=TestTezTPCDS30TBPerfCliDriver` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517195) Time Spent: 5.5h (was: 5h 20m) > Improve plan regression tests using TPCDS30TB metastore dump and custom > configs > --- > > Key: HIVE-23965 > URL: https://issues.apache.org/jira/browse/HIVE-23965 > Project: Hive > Issue Type: Improvement >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: master355.tgz > > Time Spent: 5.5h > Remaining Estimate: 0h > > The existing regression tests (HIVE-12586) based on TPC-DS have certain > shortcomings: > The table statistics do not reflect cardinalities from a specific TPC-DS > scale factor (SF). Some tables are from a 30TB dataset, others from 200GB > dataset, and others from a 3GB dataset. This mix leads to plans that may > never appear when using an actual TPC-DS dataset. > The existing statistics do not contain information about partitions something > that can have a big impact on the resulting plans. > The existing regression tests rely on more or less on the default > configuration (hive-site.xml). In real-life scenarios though some of the > configurations differ and may impact the choices of the optimizer. > This issue aims to address the above shortcomings by using a curated > TPCDS30TB metastore dump along with some custom hive configurations. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24259) [CachedStore] Optimise get constraints call by removing redundant table check
[ https://issues.apache.org/jira/browse/HIVE-24259?focusedWorklogId=517128&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517128 ] ASF GitHub Bot logged work on HIVE-24259: - Author: ASF GitHub Bot Created on: 26/Nov/20 17:20 Start Date: 26/Nov/20 17:20 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on a change in pull request #1610: URL: https://github.com/apache/hive/pull/1610#discussion_r531158744 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java ## @@ -2844,15 +2814,10 @@ public SQLAllTableConstraints getAllTableConstraints(String catName, String dbNa return rawStore.getAllTableConstraints(catName, dbName, tblName); } -Table tbl = sharedCache.getTableFromCache(catName, dbName, tblName); -if (tbl == null) { - // The table containing the constraints is not yet loaded in cache - return rawStore.getAllTableConstraints(catName, dbName, tblName); -} SQLAllTableConstraints constraints = sharedCache.listCachedAllTableConstraints(catName, dbName, tblName); -// if any of the constraint value is missing then there might be the case of partial constraints are stored in cached. -// So fall back to raw store for correct values +/* If constraint value is missing then there might be the case that table is not stored in cached or Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517128) Time Spent: 1h 40m (was: 1.5h) > [CachedStore] Optimise get constraints call by removing redundant table check > -- > > Key: HIVE-24259 > URL: https://issues.apache.org/jira/browse/HIVE-24259 > Project: Hive > Issue Type: Sub-task >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Description - > Problem - > 1. Redundant check if table is present or not > 2. Currently in order to get all constraint form the cachedstore. 6 different > call is made with in the cached store. Which led to 6 different call to raw > store > > DOD > 1. Check only once if table exit in cached store. > 2. Instead of calling individual constraint in cached store. Add a method > which return all constraint at once and if data is not consistent then fall > back to rawstore. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24259) [CachedStore] Optimise get constraints call by removing redundant table check
[ https://issues.apache.org/jira/browse/HIVE-24259?focusedWorklogId=517126&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517126 ] ASF GitHub Bot logged work on HIVE-24259: - Author: ASF GitHub Bot Created on: 26/Nov/20 17:20 Start Date: 26/Nov/20 17:20 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on a change in pull request #1610: URL: https://github.com/apache/hive/pull/1610#discussion_r531158652 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java ## @@ -2397,7 +2397,7 @@ public SQLAllTableConstraints listCachedAllTableConstraints(String catName, Stri public List listCachedForeignKeys(String catName, String foreignDbName, String foreignTblName, String parentDbName, String parentTblName) { -List keys = new ArrayList<>(); +List keys = null; Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517126) Time Spent: 1.5h (was: 1h 20m) > [CachedStore] Optimise get constraints call by removing redundant table check > -- > > Key: HIVE-24259 > URL: https://issues.apache.org/jira/browse/HIVE-24259 > Project: Hive > Issue Type: Sub-task >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Minor > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > Description - > Problem - > 1. Redundant check if table is present or not > 2. Currently in order to get all constraint form the cachedstore. 6 different > call is made with in the cached store. Which led to 6 different call to raw > store > > DOD > 1. Check only once if table exit in cached store. > 2. Instead of calling individual constraint in cached store. Add a method > which return all constraint at once and if data is not consistent then fall > back to rawstore. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24259) [CachedStore] Optimise get constraints call by removing redundant table check
[ https://issues.apache.org/jira/browse/HIVE-24259?focusedWorklogId=517125&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517125 ] ASF GitHub Bot logged work on HIVE-24259: - Author: ASF GitHub Bot Created on: 26/Nov/20 17:20 Start Date: 26/Nov/20 17:20 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on a change in pull request #1610: URL: https://github.com/apache/hive/pull/1610#discussion_r531158438 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java ## @@ -2836,14 +2836,32 @@ long getPartsFound() { @Override public SQLAllTableConstraints getAllTableConstraints(String catName, String dbName, String tblName) throws MetaException, NoSuchObjectException { -SQLAllTableConstraints sqlAllTableConstraints = new SQLAllTableConstraints(); -sqlAllTableConstraints.setPrimaryKeys(getPrimaryKeys(catName, dbName, tblName)); -sqlAllTableConstraints.setForeignKeys(getForeignKeys(catName, null, null, dbName, tblName)); -sqlAllTableConstraints.setUniqueConstraints(getUniqueConstraints(catName, dbName, tblName)); - sqlAllTableConstraints.setDefaultConstraints(getDefaultConstraints(catName, dbName, tblName)); -sqlAllTableConstraints.setCheckConstraints(getCheckConstraints(catName, dbName, tblName)); - sqlAllTableConstraints.setNotNullConstraints(getNotNullConstraints(catName, dbName, tblName)); -return sqlAllTableConstraints; + +catName = StringUtils.normalizeIdentifier(catName); +dbName = StringUtils.normalizeIdentifier(dbName); +tblName = StringUtils.normalizeIdentifier(tblName); +if (!shouldCacheTable(catName, dbName, tblName) || (canUseEvents && rawStore.isActiveTransaction())) { + return rawStore.getAllTableConstraints(catName, dbName, tblName); +} + +Table tbl = sharedCache.getTableFromCache(catName, dbName, tblName); +if (tbl == null) { + // The table containing the constraints is not yet loaded in cache + return rawStore.getAllTableConstraints(catName, dbName, tblName); +} +SQLAllTableConstraints constraints = sharedCache.listCachedAllTableConstraints(catName, dbName, tblName); + +// if any of the constraint value is missing then there might be the case of partial constraints are stored in cached. +// So fall back to raw store for correct values +if (constraints != null && CollectionUtils.isNotEmpty(constraints.getPrimaryKeys()) && CollectionUtils Review comment: Added flag This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517125) Time Spent: 1h 20m (was: 1h 10m) > [CachedStore] Optimise get constraints call by removing redundant table check > -- > > Key: HIVE-24259 > URL: https://issues.apache.org/jira/browse/HIVE-24259 > Project: Hive > Issue Type: Sub-task >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Description - > Problem - > 1. Redundant check if table is present or not > 2. Currently in order to get all constraint form the cachedstore. 6 different > call is made with in the cached store. Which led to 6 different call to raw > store > > DOD > 1. Check only once if table exit in cached store. > 2. Instead of calling individual constraint in cached store. Add a method > which return all constraint at once and if data is not consistent then fall > back to rawstore. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24389) Trailing zeros of constant decimal numbers are removed
[ https://issues.apache.org/jira/browse/HIVE-24389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa resolved HIVE-24389. --- Resolution: Fixed Pushed to master, Thanks [~jcamachorodriguez] for review. > Trailing zeros of constant decimal numbers are removed > -- > > Key: HIVE-24389 > URL: https://issues.apache.org/jira/browse/HIVE-24389 > Project: Hive > Issue Type: Bug > Components: Types >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > In some case Hive removes trailing zeros of constant decimal numbers > {code} > select cast(1.1 as decimal(22, 2)) > 1.1 > {code} > In this case *WritableConstantHiveDecimalObjectInspector* is used and this > object inspector takes it's wrapped HiveDecimal scale instead of the scale > specified in the wrapped typeinfo: > {code} > this = {WritableConstantHiveDecimalObjectInspector@14415} > value = {HiveDecimalWritable@14426} "1.1" > typeInfo = {DecimalTypeInfo@14421} "decimal(22,2)"{code} > However in case of an expression with an aggregate function > *WritableHiveDecimalObjectInspector* is used > {code} > select cast(sum(1.1) as decimal(22, 2)) > 1.10 > {code} > {code} > o = {HiveDecimalWritable@16633} "1.1" > oi = {WritableHiveDecimalObjectInspector@16634} > typeInfo = {DecimalTypeInfo@16640} "decimal(22,2)" > {code} > Casting the expressions to string > {code:java} > select cast(cast(1.1 as decimal(22, 2)) as string), cast(cast(sum(1.1) as > decimal(22, 2)) as string) > 1.1 1.10 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24389) Trailing zeros of constant decimal numbers are removed
[ https://issues.apache.org/jira/browse/HIVE-24389?focusedWorklogId=517095&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517095 ] ASF GitHub Bot logged work on HIVE-24389: - Author: ASF GitHub Bot Created on: 26/Nov/20 14:59 Start Date: 26/Nov/20 14:59 Worklog Time Spent: 10m Work Description: kasakrisz merged pull request #1676: URL: https://github.com/apache/hive/pull/1676 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517095) Time Spent: 2h (was: 1h 50m) > Trailing zeros of constant decimal numbers are removed > -- > > Key: HIVE-24389 > URL: https://issues.apache.org/jira/browse/HIVE-24389 > Project: Hive > Issue Type: Bug > Components: Types >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > In some case Hive removes trailing zeros of constant decimal numbers > {code} > select cast(1.1 as decimal(22, 2)) > 1.1 > {code} > In this case *WritableConstantHiveDecimalObjectInspector* is used and this > object inspector takes it's wrapped HiveDecimal scale instead of the scale > specified in the wrapped typeinfo: > {code} > this = {WritableConstantHiveDecimalObjectInspector@14415} > value = {HiveDecimalWritable@14426} "1.1" > typeInfo = {DecimalTypeInfo@14421} "decimal(22,2)"{code} > However in case of an expression with an aggregate function > *WritableHiveDecimalObjectInspector* is used > {code} > select cast(sum(1.1) as decimal(22, 2)) > 1.10 > {code} > {code} > o = {HiveDecimalWritable@16633} "1.1" > oi = {WritableHiveDecimalObjectInspector@16634} > typeInfo = {DecimalTypeInfo@16640} "decimal(22,2)" > {code} > Casting the expressions to string > {code:java} > select cast(cast(1.1 as decimal(22, 2)) as string), cast(cast(sum(1.1) as > decimal(22, 2)) as string) > 1.1 1.10 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HIVE-22782) Consolidate metastore call to fetch constraints
[ https://issues.apache.org/jira/browse/HIVE-22782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17202575#comment-17202575 ] Sankar Hariappan edited comment on HIVE-22782 at 11/26/20, 2:32 PM: PR merged to master. Thanks [~ashish-kumar-sharma] for the contribution! was (Author: sankarh): PR merged to master. Thanks [~ashish-kumar-sharma] for the constribution! > Consolidate metastore call to fetch constraints > --- > > Key: HIVE-22782 > URL: https://issues.apache.org/jira/browse/HIVE-22782 > Project: Hive > Issue Type: Improvement > Components: Query Planning, Standalone Metastore >Affects Versions: 4.0.0 >Reporter: Vineet Garg >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 7h 20m > Remaining Estimate: 0h > > Currently separate calls are made to metastore to fetch constraints like Pk, > fk, not null etc. Since planner always retrieve these constraints we should > retrieve all of them in one call. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24435) Vectorized unix_timestamp is inconsistent with non-vectorized counterpart
[ https://issues.apache.org/jira/browse/HIVE-24435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24435: -- Labels: pull-request-available (was: ) > Vectorized unix_timestamp is inconsistent with non-vectorized counterpart > - > > Key: HIVE-24435 > URL: https://issues.apache.org/jira/browse/HIVE-24435 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > {code} > create table t (d string); > insert into t values('2020-11-16 22:18:40 UTC'); > select > '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), > to_date(from_unixtime(unix_timestamp(d))) > from t > ; > set hive.fetch.task.conversion=none; > select > '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), > to_date(from_unixtime(unix_timestamp(d))) > from t > ; > {code} > results: > {code} > -- std udf: > >2020-11-16 22:18:40 UTC< 1605593920 2020-11-16 22:18:40 > >2020-11-16 > -- vectorized udf > >2020-11-16 22:18:40 UTC< NULLNULLNULL > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24435) Vectorized unix_timestamp is inconsistent with non-vectorized counterpart
[ https://issues.apache.org/jira/browse/HIVE-24435?focusedWorklogId=517035&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517035 ] ASF GitHub Bot logged work on HIVE-24435: - Author: ASF GitHub Bot Created on: 26/Nov/20 12:02 Start Date: 26/Nov/20 12:02 Worklog Time Spent: 10m Work Description: kgyrtkirk opened a new pull request #1713: URL: https://github.com/apache/hive/pull/1713 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517035) Remaining Estimate: 0h Time Spent: 10m > Vectorized unix_timestamp is inconsistent with non-vectorized counterpart > - > > Key: HIVE-24435 > URL: https://issues.apache.org/jira/browse/HIVE-24435 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > {code} > create table t (d string); > insert into t values('2020-11-16 22:18:40 UTC'); > select > '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), > to_date(from_unixtime(unix_timestamp(d))) > from t > ; > set hive.fetch.task.conversion=none; > select > '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), > to_date(from_unixtime(unix_timestamp(d))) > from t > ; > {code} > results: > {code} > -- std udf: > >2020-11-16 22:18:40 UTC< 1605593920 2020-11-16 22:18:40 > >2020-11-16 > -- vectorized udf > >2020-11-16 22:18:40 UTC< NULLNULLNULL > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24409) Use LazyBinarySerDe2 in PlanUtils::getReduceValueTableDesc
[ https://issues.apache.org/jira/browse/HIVE-24409?focusedWorklogId=517027&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517027 ] ASF GitHub Bot logged work on HIVE-24409: - Author: ASF GitHub Bot Created on: 26/Nov/20 11:49 Start Date: 26/Nov/20 11:49 Worklog Time Spent: 10m Work Description: maheshk114 merged pull request #1708: URL: https://github.com/apache/hive/pull/1708 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517027) Time Spent: 0.5h (was: 20m) > Use LazyBinarySerDe2 in PlanUtils::getReduceValueTableDesc > -- > > Key: HIVE-24409 > URL: https://issues.apache.org/jira/browse/HIVE-24409 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Priority: Major > Labels: pull-request-available > Attachments: Screenshot 2020-11-23 at 10.52.49 AM.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > !Screenshot 2020-11-23 at 10.52.49 AM.png|width=858,height=493! > Lines of interest: > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java#L535] > (non-vectorized path due to stats) > > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java#L581] > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (HIVE-24435) Vectorized unix_timestamp is inconsistent with non-vectorized counterpart
[ https://issues.apache.org/jira/browse/HIVE-24435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-24435: Comment: was deleted (was: looks like there are some things here * unix_timestamp is deprecated - recommends to use current_timestamp - however current_timestamp doesnt take any argument; so its puzzling to suggest to use that * GenericUDFUnixTimeStamp has some implementation; but it also extends GenericUDFToUnixTimeStamp which also has a few vectorized implementations attached * unix_timestamp behaves 100% the same as to_unix_timestamp in case an argument is specfied) > Vectorized unix_timestamp is inconsistent with non-vectorized counterpart > - > > Key: HIVE-24435 > URL: https://issues.apache.org/jira/browse/HIVE-24435 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > > {code} > create table t (d string); > insert into t values('2020-11-16 22:18:40 UTC'); > select > '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), > to_date(from_unixtime(unix_timestamp(d))) > from t > ; > set hive.fetch.task.conversion=none; > select > '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), > to_date(from_unixtime(unix_timestamp(d))) > from t > ; > {code} > results: > {code} > -- std udf: > >2020-11-16 22:18:40 UTC< 1605593920 2020-11-16 22:18:40 > >2020-11-16 > -- vectorized udf > >2020-11-16 22:18:40 UTC< NULLNULLNULL > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24435) Vectorized unix_timestamp is inconsistent with non-vectorized counterpart
[ https://issues.apache.org/jira/browse/HIVE-24435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17239211#comment-17239211 ] Zoltan Haindrich commented on HIVE-24435: - looks like there are some things here * unix_timestamp is deprecated - recommends to use current_timestamp - however current_timestamp doesnt take any argument; so its puzzling to suggest to use that * GenericUDFUnixTimeStamp has some implementation; but it also extends GenericUDFToUnixTimeStamp which also has a few vectorized implementations attached * unix_timestamp behaves 100% the same as to_unix_timestamp in case an argument is specfied > Vectorized unix_timestamp is inconsistent with non-vectorized counterpart > - > > Key: HIVE-24435 > URL: https://issues.apache.org/jira/browse/HIVE-24435 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > > {code} > create table t (d string); > insert into t values('2020-11-16 22:18:40 UTC'); > select > '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), > to_date(from_unixtime(unix_timestamp(d))) > from t > ; > set hive.fetch.task.conversion=none; > select > '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), > to_date(from_unixtime(unix_timestamp(d))) > from t > ; > {code} > results: > {code} > -- std udf: > >2020-11-16 22:18:40 UTC< 1605593920 2020-11-16 22:18:40 > >2020-11-16 > -- vectorized udf > >2020-11-16 22:18:40 UTC< NULLNULLNULL > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24435) Vectorized unix_timestamp is inconsistent with non-vectorized counterpart
[ https://issues.apache.org/jira/browse/HIVE-24435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich reassigned HIVE-24435: --- > Vectorized unix_timestamp is inconsistent with non-vectorized counterpart > - > > Key: HIVE-24435 > URL: https://issues.apache.org/jira/browse/HIVE-24435 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > > {code} > create table t (d string); > insert into t values('2020-11-16 22:18:40 UTC'); > select > '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), > to_date(from_unixtime(unix_timestamp(d))) > from t > ; > set hive.fetch.task.conversion=none; > select > '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), > to_date(from_unixtime(unix_timestamp(d))) > from t > ; > {code} > results: > {code} > -- std udf: > >2020-11-16 22:18:40 UTC< 1605593920 2020-11-16 22:18:40 > >2020-11-16 > -- vectorized udf > >2020-11-16 22:18:40 UTC< NULLNULLNULL > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24424) Use PreparedStatements in DbNotificationListener getNextNLId
[ https://issues.apache.org/jira/browse/HIVE-24424?focusedWorklogId=517001&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517001 ] ASF GitHub Bot logged work on HIVE-24424: - Author: ASF GitHub Bot Created on: 26/Nov/20 11:09 Start Date: 26/Nov/20 11:09 Worklog Time Spent: 10m Work Description: miklosgergely commented on a change in pull request #1704: URL: https://github.com/apache/hive/pull/1704#discussion_r530952029 ## File path: hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java ## @@ -970,28 +971,44 @@ private static void close(ResultSet rs) { } } - private long getNextNLId(Statement stmt, SQLGenerator sqlGenerator, String sequence) + /** + * Get the next notification log ID. + * + * @return The next ID to use for a notification log message + * @throws SQLException if a database access error occurs or this method is + * called on a closed connection + * @throws MetaException if the sequence table is not properly initialized + */ + private long getNextNLId(Connection con, SQLGenerator sqlGenerator, String sequence) throws SQLException, MetaException { -String s = sqlGenerator.addForUpdateClause("select \"NEXT_VAL\" from " + -"\"SEQUENCE_TABLE\" where \"SEQUENCE_NAME\" = " + quoteString(sequence)); -LOG.debug("Going to execute query <" + s + ">"); -ResultSet rs = null; -try { - rs = stmt.executeQuery(s); - if (!rs.next()) { -throw new MetaException("Transaction database not properly configured, can't find next NL id."); +final String seq_sql = "select \"NEXT_VAL\" from \"SEQUENCE_TABLE\" where \"SEQUENCE_NAME\" = ?"; +final String upd_sql = "update \"SEQUENCE_TABLE\" set \"NEXT_VAL\" = ? where \"SEQUENCE_NAME\" = ?"; + +final String sou_sql = sqlGenerator.addForUpdateClause(seq_sql); Review comment: Please use camelCase for variables within a function. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517001) Time Spent: 50m (was: 40m) > Use PreparedStatements in DbNotificationListener getNextNLId > > > Key: HIVE-24424 > URL: https://issues.apache.org/jira/browse/HIVE-24424 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Simplify the code, remove debug logging concatenation, and make it more > readable, -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24424) Use PreparedStatements in DbNotificationListener getNextNLId
[ https://issues.apache.org/jira/browse/HIVE-24424?focusedWorklogId=517000&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517000 ] ASF GitHub Bot logged work on HIVE-24424: - Author: ASF GitHub Bot Created on: 26/Nov/20 11:08 Start Date: 26/Nov/20 11:08 Worklog Time Spent: 10m Work Description: miklosgergely commented on a change in pull request #1704: URL: https://github.com/apache/hive/pull/1704#discussion_r530951752 ## File path: hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java ## @@ -970,28 +971,44 @@ private static void close(ResultSet rs) { } } - private long getNextNLId(Statement stmt, SQLGenerator sqlGenerator, String sequence) + /** + * Get the next notification log ID. + * + * @return The next ID to use for a notification log message + * @throws SQLException if a database access error occurs or this method is + * called on a closed connection + * @throws MetaException if the sequence table is not properly initialized + */ + private long getNextNLId(Connection con, SQLGenerator sqlGenerator, String sequence) throws SQLException, MetaException { -String s = sqlGenerator.addForUpdateClause("select \"NEXT_VAL\" from " + -"\"SEQUENCE_TABLE\" where \"SEQUENCE_NAME\" = " + quoteString(sequence)); -LOG.debug("Going to execute query <" + s + ">"); -ResultSet rs = null; -try { - rs = stmt.executeQuery(s); - if (!rs.next()) { -throw new MetaException("Transaction database not properly configured, can't find next NL id."); +final String seq_sql = "select \"NEXT_VAL\" from \"SEQUENCE_TABLE\" where \"SEQUENCE_NAME\" = ?"; Review comment: These two are constants, please extract them. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517000) Time Spent: 40m (was: 0.5h) > Use PreparedStatements in DbNotificationListener getNextNLId > > > Key: HIVE-24424 > URL: https://issues.apache.org/jira/browse/HIVE-24424 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Simplify the code, remove debug logging concatenation, and make it more > readable, -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24423) Improve DbNotificationListener Thread
[ https://issues.apache.org/jira/browse/HIVE-24423?focusedWorklogId=516999&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-516999 ] ASF GitHub Bot logged work on HIVE-24423: - Author: ASF GitHub Bot Created on: 26/Nov/20 11:01 Start Date: 26/Nov/20 11:01 Worklog Time Spent: 10m Work Description: miklosgergely commented on a change in pull request #1703: URL: https://github.com/apache/hive/pull/1703#discussion_r530947449 ## File path: hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java ## @@ -1242,64 +1244,50 @@ private void process(NotificationEvent event, ListenerEvent listenerEvent) throw } private static class CleanerThread extends Thread { -private RawStore rs; +private final RawStore rs; private int ttl; -private boolean shouldRun = true; private long sleepTime; CleanerThread(Configuration conf, RawStore rs) { super("DB-Notification-Cleaner"); - this.rs = rs; - boolean isReplEnabled = MetastoreConf.getBoolVar(conf, ConfVars.REPLCMENABLED); - if(isReplEnabled){ -setTimeToLive(MetastoreConf.getTimeVar(conf, ConfVars.REPL_EVENT_DB_LISTENER_TTL, -TimeUnit.SECONDS)); - } - else { -setTimeToLive(MetastoreConf.getTimeVar(conf, ConfVars.EVENT_DB_LISTENER_TTL, -TimeUnit.SECONDS)); - } - setCleanupInterval(MetastoreConf.getTimeVar(conf, ConfVars.EVENT_DB_LISTENER_CLEAN_INTERVAL, - TimeUnit.MILLISECONDS)); setDaemon(true); + this.rs = Objects.requireNonNull(rs); + + boolean isReplEnabled = MetastoreConf.getBoolVar(conf, ConfVars.REPLCMENABLED); + ConfVars ttlConf = (isReplEnabled) ? ConfVars.REPL_EVENT_DB_LISTENER_TTL : ConfVars.EVENT_DB_LISTENER_TTL; + setTimeToLive(MetastoreConf.getTimeVar(conf, ttlConf, TimeUnit.SECONDS)); + setCleanupInterval( + MetastoreConf.getTimeVar(conf, ConfVars.EVENT_DB_LISTENER_CLEAN_INTERVAL, TimeUnit.MILLISECONDS)); } @Override public void run() { - while (shouldRun) { + while (true) { +LOG.debug("Cleaner thread running"); try { rs.cleanNotificationEvents(ttl); rs.cleanWriteNotificationEvents(ttl); } catch (Exception ex) { - //catching exceptions here makes sure that the thread doesn't die in case of unexpected - //exceptions - LOG.warn("Exception received while cleaning notifications: ", ex); + LOG.warn("Exception received while cleaning notifications", ex); Review comment: What if interruption occurs while the execution of CleanerThread is within this try block, wouldn't the InterruptedException be caught by this catch block, and the thread will go on? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 516999) Time Spent: 20m (was: 10m) > Improve DbNotificationListener Thread > - > > Key: HIVE-24423 > URL: https://issues.apache.org/jira/browse/HIVE-24423 > Project: Hive > Issue Type: Improvement >Affects Versions: 3.1.0 >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Clean up and simplify {{DbNotificationListener}} thread class. > Most importantly, stop the thread and wait for it to finish before launching > a new thread. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24359) Hive Compaction hangs because of doAs when worker set to HS2
[ https://issues.apache.org/jira/browse/HIVE-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Coppage resolved HIVE-24359. -- Resolution: Duplicate HIVE-24410 duplicates this, and is Fixed. > Hive Compaction hangs because of doAs when worker set to HS2 > > > Key: HIVE-24359 > URL: https://issues.apache.org/jira/browse/HIVE-24359 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Transactions >Reporter: Chiran Ravani >Priority: Critical > > When creating a managed table and inserting data using Impala, with > compaction worker set to HiveServer2 - in secured environment (Kerberized > Cluster). Worker thread hangs indefinitely expecting user to provide kerberos > credentials from STDIN > The problem appears to be because of no login context being sent from HS2 to > HMS as part of QueryCompactor and HS2 JVM has property > javax.security.auth.useSubjectCredsOnly is set to false. Which is causing it > to prompt for logins via stdin, however setting to true also does not helo as > the context does not seem to be passed in any case. > Below is observed in HS2 Jstack. If you see the the thread is waiting for > stdin "com.sun.security.auth.module.Krb5LoginModule.promptForName" > {code} > "c570-node2.abc.host.com-44_executor" #47 daemon prio=1 os_prio=0 > tid=0x01506000 nid=0x1348 runnable [0x7f1beea95000] >java.lang.Thread.State: RUNNABLE > at java.io.FileInputStream.readBytes(Native Method) > at java.io.FileInputStream.read(FileInputStream.java:255) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:284) > at java.io.BufferedInputStream.read(BufferedInputStream.java:345) > - locked <0x9fa38c90> (a java.io.BufferedInputStream) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) > - locked <0x8c7d5010> (a java.io.InputStreamReader) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:161) > at java.io.BufferedReader.readLine(BufferedReader.java:324) > - locked <0x8c7d5010> (a java.io.InputStreamReader) > at java.io.BufferedReader.readLine(BufferedReader.java:389) > at > com.sun.security.auth.callback.TextCallbackHandler.readLine(TextCallbackHandler.java:153) > at > com.sun.security.auth.callback.TextCallbackHandler.handle(TextCallbackHandler.java:120) > at > com.sun.security.auth.module.Krb5LoginModule.promptForName(Krb5LoginModule.java:862) > at > com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:708) > at > com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > javax.security.auth.login.LoginContext.invoke(LoginContext.java:755) > at > javax.security.auth.login.LoginContext.access$000(LoginContext.java:195) > at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682) > at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680) > at java.security.AccessController.doPrivileged(Native Method) > at > javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680) > at javax.security.auth.login.LoginContext.login(LoginContext.java:587) > at sun.security.jgss.GSSUtil.login(GSSUtil.java:258) > at sun.security.jgss.krb5.Krb5Util.getInitialTicket(Krb5Util.java:175) > at > sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:341) > at > sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:337) > at java.security.AccessController.doPrivileged(Native Method) > at > sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:336) > at > sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:146) > at > sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) > at > sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:189) > at > sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) > at > sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
[jira] [Work logged] (HIVE-24409) Use LazyBinarySerDe2 in PlanUtils::getReduceValueTableDesc
[ https://issues.apache.org/jira/browse/HIVE-24409?focusedWorklogId=516954&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-516954 ] ASF GitHub Bot logged work on HIVE-24409: - Author: ASF GitHub Bot Created on: 26/Nov/20 09:03 Start Date: 26/Nov/20 09:03 Worklog Time Spent: 10m Work Description: rbalamohan commented on pull request #1708: URL: https://github.com/apache/hive/pull/1708#issuecomment-734168724 LGTM. +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 516954) Time Spent: 20m (was: 10m) > Use LazyBinarySerDe2 in PlanUtils::getReduceValueTableDesc > -- > > Key: HIVE-24409 > URL: https://issues.apache.org/jira/browse/HIVE-24409 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Priority: Major > Labels: pull-request-available > Attachments: Screenshot 2020-11-23 at 10.52.49 AM.png > > Time Spent: 20m > Remaining Estimate: 0h > > !Screenshot 2020-11-23 at 10.52.49 AM.png|width=858,height=493! > Lines of interest: > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java#L535] > (non-vectorized path due to stats) > > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java#L581] > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24433) AutoCompaction is not getting triggered for CamelCase Partition Values
[ https://issues.apache.org/jira/browse/HIVE-24433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naresh P R updated HIVE-24433: -- Description: PartionKeyValue is getting converted into lowerCase in below 2 places. [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2728] [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2851] Because of which TXN_COMPONENTS & HIVE_LOCKS tables are not having entries from proper partition values. When query completes, the entry moves from TXN_COMPONENTS to COMPLETED_TXN_COMPONENTS. Hive AutoCompaction will not recognize the partition & considers it as invalid partition {code:java} create table abc(name string) partitioned by(city string) stored as orc tblproperties('transactional'='true'); insert into abc partition(city='Bangalore') values('aaa'); {code} Example entry in COMPLETED_TXN_COMPONENTS {noformat} +---+--++---+-+-+---+ | CTC_TXNID | CTC_DATABASE | CTC_TABLE | CTC_PARTITION | CTC_TIMESTAMP | CTC_WRITEID | CTC_UPDATE_DELETE | +---+--++---+-+-+---+ | 2 | default | abc | city=bangalore | 2020-11-25 09:26:59 | 1 | N | +---+--++---+-+-+---+ {noformat} AutoCompaction fails to get triggered with below error {code:java} 2020-11-25T09:35:10,364 INFO [Thread-9]: compactor.Initiator (Initiator.java:run(98)) - Checking to see if we should compact default.abc.city=bangalore 2020-11-25T09:35:10,380 INFO [Thread-9]: compactor.Initiator (Initiator.java:run(155)) - Can't find partition default.compaction_test.city=bangalore, assuming it has been dropped and moving on{code} I verifed below 4 SQL's with my PR, those all produced correct PartitionKeyValue i.e, COMPLETED_TXN_COMPONENTS.CTC_PARTITION="city=Bangalore" {code:java} insert into table abc PARTITION(CitY='Bangalore') values('Dan'); insert overwrite table abc partition(CiTy='Bangalore') select Name from abc; update table abc set Name='xy' where CiTy='Bangalore'; delete from abc where CiTy='Bangalore';{code} was: PartionKeyValue is getting converted into lowerCase in below 2 places. [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2728] [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2851] Because of which TXN_COMPONENTS & HIVE_LOCKS tables are not having entries from proper partition values. When query completes, the entry moves from TXN_COMPONENTS to COMPLETED_TXN_COMPONENTS. Hive AutoCompaction will not recognize the partition & considers it as invalid partition {code:java} create table abc(name string) partitioned by(city string) stored as orc tblproperties('transactional'='true'); insert into abc partition(city='Bangalore') values('aaa'); {code} Example entry in COMPLETED_TXN_COMPONENTS {noformat} +---+--++---+-+-+---+ | CTC_TXNID | CTC_DATABASE | CTC_TABLE | CTC_PARTITION | CTC_TIMESTAMP | CTC_WRITEID | CTC_UPDATE_DELETE | +---+--++---+-+-+---+ | 2 | default | abc | city=bangalore | 2020-11-25 09:26:59 | 1 | N | +---+--++---+-+-+---+ {noformat} AutoCompaction fails to get triggered with below error {code:java} 2020-11-25T09:35:10,364 INFO [Thread-9]: compactor.Initiator (Initiator.java:run(98)) - Checking to see if we should compact default.abc.city=bangalore 2020-11-25T09:35:10,380 INFO [Thread-9]: compactor.Initiator (Initiator.java:run(155)) - Can't find partition default.compaction_test.city=bhubaneshwar, assuming it has been dropped and moving on{code} I verifed below 4 SQL's with my PR, those all produced correct PartitionKeyValue i.e, COMPLETED_TXN_COMPONENTS.CTC_PARTITION="city=Bangalore" {code:java} insert into table abc PARTITION(CitY='Bangalore') values('Dan'); insert overwrite table abc partition(CiTy='Bangalore') select Name from abc; update table abc set Name='xy' where CiTy='Bangalore'; delete from abc where CiTy='Bangalore';{code} > AutoCompaction is not getting trigger
[jira] [Work logged] (HIVE-24433) AutoCompaction is not getting triggered for CamelCase Partition Values
[ https://issues.apache.org/jira/browse/HIVE-24433?focusedWorklogId=516952&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-516952 ] ASF GitHub Bot logged work on HIVE-24433: - Author: ASF GitHub Bot Created on: 26/Nov/20 08:44 Start Date: 26/Nov/20 08:44 Worklog Time Spent: 10m Work Description: pvargacl commented on pull request #1712: URL: https://github.com/apache/hive/pull/1712#issuecomment-734158908 Hi @nareshpr thanks for picking this up. Could you add a test case to TestInitiator that covers this use case. cc @klcopp @deniskuzZ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 516952) Time Spent: 20m (was: 10m) > AutoCompaction is not getting triggered for CamelCase Partition Values > -- > > Key: HIVE-24433 > URL: https://issues.apache.org/jira/browse/HIVE-24433 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > PartionKeyValue is getting converted into lowerCase in below 2 places. > [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2728] > [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2851] > Because of which TXN_COMPONENTS & HIVE_LOCKS tables are not having entries > from proper partition values. > When query completes, the entry moves from TXN_COMPONENTS to > COMPLETED_TXN_COMPONENTS. Hive AutoCompaction will not recognize the > partition & considers it as invalid partition > {code:java} > create table abc(name string) partitioned by(city string) stored as orc > tblproperties('transactional'='true'); > insert into abc partition(city='Bangalore') values('aaa'); > {code} > Example entry in COMPLETED_TXN_COMPONENTS > {noformat} > +---+--++---+-+-+---+ > | CTC_TXNID | CTC_DATABASE | CTC_TABLE | CTC_PARTITION | > CTC_TIMESTAMP | CTC_WRITEID | CTC_UPDATE_DELETE | > +---+--++---+-+-+---+ > | 2 | default | abc | city=bangalore | 2020-11-25 09:26:59 > | 1 | N | > +---+--++---+-+-+---+ > {noformat} > > AutoCompaction fails to get triggered with below error > {code:java} > 2020-11-25T09:35:10,364 INFO [Thread-9]: compactor.Initiator > (Initiator.java:run(98)) - Checking to see if we should compact > default.abc.city=bangalore > 2020-11-25T09:35:10,380 INFO [Thread-9]: compactor.Initiator > (Initiator.java:run(155)) - Can't find partition > default.compaction_test.city=bhubaneshwar, assuming it has been dropped and > moving on{code} > I verifed below 4 SQL's with my PR, those all produced correct > PartitionKeyValue > i.e, COMPLETED_TXN_COMPONENTS.CTC_PARTITION="city=Bangalore" > {code:java} > insert into table abc PARTITION(CitY='Bangalore') values('Dan'); > insert overwrite table abc partition(CiTy='Bangalore') select Name from abc; > update table abc set Name='xy' where CiTy='Bangalore'; > delete from abc where CiTy='Bangalore';{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24410) Query-based compaction hangs because of doAs
[ https://issues.apache.org/jira/browse/HIVE-24410?focusedWorklogId=516951&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-516951 ] ASF GitHub Bot logged work on HIVE-24410: - Author: ASF GitHub Bot Created on: 26/Nov/20 08:43 Start Date: 26/Nov/20 08:43 Worklog Time Spent: 10m Work Description: klcopp merged pull request #1693: URL: https://github.com/apache/hive/pull/1693 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 516951) Time Spent: 3h (was: 2h 50m) > Query-based compaction hangs because of doAs > > > Key: HIVE-24410 > URL: https://issues.apache.org/jira/browse/HIVE-24410 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 3h > Remaining Estimate: 0h > > QB compaction runs within a doas +and+ hive.server2.enable.doAs is set to > true (as of HIVE-24089). On a secure cluster with Worker threads running in > HS2, this results in HMS client not receiving a login context during > compaction queries, so kerberos prompts for a login via stdin which causes > the worker thread to hang until it times out: > {code:java} > "node-x.com-44_executor" #47 daemon prio=1 os_prio=0 tid=0x01506000 > nid=0x1348 runnable [0x7f1beea95000] >java.lang.Thread.State: RUNNABLE > at java.io.FileInputStream.readBytes(Native Method) > at java.io.FileInputStream.read(FileInputStream.java:255) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:284) > at java.io.BufferedInputStream.read(BufferedInputStream.java:345) > - locked <0x9fa38c90> (a java.io.BufferedInputStream) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) > - locked <0x8c7d5010> (a java.io.InputStreamReader) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:161) > at java.io.BufferedReader.readLine(BufferedReader.java:324) > - locked <0x8c7d5010> (a java.io.InputStreamReader) > at java.io.BufferedReader.readLine(BufferedReader.java:389) > at > com.sun.security.auth.callback.TextCallbackHandler.readLine(TextCallbackHandler.java:153) > at > com.sun.security.auth.callback.TextCallbackHandler.handle(TextCallbackHandler.java:120) > at > com.sun.security.auth.module.Krb5LoginModule.promptForName(Krb5LoginModule.java:862) > at > com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:708) > at > com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > javax.security.auth.login.LoginContext.invoke(LoginContext.java:755) > at > javax.security.auth.login.LoginContext.access$000(LoginContext.java:195) > at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682) > at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680) > at java.security.AccessController.doPrivileged(Native Method) > at > javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680) > at javax.security.auth.login.LoginContext.login(LoginContext.java:587) > at sun.security.jgss.GSSUtil.login(GSSUtil.java:258) > at sun.security.jgss.krb5.Krb5Util.getInitialTicket(Krb5Util.java:175) > at > sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:341) > at > sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:337) > at java.security.AccessController.doPrivileged(Native Method) > at > sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:336) > at > sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:146) > at > sun.security.jgss.
[jira] [Resolved] (HIVE-24410) Query-based compaction hangs because of doAs
[ https://issues.apache.org/jira/browse/HIVE-24410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Coppage resolved HIVE-24410. -- Resolution: Fixed Committed to master branch. Thanks [~pvargacl] and [~pvary] for reviewing! > Query-based compaction hangs because of doAs > > > Key: HIVE-24410 > URL: https://issues.apache.org/jira/browse/HIVE-24410 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 3h > Remaining Estimate: 0h > > QB compaction runs within a doas +and+ hive.server2.enable.doAs is set to > true (as of HIVE-24089). On a secure cluster with Worker threads running in > HS2, this results in HMS client not receiving a login context during > compaction queries, so kerberos prompts for a login via stdin which causes > the worker thread to hang until it times out: > {code:java} > "node-x.com-44_executor" #47 daemon prio=1 os_prio=0 tid=0x01506000 > nid=0x1348 runnable [0x7f1beea95000] >java.lang.Thread.State: RUNNABLE > at java.io.FileInputStream.readBytes(Native Method) > at java.io.FileInputStream.read(FileInputStream.java:255) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:284) > at java.io.BufferedInputStream.read(BufferedInputStream.java:345) > - locked <0x9fa38c90> (a java.io.BufferedInputStream) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) > - locked <0x8c7d5010> (a java.io.InputStreamReader) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:161) > at java.io.BufferedReader.readLine(BufferedReader.java:324) > - locked <0x8c7d5010> (a java.io.InputStreamReader) > at java.io.BufferedReader.readLine(BufferedReader.java:389) > at > com.sun.security.auth.callback.TextCallbackHandler.readLine(TextCallbackHandler.java:153) > at > com.sun.security.auth.callback.TextCallbackHandler.handle(TextCallbackHandler.java:120) > at > com.sun.security.auth.module.Krb5LoginModule.promptForName(Krb5LoginModule.java:862) > at > com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:708) > at > com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > javax.security.auth.login.LoginContext.invoke(LoginContext.java:755) > at > javax.security.auth.login.LoginContext.access$000(LoginContext.java:195) > at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682) > at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680) > at java.security.AccessController.doPrivileged(Native Method) > at > javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680) > at javax.security.auth.login.LoginContext.login(LoginContext.java:587) > at sun.security.jgss.GSSUtil.login(GSSUtil.java:258) > at sun.security.jgss.krb5.Krb5Util.getInitialTicket(Krb5Util.java:175) > at > sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:341) > at > sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:337) > at java.security.AccessController.doPrivileged(Native Method) > at > sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:336) > at > sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:146) > at > sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) > at > sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:189) > at > sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) > at > sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) > at > sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) > at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192) > at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransp