[jira] [Updated] (HIVE-27523) Implement array_union UDF in Hive
[ https://issues.apache.org/jira/browse/HIVE-27523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-27523: -- Labels: pull-request-available (was: ) > Implement array_union UDF in Hive > - > > Key: HIVE-27523 > URL: https://issues.apache.org/jira/browse/HIVE-27523 > Project: Hive > Issue Type: Sub-task >Reporter: Taraka Rama Rao Lethavadla >Assignee: Taraka Rama Rao Lethavadla >Priority: Major > Labels: pull-request-available > > *array_union(array1, array2)* > Returns an array of the elements in the union of {{array1}} and {{array2}} > without duplicates. > > {noformat} > SELECT array_union(array(1, 2, 2, 3), array(1, 3, 5)); > [1,2,3,5] > {noformat} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27525) Ease the write permissions on external table during create table operation
Sai Hemanth Gantasala created HIVE-27525: Summary: Ease the write permissions on external table during create table operation Key: HIVE-27525 URL: https://issues.apache.org/jira/browse/HIVE-27525 Project: Hive Issue Type: Improvement Components: Standalone Metastore Reporter: Sai Hemanth Gantasala Assignee: Sai Hemanth Gantasala During the creation of external tables with a specified location, the general expectation is that the data is already present or the data might be externally added to the location without involving HMS. So, it is really not required to have read and write permissions on an external table during the creation time. This enhancement can address security concerns where currently the users had to be grant unnecessary write permissions on an external file location when the table is only used for reading the data. Update/delete operations would anyway require write permissions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27524) Create Hive datasource for Grafana
Jeeshan Das created HIVE-27524: -- Summary: Create Hive datasource for Grafana Key: HIVE-27524 URL: https://issues.apache.org/jira/browse/HIVE-27524 Project: Hive Issue Type: New Feature Components: Hive Reporter: Jeeshan Das Hi, Currently there is no direct way to connect to Hive from the visualization web application Grafana. Existing list of data sources - [https://grafana.com/grafana/plugins/?type=datasource] The requirement is to submit a Hive datasource for Grafana using which the reporting dashboards can be created in Grafana directly consuming the data from Hive tables. Below communication from Grafana labs - [https://community.grafana.com/t/grafana-hive-connector/91855] Thanks -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-24771) Fix hang of TransactionalKafkaWriterTest
[ https://issues.apache.org/jira/browse/HIVE-24771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24771: -- Labels: pull-request-available (was: ) > Fix hang of TransactionalKafkaWriterTest > - > > Key: HIVE-24771 > URL: https://issues.apache.org/jira/browse/HIVE-24771 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Kokila N >Priority: Major > Labels: pull-request-available > Attachments: hive.log.gz, jstack.1, jstack.2, jstack.3 > > > this test seems to hang randomly - I've launched 3 checks against it - all of > which started to hang after some time > http://ci.hive.apache.org/job/hive-flaky-check/187/ > http://ci.hive.apache.org/job/hive-flaky-check/188/ > http://ci.hive.apache.org/job/hive-flaky-check/189/ > {code} > "main" #1 prio=5 os_prio=0 tid=0x7f1d5400a800 nid=0x31e waiting on > condition [0x7f1d59381000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x894b3ed8> (a > java.util.concurrent.CountDownLatch$Sync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:837) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:999) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1308) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > at > org.apache.kafka.clients.producer.internals.TransactionalRequestResult.await(TransactionalRequestResult.java:56) > at > org.apache.hadoop.hive.kafka.HiveKafkaProducer.flushNewPartitions(HiveKafkaProducer.java:187) > at > org.apache.hadoop.hive.kafka.HiveKafkaProducer.flush(HiveKafkaProducer.java:123) > at > org.apache.hadoop.hive.kafka.TransactionalKafkaWriter.close(TransactionalKafkaWriter.java:189) > at > org.apache.hadoop.hive.kafka.TransactionalKafkaWriterTest.writeAndCommit(TransactionalKafkaWriterTest.java:182) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at >
[jira] [Assigned] (HIVE-24771) Fix hang of TransactionalKafkaWriterTest
[ https://issues.apache.org/jira/browse/HIVE-24771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kokila N reassigned HIVE-24771: --- Assignee: Kokila N > Fix hang of TransactionalKafkaWriterTest > - > > Key: HIVE-24771 > URL: https://issues.apache.org/jira/browse/HIVE-24771 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Kokila N >Priority: Major > Attachments: hive.log.gz, jstack.1, jstack.2, jstack.3 > > > this test seems to hang randomly - I've launched 3 checks against it - all of > which started to hang after some time > http://ci.hive.apache.org/job/hive-flaky-check/187/ > http://ci.hive.apache.org/job/hive-flaky-check/188/ > http://ci.hive.apache.org/job/hive-flaky-check/189/ > {code} > "main" #1 prio=5 os_prio=0 tid=0x7f1d5400a800 nid=0x31e waiting on > condition [0x7f1d59381000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x894b3ed8> (a > java.util.concurrent.CountDownLatch$Sync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:837) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:999) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1308) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > at > org.apache.kafka.clients.producer.internals.TransactionalRequestResult.await(TransactionalRequestResult.java:56) > at > org.apache.hadoop.hive.kafka.HiveKafkaProducer.flushNewPartitions(HiveKafkaProducer.java:187) > at > org.apache.hadoop.hive.kafka.HiveKafkaProducer.flush(HiveKafkaProducer.java:123) > at > org.apache.hadoop.hive.kafka.TransactionalKafkaWriter.close(TransactionalKafkaWriter.java:189) > at > org.apache.hadoop.hive.kafka.TransactionalKafkaWriterTest.writeAndCommit(TransactionalKafkaWriterTest.java:182) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at >
[jira] [Created] (HIVE-27523) Implement array_union UDF in Hive
Taraka Rama Rao Lethavadla created HIVE-27523: - Summary: Implement array_union UDF in Hive Key: HIVE-27523 URL: https://issues.apache.org/jira/browse/HIVE-27523 Project: Hive Issue Type: Sub-task Reporter: Taraka Rama Rao Lethavadla Assignee: Taraka Rama Rao Lethavadla *array_union(array1, array2)* Returns an array of the elements in the union of {{array1}} and {{array2}} without duplicates. {noformat} SELECT array_union(array(1, 2, 2, 3), array(1, 3, 5)); [1,2,3,5] {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27195) Add database authorization for drop table command
[ https://issues.apache.org/jira/browse/HIVE-27195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745576#comment-17745576 ] Stamatis Zampetakis commented on HIVE-27195: Thanks for your hard work Riju! I went over the results in the spreadsheet and I have a few questions. Q1. Is it normal that when the table or database is missing the behavior of DROP TABLE is the same (NOOP) with and without the IF EXISTS clause? The [Hive wiki|https://cwiki.apache.org/confluence/display/hive/languagemanual+ddl#LanguageManualDDL-DropTable] mentions the following: "In Hive 0.7.0 or later, DROP returns an error if the table doesn't exist, unless IF EXISTS is specified or the configuration variable hive.exec.drop.ignorenonexistent is set to true." Q2. I noticed that for non-temporary tables there is a "GRANT DROP ON TABLE" statement in the sample test case? Why is this needed? Left also a related comment in the PR. Q3. I observed that DROP TABLE *IF EXISTS* will throw an authentication error even when the operations is NOOP (i.e., the database/table does not exist). I am wondering what happens with respect to authorization if we do CREATE TABLE *IF NOT EXISTS* and the table is already there. Do we perform the authorization anyways or we simply return as NOOP? Maybe it's worth keeping the behavior of the two operations consistent. Anyways, I am not an authorization expert so will defer the decision about the expected output to [~rmani] or [~hemanth619]. > Add database authorization for drop table command > - > > Key: HIVE-27195 > URL: https://issues.apache.org/jira/browse/HIVE-27195 > Project: Hive > Issue Type: Bug >Reporter: Riju Trivedi >Assignee: Riju Trivedi >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Include authorization of the database object during the "drop table" command. > Similar to "Create table", DB permissions should be verified in the case of > "drop table" too. Add the database object along with the table object to the > list of output objects sent for verifying privileges. This change would > ensure that in case of a non-existent table or temporary table (skipped from > authorization after HIVE-20051), the authorizer will verify privileges for > the database object. > This would also prevent DROP TABLE IF EXISTS command failure for temporary or > non-existing tables with `RangerHiveAuthorizer`. In case of > temporary/non-existing table, empty input and output HivePrivilege Objects > are sent to Ranger authorizer and after > https://issues.apache.org/jira/browse/RANGER-3407 authorization request is > built from command in case of empty objects. Hence, the drop table if Exists > command fails with HiveAccessControlException. > Steps to Repro: > {code:java} > use test; CREATE TEMPORARY TABLE temp_table (id int); > drop table if exists test.temp_table; > Error: Error while compiling statement: FAILED: HiveAccessControlException > Permission denied: user [rtrivedi] does not have [DROP] privilege on > [test/temp_table] (state=42000,code=4) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27522) Iceberg: Bucket partition transformation date type support
[ https://issues.apache.org/jira/browse/HIVE-27522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-27522: -- Labels: pull-request-available (was: ) > Iceberg: Bucket partition transformation date type support > -- > > Key: HIVE-27522 > URL: https://issues.apache.org/jira/browse/HIVE-27522 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > > {code} > Caused by: org.apache.hadoop.hive.ql.exec.UDFArgumentException: > ICEBERG_BUCKET() only takes > STRING/CHAR/VARCHAR/BINARY/INT/LONG/DECIMAL/FLOAT/DOUBLE types as first > argument, got DATE > at > org.apache.iceberg.mr.hive.GenericUDFIcebergBucket.initialize(GenericUDFIcebergBucket.java:162) > ~[hive-iceberg-handler-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335] > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:149) > ~[hive-exec-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335] > at > org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:235) > ~[hive-exec-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335] > at > org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.lambda$null$0(HiveIcebergStorageHandler.java:142) > ~[hive-iceberg-handler-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335] > at > org.apache.hadoop.hive.ql.optimizer.SortedDynPartitionOptimizer$SortedDynamicPartitionProc.allStaticPartitions(SortedDynPartitionOptimizer.java:420) > ~[hive-exec-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335] > at > org.apache.hadoop.hive.ql.optimizer.SortedDynPartitionOptimizer$SortedDynamicPartitionProc.process(SortedDynPartitionOptimizer.java:195) > ~[hive-exec-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27195) Add database authorization for drop table command
[ https://issues.apache.org/jira/browse/HIVE-27195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745509#comment-17745509 ] Riju Trivedi commented on HIVE-27195: - Thank you [~zabetak] for reviewing and consolidating test scenarios. I have updated the test results to the [sheet|https://docs.google.com/spreadsheets/d/1CJ1U0LOCpK7TfxY5RSSM4Wmbmt7GiKt5VQrWt1x2tfs/edit?pli=1#gid=0] and uploaded tests to the PR. > Add database authorization for drop table command > - > > Key: HIVE-27195 > URL: https://issues.apache.org/jira/browse/HIVE-27195 > Project: Hive > Issue Type: Bug >Reporter: Riju Trivedi >Assignee: Riju Trivedi >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Include authorization of the database object during the "drop table" command. > Similar to "Create table", DB permissions should be verified in the case of > "drop table" too. Add the database object along with the table object to the > list of output objects sent for verifying privileges. This change would > ensure that in case of a non-existent table or temporary table (skipped from > authorization after HIVE-20051), the authorizer will verify privileges for > the database object. > This would also prevent DROP TABLE IF EXISTS command failure for temporary or > non-existing tables with `RangerHiveAuthorizer`. In case of > temporary/non-existing table, empty input and output HivePrivilege Objects > are sent to Ranger authorizer and after > https://issues.apache.org/jira/browse/RANGER-3407 authorization request is > built from command in case of empty objects. Hence, the drop table if Exists > command fails with HiveAccessControlException. > Steps to Repro: > {code:java} > use test; CREATE TEMPORARY TABLE temp_table (id int); > drop table if exists test.temp_table; > Error: Error while compiling statement: FAILED: HiveAccessControlException > Permission denied: user [rtrivedi] does not have [DROP] privilege on > [test/temp_table] (state=42000,code=4) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27236) Iceberg: DROP BRANCH SQL implementation
[ https://issues.apache.org/jira/browse/HIVE-27236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-27236: -- Labels: pull-request-available (was: ) > Iceberg: DROP BRANCH SQL implementation > --- > > Key: HIVE-27236 > URL: https://issues.apache.org/jira/browse/HIVE-27236 > Project: Hive > Issue Type: Sub-task > Components: Iceberg integration >Reporter: zhangbutao >Assignee: zhangbutao >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27484) Limit pushdown with offset generate wrong results
[ https://issues.apache.org/jira/browse/HIVE-27484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa updated HIVE-27484: -- Resolution: Fixed Status: Resolved (was: Patch Available) Merged to master. Thanks [~okumin] for the patch and [~aturoczy] for review. > Limit pushdown with offset generate wrong results > - > > Key: HIVE-27484 > URL: https://issues.apache.org/jira/browse/HIVE-27484 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0-alpha-2 >Reporter: okumin >Assignee: okumin >Priority: Major > Labels: pull-request-available > > With `hive.optimize.limittranspose` in CBO, Hive can generate incorrect > results. > For example, I'd say this case should generate 1 row. > https://github.com/apache/hive/blob/rel/release-4.0.0-alpha-2/ql/src/test/results/clientpositive/llap/limit_join_transpose.q.out#L1328-L1341 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-25585) Put dagId to MDC once it's available in HS2
[ https://issues.apache.org/jira/browse/HIVE-25585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-25585: Description: This is about putting dagID to MDC once the DAG is submitted. This way, dagId can be easily appended to log messages by log4j. Like: {code} hiveserver2 <14>1 2022-10-25T10:51:05.496Z hiveserver2-0 hiveserver2 1 24a33001-3523-415b-a6a8-7733708507af [mdc@18060 class="monitoring.RenderStrategy$LogToFileFunction" dagId="dag_194252139__3" level="INFO" operationLogLevel="EXECUTION" queryId="hive_20221025105055_ec8135b4-1b0e-46b8-bb3f-4b58b41cf53b" sessionId="6fc7adf6-d3b0-470a-813f-03c79a0ca20d" thread="HiveServer2-Background-Pool: Thread-196"] Map 1: 1/1Map 2: 0(+11)/208Map 7: 1/1Map 8: 12(+11)/23Map 9: 1/1Reducer 3: 0/738 Reducer 4: 0/642Reducer 5: 0/1Reducer 6: 0/782 {code} was:This is about putting dagID to MDC once the DAG is submitted. This way, dagId can be easily appended to log messages by log4j. > Put dagId to MDC once it's available in HS2 > --- > > Key: HIVE-25585 > URL: https://issues.apache.org/jira/browse/HIVE-25585 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-beta-1 > > Time Spent: 0.5h > Remaining Estimate: 0h > > This is about putting dagID to MDC once the DAG is submitted. This way, dagId > can be easily appended to log messages by log4j. > Like: > {code} > hiveserver2 <14>1 2022-10-25T10:51:05.496Z hiveserver2-0 hiveserver2 1 > 24a33001-3523-415b-a6a8-7733708507af [mdc@18060 > class="monitoring.RenderStrategy$LogToFileFunction" > dagId="dag_194252139__3" level="INFO" operationLogLevel="EXECUTION" > queryId="hive_20221025105055_ec8135b4-1b0e-46b8-bb3f-4b58b41cf53b" > sessionId="6fc7adf6-d3b0-470a-813f-03c79a0ca20d" > thread="HiveServer2-Background-Pool: Thread-196"] Map 1: 1/1Map 2: > 0(+11)/208Map 7: 1/1Map 8: 12(+11)/23Map 9: 1/1Reducer 3: > 0/738Reducer 4: 0/642Reducer 5: 0/1Reducer 6: 0/782 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-25585) Put dagId to MDC once it's available in HS2
[ https://issues.apache.org/jira/browse/HIVE-25585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-25585: Description: This is about putting dagID to MDC once the DAG is submitted. This way, dagId can be easily appended to log messages by log4j. > Put dagId to MDC once it's available in HS2 > --- > > Key: HIVE-25585 > URL: https://issues.apache.org/jira/browse/HIVE-25585 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-beta-1 > > Time Spent: 0.5h > Remaining Estimate: 0h > > This is about putting dagID to MDC once the DAG is submitted. This way, dagId > can be easily appended to log messages by log4j. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27522) Iceberg: Bucket partition transformation date type support
Denys Kuzmenko created HIVE-27522: - Summary: Iceberg: Bucket partition transformation date type support Key: HIVE-27522 URL: https://issues.apache.org/jira/browse/HIVE-27522 Project: Hive Issue Type: Task Reporter: Denys Kuzmenko {code} Caused by: org.apache.hadoop.hive.ql.exec.UDFArgumentException: ICEBERG_BUCKET() only takes STRING/CHAR/VARCHAR/BINARY/INT/LONG/DECIMAL/FLOAT/DOUBLE types as first argument, got DATE at org.apache.iceberg.mr.hive.GenericUDFIcebergBucket.initialize(GenericUDFIcebergBucket.java:162) ~[hive-iceberg-handler-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335] at org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:149) ~[hive-exec-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335] at org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:235) ~[hive-exec-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335] at org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.lambda$null$0(HiveIcebergStorageHandler.java:142) ~[hive-iceberg-handler-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335] at org.apache.hadoop.hive.ql.optimizer.SortedDynPartitionOptimizer$SortedDynamicPartitionProc.allStaticPartitions(SortedDynPartitionOptimizer.java:420) ~[hive-exec-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335] at org.apache.hadoop.hive.ql.optimizer.SortedDynPartitionOptimizer$SortedDynamicPartitionProc.process(SortedDynPartitionOptimizer.java:195) ~[hive-exec-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335] {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-27437) Vectorization: VectorizedOrcRecordReader does not reset VectorizedRowBatch after processing
[ https://issues.apache.org/jira/browse/HIVE-27437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko resolved HIVE-27437. --- Fix Version/s: 4.0.0-beta-1 Resolution: Fixed > Vectorization: VectorizedOrcRecordReader does not reset VectorizedRowBatch > after processing > --- > > Key: HIVE-27437 > URL: https://issues.apache.org/jira/browse/HIVE-27437 > Project: Hive > Issue Type: Task >Reporter: Alagappan Maruthappan >Assignee: Alagappan Maruthappan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-beta-1 > > > There seems to be a memory leak in VectorizedOrcRecordReader. When > MapColumnVector or ListColumnVector is used and the VectorizedRowBatch is not > reset after every read, the vector keeps growing and spending a lot of time > assigning memory. > The reset happens in VectorizedParquetRecordReader - > https://github.com/apache/hive/blob/f78ca5df80c0bcb566f0915cda65112268df492c/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java#L400 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27437) Vectorization: VectorizedOrcRecordReader does not reset VectorizedRowBatch after processing
[ https://issues.apache.org/jira/browse/HIVE-27437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745470#comment-17745470 ] Denys Kuzmenko commented on HIVE-27437: --- Merged to master. [~maswin], thank you for the contribution! > Vectorization: VectorizedOrcRecordReader does not reset VectorizedRowBatch > after processing > --- > > Key: HIVE-27437 > URL: https://issues.apache.org/jira/browse/HIVE-27437 > Project: Hive > Issue Type: Task >Reporter: Alagappan Maruthappan >Assignee: Alagappan Maruthappan >Priority: Major > Labels: pull-request-available > > There seems to be a memory leak in VectorizedOrcRecordReader. When > MapColumnVector or ListColumnVector is used and the VectorizedRowBatch is not > reset after every read, the vector keeps growing and spending a lot of time > assigning memory. > The reset happens in VectorizedParquetRecordReader - > https://github.com/apache/hive/blob/f78ca5df80c0bcb566f0915cda65112268df492c/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java#L400 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-27521) Bump bouncycastle (bcprov-jdk15on) to 1.70
[ https://issues.apache.org/jira/browse/HIVE-27521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor reassigned HIVE-27521: --- Assignee: László Bodor > Bump bouncycastle (bcprov-jdk15on) to 1.70 > -- > > Key: HIVE-27521 > URL: https://issues.apache.org/jira/browse/HIVE-27521 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work started] (HIVE-27521) Bump bouncycastle (bcprov-jdk15on) to 1.70
[ https://issues.apache.org/jira/browse/HIVE-27521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-27521 started by László Bodor. --- > Bump bouncycastle (bcprov-jdk15on) to 1.70 > -- > > Key: HIVE-27521 > URL: https://issues.apache.org/jira/browse/HIVE-27521 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27521) Bump bouncycastle (bcprov-jdk15on) to 1.70
[ https://issues.apache.org/jira/browse/HIVE-27521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-27521: -- Labels: pull-request-available (was: ) > Bump bouncycastle (bcprov-jdk15on) to 1.70 > -- > > Key: HIVE-27521 > URL: https://issues.apache.org/jira/browse/HIVE-27521 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27521) Bump bouncycastle (bcprov-jdk15on) to 1.70
László Bodor created HIVE-27521: --- Summary: Bump bouncycastle (bcprov-jdk15on) to 1.70 Key: HIVE-27521 URL: https://issues.apache.org/jira/browse/HIVE-27521 Project: Hive Issue Type: Improvement Reporter: László Bodor -- This message was sent by Atlassian Jira (v8.20.10#820010)