[jira] [Updated] (HIVE-27523) Implement array_union UDF in Hive

2023-07-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27523:
--
Labels: pull-request-available  (was: )

> Implement array_union UDF in Hive
> -
>
> Key: HIVE-27523
> URL: https://issues.apache.org/jira/browse/HIVE-27523
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Taraka Rama Rao Lethavadla
>Assignee: Taraka Rama Rao Lethavadla
>Priority: Major
>  Labels: pull-request-available
>
> *array_union(array1, array2)*
> Returns an array of the elements in the union of {{array1}} and {{array2}} 
> without duplicates.
>  
> {noformat}
> SELECT array_union(array(1, 2, 2, 3), array(1, 3, 5));
> [1,2,3,5]
> {noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27525) Ease the write permissions on external table during create table operation

2023-07-21 Thread Sai Hemanth Gantasala (Jira)
Sai Hemanth Gantasala created HIVE-27525:


 Summary: Ease the write permissions on external table during 
create table operation
 Key: HIVE-27525
 URL: https://issues.apache.org/jira/browse/HIVE-27525
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Sai Hemanth Gantasala
Assignee: Sai Hemanth Gantasala


During the creation of external tables with a specified location, the general 
expectation is that the data is already present or the data might be externally 
added to the location without involving HMS. So, it is really not required to 
have read and write permissions on an external table during the creation time.

This enhancement can address security concerns where currently the users had to 
be grant unnecessary write permissions on an external file location when the 
table is only used for reading the data.

Update/delete operations would anyway require write permissions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27524) Create Hive datasource for Grafana

2023-07-21 Thread Jeeshan Das (Jira)
Jeeshan Das created HIVE-27524:
--

 Summary: Create Hive datasource for Grafana
 Key: HIVE-27524
 URL: https://issues.apache.org/jira/browse/HIVE-27524
 Project: Hive
  Issue Type: New Feature
  Components: Hive
Reporter: Jeeshan Das


Hi,

Currently there is no direct way to connect to Hive from the visualization web 
application Grafana. 

Existing list of data sources -

[https://grafana.com/grafana/plugins/?type=datasource]

The requirement is to submit a Hive datasource for Grafana using which the 
reporting dashboards can be created in Grafana directly consuming the data from 
Hive tables.

Below communication from Grafana labs -

[https://community.grafana.com/t/grafana-hive-connector/91855]

Thanks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-24771) Fix hang of TransactionalKafkaWriterTest

2023-07-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24771:
--
Labels: pull-request-available  (was: )

> Fix hang of TransactionalKafkaWriterTest 
> -
>
> Key: HIVE-24771
> URL: https://issues.apache.org/jira/browse/HIVE-24771
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Kokila N
>Priority: Major
>  Labels: pull-request-available
> Attachments: hive.log.gz, jstack.1, jstack.2, jstack.3
>
>
> this test seems to hang randomly - I've launched 3 checks against it - all of 
> which started to hang after some time
> http://ci.hive.apache.org/job/hive-flaky-check/187/
> http://ci.hive.apache.org/job/hive-flaky-check/188/
> http://ci.hive.apache.org/job/hive-flaky-check/189/
> {code}
> "main" #1 prio=5 os_prio=0 tid=0x7f1d5400a800 nid=0x31e waiting on 
> condition [0x7f1d59381000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x894b3ed8> (a 
> java.util.concurrent.CountDownLatch$Sync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:837)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:999)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1308)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
> at 
> org.apache.kafka.clients.producer.internals.TransactionalRequestResult.await(TransactionalRequestResult.java:56)
> at 
> org.apache.hadoop.hive.kafka.HiveKafkaProducer.flushNewPartitions(HiveKafkaProducer.java:187)
> at 
> org.apache.hadoop.hive.kafka.HiveKafkaProducer.flush(HiveKafkaProducer.java:123)
> at 
> org.apache.hadoop.hive.kafka.TransactionalKafkaWriter.close(TransactionalKafkaWriter.java:189)
> at 
> org.apache.hadoop.hive.kafka.TransactionalKafkaWriterTest.writeAndCommit(TransactionalKafkaWriterTest.java:182)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> at 
> 

[jira] [Assigned] (HIVE-24771) Fix hang of TransactionalKafkaWriterTest

2023-07-21 Thread Kokila N (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kokila N reassigned HIVE-24771:
---

Assignee: Kokila N

> Fix hang of TransactionalKafkaWriterTest 
> -
>
> Key: HIVE-24771
> URL: https://issues.apache.org/jira/browse/HIVE-24771
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Kokila N
>Priority: Major
> Attachments: hive.log.gz, jstack.1, jstack.2, jstack.3
>
>
> this test seems to hang randomly - I've launched 3 checks against it - all of 
> which started to hang after some time
> http://ci.hive.apache.org/job/hive-flaky-check/187/
> http://ci.hive.apache.org/job/hive-flaky-check/188/
> http://ci.hive.apache.org/job/hive-flaky-check/189/
> {code}
> "main" #1 prio=5 os_prio=0 tid=0x7f1d5400a800 nid=0x31e waiting on 
> condition [0x7f1d59381000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x894b3ed8> (a 
> java.util.concurrent.CountDownLatch$Sync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:837)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:999)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1308)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
> at 
> org.apache.kafka.clients.producer.internals.TransactionalRequestResult.await(TransactionalRequestResult.java:56)
> at 
> org.apache.hadoop.hive.kafka.HiveKafkaProducer.flushNewPartitions(HiveKafkaProducer.java:187)
> at 
> org.apache.hadoop.hive.kafka.HiveKafkaProducer.flush(HiveKafkaProducer.java:123)
> at 
> org.apache.hadoop.hive.kafka.TransactionalKafkaWriter.close(TransactionalKafkaWriter.java:189)
> at 
> org.apache.hadoop.hive.kafka.TransactionalKafkaWriterTest.writeAndCommit(TransactionalKafkaWriterTest.java:182)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> at 
> 

[jira] [Created] (HIVE-27523) Implement array_union UDF in Hive

2023-07-21 Thread Taraka Rama Rao Lethavadla (Jira)
Taraka Rama Rao Lethavadla created HIVE-27523:
-

 Summary: Implement array_union UDF in Hive
 Key: HIVE-27523
 URL: https://issues.apache.org/jira/browse/HIVE-27523
 Project: Hive
  Issue Type: Sub-task
Reporter: Taraka Rama Rao Lethavadla
Assignee: Taraka Rama Rao Lethavadla


*array_union(array1, array2)*

Returns an array of the elements in the union of {{array1}} and {{array2}} 
without duplicates.

 
{noformat}
SELECT array_union(array(1, 2, 2, 3), array(1, 3, 5));
[1,2,3,5]
{noformat}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27195) Add database authorization for drop table command

2023-07-21 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745576#comment-17745576
 ] 

Stamatis Zampetakis commented on HIVE-27195:


Thanks for your hard work Riju! 

I went over the results in the spreadsheet and I have a few questions.

Q1. Is it normal that when the table or database is missing the behavior of 
DROP TABLE is the same (NOOP) with and without the IF EXISTS clause? 
The [Hive 
wiki|https://cwiki.apache.org/confluence/display/hive/languagemanual+ddl#LanguageManualDDL-DropTable]
 mentions the following:

"In Hive 0.7.0 or later, DROP returns an error if the table doesn't exist, 
unless IF EXISTS is specified or the configuration variable 
hive.exec.drop.ignorenonexistent is set to true."

Q2. I noticed that for non-temporary tables there is a "GRANT DROP ON TABLE" 
statement in the sample test case? Why is this needed? Left also a related 
comment in the PR.

Q3. I observed that DROP TABLE *IF EXISTS* will throw an authentication error 
even when the operations is NOOP (i.e., the database/table does not exist). I 
am wondering what happens with respect to authorization if we do CREATE TABLE 
*IF NOT EXISTS* and the table is already there. Do we perform the authorization 
anyways or we simply return as NOOP? Maybe it's worth keeping the behavior of 
the two operations consistent. Anyways, I am not an authorization expert so 
will defer the decision about the expected output to [~rmani] or [~hemanth619]. 

> Add database authorization for drop table command
> -
>
> Key: HIVE-27195
> URL: https://issues.apache.org/jira/browse/HIVE-27195
> Project: Hive
>  Issue Type: Bug
>Reporter: Riju Trivedi
>Assignee: Riju Trivedi
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Include authorization of the database object during the "drop table" command. 
> Similar to "Create table", DB permissions should be verified in the case of 
> "drop table" too. Add the database object along with the table object to the 
> list of output objects sent for verifying privileges. This change would 
> ensure that in case of a non-existent table or temporary table (skipped from 
> authorization after HIVE-20051), the authorizer will verify privileges for 
> the database object.
> This would also prevent DROP TABLE IF EXISTS command failure for temporary or 
> non-existing tables with `RangerHiveAuthorizer`. In case of 
> temporary/non-existing table, empty input and output HivePrivilege Objects 
> are sent to Ranger authorizer and after 
> https://issues.apache.org/jira/browse/RANGER-3407 authorization request is 
> built from command in case of empty objects. Hence, the drop table if Exists 
> command fails with  HiveAccessControlException.
> Steps to Repro:
> {code:java}
> use test; CREATE TEMPORARY TABLE temp_table (id int);
> drop table if exists test.temp_table;
> Error: Error while compiling statement: FAILED: HiveAccessControlException 
> Permission denied: user [rtrivedi] does not have [DROP] privilege on 
> [test/temp_table] (state=42000,code=4) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27522) Iceberg: Bucket partition transformation date type support

2023-07-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27522:
--
Labels: pull-request-available  (was: )

> Iceberg: Bucket partition transformation date type support
> --
>
> Key: HIVE-27522
> URL: https://issues.apache.org/jira/browse/HIVE-27522
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>
> {code}
> Caused by: org.apache.hadoop.hive.ql.exec.UDFArgumentException:  
> ICEBERG_BUCKET() only takes 
> STRING/CHAR/VARCHAR/BINARY/INT/LONG/DECIMAL/FLOAT/DOUBLE types as first 
> argument, got DATE
> at 
> org.apache.iceberg.mr.hive.GenericUDFIcebergBucket.initialize(GenericUDFIcebergBucket.java:162)
>  ~[hive-iceberg-handler-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:149)
>  ~[hive-exec-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335]
> at 
> org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:235)
>  ~[hive-exec-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335]
> at 
> org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.lambda$null$0(HiveIcebergStorageHandler.java:142)
>  ~[hive-iceberg-handler-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335]
> at 
> org.apache.hadoop.hive.ql.optimizer.SortedDynPartitionOptimizer$SortedDynamicPartitionProc.allStaticPartitions(SortedDynPartitionOptimizer.java:420)
>  ~[hive-exec-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335]
> at 
> org.apache.hadoop.hive.ql.optimizer.SortedDynPartitionOptimizer$SortedDynamicPartitionProc.process(SortedDynPartitionOptimizer.java:195)
>  ~[hive-exec-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27195) Add database authorization for drop table command

2023-07-21 Thread Riju Trivedi (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745509#comment-17745509
 ] 

Riju Trivedi commented on HIVE-27195:
-

Thank you [~zabetak] for reviewing and consolidating test scenarios. I have 
updated the test results to the 
[sheet|https://docs.google.com/spreadsheets/d/1CJ1U0LOCpK7TfxY5RSSM4Wmbmt7GiKt5VQrWt1x2tfs/edit?pli=1#gid=0]
 and uploaded tests to the PR.

> Add database authorization for drop table command
> -
>
> Key: HIVE-27195
> URL: https://issues.apache.org/jira/browse/HIVE-27195
> Project: Hive
>  Issue Type: Bug
>Reporter: Riju Trivedi
>Assignee: Riju Trivedi
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Include authorization of the database object during the "drop table" command. 
> Similar to "Create table", DB permissions should be verified in the case of 
> "drop table" too. Add the database object along with the table object to the 
> list of output objects sent for verifying privileges. This change would 
> ensure that in case of a non-existent table or temporary table (skipped from 
> authorization after HIVE-20051), the authorizer will verify privileges for 
> the database object.
> This would also prevent DROP TABLE IF EXISTS command failure for temporary or 
> non-existing tables with `RangerHiveAuthorizer`. In case of 
> temporary/non-existing table, empty input and output HivePrivilege Objects 
> are sent to Ranger authorizer and after 
> https://issues.apache.org/jira/browse/RANGER-3407 authorization request is 
> built from command in case of empty objects. Hence, the drop table if Exists 
> command fails with  HiveAccessControlException.
> Steps to Repro:
> {code:java}
> use test; CREATE TEMPORARY TABLE temp_table (id int);
> drop table if exists test.temp_table;
> Error: Error while compiling statement: FAILED: HiveAccessControlException 
> Permission denied: user [rtrivedi] does not have [DROP] privilege on 
> [test/temp_table] (state=42000,code=4) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27236) Iceberg: DROP BRANCH SQL implementation

2023-07-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27236:
--
Labels: pull-request-available  (was: )

> Iceberg: DROP BRANCH SQL implementation
> ---
>
> Key: HIVE-27236
> URL: https://issues.apache.org/jira/browse/HIVE-27236
> Project: Hive
>  Issue Type: Sub-task
>  Components: Iceberg integration
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27484) Limit pushdown with offset generate wrong results

2023-07-21 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-27484:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Merged to master. Thanks [~okumin] for the patch and [~aturoczy] for review.

> Limit pushdown with offset generate wrong results
> -
>
> Key: HIVE-27484
> URL: https://issues.apache.org/jira/browse/HIVE-27484
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0-alpha-2
>Reporter: okumin
>Assignee: okumin
>Priority: Major
>  Labels: pull-request-available
>
> With `hive.optimize.limittranspose` in CBO, Hive can generate incorrect 
> results.
> For example, I'd say this case should generate 1 row.
> https://github.com/apache/hive/blob/rel/release-4.0.0-alpha-2/ql/src/test/results/clientpositive/llap/limit_join_transpose.q.out#L1328-L1341



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-25585) Put dagId to MDC once it's available in HS2

2023-07-21 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-25585:

Description: 
This is about putting dagID to MDC once the DAG is submitted. This way, dagId 
can be easily appended to log messages by log4j.
Like:
{code}
hiveserver2 <14>1 2022-10-25T10:51:05.496Z hiveserver2-0 hiveserver2 1 
24a33001-3523-415b-a6a8-7733708507af [mdc@18060 
class="monitoring.RenderStrategy$LogToFileFunction" 
dagId="dag_194252139__3" level="INFO" operationLogLevel="EXECUTION" 
queryId="hive_20221025105055_ec8135b4-1b0e-46b8-bb3f-4b58b41cf53b" 
sessionId="6fc7adf6-d3b0-470a-813f-03c79a0ca20d" 
thread="HiveServer2-Background-Pool: Thread-196"] Map 1: 1/1Map 2: 
0(+11)/208Map 7: 1/1Map 8: 12(+11)/23Map 9: 1/1Reducer 3: 0/738 
   Reducer 4: 0/642Reducer 5: 0/1Reducer 6: 0/782
{code}

  was:This is about putting dagID to MDC once the DAG is submitted. This way, 
dagId can be easily appended to log messages by log4j.


> Put dagId to MDC once it's available in HS2
> ---
>
> Key: HIVE-25585
> URL: https://issues.apache.org/jira/browse/HIVE-25585
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-beta-1
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This is about putting dagID to MDC once the DAG is submitted. This way, dagId 
> can be easily appended to log messages by log4j.
> Like:
> {code}
> hiveserver2 <14>1 2022-10-25T10:51:05.496Z hiveserver2-0 hiveserver2 1 
> 24a33001-3523-415b-a6a8-7733708507af [mdc@18060 
> class="monitoring.RenderStrategy$LogToFileFunction" 
> dagId="dag_194252139__3" level="INFO" operationLogLevel="EXECUTION" 
> queryId="hive_20221025105055_ec8135b4-1b0e-46b8-bb3f-4b58b41cf53b" 
> sessionId="6fc7adf6-d3b0-470a-813f-03c79a0ca20d" 
> thread="HiveServer2-Background-Pool: Thread-196"] Map 1: 1/1Map 2: 
> 0(+11)/208Map 7: 1/1Map 8: 12(+11)/23Map 9: 1/1Reducer 3: 
> 0/738Reducer 4: 0/642Reducer 5: 0/1Reducer 6: 0/782
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-25585) Put dagId to MDC once it's available in HS2

2023-07-21 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-25585:

Description: This is about putting dagID to MDC once the DAG is submitted. 
This way, dagId can be easily appended to log messages by log4j.

> Put dagId to MDC once it's available in HS2
> ---
>
> Key: HIVE-25585
> URL: https://issues.apache.org/jira/browse/HIVE-25585
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-beta-1
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This is about putting dagID to MDC once the DAG is submitted. This way, dagId 
> can be easily appended to log messages by log4j.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27522) Iceberg: Bucket partition transformation date type support

2023-07-21 Thread Denys Kuzmenko (Jira)
Denys Kuzmenko created HIVE-27522:
-

 Summary: Iceberg: Bucket partition transformation date type support
 Key: HIVE-27522
 URL: https://issues.apache.org/jira/browse/HIVE-27522
 Project: Hive
  Issue Type: Task
Reporter: Denys Kuzmenko


{code}
Caused by: org.apache.hadoop.hive.ql.exec.UDFArgumentException:  
ICEBERG_BUCKET() only takes 
STRING/CHAR/VARCHAR/BINARY/INT/LONG/DECIMAL/FLOAT/DOUBLE types as first 
argument, got DATE
at 
org.apache.iceberg.mr.hive.GenericUDFIcebergBucket.initialize(GenericUDFIcebergBucket.java:162)
 ~[hive-iceberg-handler-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335]
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:149)
 ~[hive-exec-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335]
at 
org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:235)
 ~[hive-exec-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335]
at 
org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.lambda$null$0(HiveIcebergStorageHandler.java:142)
 ~[hive-iceberg-handler-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335]
at 
org.apache.hadoop.hive.ql.optimizer.SortedDynPartitionOptimizer$SortedDynamicPartitionProc.allStaticPartitions(SortedDynPartitionOptimizer.java:420)
 ~[hive-exec-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335]
at 
org.apache.hadoop.hive.ql.optimizer.SortedDynPartitionOptimizer$SortedDynamicPartitionProc.process(SortedDynPartitionOptimizer.java:195)
 ~[hive-exec-3.1.3000.7.2.17.0-335.jar:3.1.3000.7.2.17.0-335]
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27437) Vectorization: VectorizedOrcRecordReader does not reset VectorizedRowBatch after processing

2023-07-21 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko resolved HIVE-27437.
---
Fix Version/s: 4.0.0-beta-1
   Resolution: Fixed

> Vectorization: VectorizedOrcRecordReader does not reset VectorizedRowBatch 
> after processing
> ---
>
> Key: HIVE-27437
> URL: https://issues.apache.org/jira/browse/HIVE-27437
> Project: Hive
>  Issue Type: Task
>Reporter: Alagappan Maruthappan
>Assignee: Alagappan Maruthappan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-beta-1
>
>
> There seems to be a memory leak in VectorizedOrcRecordReader. When 
> MapColumnVector or ListColumnVector is used and the VectorizedRowBatch is not 
> reset after every read, the vector keeps growing and spending a lot of time 
> assigning memory. 
> The reset happens in VectorizedParquetRecordReader -
> https://github.com/apache/hive/blob/f78ca5df80c0bcb566f0915cda65112268df492c/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java#L400



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27437) Vectorization: VectorizedOrcRecordReader does not reset VectorizedRowBatch after processing

2023-07-21 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745470#comment-17745470
 ] 

Denys Kuzmenko commented on HIVE-27437:
---

Merged to master.
[~maswin], thank you for the contribution!

> Vectorization: VectorizedOrcRecordReader does not reset VectorizedRowBatch 
> after processing
> ---
>
> Key: HIVE-27437
> URL: https://issues.apache.org/jira/browse/HIVE-27437
> Project: Hive
>  Issue Type: Task
>Reporter: Alagappan Maruthappan
>Assignee: Alagappan Maruthappan
>Priority: Major
>  Labels: pull-request-available
>
> There seems to be a memory leak in VectorizedOrcRecordReader. When 
> MapColumnVector or ListColumnVector is used and the VectorizedRowBatch is not 
> reset after every read, the vector keeps growing and spending a lot of time 
> assigning memory. 
> The reset happens in VectorizedParquetRecordReader -
> https://github.com/apache/hive/blob/f78ca5df80c0bcb566f0915cda65112268df492c/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java#L400



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27521) Bump bouncycastle (bcprov-jdk15on) to 1.70

2023-07-21 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-27521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned HIVE-27521:
---

Assignee: László Bodor

> Bump bouncycastle (bcprov-jdk15on) to 1.70
> --
>
> Key: HIVE-27521
> URL: https://issues.apache.org/jira/browse/HIVE-27521
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work started] (HIVE-27521) Bump bouncycastle (bcprov-jdk15on) to 1.70

2023-07-21 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-27521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-27521 started by László Bodor.
---
> Bump bouncycastle (bcprov-jdk15on) to 1.70
> --
>
> Key: HIVE-27521
> URL: https://issues.apache.org/jira/browse/HIVE-27521
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27521) Bump bouncycastle (bcprov-jdk15on) to 1.70

2023-07-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27521:
--
Labels: pull-request-available  (was: )

> Bump bouncycastle (bcprov-jdk15on) to 1.70
> --
>
> Key: HIVE-27521
> URL: https://issues.apache.org/jira/browse/HIVE-27521
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27521) Bump bouncycastle (bcprov-jdk15on) to 1.70

2023-07-21 Thread Jira
László Bodor created HIVE-27521:
---

 Summary: Bump bouncycastle (bcprov-jdk15on) to 1.70
 Key: HIVE-27521
 URL: https://issues.apache.org/jira/browse/HIVE-27521
 Project: Hive
  Issue Type: Improvement
Reporter: László Bodor






--
This message was sent by Atlassian Jira
(v8.20.10#820010)