[jira] [Assigned] (HIVE-24484) Upgrade Hadoop to 3.3.3

2022-06-02 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena reassigned HIVE-24484:
---

Assignee: Ayush Saxena  (was: David Mollitor)

> Upgrade Hadoop to 3.3.3
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 12h 13m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (HIVE-26264) Iceberg integration: Fetch virtual columns on demand

2022-06-02 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa resolved HIVE-26264.
---
Resolution: Fixed

Pushed to master. Thanks [~pvary] for review.

> Iceberg integration: Fetch virtual columns on demand
> 
>
> Key: HIVE-26264
> URL: https://issues.apache.org/jira/browse/HIVE-26264
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Currently virtual columns are fetched from iceberg tables if the statement 
> being executed is a delete or update statement and the setting is global. It 
> means it affects all tables affected by the statement. Also the read and 
> write schema depends on the operation setting.
> Some statements fails due to invalid schema:
> {code}
> create external table tbl_ice(a int, b string, c int) stored by iceberg 
> stored as orc tblproperties ('format-version'='2');
> insert into tbl_ice values (1, 'one', 50), (2, 'two', 51), (3, 'three', 52), 
> (4, 'four', 53), (5, 'five', 54), (111, 'one', 55), (333, 'two', 56);
> update tbl_ice set b='Changed' where b in (select b from tbl_ice where a < 4);
> {code}
> {code}
> See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, 
> or check ./ql/target/surefire-reports or 
> ./itests/qtest/target/surefire-reports/ for specific test cases logs.
>  org.apache.hadoop.hive.ql.metadata.HiveException: Vertex failed, 
> vertexName=Map 3, vertexId=vertex_1653493839723_0001_3_01, diagnostics=[Task 
> failed, taskId=task_1653493839723_0001_3_01_00, diagnostics=[TaskAttempt 
> 0 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1653493839723_0001_3_01_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:348)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:276)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:82)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:69)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:69)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:39)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:110)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:83)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:414)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:293)
>   ... 15 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:574)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:101)
>   ... 18 more
> Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
> java.lang.Integer
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaIntObjectInspector.get(JavaIntObjectInspector.java:40)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPLessThan.evaluate(GenericUDFOPLessThan.java:127)
>   at 
> 

[jira] [Work logged] (HIVE-26264) Iceberg integration: Fetch virtual columns on demand

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26264?focusedWorklogId=777920=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777920
 ]

ASF GitHub Bot logged work on HIVE-26264:
-

Author: ASF GitHub Bot
Created on: 03/Jun/22 04:13
Start Date: 03/Jun/22 04:13
Worklog Time Spent: 10m 
  Work Description: kasakrisz merged PR #3324:
URL: https://github.com/apache/hive/pull/3324




Issue Time Tracking
---

Worklog Id: (was: 777920)
Time Spent: 5.5h  (was: 5h 20m)

> Iceberg integration: Fetch virtual columns on demand
> 
>
> Key: HIVE-26264
> URL: https://issues.apache.org/jira/browse/HIVE-26264
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Currently virtual columns are fetched from iceberg tables if the statement 
> being executed is a delete or update statement and the setting is global. It 
> means it affects all tables affected by the statement. Also the read and 
> write schema depends on the operation setting.
> Some statements fails due to invalid schema:
> {code}
> create external table tbl_ice(a int, b string, c int) stored by iceberg 
> stored as orc tblproperties ('format-version'='2');
> insert into tbl_ice values (1, 'one', 50), (2, 'two', 51), (3, 'three', 52), 
> (4, 'four', 53), (5, 'five', 54), (111, 'one', 55), (333, 'two', 56);
> update tbl_ice set b='Changed' where b in (select b from tbl_ice where a < 4);
> {code}
> {code}
> See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, 
> or check ./ql/target/surefire-reports or 
> ./itests/qtest/target/surefire-reports/ for specific test cases logs.
>  org.apache.hadoop.hive.ql.metadata.HiveException: Vertex failed, 
> vertexName=Map 3, vertexId=vertex_1653493839723_0001_3_01, diagnostics=[Task 
> failed, taskId=task_1653493839723_0001_3_01_00, diagnostics=[TaskAttempt 
> 0 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1653493839723_0001_3_01_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:348)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:276)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:82)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:69)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:69)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:39)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:110)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:83)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:414)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:293)
>   ... 15 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:574)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:101)
>   ... 18 more
> Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
> 

[jira] [Work logged] (HIVE-25980) Reduce fs calls in HiveMetaStoreChecker.checkTable

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25980?focusedWorklogId=777663=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777663
 ]

ASF GitHub Bot logged work on HIVE-25980:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 15:59
Start Date: 02/Jun/22 15:59
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3053:
URL: https://github.com/apache/hive/pull/3053#discussion_r888123399


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java:
##
@@ -326,10 +356,29 @@ void checkTable(Table table, PartitionIterable parts, 
byte[] filterExp, CheckRes
   CheckResult.PartitionResult prFromMetastore = new 
CheckResult.PartitionResult();
   prFromMetastore.setPartitionName(getPartitionName(table, partition));
   prFromMetastore.setTableName(partition.getTableName());
-  if (!fs.exists(partPath)) {
-result.getPartitionsNotOnFs().add(prFromMetastore);
-  } else {
+  if (allPartDirs.contains(partPath)) {
 result.getCorrectPartitions().add(prFromMetastore);
+allPartDirs.remove(partPath);
+  }

Review Comment:
   nit:
   ```
   }  else {
   // Comment here
   ```





Issue Time Tracking
---

Worklog Id: (was: 777663)
Time Spent: 6h 50m  (was: 6h 40m)

> Reduce fs calls in HiveMetaStoreChecker.checkTable
> --
>
> Key: HIVE-25980
> URL: https://issues.apache.org/jira/browse/HIVE-25980
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Chiran Ravani
>Assignee: Chiran Ravani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> MSCK Repair table for high partition table can perform slow on Cloud Storage 
> such as S3, one of the case we found where slowness was observed in 
> HiveMetaStoreChecker.checkTable.
> {code:java}
> "HiveServer2-Background-Pool: Thread-382" #382 prio=5 os_prio=0 
> tid=0x7f97fc4a4000 nid=0x5c2a runnable [0x7f97c41a8000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>   at java.net.SocketInputStream.read(SocketInputStream.java:171)
>   at java.net.SocketInputStream.read(SocketInputStream.java:141)
>   at 
> sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:464)
>   at 
> sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68)
>   at 
> sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1341)
>   at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73)
>   at 
> sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:957)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
>   at 
> com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
>   at 
> com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:82)
>   at 
> com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
>   at 
> 

[jira] [Work logged] (HIVE-25980) Reduce fs calls in HiveMetaStoreChecker.checkTable

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25980?focusedWorklogId=777666=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777666
 ]

ASF GitHub Bot logged work on HIVE-25980:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 15:59
Start Date: 02/Jun/22 15:59
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3053:
URL: https://github.com/apache/hive/pull/3053#discussion_r888123783


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java:
##
@@ -326,10 +356,29 @@ void checkTable(Table table, PartitionIterable parts, 
byte[] filterExp, CheckRes
   CheckResult.PartitionResult prFromMetastore = new 
CheckResult.PartitionResult();
   prFromMetastore.setPartitionName(getPartitionName(table, partition));
   prFromMetastore.setTableName(partition.getTableName());
-  if (!fs.exists(partPath)) {
-result.getPartitionsNotOnFs().add(prFromMetastore);
-  } else {
+  if (allPartDirs.contains(partPath)) {
 result.getCorrectPartitions().add(prFromMetastore);
+allPartDirs.remove(partPath);
+  }
+  // There can be edge case where user can define partition directory 
outside of table directory
+  // to avoid eviction of such partitions
+  // we check existence of partition path which are not in table directory
+  // and add to result for getPartitionsNotOnFs.
+  else {
+if (!partPath.toString().contains(tablePath.toString())) {
+  if(!fs.exists(partPath)) {
+result.getPartitionsNotOnFs().add(prFromMetastore);
+  }

Review Comment:
   nit: `} else {`



##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java:
##
@@ -326,10 +356,29 @@ void checkTable(Table table, PartitionIterable parts, 
byte[] filterExp, CheckRes
   CheckResult.PartitionResult prFromMetastore = new 
CheckResult.PartitionResult();
   prFromMetastore.setPartitionName(getPartitionName(table, partition));
   prFromMetastore.setTableName(partition.getTableName());
-  if (!fs.exists(partPath)) {
-result.getPartitionsNotOnFs().add(prFromMetastore);
-  } else {
+  if (allPartDirs.contains(partPath)) {
 result.getCorrectPartitions().add(prFromMetastore);
+allPartDirs.remove(partPath);
+  }
+  // There can be edge case where user can define partition directory 
outside of table directory
+  // to avoid eviction of such partitions
+  // we check existence of partition path which are not in table directory
+  // and add to result for getPartitionsNotOnFs.
+  else {
+if (!partPath.toString().contains(tablePath.toString())) {
+  if(!fs.exists(partPath)) {

Review Comment:
   nit `if (`





Issue Time Tracking
---

Worklog Id: (was: 777666)
Time Spent: 7h  (was: 6h 50m)

> Reduce fs calls in HiveMetaStoreChecker.checkTable
> --
>
> Key: HIVE-25980
> URL: https://issues.apache.org/jira/browse/HIVE-25980
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Chiran Ravani
>Assignee: Chiran Ravani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> MSCK Repair table for high partition table can perform slow on Cloud Storage 
> such as S3, one of the case we found where slowness was observed in 
> HiveMetaStoreChecker.checkTable.
> {code:java}
> "HiveServer2-Background-Pool: Thread-382" #382 prio=5 os_prio=0 
> tid=0x7f97fc4a4000 nid=0x5c2a runnable [0x7f97c41a8000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>   at java.net.SocketInputStream.read(SocketInputStream.java:171)
>   at java.net.SocketInputStream.read(SocketInputStream.java:141)
>   at 
> sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:464)
>   at 
> sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68)
>   at 
> sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1341)
>   at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73)
>   at 
> sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:957)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
>   at 
> 

[jira] [Work logged] (HIVE-25980) Reduce fs calls in HiveMetaStoreChecker.checkTable

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25980?focusedWorklogId=777662=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777662
 ]

ASF GitHub Bot logged work on HIVE-25980:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 15:57
Start Date: 02/Jun/22 15:57
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3053:
URL: https://github.com/apache/hive/pull/3053#discussion_r888121662


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java:
##
@@ -308,8 +308,38 @@ void checkTable(Table table, PartitionIterable parts, 
byte[] filterExp, CheckRes
   result.getTablesNotOnFs().add(table.getTableName());
   return;
 }
+
+// now check the table folder and see if we find anything
+// that isn't in the metastore
+Set allPartDirs = new HashSet<>();
+List partColumns = table.getPartitionKeys();
+checkPartitionDirs(tablePath, allPartDirs, 
Collections.unmodifiableList(getPartColNames(table)));
+
+if (filterExp != null) {
+  PartitionExpressionProxy expressionProxy = createExpressionProxy(conf);
+  List partitions = new ArrayList<>();
+  Set partDirs = new HashSet();
+  String tablePathStr = tablePath.toString();
+  for (Path path : allPartDirs) {

Review Comment:
   Could we do this with Stream, and filter and friends?
   Also we should calculate `tablePathStr.endsWith("/")` only once





Issue Time Tracking
---

Worklog Id: (was: 777662)
Time Spent: 6h 40m  (was: 6.5h)

> Reduce fs calls in HiveMetaStoreChecker.checkTable
> --
>
> Key: HIVE-25980
> URL: https://issues.apache.org/jira/browse/HIVE-25980
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Chiran Ravani
>Assignee: Chiran Ravani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> MSCK Repair table for high partition table can perform slow on Cloud Storage 
> such as S3, one of the case we found where slowness was observed in 
> HiveMetaStoreChecker.checkTable.
> {code:java}
> "HiveServer2-Background-Pool: Thread-382" #382 prio=5 os_prio=0 
> tid=0x7f97fc4a4000 nid=0x5c2a runnable [0x7f97c41a8000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>   at java.net.SocketInputStream.read(SocketInputStream.java:171)
>   at java.net.SocketInputStream.read(SocketInputStream.java:141)
>   at 
> sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:464)
>   at 
> sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68)
>   at 
> sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1341)
>   at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73)
>   at 
> sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:957)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
>   at 
> com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
>   at 
> com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:82)
>   at 
> com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
>   at 
> 

[jira] [Updated] (HIVE-26277) Add unit tests for ColumnStatsAggregator classes

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26277:
--
Labels: pull-request-available  (was: )

> Add unit tests for ColumnStatsAggregator classes
> 
>
> Key: HIVE-26277
> URL: https://issues.apache.org/jira/browse/HIVE-26277
> Project: Hive
>  Issue Type: Test
>  Components: Standalone Metastore, Statistics, Tests
>Affects Versions: 4.0.0-alpha-2
>Reporter: Alessandro Solimando
>Assignee: Alessandro Solimando
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We have no unit tests covering these classes, which also happen to contain 
> some complicated logic, making the absence of tests even more risky.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26277) Add unit tests for ColumnStatsAggregator classes

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26277?focusedWorklogId=777643=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777643
 ]

ASF GitHub Bot logged work on HIVE-26277:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 15:27
Start Date: 02/Jun/22 15:27
Worklog Time Spent: 10m 
  Work Description: asolimando opened a new pull request, #3339:
URL: https://github.com/apache/hive/pull/3339

   
   
   ### What changes were proposed in this pull request?
   
   
   Adding unit tests for *ColumnStatsAggregator classes (first commit), fixing 
bugs discovered while writing the UTs (second commit) and fixed some warnings 
(subsequent commits, at the discretion of the reviewers).
   
   ### Why are the changes needed?
   
   
   Lack of unit tests is detrimental for code quality, as highlighted by the 
bugs discovered while writing the tests.
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   No.
   
   ### How was this patch tested?
   
   
   `mvn test 
-Dtest.groups=org.apache.hadoop.hive.metastore.annotation.MetastoreUnitTest 
-Dtest='*ColumnStatsAggregatorTest.java' -pl 
standalone-metastore/metastore-server`




Issue Time Tracking
---

Worklog Id: (was: 777643)
Remaining Estimate: 0h
Time Spent: 10m

> Add unit tests for ColumnStatsAggregator classes
> 
>
> Key: HIVE-26277
> URL: https://issues.apache.org/jira/browse/HIVE-26277
> Project: Hive
>  Issue Type: Test
>  Components: Standalone Metastore, Statistics, Tests
>Affects Versions: 4.0.0-alpha-2
>Reporter: Alessandro Solimando
>Assignee: Alessandro Solimando
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We have no unit tests covering these classes, which also happen to contain 
> some complicated logic, making the absence of tests even more risky.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.3

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=777603=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777603
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 14:30
Start Date: 02/Jun/22 14:30
Worklog Time Spent: 10m 
  Work Description: TheCodeTracer commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1144935338

   I am sorry if its the incorrect forum but since it's tracking Hadoop 3.3 
upgrade on Hive, I just wanted to confirm if there is a possibility that 
webhcat might not work with Hadoop 3.3.x yet 
(https://issues.apache.org/jira/browse/HIVE-24083 ). Thanks a lot. Right now, 
it does seem that WebHCat tests are breaking anyway due to a different issue 
(https://issues.apache.org/jira/browse/HIVE-26286) 




Issue Time Tracking
---

Worklog Id: (was: 777603)
Time Spent: 12h 13m  (was: 12.05h)

> Upgrade Hadoop to 3.3.3
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 12h 13m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-26286) Hive WebHCat Tests are failing

2022-06-02 Thread Anmol Sundaram (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anmol Sundaram updated HIVE-26286:
--
Description: 
The Hive TestWebHCatE2e  tests seem to be failing due to 

 
{quote}templeton: Server failed to start: null
[main] ERROR org.apache.hive.hcatalog.templeton.Main - Server failed to start: 
java.lang.NullPointerException
at org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:174)
at 
org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44)
at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:220)
at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:143)
at org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295)
at org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252)
at org.apache.hive.hcatalog.templeton.Main.run(Main.java:147)
at 
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.startHebHcatInMem(TestWebHCatE2e.java:94)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59){quote}
{quote} {quote}
This seems to be caused due to HIVE-18728 , which is breaking.

  was:
The Hive TestWebHCatE2e  tests seem to be failing due to 

 
{quote}
templeton: Server failed to start: null2022-05-31T21:51:13,711 ERROR [main] 
templeton.Main: Server failed to start: java.lang.NullPointerException: null at 
org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:186) 
~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at 
org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44)
 ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at 
org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:215) 
~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at 
org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:145) 
~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at 
org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) 
~[classes/:?] at 
org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) ~[classes/:?]
 
{quote}
This seems to be caused due to HIVE-18728 , which is breaking.


> Hive WebHCat Tests are failing
> --
>
> Key: HIVE-26286
> URL: https://issues.apache.org/jira/browse/HIVE-26286
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 4.0.0
> Environment: [link title|http://example.com]
>Reporter: Anmol Sundaram
>Priority: Major
>
> The Hive TestWebHCatE2e  tests seem to be failing due to 
>  
> {quote}templeton: Server failed to start: null
> [main] ERROR org.apache.hive.hcatalog.templeton.Main - Server failed to 
> start: 
> java.lang.NullPointerException
> at 
> org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:174)
> at 
> org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44)
> at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:220)
> at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:143)
> at 
> org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295)
> at org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252)
> at org.apache.hive.hcatalog.templeton.Main.run(Main.java:147)
> at 
> org.apache.hive.hcatalog.templeton.TestWebHCatE2e.startHebHcatInMem(TestWebHCatE2e.java:94)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59){quote}
> {quote} {quote}
> This seems to be caused due to HIVE-18728 , which is breaking.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-26286) Hive WebHCat Tests are failing

2022-06-02 Thread Anmol Sundaram (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anmol Sundaram updated HIVE-26286:
--
Description: 
The Hive TestWebHCatE2e  tests seem to be failing due to 

 
{quote}
templeton: Server failed to start: null2022-05-31T21:51:13,711 ERROR [main] 
templeton.Main: Server failed to start: java.lang.NullPointerException: null at 
org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:186) 
~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at 
org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44)
 ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at 
org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:215) 
~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at 
org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:145) 
~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at 
org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) 
~[classes/:?] at 
org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) ~[classes/:?]
 
{quote}
This seems to be caused due to HIVE-18728 , which is breaking.

  was:
The Hive TestWebHCatE2e  tests seem to be failing due to 
templeton: Server failed to start: null2022-05-31T21:51:13,711 ERROR [main] 
templeton.Main: Server failed to start: java.lang.NullPointerException: null
at 
org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:186) 
~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629]  at 
org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44)
 ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at 
org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:215) 
~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629]  at 
org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:145) 
~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629]  at 
org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) 
~[classes/:?]  at 
org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) ~[classes/:?]
 

This seems to be caused due to 
[HIVE-18728|https://issues.apache.org/jira/browse/HIVE-18728] , which is 
breaking.


> Hive WebHCat Tests are failing
> --
>
> Key: HIVE-26286
> URL: https://issues.apache.org/jira/browse/HIVE-26286
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 4.0.0
> Environment: [link title|http://example.com]
>Reporter: Anmol Sundaram
>Priority: Major
>
> The Hive TestWebHCatE2e  tests seem to be failing due to 
>  
> {quote}
> templeton: Server failed to start: null2022-05-31T21:51:13,711 ERROR [main] 
> templeton.Main: Server failed to start: java.lang.NullPointerException: null 
> at 
> org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:186) 
> ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at 
> org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44)
>  ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at 
> org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:215) 
> ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at 
> org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:145) 
> ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at 
> org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) 
> ~[classes/:?] at 
> org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) ~[classes/:?]
>  
> {quote}
> This seems to be caused due to HIVE-18728 , which is breaking.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26228) Implement Iceberg table rollback feature

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26228?focusedWorklogId=777524=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777524
 ]

ASF GitHub Bot logged work on HIVE-26228:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 13:28
Start Date: 02/Jun/22 13:28
Worklog Time Spent: 10m 
  Work Description: szlta commented on code in PR #3287:
URL: https://github.com/apache/hive/pull/3287#discussion_r887952384


##
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/execute/AlterTableExecuteAnalyzer.java:
##
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.ddl.table.execute;
+
+import org.apache.hadoop.hive.common.TableName;
+import org.apache.hadoop.hive.common.type.TimestampTZ;
+import org.apache.hadoop.hive.common.type.TimestampTZUtil;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.QueryState;
+import org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory.DDLType;
+import org.apache.hadoop.hive.ql.ddl.DDLWork;
+import org.apache.hadoop.hive.ql.ddl.table.AbstractAlterTableAnalyzer;
+import org.apache.hadoop.hive.ql.ddl.table.AlterTableType;
+import org.apache.hadoop.hive.ql.exec.TaskFactory;
+import org.apache.hadoop.hive.ql.hooks.ReadEntity;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.apache.hadoop.hive.ql.parse.ASTNode;
+import org.apache.hadoop.hive.ql.parse.HiveParser;
+import org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec;
+import org.apache.hadoop.hive.ql.parse.SemanticException;
+import org.apache.hadoop.hive.ql.plan.PlanUtils;
+import org.apache.hadoop.hive.ql.session.SessionState;
+
+import java.time.ZoneId;
+import java.util.Map;
+
+import static 
org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec.ExecuteOperationType.ROLLBACK;
+import static 
org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec.RollbackSpec.RollbackType.TIME;
+import static 
org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec.RollbackSpec.RollbackType.VERSION;
+
+/**
+ * Analyzer for ALTER TABLE ... EXECUTE commands.
+ */
+@DDLType(types = HiveParser.TOK_ALTERTABLE_EXECUTE)
+public class AlterTableExecuteAnalyzer extends AbstractAlterTableAnalyzer {
+
+  public AlterTableExecuteAnalyzer(QueryState queryState) throws 
SemanticException {
+super(queryState);
+  }
+
+  @Override
+  protected void analyzeCommand(TableName tableName, Map 
partitionSpec, ASTNode command)
+  throws SemanticException {
+Table table = getTable(tableName);
+// the first child must be the execute operation type
+ASTNode executeCommandType = (ASTNode) command.getChild(0);
+validateAlterTableType(table, AlterTableType.EXECUTE, false);
+inputs.add(new ReadEntity(table));
+AlterTableExecuteDesc desc = null;
+if (HiveParser.KW_ROLLBACK == executeCommandType.getType()) {
+  AlterTableExecuteSpec spec;
+  // the second child must be the rollback parameter
+  ASTNode child = (ASTNode) command.getChild(1);
+
+  if (child.getType() == HiveParser.StringLiteral) {

Review Comment:
   I see, thanks!





Issue Time Tracking
---

Worklog Id: (was: 777524)
Time Spent: 5h  (was: 4h 50m)

> Implement Iceberg table rollback feature
> 
>
> Key: HIVE-26228
> URL: https://issues.apache.org/jira/browse/HIVE-26228
> Project: Hive
>  Issue Type: New Feature
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> We should allow rolling back iceberg table's data to the state at an older 
> table snapshot. 
> Rollback to the last snapshot before a specific timestamp
> {code:java}
> ALTER TABLE ice_t EXECUTE ROLLBACK('2022-05-12 00:00:00')
> {code}
> Rollback to a specific snapshot ID
> {code:java}
>  ALTER TABLE ice_t EXECUTE ROLLBACK(); 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-21160) Rewrite Update statement as Multi-insert and do Update split early

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21160?focusedWorklogId=777513=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777513
 ]

ASF GitHub Bot logged work on HIVE-21160:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 13:19
Start Date: 02/Jun/22 13:19
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on code in PR #2855:
URL: https://github.com/apache/hive/pull/2855#discussion_r887943205


##
ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java:
##
@@ -48,28 +52,43 @@ public class UpdateDeleteSemanticAnalyzer extends 
RewriteSemanticAnalyzer {
 super(queryState);
   }
 
-  protected void analyze(ASTNode tree) throws SemanticException {
+  @Override
+  protected ASTNode getTargetTableNode(ASTNode tree) {
+// The first child should be the table we are updating / deleting from
+ASTNode tabName = (ASTNode)tree.getChild(0);
+assert tabName.getToken().getType() == HiveParser.TOK_TABNAME :
+"Expected tablename as first child of " + operation + " but found 
" + tabName.getName();
+return tabName;
+  }
+
+  protected void analyze(ASTNode tree, Table table, ASTNode tabNameNode) 
throws SemanticException {
 switch (tree.getToken().getType()) {
 case HiveParser.TOK_DELETE_FROM:
-  analyzeDelete(tree);
+  analyzeDelete(tree, tabNameNode, table);
   break;
 case HiveParser.TOK_UPDATE_TABLE:
-  analyzeUpdate(tree);
+  analyzeUpdate(tree, tabNameNode, table);
   break;
 default:
   throw new RuntimeException("Asked to parse token " + tree.getName() + " 
in " +
   "UpdateDeleteSemanticAnalyzer");
 }
   }
 
-  private void analyzeUpdate(ASTNode tree) throws SemanticException {
+  private void analyzeUpdate(ASTNode tree, ASTNode tabNameNode, Table mTable) 
throws SemanticException {
 operation = Context.Operation.UPDATE;
-reparseAndSuperAnalyze(tree);
+boolean nonNativeAcid = AcidUtils.isNonNativeAcidTable(mTable);
+
+if (HiveConf.getBoolVar(queryState.getConf(), 
HiveConf.ConfVars.SPLIT_UPDATE) && !nonNativeAcid) {
+  analyzeSplitUpdate(tree, mTable, tabNameNode);
+} else {
+  reparseAndSuperAnalyze(tree, tabNameNode, mTable);

Review Comment:
   Altered the order to use everywhere `tree,mTable, tableNameNode`





Issue Time Tracking
---

Worklog Id: (was: 777513)
Time Spent: 4h 10m  (was: 4h)

> Rewrite Update statement as Multi-insert and do Update split early
> --
>
> Key: HIVE-21160
> URL: https://issues.apache.org/jira/browse/HIVE-21160
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-21160) Rewrite Update statement as Multi-insert and do Update split early

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21160?focusedWorklogId=777514=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777514
 ]

ASF GitHub Bot logged work on HIVE-21160:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 13:19
Start Date: 02/Jun/22 13:19
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on code in PR #2855:
URL: https://github.com/apache/hive/pull/2855#discussion_r887943545


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -8292,8 +8294,8 @@ private WriteEntity generateTableWriteEntity(String dest, 
Table dest_tab,
 if ((dpCtx == null || dpCtx.getNumDPCols() == 0)) {
   output = new WriteEntity(dest_tab, determineWriteType(ltd, dest));
   if (!outputs.add(output)) {
-if(!((this instanceof MergeSemanticAnalyzer) &&
-conf.getBoolVar(ConfVars.MERGE_SPLIT_UPDATE))) {
+if(!((this instanceof MergeSemanticAnalyzer || this instanceof 
UpdateDeleteSemanticAnalyzer) &&

Review Comment:
   Extracted to instance method.



##
itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java:
##
@@ -251,6 +252,44 @@ public void testExceucteUpdateCounts() throws Exception {
 assertEquals("1 row PreparedStatement update", 1, count);
   }
 
+  @Test
+  public void testExceucteMergeCounts() throws Exception {
+testExceucteMergeCounts(true);
+  }
+
+  @Test
+  public void testExceucteMergeCountsNoSplitUpdate() throws Exception {
+testExceucteMergeCounts(false);
+  }
+
+  private void testExceucteMergeCounts(boolean splitUpdateEarly) throws 
Exception {
+
+Statement stmt =  con.createStatement();
+stmt.execute("set " + ConfVars.MERGE_SPLIT_UPDATE.varname + "=" + 
splitUpdateEarly);
+stmt.execute("set " + ConfVars.HIVE_SUPPORT_CONCURRENCY.varname + "=true");

Review Comment:
   Changed to SPLIT_UPDATE





Issue Time Tracking
---

Worklog Id: (was: 777514)
Time Spent: 4h 20m  (was: 4h 10m)

> Rewrite Update statement as Multi-insert and do Update split early
> --
>
> Key: HIVE-21160
> URL: https://issues.apache.org/jira/browse/HIVE-21160
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-21160) Rewrite Update statement as Multi-insert and do Update split early

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21160?focusedWorklogId=777510=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777510
 ]

ASF GitHub Bot logged work on HIVE-21160:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 13:18
Start Date: 02/Jun/22 13:18
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on code in PR #2855:
URL: https://github.com/apache/hive/pull/2855#discussion_r887942105


##
ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java:
##
@@ -91,23 +110,15 @@ private void analyzeDelete(ASTNode tree) throws 
SemanticException {
* The sort by clause is put in there so that records come out in the right 
order to enable
* merge on read.
*/
-  private void reparseAndSuperAnalyze(ASTNode tree) throws SemanticException {
+  private void reparseAndSuperAnalyze(ASTNode tree, ASTNode tabNameNode, Table 
mTable) throws SemanticException {
 List children = tree.getChildren();
 
-// The first child should be the table we are updating / deleting from
-ASTNode tabName = (ASTNode)children.get(0);
-assert tabName.getToken().getType() == HiveParser.TOK_TABNAME :
-"Expected tablename as first child of " + operation + " but found " + 
tabName.getName();
-Table mTable = getTargetTable(tabName);
-validateTxnManager(mTable);
-validateTargetTable(mTable);
-
 // save the operation type into the query state
 SessionStateUtil.addResource(conf, 
Context.Operation.class.getSimpleName(), operation.name());
 
 StringBuilder rewrittenQueryStr = new StringBuilder();
 rewrittenQueryStr.append("insert into table ");
-rewrittenQueryStr.append(getFullTableNameForSQL(tabName));
+rewrittenQueryStr.append(getFullTableNameForSQL(tabNameNode));

Review Comment:
   It is also used in `MergeSemanticAnalyzer`
   ```
   String targetName = getSimpleTableName(targetNameNode);
   ...
   appendTarget(rewrittenQueryStr, targetNameNode, targetName);
   ...
   ```
   and in `AcidExportSemanticAnalyzer`
   ```
   StringBuilder rewrittenQueryStr = generateExportQuery(
   newTable.getPartCols(), tokRefOrNameExportTable, (ASTNode) 
tokRefOrNameExportTable.parent, newTableName);
   ```
   





Issue Time Tracking
---

Worklog Id: (was: 777510)
Time Spent: 4h  (was: 3h 50m)

> Rewrite Update statement as Multi-insert and do Update split early
> --
>
> Key: HIVE-21160
> URL: https://issues.apache.org/jira/browse/HIVE-21160
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-21160) Rewrite Update statement as Multi-insert and do Update split early

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21160?focusedWorklogId=777504=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777504
 ]

ASF GitHub Bot logged work on HIVE-21160:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 13:09
Start Date: 02/Jun/22 13:09
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on code in PR #2855:
URL: https://github.com/apache/hive/pull/2855#discussion_r887933328


##
ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java:
##
@@ -48,28 +52,43 @@ public class UpdateDeleteSemanticAnalyzer extends 
RewriteSemanticAnalyzer {
 super(queryState);
   }
 
-  protected void analyze(ASTNode tree) throws SemanticException {
+  @Override
+  protected ASTNode getTargetTableNode(ASTNode tree) {
+// The first child should be the table we are updating / deleting from
+ASTNode tabName = (ASTNode)tree.getChild(0);
+assert tabName.getToken().getType() == HiveParser.TOK_TABNAME :
+"Expected tablename as first child of " + operation + " but found 
" + tabName.getName();
+return tabName;
+  }
+
+  protected void analyze(ASTNode tree, Table table, ASTNode tabNameNode) 
throws SemanticException {

Review Comment:
   We need the quoted version of the db and table name. `TableName` object 
can't generate it when identifier contains `'` character because it does not 
escape the id but surround it with backticks.
   ```
 public String getEscapedNotEmptyDbTable() {
   return
   db == null || db.trim().isEmpty() ?
   "`" + table + "`" : "`" + db + "`" + 
DatabaseName.CAT_DB_TABLE_SEPARATOR + "`" + table + "`";
 }
   ```
   
   Table object is good only in the update case. However the target table may 
have an alias defined in merge statements and we need that alias for the 
rewrite. It can be extracted from the `tabNameNode` AST.





Issue Time Tracking
---

Worklog Id: (was: 777504)
Time Spent: 3h 50m  (was: 3h 40m)

> Rewrite Update statement as Multi-insert and do Update split early
> --
>
> Key: HIVE-21160
> URL: https://issues.apache.org/jira/browse/HIVE-21160
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26228) Implement Iceberg table rollback feature

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26228?focusedWorklogId=777497=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777497
 ]

ASF GitHub Bot logged work on HIVE-26228:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 13:02
Start Date: 02/Jun/22 13:02
Worklog Time Spent: 10m 
  Work Description: lcspinter commented on code in PR #3287:
URL: https://github.com/apache/hive/pull/3287#discussion_r887926106


##
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/execute/AlterTableExecuteAnalyzer.java:
##
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.ddl.table.execute;
+
+import org.apache.hadoop.hive.common.TableName;
+import org.apache.hadoop.hive.common.type.TimestampTZ;
+import org.apache.hadoop.hive.common.type.TimestampTZUtil;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.QueryState;
+import org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory.DDLType;
+import org.apache.hadoop.hive.ql.ddl.DDLWork;
+import org.apache.hadoop.hive.ql.ddl.table.AbstractAlterTableAnalyzer;
+import org.apache.hadoop.hive.ql.ddl.table.AlterTableType;
+import org.apache.hadoop.hive.ql.exec.TaskFactory;
+import org.apache.hadoop.hive.ql.hooks.ReadEntity;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.apache.hadoop.hive.ql.parse.ASTNode;
+import org.apache.hadoop.hive.ql.parse.HiveParser;
+import org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec;
+import org.apache.hadoop.hive.ql.parse.SemanticException;
+import org.apache.hadoop.hive.ql.plan.PlanUtils;
+import org.apache.hadoop.hive.ql.session.SessionState;
+
+import java.time.ZoneId;
+import java.util.Map;
+
+import static 
org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec.ExecuteOperationType.ROLLBACK;
+import static 
org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec.RollbackSpec.RollbackType.TIME;
+import static 
org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec.RollbackSpec.RollbackType.VERSION;
+
+/**
+ * Analyzer for ALTER TABLE ... EXECUTE commands.
+ */
+@DDLType(types = HiveParser.TOK_ALTERTABLE_EXECUTE)
+public class AlterTableExecuteAnalyzer extends AbstractAlterTableAnalyzer {
+
+  public AlterTableExecuteAnalyzer(QueryState queryState) throws 
SemanticException {
+super(queryState);
+  }
+
+  @Override
+  protected void analyzeCommand(TableName tableName, Map 
partitionSpec, ASTNode command)
+  throws SemanticException {
+Table table = getTable(tableName);
+// the first child must be the execute operation type
+ASTNode executeCommandType = (ASTNode) command.getChild(0);
+validateAlterTableType(table, AlterTableType.EXECUTE, false);
+inputs.add(new ReadEntity(table));
+AlterTableExecuteDesc desc = null;
+if (HiveParser.KW_ROLLBACK == executeCommandType.getType()) {
+  AlterTableExecuteSpec spec;
+  // the second child must be the rollback parameter
+  ASTNode child = (ASTNode) command.getChild(1);
+
+  if (child.getType() == HiveParser.StringLiteral) {

Review Comment:
   If a timezone is present we will try to use it. 
   
https://github.com/apache/hive/blob/63326ff775206e59547b6b1332e25279e90ef5ee/common/src/java/org/apache/hadoop/hive/common/type/TimestampTZUtil.java#L103
   Otherwise use the preconfigured time zone.





Issue Time Tracking
---

Worklog Id: (was: 777497)
Time Spent: 4h 50m  (was: 4h 40m)

> Implement Iceberg table rollback feature
> 
>
> Key: HIVE-26228
> URL: https://issues.apache.org/jira/browse/HIVE-26228
> Project: Hive
>  Issue Type: New Feature
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> We should allow rolling back iceberg table's data to the state at an older 
> table snapshot. 
> Rollback to the last snapshot before a specific timestamp
> {code:java}
> ALTER TABLE ice_t EXECUTE 

[jira] [Work logged] (HIVE-26228) Implement Iceberg table rollback feature

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26228?focusedWorklogId=777493=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777493
 ]

ASF GitHub Bot logged work on HIVE-26228:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:58
Start Date: 02/Jun/22 12:58
Worklog Time Spent: 10m 
  Work Description: lcspinter commented on code in PR #3287:
URL: https://github.com/apache/hive/pull/3287#discussion_r887922632


##
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/AbstractBaseAlterTableAnalyzer.java:
##
@@ -161,7 +161,8 @@ protected void validateAlterTableType(Table table, 
AlterTableType op, boolean ex
 }
 if (table.isNonNative() && table.getStorageHandler() != null) {
   if (!table.getStorageHandler().isAllowedAlterOperation(op) ||

Review Comment:
   Yes, you are right, all this can be combined into isAllowedAlterOperation()





Issue Time Tracking
---

Worklog Id: (was: 777493)
Time Spent: 4h 40m  (was: 4.5h)

> Implement Iceberg table rollback feature
> 
>
> Key: HIVE-26228
> URL: https://issues.apache.org/jira/browse/HIVE-26228
> Project: Hive
>  Issue Type: New Feature
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> We should allow rolling back iceberg table's data to the state at an older 
> table snapshot. 
> Rollback to the last snapshot before a specific timestamp
> {code:java}
> ALTER TABLE ice_t EXECUTE ROLLBACK('2022-05-12 00:00:00')
> {code}
> Rollback to a specific snapshot ID
> {code:java}
>  ALTER TABLE ice_t EXECUTE ROLLBACK(); 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777488=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777488
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:39
Start Date: 02/Jun/22 12:39
Worklog Time Spent: 10m 
  Work Description: veghlaci05 commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887904369


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777483=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777483
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:30
Start Date: 02/Jun/22 12:30
Worklog Time Spent: 10m 
  Work Description: veghlaci05 commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887896665


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * 

[jira] [Work logged] (HIVE-21160) Rewrite Update statement as Multi-insert and do Update split early

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21160?focusedWorklogId=777481=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777481
 ]

ASF GitHub Bot logged work on HIVE-21160:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:30
Start Date: 02/Jun/22 12:30
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on code in PR #2855:
URL: https://github.com/apache/hive/pull/2855#discussion_r887896104


##
itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java:
##
@@ -251,6 +252,44 @@ public void testExceucteUpdateCounts() throws Exception {
 assertEquals("1 row PreparedStatement update", 1, count);
   }
 
+  @Test
+  public void testExceucteMergeCounts() throws Exception {
+testExceucteMergeCounts(true);
+  }
+
+  @Test
+  public void testExceucteMergeCountsNoSplitUpdate() throws Exception {
+testExceucteMergeCounts(false);
+  }
+
+  private void testExceucteMergeCounts(boolean splitUpdateEarly) throws 
Exception {
+
+Statement stmt =  con.createStatement();
+stmt.execute("set " + ConfVars.MERGE_SPLIT_UPDATE.varname + "=" + 
splitUpdateEarly);
+stmt.execute("set " + ConfVars.HIVE_SUPPORT_CONCURRENCY.varname + "=true");
+stmt.execute("set " + ConfVars.HIVE_TXN_MANAGER.varname +
+"=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager");
+
+stmt.execute("drop table if exists transactional_crud");
+stmt.execute("drop table if exists source");
+
+stmt.execute("create table transactional_crud (a int, b int) stored as orc 
" +
+"tblproperties('transactional'='true', 
'transactional_properties'='default')");
+stmt.executeUpdate("insert into transactional_crud 
values(1,2),(3,4),(5,6),(7,8),(9,10)");
+
+stmt.execute("create table source (a int, b int) stored as orc " +
+"tblproperties('transactional'='true', 
'transactional_properties'='default')");
+stmt.executeUpdate("insert into source 
values(1,12),(3,14),(9,19),(100,100)");
+
+int count = stmt.executeUpdate("MERGE INTO transactional_crud as t 
using source as s ON t.a = s.a\n" +
+"WHEN MATCHED AND s.a > 7 THEN DELETE\n" +
+"WHEN MATCHED AND s.a <= 8 THEN UPDATE set b = 100\n" +
+"WHEN NOT MATCHED THEN INSERT VALUES (s.a, s.b)");
+stmt.close();
+
+assertEquals("Statement merge", 4, count);

Review Comment:
   The number of affected rows. With split update I experienced that it was not 
4 so I had to introduce some logic because updated records was counted twice.





Issue Time Tracking
---

Worklog Id: (was: 777481)
Time Spent: 3h 40m  (was: 3.5h)

> Rewrite Update statement as Multi-insert and do Update split early
> --
>
> Key: HIVE-21160
> URL: https://issues.apache.org/jira/browse/HIVE-21160
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777478=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777478
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:22
Start Date: 02/Jun/22 12:22
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887889651


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * Stops 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777477=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777477
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:21
Start Date: 02/Jun/22 12:21
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887888610


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * Stops 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777476=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777476
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:20
Start Date: 02/Jun/22 12:20
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887887873


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * Stops 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777474=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777474
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:18
Start Date: 02/Jun/22 12:18
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887886357


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * Stops 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777473=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777473
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:17
Start Date: 02/Jun/22 12:17
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887884610


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);

Review Comment:
   and? you are not manipulating the iterator





Issue Time Tracking
---

Worklog Id: (was: 777473)
Time Spent: 3h 40m  (was: 3.5h)

> Compaction heartbeater improvements
> ---
>
> Key: HIVE-26242
> URL: https://issues.apache.org/jira/browse/HIVE-26242
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> The Compaction heartbeater should be improved the following ways:
>  * The metastore clients should be reused between heartbeats and closed only 
> at the end, when the transaction ends
>  * Instead of having a dedicated heartbeater thread for each Compaction 
> transaction, 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777472=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777472
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:16
Start Date: 02/Jun/22 12:16
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887884610


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);

Review Comment:
   and? 





Issue Time Tracking
---

Worklog Id: (was: 777472)
Time Spent: 3.5h  (was: 3h 20m)

> Compaction heartbeater improvements
> ---
>
> Key: HIVE-26242
> URL: https://issues.apache.org/jira/browse/HIVE-26242
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> The Compaction heartbeater should be improved the following ways:
>  * The metastore clients should be reused between heartbeats and closed only 
> at the end, when the transaction ends
>  * Instead of having a dedicated heartbeater thread for each Compaction 
> transaction, there should be shared a heartbeater 

[jira] [Work logged] (HIVE-21160) Rewrite Update statement as Multi-insert and do Update split early

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21160?focusedWorklogId=777470=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777470
 ]

ASF GitHub Bot logged work on HIVE-21160:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:15
Start Date: 02/Jun/22 12:15
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on code in PR #2855:
URL: https://github.com/apache/hive/pull/2855#discussion_r887876292


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -8292,8 +8294,8 @@ private WriteEntity generateTableWriteEntity(String dest, 
Table dest_tab,
 if ((dpCtx == null || dpCtx.getNumDPCols() == 0)) {
   output = new WriteEntity(dest_tab, determineWriteType(ltd, dest));
   if (!outputs.add(output)) {
-if(!((this instanceof MergeSemanticAnalyzer) &&
-conf.getBoolVar(ConfVars.MERGE_SPLIT_UPDATE))) {
+if(!((this instanceof MergeSemanticAnalyzer || this instanceof 
UpdateDeleteSemanticAnalyzer) &&

Review Comment:
   I believe this could be more readable if the condition would be placed into 
an instance method; which could be overriden by `MergeSemanticAnalyzer` / 
`UpdateDeleteSemanticAnalyzer`



##
itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java:
##
@@ -251,6 +252,44 @@ public void testExceucteUpdateCounts() throws Exception {
 assertEquals("1 row PreparedStatement update", 1, count);
   }
 
+  @Test
+  public void testExceucteMergeCounts() throws Exception {
+testExceucteMergeCounts(true);
+  }
+
+  @Test
+  public void testExceucteMergeCountsNoSplitUpdate() throws Exception {
+testExceucteMergeCounts(false);
+  }
+
+  private void testExceucteMergeCounts(boolean splitUpdateEarly) throws 
Exception {
+
+Statement stmt =  con.createStatement();
+stmt.execute("set " + ConfVars.MERGE_SPLIT_UPDATE.varname + "=" + 
splitUpdateEarly);
+stmt.execute("set " + ConfVars.HIVE_SUPPORT_CONCURRENCY.varname + "=true");

Review Comment:
   isn't `MERGE_SPLIT_UPDATE` is the old option's name; meanwhile the one is 
named `SPLIT_UPDATE` ?



##
ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java:
##
@@ -91,23 +110,15 @@ private void analyzeDelete(ASTNode tree) throws 
SemanticException {
* The sort by clause is put in there so that records come out in the right 
order to enable
* merge on read.
*/
-  private void reparseAndSuperAnalyze(ASTNode tree) throws SemanticException {
+  private void reparseAndSuperAnalyze(ASTNode tree, ASTNode tabNameNode, Table 
mTable) throws SemanticException {
 List children = tree.getChildren();
 
-// The first child should be the table we are updating / deleting from
-ASTNode tabName = (ASTNode)children.get(0);
-assert tabName.getToken().getType() == HiveParser.TOK_TABNAME :
-"Expected tablename as first child of " + operation + " but found " + 
tabName.getName();
-Table mTable = getTargetTable(tabName);
-validateTxnManager(mTable);
-validateTargetTable(mTable);
-
 // save the operation type into the query state
 SessionStateUtil.addResource(conf, 
Context.Operation.class.getSimpleName(), operation.name());
 
 StringBuilder rewrittenQueryStr = new StringBuilder();
 rewrittenQueryStr.append("insert into table ");
-rewrittenQueryStr.append(getFullTableNameForSQL(tabName));
+rewrittenQueryStr.append(getFullTableNameForSQL(tabNameNode));

Review Comment:
   it seems like `tabNameNode` is only used as an argument of this method...



##
ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java:
##
@@ -48,28 +52,43 @@ public class UpdateDeleteSemanticAnalyzer extends 
RewriteSemanticAnalyzer {
 super(queryState);
   }
 
-  protected void analyze(ASTNode tree) throws SemanticException {
+  @Override
+  protected ASTNode getTargetTableNode(ASTNode tree) {
+// The first child should be the table we are updating / deleting from
+ASTNode tabName = (ASTNode)tree.getChild(0);
+assert tabName.getToken().getType() == HiveParser.TOK_TABNAME :
+"Expected tablename as first child of " + operation + " but found 
" + tabName.getName();
+return tabName;
+  }
+
+  protected void analyze(ASTNode tree, Table table, ASTNode tabNameNode) 
throws SemanticException {
 switch (tree.getToken().getType()) {
 case HiveParser.TOK_DELETE_FROM:
-  analyzeDelete(tree);
+  analyzeDelete(tree, tabNameNode, table);
   break;
 case HiveParser.TOK_UPDATE_TABLE:
-  analyzeUpdate(tree);
+  analyzeUpdate(tree, tabNameNode, table);
   break;
 default:
   throw new RuntimeException("Asked to parse token " + tree.getName() + " 
in " +
   "UpdateDeleteSemanticAnalyzer");
 }
   }
 
-  private void analyzeUpdate(ASTNode tree) throws 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777467=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777467
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:13
Start Date: 02/Jun/22 12:13
Worklog Time Spent: 10m 
  Work Description: veghlaci05 commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887882124


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777464=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777464
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:11
Start Date: 02/Jun/22 12:11
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887880115


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * Stops 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777462=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777462
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:10
Start Date: 02/Jun/22 12:10
Worklog Time Spent: 10m 
  Work Description: veghlaci05 commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887879180


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777461=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777461
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:06
Start Date: 02/Jun/22 12:06
Worklog Time Spent: 10m 
  Work Description: veghlaci05 commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887876080


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777457=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777457
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:05
Start Date: 02/Jun/22 12:05
Worklog Time Spent: 10m 
  Work Description: veghlaci05 commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887875032


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777455=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777455
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:04
Start Date: 02/Jun/22 12:04
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887874363


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * Stops 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777451=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777451
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:02
Start Date: 02/Jun/22 12:02
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887872810


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * Stops 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777453=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777453
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 12:03
Start Date: 02/Jun/22 12:03
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887873535


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * Stops 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777450=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777450
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 11:59
Start Date: 02/Jun/22 11:59
Worklog Time Spent: 10m 
  Work Description: veghlaci05 commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887870652


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777449=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777449
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 11:58
Start Date: 02/Jun/22 11:58
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887870152


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * Stops 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777447=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777447
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 11:57
Start Date: 02/Jun/22 11:57
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887868543


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * Stops 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777444=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777444
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 11:54
Start Date: 02/Jun/22 11:54
Worklog Time Spent: 10m 
  Work Description: veghlaci05 commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887866285


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);

Review Comment:
   The map is used in the `startHeartbeat()` and `stopHeartBeat()` methods 
which can be called concurrently if there a re multiple worker threads 
configured.





Issue Time Tracking
---

Worklog Id: (was: 777444)
Time Spent: 1.5h  (was: 1h 20m)

> Compaction heartbeater improvements
> ---
>
> Key: HIVE-26242
> URL: https://issues.apache.org/jira/browse/HIVE-26242
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The Compaction heartbeater should be improved the following ways:
>  * The metastore clients should be reused between heartbeats and closed only 
> at the end, when 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777443=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777443
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 11:52
Start Date: 02/Jun/22 11:52
Worklog Time Spent: 10m 
  Work Description: veghlaci05 commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887865099


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777441=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777441
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 11:50
Start Date: 02/Jun/22 11:50
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887863531


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * Stops 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777434=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777434
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 11:46
Start Date: 02/Jun/22 11:46
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887860283


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);
+
+  /**
+   * Starts the heartbeat for the given transaction
+   * @param txnId The id of the compaction txn
+   * @param lockId The id of the lock associated with the txn
+   * @param tableName Required for logging only
+   * @throws IllegalStateException Thrown when the heartbeat for the given txn 
has already been started.
+   */
+  void startHeartbeat(long txnId, long lockId, String tableName) {
+if (tasks.containsKey(txnId)) {
+  throw new IllegalStateException("Heartbeat already started for TXN " + 
txnId);
+}
+LOG.info("Submitting heartbeat task for TXN {}", txnId);
+CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, 
lockId, tableName);
+Future submittedTask = 
heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, 
TimeUnit.MILLISECONDS);
+tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask));
+  }
+
+  /**
+   * Stops 

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777429=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777429
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 11:41
Start Date: 02/Jun/22 11:41
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887856299


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
+try {
+  INSTANCE.shutdown();
+} catch (InterruptedException e) {
+  Thread.currentThread().interrupt();
+}
+  }));
+}
+  }
+}
+if (INSTANCE.heartbeatExecutor.isShutdown()) {
+  throw new IllegalStateException("The CompactionHeartbeatService is 
already shut down!");
+}
+return INSTANCE;
+  }
+
+  private final ScheduledExecutorService heartbeatExecutor;
+  private final ObjectPool clientPool;
+  private final long initialDelay;
+  private final long period;
+  private final ConcurrentHashMap tasks = new 
ConcurrentHashMap<>(30);

Review Comment:
   why should it be a concurrent map, I don't see any usage of it in the code



##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, 

[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777427=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777427
 ]

ASF GitHub Bot logged work on HIVE-26267:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 11:38
Start Date: 02/Jun/22 11:38
Worklog Time Spent: 10m 
  Work Description: veghlaci05 commented on code in PR #3325:
URL: https://github.com/apache/hive/pull/3325#discussion_r887852145


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -3701,122 +3698,125 @@ public CompactionResponse compact(CompactionRequest 
rqst) throws MetaException {
  * compactions for any resource.
  */
 handle = 
getMutexAPI().acquireLock(MUTEX_KEY.CompactionScheduler.name());
-dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED);
-stmt = dbConn.createStatement();
-
-long id = generateCompactionQueueId(stmt);
-
-GetValidWriteIdsRequest request = new GetValidWriteIdsRequest(
-Collections.singletonList(getFullTableName(rqst.getDbname(), 
rqst.getTablename(;
-final ValidCompactorWriteIdList tblValidWriteIds =
-
TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0));
-LOG.debug("ValidCompactWriteIdList: " + 
tblValidWriteIds.writeToString());
-
-List params = new ArrayList<>();
-StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" 
FROM \"COMPACTION_QUEUE\" WHERE").
-  append(" (\"CQ_STATE\" IN(").
-
append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)).
-append(") OR (\"CQ_STATE\" = 
").append(quoteChar(READY_FOR_CLEANING)).
-append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))").
-append(" AND \"CQ_DATABASE\"=?").
-append(" AND \"CQ_TABLE\"=?").append(" AND ");
-params.add(Long.toString(tblValidWriteIds.getHighWatermark()));
-params.add(rqst.getDbname());
-params.add(rqst.getTablename());
-if(rqst.getPartitionname() == null) {
-  sb.append("\"CQ_PARTITION\" is null");
-} else {
-  sb.append("\"CQ_PARTITION\"=?");
-  params.add(rqst.getPartitionname());
-}
+try (Connection dbConn = 
getDbConn(Connection.TRANSACTION_READ_COMMITTED)) {

Review Comment:
   I have googled for it before splitting, and this approach was the suggested 
one. if you check the java 
[spec](https://docs.oracle.com/javase/specs/jls/se7/html/jls-14.html#jls-14.20.3.2),
 the extended try-with -resources construct is translated to a nested 
try-catch. The inner try-catch is responsible for closing the resources, so 
when the outer catch clause is executed (`dbConn.rollback();`), the connection 
object should be already closed. So as I understand, you cannot execute code in 
the catch/finally blocks which require the resource to be open. Did I miss 
something here?





Issue Time Tracking
---

Worklog Id: (was: 777427)
Time Spent: 1h 50m  (was: 1h 40m)

> Addendum to HIVE-26107: perpared statement is not working on Postgres
> -
>
> Key: HIVE-26267
> URL: https://issues.apache.org/jira/browse/HIVE-26267
> Project: Hive
>  Issue Type: Bug
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The assembled prepared statement in
> {code:java}
> org.apache.hadoop.hive.metastore.txn.TxnHandler#compact{code}
> does not work for Postgres DB.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777419=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777419
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 11:31
Start Date: 02/Jun/22 11:31
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887849154


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;
+
+  /**
+   * Return the singleton instance of this class.
+   * @param conf The {@link HiveConf} used to create the service. Used only 
during the firsst call
+   * @return Returns the singleton {@link CompactionHeartbeatService}
+   * @throws IllegalStateException Thrown when the service has already been 
shut down.
+   */
+  static CompactionHeartbeatService getInstance(HiveConf conf) {
+if (INSTANCE == null) {
+  synchronized (CompactionHeartbeatService.class) {
+if (INSTANCE == null) {
+  LOG.debug("Initializing compaction txn heartbeater service.");
+  INSTANCE = new CompactionHeartbeatService(conf);
+  Runtime.getRuntime().addShutdownHook(new Thread(() -> {

Review Comment:
   should we use ShutdownHookManager.addShutdownHook(runnable, 
SHUTDOWN_HOOK_PRIORITY);





Issue Time Tracking
---

Worklog Id: (was: 777419)
Time Spent: 40m  (was: 0.5h)

> Compaction heartbeater improvements
> ---
>
> Key: HIVE-26242
> URL: https://issues.apache.org/jira/browse/HIVE-26242
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The Compaction heartbeater should be improved the following ways:
>  * The metastore clients should be reused between heartbeats and closed only 
> at the end, when the transaction ends
>  * Instead of having a dedicated heartbeater thread for each Compaction 
> transaction, there should be shared a heartbeater executor where the 
> heartbeat tasks can be scheduled/submitted.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777395=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777395
 ]

ASF GitHub Bot logged work on HIVE-26267:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 10:43
Start Date: 02/Jun/22 10:43
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3325:
URL: https://github.com/apache/hive/pull/3325#discussion_r887809222


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -3701,122 +3698,125 @@ public CompactionResponse compact(CompactionRequest 
rqst) throws MetaException {
  * compactions for any resource.
  */
 handle = 
getMutexAPI().acquireLock(MUTEX_KEY.CompactionScheduler.name());
-dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED);
-stmt = dbConn.createStatement();
-
-long id = generateCompactionQueueId(stmt);
-
-GetValidWriteIdsRequest request = new GetValidWriteIdsRequest(
-Collections.singletonList(getFullTableName(rqst.getDbname(), 
rqst.getTablename(;
-final ValidCompactorWriteIdList tblValidWriteIds =
-
TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0));
-LOG.debug("ValidCompactWriteIdList: " + 
tblValidWriteIds.writeToString());
-
-List params = new ArrayList<>();
-StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" 
FROM \"COMPACTION_QUEUE\" WHERE").
-  append(" (\"CQ_STATE\" IN(").
-
append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)).
-append(") OR (\"CQ_STATE\" = 
").append(quoteChar(READY_FOR_CLEANING)).
-append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))").
-append(" AND \"CQ_DATABASE\"=?").
-append(" AND \"CQ_TABLE\"=?").append(" AND ");
-params.add(Long.toString(tblValidWriteIds.getHighWatermark()));
-params.add(rqst.getDbname());
-params.add(rqst.getTablename());
-if(rqst.getPartitionname() == null) {
-  sb.append("\"CQ_PARTITION\" is null");
-} else {
-  sb.append("\"CQ_PARTITION\"=?");
-  params.add(rqst.getPartitionname());
-}
+try (Connection dbConn = 
getDbConn(Connection.TRANSACTION_READ_COMMITTED)) {

Review Comment:
   no need for an extra try, you could initialize dbConn and stmt in a single 
try:
   
 try (Connection dbConn = 
getDbConn(Connection.TRANSACTION_READ_COMMITTED);
  Statement stmt = dbConn.createStatement()) {
   
   





Issue Time Tracking
---

Worklog Id: (was: 777395)
Time Spent: 1h 40m  (was: 1.5h)

> Addendum to HIVE-26107: perpared statement is not working on Postgres
> -
>
> Key: HIVE-26267
> URL: https://issues.apache.org/jira/browse/HIVE-26267
> Project: Hive
>  Issue Type: Bug
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The assembled prepared statement in
> {code:java}
> org.apache.hadoop.hive.metastore.txn.TxnHandler#compact{code}
> does not work for Postgres DB.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777393=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777393
 ]

ASF GitHub Bot logged work on HIVE-26267:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 10:43
Start Date: 02/Jun/22 10:43
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3325:
URL: https://github.com/apache/hive/pull/3325#discussion_r887809222


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -3701,122 +3698,125 @@ public CompactionResponse compact(CompactionRequest 
rqst) throws MetaException {
  * compactions for any resource.
  */
 handle = 
getMutexAPI().acquireLock(MUTEX_KEY.CompactionScheduler.name());
-dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED);
-stmt = dbConn.createStatement();
-
-long id = generateCompactionQueueId(stmt);
-
-GetValidWriteIdsRequest request = new GetValidWriteIdsRequest(
-Collections.singletonList(getFullTableName(rqst.getDbname(), 
rqst.getTablename(;
-final ValidCompactorWriteIdList tblValidWriteIds =
-
TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0));
-LOG.debug("ValidCompactWriteIdList: " + 
tblValidWriteIds.writeToString());
-
-List params = new ArrayList<>();
-StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" 
FROM \"COMPACTION_QUEUE\" WHERE").
-  append(" (\"CQ_STATE\" IN(").
-
append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)).
-append(") OR (\"CQ_STATE\" = 
").append(quoteChar(READY_FOR_CLEANING)).
-append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))").
-append(" AND \"CQ_DATABASE\"=?").
-append(" AND \"CQ_TABLE\"=?").append(" AND ");
-params.add(Long.toString(tblValidWriteIds.getHighWatermark()));
-params.add(rqst.getDbname());
-params.add(rqst.getTablename());
-if(rqst.getPartitionname() == null) {
-  sb.append("\"CQ_PARTITION\" is null");
-} else {
-  sb.append("\"CQ_PARTITION\"=?");
-  params.add(rqst.getPartitionname());
-}
+try (Connection dbConn = 
getDbConn(Connection.TRANSACTION_READ_COMMITTED)) {

Review Comment:
   no need for an extra try, you could initialize dbConn and stmt in single try:
   
 try (Connection dbConn = 
getDbConn(Connection.TRANSACTION_READ_COMMITTED);
  Statement stmt = dbConn.createStatement()) {
   
   





Issue Time Tracking
---

Worklog Id: (was: 777393)
Time Spent: 1.5h  (was: 1h 20m)

> Addendum to HIVE-26107: perpared statement is not working on Postgres
> -
>
> Key: HIVE-26267
> URL: https://issues.apache.org/jira/browse/HIVE-26267
> Project: Hive
>  Issue Type: Bug
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The assembled prepared statement in
> {code:java}
> org.apache.hadoop.hive.metastore.txn.TxnHandler#compact{code}
> does not work for Postgres DB.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777388=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777388
 ]

ASF GitHub Bot logged work on HIVE-26267:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 10:36
Start Date: 02/Jun/22 10:36
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3325:
URL: https://github.com/apache/hive/pull/3325#discussion_r887809222


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -3701,122 +3698,125 @@ public CompactionResponse compact(CompactionRequest 
rqst) throws MetaException {
  * compactions for any resource.
  */
 handle = 
getMutexAPI().acquireLock(MUTEX_KEY.CompactionScheduler.name());
-dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED);
-stmt = dbConn.createStatement();
-
-long id = generateCompactionQueueId(stmt);
-
-GetValidWriteIdsRequest request = new GetValidWriteIdsRequest(
-Collections.singletonList(getFullTableName(rqst.getDbname(), 
rqst.getTablename(;
-final ValidCompactorWriteIdList tblValidWriteIds =
-
TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0));
-LOG.debug("ValidCompactWriteIdList: " + 
tblValidWriteIds.writeToString());
-
-List params = new ArrayList<>();
-StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" 
FROM \"COMPACTION_QUEUE\" WHERE").
-  append(" (\"CQ_STATE\" IN(").
-
append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)).
-append(") OR (\"CQ_STATE\" = 
").append(quoteChar(READY_FOR_CLEANING)).
-append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))").
-append(" AND \"CQ_DATABASE\"=?").
-append(" AND \"CQ_TABLE\"=?").append(" AND ");
-params.add(Long.toString(tblValidWriteIds.getHighWatermark()));
-params.add(rqst.getDbname());
-params.add(rqst.getTablename());
-if(rqst.getPartitionname() == null) {
-  sb.append("\"CQ_PARTITION\" is null");
-} else {
-  sb.append("\"CQ_PARTITION\"=?");
-  params.add(rqst.getPartitionname());
-}
+try (Connection dbConn = 
getDbConn(Connection.TRANSACTION_READ_COMMITTED)) {

Review Comment:
   no need for an extra try, you could initialize dbConn and stmt in the above 
block (line 3693) in single try:
   
 try (Connection dbConn = 
getDbConn(Connection.TRANSACTION_READ_COMMITTED);
  Statement stmt = dbConn.createStatement()) {
   
   





Issue Time Tracking
---

Worklog Id: (was: 777388)
Time Spent: 1h 20m  (was: 1h 10m)

> Addendum to HIVE-26107: perpared statement is not working on Postgres
> -
>
> Key: HIVE-26267
> URL: https://issues.apache.org/jira/browse/HIVE-26267
> Project: Hive
>  Issue Type: Bug
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The assembled prepared statement in
> {code:java}
> org.apache.hadoop.hive.metastore.txn.TxnHandler#compact{code}
> does not work for Postgres DB.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777387=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777387
 ]

ASF GitHub Bot logged work on HIVE-26267:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 10:34
Start Date: 02/Jun/22 10:34
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3325:
URL: https://github.com/apache/hive/pull/3325#discussion_r887809222


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -3701,122 +3698,125 @@ public CompactionResponse compact(CompactionRequest 
rqst) throws MetaException {
  * compactions for any resource.
  */
 handle = 
getMutexAPI().acquireLock(MUTEX_KEY.CompactionScheduler.name());
-dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED);
-stmt = dbConn.createStatement();
-
-long id = generateCompactionQueueId(stmt);
-
-GetValidWriteIdsRequest request = new GetValidWriteIdsRequest(
-Collections.singletonList(getFullTableName(rqst.getDbname(), 
rqst.getTablename(;
-final ValidCompactorWriteIdList tblValidWriteIds =
-
TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0));
-LOG.debug("ValidCompactWriteIdList: " + 
tblValidWriteIds.writeToString());
-
-List params = new ArrayList<>();
-StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" 
FROM \"COMPACTION_QUEUE\" WHERE").
-  append(" (\"CQ_STATE\" IN(").
-
append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)).
-append(") OR (\"CQ_STATE\" = 
").append(quoteChar(READY_FOR_CLEANING)).
-append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))").
-append(" AND \"CQ_DATABASE\"=?").
-append(" AND \"CQ_TABLE\"=?").append(" AND ");
-params.add(Long.toString(tblValidWriteIds.getHighWatermark()));
-params.add(rqst.getDbname());
-params.add(rqst.getTablename());
-if(rqst.getPartitionname() == null) {
-  sb.append("\"CQ_PARTITION\" is null");
-} else {
-  sb.append("\"CQ_PARTITION\"=?");
-  params.add(rqst.getPartitionname());
-}
+try (Connection dbConn = 
getDbConn(Connection.TRANSACTION_READ_COMMITTED)) {

Review Comment:
   no need for an extra try, you could initialize dbConn in the above block





Issue Time Tracking
---

Worklog Id: (was: 777387)
Time Spent: 1h 10m  (was: 1h)

> Addendum to HIVE-26107: perpared statement is not working on Postgres
> -
>
> Key: HIVE-26267
> URL: https://issues.apache.org/jira/browse/HIVE-26267
> Project: Hive
>  Issue Type: Bug
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The assembled prepared statement in
> {code:java}
> org.apache.hadoop.hive.metastore.txn.TxnHandler#compact{code}
> does not work for Postgres DB.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777386=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777386
 ]

ASF GitHub Bot logged work on HIVE-26267:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 10:28
Start Date: 02/Jun/22 10:28
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3325:
URL: https://github.com/apache/hive/pull/3325#discussion_r887804677


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -3701,122 +3698,125 @@ public CompactionResponse compact(CompactionRequest 
rqst) throws MetaException {
  * compactions for any resource.
  */
 handle = 
getMutexAPI().acquireLock(MUTEX_KEY.CompactionScheduler.name());
-dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED);
-stmt = dbConn.createStatement();
-
-long id = generateCompactionQueueId(stmt);
-
-GetValidWriteIdsRequest request = new GetValidWriteIdsRequest(
-Collections.singletonList(getFullTableName(rqst.getDbname(), 
rqst.getTablename(;
-final ValidCompactorWriteIdList tblValidWriteIds =
-
TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0));
-LOG.debug("ValidCompactWriteIdList: " + 
tblValidWriteIds.writeToString());
-
-List params = new ArrayList<>();
-StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" 
FROM \"COMPACTION_QUEUE\" WHERE").
-  append(" (\"CQ_STATE\" IN(").
-
append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)).
-append(") OR (\"CQ_STATE\" = 
").append(quoteChar(READY_FOR_CLEANING)).
-append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))").
-append(" AND \"CQ_DATABASE\"=?").
-append(" AND \"CQ_TABLE\"=?").append(" AND ");
-params.add(Long.toString(tblValidWriteIds.getHighWatermark()));
-params.add(rqst.getDbname());
-params.add(rqst.getTablename());
-if(rqst.getPartitionname() == null) {
-  sb.append("\"CQ_PARTITION\" is null");
-} else {
-  sb.append("\"CQ_PARTITION\"=?");
-  params.add(rqst.getPartitionname());
-}
+try (Connection dbConn = 
getDbConn(Connection.TRANSACTION_READ_COMMITTED)) {
+  try (Statement stmt = dbConn.createStatement()) {
+
+long id = generateCompactionQueueId(stmt);
+
+GetValidWriteIdsRequest request = new GetValidWriteIdsRequest(
+Collections.singletonList(getFullTableName(rqst.getDbname(), 
rqst.getTablename(;
+final ValidCompactorWriteIdList tblValidWriteIds =
+
TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0));
+LOG.debug("ValidCompactWriteIdList: " + 
tblValidWriteIds.writeToString());
+
+StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", 
\"CQ_STATE\" FROM \"COMPACTION_QUEUE\" WHERE").
+append(" (\"CQ_STATE\" IN(").
+
append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)).
+append(") OR (\"CQ_STATE\" = 
").append(quoteChar(READY_FOR_CLEANING)).
+append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))").
+append(" AND \"CQ_DATABASE\"=?").
+append(" AND \"CQ_TABLE\"=?").append(" AND ");
+if(rqst.getPartitionname() == null) {
+  sb.append("\"CQ_PARTITION\" is null");
+} else {
+  sb.append("\"CQ_PARTITION\"=?");
+}
 
-pst = sqlGenerator.prepareStmtWithParameters(dbConn, sb.toString(), 
params);
-LOG.debug("Going to execute query <" + sb + ">");
-ResultSet rs = pst.executeQuery();
-if(rs.next()) {
-  long enqueuedId = rs.getLong(1);
-  String state = compactorStateToResponse(rs.getString(2).charAt(0));
-  LOG.info("Ignoring request to compact " + rqst.getDbname() + "/" + 
rqst.getTablename() +
-"/" + rqst.getPartitionname() + " since it is already " + 
quoteString(state) +
-" with id=" + enqueuedId);
-  CompactionResponse resp = new CompactionResponse(-1, 
REFUSED_RESPONSE, false);
-  resp.setErrormessage("Compaction is already scheduled with state=" + 
quoteString(state) +
-  " and id=" + enqueuedId);
-  return resp;
-}
-close(rs);
-closeStmt(pst);
-params.clear();
-StringBuilder buf = new StringBuilder("INSERT INTO 
\"COMPACTION_QUEUE\" (\"CQ_ID\", \"CQ_DATABASE\", " +
-  "\"CQ_TABLE\", ");
-String partName = rqst.getPartitionname();
-if (partName != null) buf.append("\"CQ_PARTITION\", ");
-

[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777385=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777385
 ]

ASF GitHub Bot logged work on HIVE-26267:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 10:27
Start Date: 02/Jun/22 10:27
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3325:
URL: https://github.com/apache/hive/pull/3325#discussion_r887804398


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -3701,122 +3698,125 @@ public CompactionResponse compact(CompactionRequest 
rqst) throws MetaException {
  * compactions for any resource.
  */
 handle = 
getMutexAPI().acquireLock(MUTEX_KEY.CompactionScheduler.name());
-dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED);
-stmt = dbConn.createStatement();
-
-long id = generateCompactionQueueId(stmt);
-
-GetValidWriteIdsRequest request = new GetValidWriteIdsRequest(
-Collections.singletonList(getFullTableName(rqst.getDbname(), 
rqst.getTablename(;
-final ValidCompactorWriteIdList tblValidWriteIds =
-
TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0));
-LOG.debug("ValidCompactWriteIdList: " + 
tblValidWriteIds.writeToString());
-
-List params = new ArrayList<>();
-StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" 
FROM \"COMPACTION_QUEUE\" WHERE").
-  append(" (\"CQ_STATE\" IN(").
-
append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)).
-append(") OR (\"CQ_STATE\" = 
").append(quoteChar(READY_FOR_CLEANING)).
-append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))").
-append(" AND \"CQ_DATABASE\"=?").
-append(" AND \"CQ_TABLE\"=?").append(" AND ");
-params.add(Long.toString(tblValidWriteIds.getHighWatermark()));
-params.add(rqst.getDbname());
-params.add(rqst.getTablename());
-if(rqst.getPartitionname() == null) {
-  sb.append("\"CQ_PARTITION\" is null");
-} else {
-  sb.append("\"CQ_PARTITION\"=?");
-  params.add(rqst.getPartitionname());
-}
+try (Connection dbConn = 
getDbConn(Connection.TRANSACTION_READ_COMMITTED)) {
+  try (Statement stmt = dbConn.createStatement()) {
+

Review Comment:
   nit: new line





Issue Time Tracking
---

Worklog Id: (was: 777385)
Time Spent: 50m  (was: 40m)

> Addendum to HIVE-26107: perpared statement is not working on Postgres
> -
>
> Key: HIVE-26267
> URL: https://issues.apache.org/jira/browse/HIVE-26267
> Project: Hive
>  Issue Type: Bug
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The assembled prepared statement in
> {code:java}
> org.apache.hadoop.hive.metastore.txn.TxnHandler#compact{code}
> does not work for Postgres DB.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26228) Implement Iceberg table rollback feature

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26228?focusedWorklogId=777380=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777380
 ]

ASF GitHub Bot logged work on HIVE-26228:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 09:52
Start Date: 02/Jun/22 09:52
Worklog Time Spent: 10m 
  Work Description: szlta commented on code in PR #3287:
URL: https://github.com/apache/hive/pull/3287#discussion_r887684398


##
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/AbstractBaseAlterTableAnalyzer.java:
##
@@ -161,7 +161,8 @@ protected void validateAlterTableType(Table table, 
AlterTableType op, boolean ex
 }
 if (table.isNonNative() && table.getStorageHandler() != null) {
   if (!table.getStorageHandler().isAllowedAlterOperation(op) ||

Review Comment:
   I may be missing the context here, but could you remind me why we don't 
incorporate these operation type checks within isAllowedAlterOperation() method 
of the storagehandler? (Meaning both setpartition and execute)



##
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/execute/AlterTableExecuteAnalyzer.java:
##
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.ddl.table.execute;
+
+import org.apache.hadoop.hive.common.TableName;
+import org.apache.hadoop.hive.common.type.TimestampTZ;
+import org.apache.hadoop.hive.common.type.TimestampTZUtil;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.QueryState;
+import org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory.DDLType;
+import org.apache.hadoop.hive.ql.ddl.DDLWork;
+import org.apache.hadoop.hive.ql.ddl.table.AbstractAlterTableAnalyzer;
+import org.apache.hadoop.hive.ql.ddl.table.AlterTableType;
+import org.apache.hadoop.hive.ql.exec.TaskFactory;
+import org.apache.hadoop.hive.ql.hooks.ReadEntity;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.apache.hadoop.hive.ql.parse.ASTNode;
+import org.apache.hadoop.hive.ql.parse.HiveParser;
+import org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec;
+import org.apache.hadoop.hive.ql.parse.SemanticException;
+import org.apache.hadoop.hive.ql.plan.PlanUtils;
+import org.apache.hadoop.hive.ql.session.SessionState;
+
+import java.time.ZoneId;
+import java.util.Map;
+
+import static 
org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec.ExecuteOperationType.ROLLBACK;
+import static 
org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec.RollbackSpec.RollbackType.TIME;
+import static 
org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec.RollbackSpec.RollbackType.VERSION;
+
+/**
+ * Analyzer for ALTER TABLE ... EXECUTE commands.
+ */
+@DDLType(types = HiveParser.TOK_ALTERTABLE_EXECUTE)
+public class AlterTableExecuteAnalyzer extends AbstractAlterTableAnalyzer {
+
+  public AlterTableExecuteAnalyzer(QueryState queryState) throws 
SemanticException {
+super(queryState);
+  }
+
+  @Override
+  protected void analyzeCommand(TableName tableName, Map 
partitionSpec, ASTNode command)
+  throws SemanticException {
+Table table = getTable(tableName);
+// the first child must be the execute operation type
+ASTNode executeCommandType = (ASTNode) command.getChild(0);
+validateAlterTableType(table, AlterTableType.EXECUTE, false);
+inputs.add(new ReadEntity(table));
+AlterTableExecuteDesc desc = null;
+if (HiveParser.KW_ROLLBACK == executeCommandType.getType()) {
+  AlterTableExecuteSpec spec;
+  // the second child must be the rollback parameter
+  ASTNode child = (ASTNode) command.getChild(1);
+
+  if (child.getType() == HiveParser.StringLiteral) {

Review Comment:
   What happens if the user specifies a timezone in this literal?





Issue Time Tracking
---

Worklog Id: (was: 777380)
Time Spent: 4.5h  (was: 4h 20m)

> Implement Iceberg table rollback feature
> 
>
> Key: HIVE-26228
> URL: https://issues.apache.org/jira/browse/HIVE-26228
> Project: Hive
>  Issue Type: New Feature
>

[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777359=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777359
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 08:56
Start Date: 02/Jun/22 08:56
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887728099


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;

Review Comment:
   shouldn't be uppercase if it's not marked as `final`





Issue Time Tracking
---

Worklog Id: (was: 777359)
Time Spent: 0.5h  (was: 20m)

> Compaction heartbeater improvements
> ---
>
> Key: HIVE-26242
> URL: https://issues.apache.org/jira/browse/HIVE-26242
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The Compaction heartbeater should be improved the following ways:
>  * The metastore clients should be reused between heartbeats and closed only 
> at the end, when the transaction ends
>  * Instead of having a dedicated heartbeater thread for each Compaction 
> transaction, there should be shared a heartbeater executor where the 
> heartbeat tasks can be scheduled/submitted.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777358=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777358
 ]

ASF GitHub Bot logged work on HIVE-26242:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 08:53
Start Date: 02/Jun/22 08:53
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3303:
URL: https://github.com/apache/hive/pull/3303#discussion_r887725752


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java:
##
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.commons.pool2.BasePooledObjectFactory;
+import org.apache.commons.pool2.ObjectPool;
+import org.apache.commons.pool2.PooledObject;
+import org.apache.commons.pool2.impl.DefaultPooledObject;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.NoSuchElementException;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Singleton service responsible for heartbeating the compaction transactions.
+ */
+class CompactionHeartbeatService {
+
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CompactionHeartbeatService.class);
+
+  private static volatile CompactionHeartbeatService INSTANCE;

Review Comment:
   why not `on demand holder` idiom





Issue Time Tracking
---

Worklog Id: (was: 777358)
Time Spent: 20m  (was: 10m)

> Compaction heartbeater improvements
> ---
>
> Key: HIVE-26242
> URL: https://issues.apache.org/jira/browse/HIVE-26242
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The Compaction heartbeater should be improved the following ways:
>  * The metastore clients should be reused between heartbeats and closed only 
> at the end, when the transaction ends
>  * Instead of having a dedicated heartbeater thread for each Compaction 
> transaction, there should be shared a heartbeater executor where the 
> heartbeat tasks can be scheduled/submitted.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26278) Add unit tests for Hive#getPartitionsByNames using batching

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26278?focusedWorklogId=777355=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777355
 ]

ASF GitHub Bot logged work on HIVE-26278:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 08:43
Start Date: 02/Jun/22 08:43
Worklog Time Spent: 10m 
  Work Description: zabetak commented on code in PR #3335:
URL: https://github.com/apache/hive/pull/3335#discussion_r887717257


##
ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreClientApiArgumentsChecker.java:
##
@@ -146,6 +151,26 @@ public void testGetPartitionsByNames3() throws 
HiveException {
 hive.getPartitionsByNames(t, new ArrayList<>(), true);
   }
 
+  @Test
+  public void testGetPartitionsByNamesWithSingleBatch() throws HiveException {
+hive.getPartitionsByNames(t, Arrays.asList("Greece", "Italy"), true);
+  }
+
+  @Test
+  public void testGetPartitionsByNamesWithMultipleEqualSizeBatches()
+  throws HiveException {
+List names = Arrays.asList("Greece", "Italy", "France", "Spain");
+hive.getPartitionsByNames(t, names, true);
+  }
+
+  @Test
+  public void testGetPartitionsByNamesWithMultipleUnequalSizeBatches()
+  throws HiveException {
+List names =
+Arrays.asList("Greece", "Italy", "France", "Spain", "Hungary");
+hive.getPartitionsByNames(t, names, true);

Review Comment:
   The goal of the tests in this class are explained in the javadoc of 
`TestHiveMetaStoreClient`. 
   
   > Tests in this class ensure that both tableId and validWriteIdList are sent 
from HS2
   
   Asserting more things seems to be out of scope for this PR.





Issue Time Tracking
---

Worklog Id: (was: 777355)
Time Spent: 1h  (was: 50m)

> Add unit tests for Hive#getPartitionsByNames using batching
> ---
>
> Key: HIVE-26278
> URL: https://issues.apache.org/jira/browse/HIVE-26278
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> [Hive#getPartitionsByNames|https://github.com/apache/hive/blob/6626b5564ee206db5a656d2f611ed71f10a0ffc1/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L4155]
>  supports decomposing requests in batches but there are no unit tests 
> checking for the ValidWriteIdList when batching is used.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26278) Add unit tests for Hive#getPartitionsByNames using batching

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26278?focusedWorklogId=777352=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777352
 ]

ASF GitHub Bot logged work on HIVE-26278:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 08:40
Start Date: 02/Jun/22 08:40
Worklog Time Spent: 10m 
  Work Description: zabetak commented on code in PR #3335:
URL: https://github.com/apache/hive/pull/3335#discussion_r887714198


##
ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreClientApiArgumentsChecker.java:
##
@@ -146,6 +151,26 @@ public void testGetPartitionsByNames3() throws 
HiveException {
 hive.getPartitionsByNames(t, new ArrayList<>(), true);
   }
 
+  @Test
+  public void testGetPartitionsByNamesWithSingleBatch() throws HiveException {
+hive.getPartitionsByNames(t, Arrays.asList("Greece", "Italy"), true);

Review Comment:
   The assertions are in `TestHiveMetaStoreClient`.





Issue Time Tracking
---

Worklog Id: (was: 777352)
Time Spent: 50m  (was: 40m)

> Add unit tests for Hive#getPartitionsByNames using batching
> ---
>
> Key: HIVE-26278
> URL: https://issues.apache.org/jira/browse/HIVE-26278
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> [Hive#getPartitionsByNames|https://github.com/apache/hive/blob/6626b5564ee206db5a656d2f611ed71f10a0ffc1/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L4155]
>  supports decomposing requests in batches but there are no unit tests 
> checking for the ValidWriteIdList when batching is used.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26278) Add unit tests for Hive#getPartitionsByNames using batching

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26278?focusedWorklogId=777350=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777350
 ]

ASF GitHub Bot logged work on HIVE-26278:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 08:36
Start Date: 02/Jun/22 08:36
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on code in PR #3335:
URL: https://github.com/apache/hive/pull/3335#discussion_r887711290


##
ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreClientApiArgumentsChecker.java:
##
@@ -146,6 +151,26 @@ public void testGetPartitionsByNames3() throws 
HiveException {
 hive.getPartitionsByNames(t, new ArrayList<>(), true);
   }
 
+  @Test
+  public void testGetPartitionsByNamesWithSingleBatch() throws HiveException {
+hive.getPartitionsByNames(t, Arrays.asList("Greece", "Italy"), true);
+  }
+
+  @Test
+  public void testGetPartitionsByNamesWithMultipleEqualSizeBatches()
+  throws HiveException {
+List names = Arrays.asList("Greece", "Italy", "France", "Spain");
+hive.getPartitionsByNames(t, names, true);
+  }
+
+  @Test
+  public void testGetPartitionsByNamesWithMultipleUnequalSizeBatches()
+  throws HiveException {
+List names =
+Arrays.asList("Greece", "Italy", "France", "Spain", "Hungary");
+hive.getPartitionsByNames(t, names, true);

Review Comment:
   Could you please add some asserts into these test methods like is the result 
of `getPartitionsByNames` contains partitions we expect.
   Example:
   ```
   List result = objectStore.getAllStoredProcedures(new 
ListStoredProcedureRequest("hive"));
   Assert.assertEquals(5, result.size());
   assertThat(result.get(0).getName, is("Greece"));
   ...
   ```
   It would be also good to know how many times was the method 
`IMetaStoreClient.getPartitionsByNames` called. I think to get this a mock 
framework should be used. Mockito is already used in some Hive tests.
   





Issue Time Tracking
---

Worklog Id: (was: 777350)
Time Spent: 40m  (was: 0.5h)

> Add unit tests for Hive#getPartitionsByNames using batching
> ---
>
> Key: HIVE-26278
> URL: https://issues.apache.org/jira/browse/HIVE-26278
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [Hive#getPartitionsByNames|https://github.com/apache/hive/blob/6626b5564ee206db5a656d2f611ed71f10a0ffc1/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L4155]
>  supports decomposing requests in batches but there are no unit tests 
> checking for the ValidWriteIdList when batching is used.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26278) Add unit tests for Hive#getPartitionsByNames using batching

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26278?focusedWorklogId=777349=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777349
 ]

ASF GitHub Bot logged work on HIVE-26278:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 08:36
Start Date: 02/Jun/22 08:36
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on code in PR #3335:
URL: https://github.com/apache/hive/pull/3335#discussion_r887711311


##
ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreClientApiArgumentsChecker.java:
##
@@ -146,6 +151,26 @@ public void testGetPartitionsByNames3() throws 
HiveException {
 hive.getPartitionsByNames(t, new ArrayList<>(), true);
   }
 
+  @Test
+  public void testGetPartitionsByNamesWithSingleBatch() throws HiveException {
+hive.getPartitionsByNames(t, Arrays.asList("Greece", "Italy"), true);

Review Comment:
   I don't see any assertions - how these tests are working?





Issue Time Tracking
---

Worklog Id: (was: 777349)
Time Spent: 0.5h  (was: 20m)

> Add unit tests for Hive#getPartitionsByNames using batching
> ---
>
> Key: HIVE-26278
> URL: https://issues.apache.org/jira/browse/HIVE-26278
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> [Hive#getPartitionsByNames|https://github.com/apache/hive/blob/6626b5564ee206db5a656d2f611ed71f10a0ffc1/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L4155]
>  supports decomposing requests in batches but there are no unit tests 
> checking for the ValidWriteIdList when batching is used.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777343=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777343
 ]

ASF GitHub Bot logged work on HIVE-26267:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 08:27
Start Date: 02/Jun/22 08:27
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3325:
URL: https://github.com/apache/hive/pull/3325#discussion_r887702617


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -3712,25 +3712,26 @@ public CompactionResponse compact(CompactionRequest 
rqst) throws MetaException {
 
TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0));
 LOG.debug("ValidCompactWriteIdList: " + 
tblValidWriteIds.writeToString());
 
-List params = new ArrayList<>();
 StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" 
FROM \"COMPACTION_QUEUE\" WHERE").
   append(" (\"CQ_STATE\" IN(").
 
append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)).
 append(") OR (\"CQ_STATE\" = 
").append(quoteChar(READY_FOR_CLEANING)).
 append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))").
 append(" AND \"CQ_DATABASE\"=?").
 append(" AND \"CQ_TABLE\"=?").append(" AND ");
-params.add(Long.toString(tblValidWriteIds.getHighWatermark()));
-params.add(rqst.getDbname());
-params.add(rqst.getTablename());
 if(rqst.getPartitionname() == null) {
   sb.append("\"CQ_PARTITION\" is null");
 } else {
   sb.append("\"CQ_PARTITION\"=?");

Review Comment:
   no need to check explicitly for null,  you can use `pstmt.setObject()`





Issue Time Tracking
---

Worklog Id: (was: 777343)
Time Spent: 40m  (was: 0.5h)

> Addendum to HIVE-26107: perpared statement is not working on Postgres
> -
>
> Key: HIVE-26267
> URL: https://issues.apache.org/jira/browse/HIVE-26267
> Project: Hive
>  Issue Type: Bug
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The assembled prepared statement in
> {code:java}
> org.apache.hadoop.hive.metastore.txn.TxnHandler#compact{code}
> does not work for Postgres DB.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777338=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777338
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 08:22
Start Date: 02/Jun/22 08:22
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887697843


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {
+  long txnId = Optional.ofNullable(pCtx.getContext())
+  .map(ctx -> 
ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L);
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);
+}
   } else if (pCtx.getQueryProperties().isMaterializedView()) {
 protoName = pCtx.getCreateViewDesc().getViewName();
-boolean createMVUseSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
-  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
 
-if (createMVUseSuffix) {
+if (useSuffix) {

Review Comment:
   Since suffixing must be done for transactional MV, added a condition. 
Updated.





Issue Time Tracking
---

Worklog Id: (was: 777338)
Time Spent: 12h 40m  (was: 12.5h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777339=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777339
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 08:22
Start Date: 02/Jun/22 08:22
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887629211


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {
+  long txnId = Optional.ofNullable(pCtx.getContext())
+  .map(ctx -> 
ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L);
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);
+}
   } else if (pCtx.getQueryProperties().isMaterializedView()) {
 protoName = pCtx.getCreateViewDesc().getViewName();
-boolean createMVUseSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
-  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
 
-if (createMVUseSuffix) {
+if (useSuffix) {

Review Comment:
   Should we do transactional check for Materialised views?



##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {
+  long txnId = Optional.ofNullable(pCtx.getContext())
+  .map(ctx -> 
ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L);
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);
+}
   } else if (pCtx.getQueryProperties().isMaterializedView()) {
 protoName = pCtx.getCreateViewDesc().getViewName();
-boolean createMVUseSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
-  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
 
-if (createMVUseSuffix) {
+if (useSuffix) {

Review Comment:
   Updated.





Issue Time Tracking
---

Worklog Id: (was: 777339)
Time Spent: 12h 50m  (was: 12h 40m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777334=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777334
 ]

ASF GitHub Bot logged work on HIVE-26267:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 08:09
Start Date: 02/Jun/22 08:09
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3325:
URL: https://github.com/apache/hive/pull/3325#discussion_r887686514


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -3746,7 +3747,7 @@ public CompactionResponse compact(CompactionRequest rqst) 
throws MetaException {
 }
 close(rs);

Review Comment:
   could we wrap prep statement in try-catch so no explicit close is required





Issue Time Tracking
---

Worklog Id: (was: 777334)
Time Spent: 0.5h  (was: 20m)

> Addendum to HIVE-26107: perpared statement is not working on Postgres
> -
>
> Key: HIVE-26267
> URL: https://issues.apache.org/jira/browse/HIVE-26267
> Project: Hive
>  Issue Type: Bug
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The assembled prepared statement in
> {code:java}
> org.apache.hadoop.hive.metastore.txn.TxnHandler#compact{code}
> does not work for Postgres DB.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777333=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777333
 ]

ASF GitHub Bot logged work on HIVE-26267:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 08:07
Start Date: 02/Jun/22 08:07
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3325:
URL: https://github.com/apache/hive/pull/3325#discussion_r887685084


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -3712,25 +3712,26 @@ public CompactionResponse compact(CompactionRequest 
rqst) throws MetaException {
 
TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0));
 LOG.debug("ValidCompactWriteIdList: " + 
tblValidWriteIds.writeToString());
 
-List params = new ArrayList<>();
 StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" 
FROM \"COMPACTION_QUEUE\" WHERE").
   append(" (\"CQ_STATE\" IN(").
 
append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)).
 append(") OR (\"CQ_STATE\" = 
").append(quoteChar(READY_FOR_CLEANING)).
 append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))").
 append(" AND \"CQ_DATABASE\"=?").
 append(" AND \"CQ_TABLE\"=?").append(" AND ");
-params.add(Long.toString(tblValidWriteIds.getHighWatermark()));
-params.add(rqst.getDbname());
-params.add(rqst.getTablename());
 if(rqst.getPartitionname() == null) {
   sb.append("\"CQ_PARTITION\" is null");
 } else {
   sb.append("\"CQ_PARTITION\"=?");
-  params.add(rqst.getPartitionname());
 }
 
-pst = sqlGenerator.prepareStmtWithParameters(dbConn, sb.toString(), 
params);
+pst = 
dbConn.prepareStatement(sqlGenerator.addEscapeCharacters(sb.toString()));
+pst.setLong(1, tblValidWriteIds.getHighWatermark());
+pst.setString(2, rqst.getDbname());
+pst.setString(3, rqst.getTablename());
+if(rqst.getPartitionname() != null) {

Review Comment:
   nit: space





Issue Time Tracking
---

Worklog Id: (was: 777333)
Time Spent: 20m  (was: 10m)

> Addendum to HIVE-26107: perpared statement is not working on Postgres
> -
>
> Key: HIVE-26267
> URL: https://issues.apache.org/jira/browse/HIVE-26267
> Project: Hive
>  Issue Type: Bug
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The assembled prepared statement in
> {code:java}
> org.apache.hadoop.hive.metastore.txn.TxnHandler#compact{code}
> does not work for Postgres DB.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26278) Add unit tests for Hive#getPartitionsByNames using batching

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26278?focusedWorklogId=777332=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777332
 ]

ASF GitHub Bot logged work on HIVE-26278:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:58
Start Date: 02/Jun/22 07:58
Worklog Time Spent: 10m 
  Work Description: zabetak commented on code in PR #3335:
URL: https://github.com/apache/hive/pull/3335#discussion_r887676826


##
ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreClientApiArgumentsChecker.java:
##
@@ -38,10 +38,13 @@
 import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrGreaterThan;
 import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory;
 import org.apache.thrift.TException;
+
+import org.junit.Assert;

Review Comment:
   Unused import, will remove before merging.





Issue Time Tracking
---

Worklog Id: (was: 777332)
Time Spent: 20m  (was: 10m)

> Add unit tests for Hive#getPartitionsByNames using batching
> ---
>
> Key: HIVE-26278
> URL: https://issues.apache.org/jira/browse/HIVE-26278
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [Hive#getPartitionsByNames|https://github.com/apache/hive/blob/6626b5564ee206db5a656d2f611ed71f10a0ffc1/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L4155]
>  supports decomposing requests in batches but there are no unit tests 
> checking for the ValidWriteIdList when batching is used.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (HIVE-25421) Fallback from vectorization when reading Iceberg's time columns from ORC files

2022-06-02 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ádám Szita resolved HIVE-25421.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Committed to master. Thanks for the review [~lpinter] 

> Fallback from vectorization when reading Iceberg's time columns from ORC files
> --
>
> Key: HIVE-25421
> URL: https://issues.apache.org/jira/browse/HIVE-25421
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> As discussed in HIVE-25420 time column is not native Hive type, reading it is 
> more complicated, and is not supported for vectorized read. Trying this 
> currently results in an exception for ORC formatted files, so we should make 
> an effort to gracefully fall back to non-vectorized reads when there's such a 
> column in the query's projection.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-25421) Fallback from vectorization when reading Iceberg's time columns from ORC files

2022-06-02 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ádám Szita updated HIVE-25421:
--
Description: As discussed in HIVE-25420 time column is not native Hive 
type, reading it is more complicated, and is not supported for vectorized read. 
Trying this currently results in an exception for ORC formatted files, so we 
should make an effort to gracefully fall back to non-vectorized reads when 
there's such a column in the query's projection.  (was: As discussed in 
HIVE-25420 time column is not native Hive type, reading it is more complicated, 
and is not supported for vectorized read. Trying this currently results in an 
exception, so we should make an effort to
 * either gracefully fall back to non-vectorized reads when there's such a 
column in the query's projection
 * or work around the reading issue on the execution side.)

> Fallback from vectorization when reading Iceberg's time columns from ORC files
> --
>
> Key: HIVE-25421
> URL: https://issues.apache.org/jira/browse/HIVE-25421
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As discussed in HIVE-25420 time column is not native Hive type, reading it is 
> more complicated, and is not supported for vectorized read. Trying this 
> currently results in an exception for ORC formatted files, so we should make 
> an effort to gracefully fall back to non-vectorized reads when there's such a 
> column in the query's projection.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25421) Fallback from vectorization when reading Iceberg's time columns from ORC files

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25421?focusedWorklogId=777331=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777331
 ]

ASF GitHub Bot logged work on HIVE-25421:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:50
Start Date: 02/Jun/22 07:50
Worklog Time Spent: 10m 
  Work Description: szlta merged PR #3334:
URL: https://github.com/apache/hive/pull/3334




Issue Time Tracking
---

Worklog Id: (was: 777331)
Time Spent: 20m  (was: 10m)

> Fallback from vectorization when reading Iceberg's time columns from ORC files
> --
>
> Key: HIVE-25421
> URL: https://issues.apache.org/jira/browse/HIVE-25421
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> As discussed in HIVE-25420 time column is not native Hive type, reading it is 
> more complicated, and is not supported for vectorized read. Trying this 
> currently results in an exception for ORC formatted files, so we should make 
> an effort to gracefully fall back to non-vectorized reads when there's such a 
> column in the query's projection.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-25421) Fallback from vectorization when reading Iceberg's time columns from ORC files

2022-06-02 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ádám Szita updated HIVE-25421:
--
Summary: Fallback from vectorization when reading Iceberg's time columns 
from ORC files  (was: Fallback from vectorization when reading Iceberg's time 
columns)

> Fallback from vectorization when reading Iceberg's time columns from ORC files
> --
>
> Key: HIVE-25421
> URL: https://issues.apache.org/jira/browse/HIVE-25421
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As discussed in HIVE-25420 time column is not native Hive type, reading it is 
> more complicated, and is not supported for vectorized read. Trying this 
> currently results in an exception, so we should make an effort to
>  * either gracefully fall back to non-vectorized reads when there's such a 
> column in the query's projection
>  * or work around the reading issue on the execution side.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=777328=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777328
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:33
Start Date: 02/Jun/22 07:33
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r887639411


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5283,6 +5284,39 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (isValidTxn(txnId)) {
+  LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar)
+  .orElseThrow(() -> new MetaException("Unknown lock type: " + 
lockChar));
+
+  if (lockType == LockType.EXCL_WRITE && blockedBy.state == 
LockState.ACQUIRED) {

Review Comment:
   I don't really like that we are adding extra overhead in checkLocks method, 
it's already a sensitive part performance-wise. I think we should try to 
optimize: if it's CTAS we know that it could only be blocked by another 
artificial CTAS or DROP database (EXCLUSIVE + EXCL_WRITE), so no need to run an 
expensive checkLock `BIG` query. Also that would mean that we can just give up 
and do not check against TXNS table what is the type of blocking TXN.
   Also, I would expect IOW to behave similarly to CTAS, currently it doesn't 
fail and is executed in sequential order, however, it doesn't require any extra 
cleanup in case of failure. So I am OK with the selected approach, but we 
should try to optimize if possible.





Issue Time Tracking
---

Worklog Id: (was: 777328)
Time Spent: 2h 50m  (was: 2h 40m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=777325=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777325
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:28
Start Date: 02/Jun/22 07:28
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r887639411


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5283,6 +5284,39 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (isValidTxn(txnId)) {
+  LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar)
+  .orElseThrow(() -> new MetaException("Unknown lock type: " + 
lockChar));
+
+  if (lockType == LockType.EXCL_WRITE && blockedBy.state == 
LockState.ACQUIRED) {

Review Comment:
   I don't really like that we are adding extra overhead in checkLocks method, 
it's already a sensitive part performance-wise. I think we should try to 
optimize: if it's CTAS we know that it could only be blocked by another 
artificial CTAS or DROP database (EXCLUSIVE + EXCL_WRITE), so no need to run 
expensive checkLock `BIG` query.
   Also, I would expect IOW to behave similarly to CTAS, currently it doesn't 
fail and is executed in sequential order, however, it doesn't require any extra 
cleanup in case of failure. So I am OK with the selected approach, but we 
should try to optimize if possible.





Issue Time Tracking
---

Worklog Id: (was: 777325)
Time Spent: 2h 40m  (was: 2.5h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=777322=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777322
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:26
Start Date: 02/Jun/22 07:26
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r887639411


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5283,6 +5284,39 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (isValidTxn(txnId)) {
+  LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar)
+  .orElseThrow(() -> new MetaException("Unknown lock type: " + 
lockChar));
+
+  if (lockType == LockType.EXCL_WRITE && blockedBy.state == 
LockState.ACQUIRED) {

Review Comment:
   I don't really like that we are adding extra overhead in checkLocks method, 
it's already a sensitive part performance-wise. I think we should try to 
optimize: if it's CTAS we know that it could only be blocked by another 
artificial CTAS or DROP database (EXCLUSIVE + EXCL_WRITE), so no need to run 
expensive checkLock `BIG` query.
   Also, I would expect IOW to behave similarly to CTAS, currently it doesn't 
fail and is executed in sequential order.





Issue Time Tracking
---

Worklog Id: (was: 777322)
Time Spent: 2.5h  (was: 2h 20m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=777316=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777316
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:22
Start Date: 02/Jun/22 07:22
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r887645373


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -13910,7 +13911,13 @@ private void addDbAndTabToOutputs(String[] 
qualifiedTabName, TableType type,
 for(Map.Entry serdeMap : 
storageFormat.getSerdeProps().entrySet()){
   t.setSerdeParam(serdeMap.getKey(), serdeMap.getValue());
 }
-outputs.add(new WriteEntity(t, WriteEntity.WriteType.DDL_NO_LOCK));
+if (tblProps != null &&
+tblProps.get(TABLE_IS_CTAS) == "true" &&

Review Comment:
   could we do `Boolean.parseBoolean(..)` instead of comparing with the string





Issue Time Tracking
---

Worklog Id: (was: 777316)
Time Spent: 2h 20m  (was: 2h 10m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777300=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777300
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:00
Start Date: 02/Jun/22 07:00
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887625663


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7598,6 +7602,26 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = MetaStoreUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+isMmTable = isMmCreate = AcidUtils.isInsertOnlyTable(tblProps);
+if (!isNonNativeTable && !destTableIsTemporary && isCtas) {
+  destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps);
+  acidOperation = getAcidType(dest);
+  isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation);
+  boolean enableSuffixing = HiveConf.getBoolVar(conf, 
ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)

Review Comment:
   Updated.



##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7598,6 +7602,26 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = MetaStoreUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+isMmTable = isMmCreate = AcidUtils.isInsertOnlyTable(tblProps);
+if (!isNonNativeTable && !destTableIsTemporary && isCtas) {
+  destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps);
+  acidOperation = getAcidType(dest);
+  isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation);
+  boolean enableSuffixing = HiveConf.getBoolVar(conf, 
ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+  if (isDirectInsert || isMmTable) {
+destinationPath = getCTASDestinationTableLocation(tblDesc, 
enableSuffixing);
+// Setting the location so that metadata transformers
+// does not change the location later while creating the table.
+tblDesc.setLocation(destinationPath.toString());
+// Property SOFT_DELETE_TABLE needs to be added to indicate that 
suffixing is used.
+if (enableSuffixing && tblDesc.getLocation().matches("(.*)" + 
SOFT_DELETE_TABLE_PATTERN)) {

Review Comment:
   Updated.





Issue Time Tracking
---

Worklog Id: (was: 777300)
Time Spent: 11h 40m  (was: 11.5h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 11h 40m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=777312=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777312
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:15
Start Date: 02/Jun/22 07:15
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r887639411


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5283,6 +5284,39 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (isValidTxn(txnId)) {
+  LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar)
+  .orElseThrow(() -> new MetaException("Unknown lock type: " + 
lockChar));
+
+  if (lockType == LockType.EXCL_WRITE && blockedBy.state == 
LockState.ACQUIRED) {

Review Comment:
   I don't really like that we are adding extra overhead in checkLocks method, 
it's already a sensitive part performance-wise. I think we should try to 
optimize: if it's CTAS we know that it could only be blocked by another 
artificial CTAS or DROP database, so no need to run expensive checkLock `BIG` 
query.
   Also, I would expect IOW to behave similarly to CTAS, currently it doesn't 
fail and is executed in sequential order.





Issue Time Tracking
---

Worklog Id: (was: 777312)
Time Spent: 2h 10m  (was: 2h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777307=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777307
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:04
Start Date: 02/Jun/22 07:04
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887629211


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {
+  long txnId = Optional.ofNullable(pCtx.getContext())
+  .map(ctx -> 
ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L);
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);
+}
   } else if (pCtx.getQueryProperties().isMaterializedView()) {
 protoName = pCtx.getCreateViewDesc().getViewName();
-boolean createMVUseSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
-  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
 
-if (createMVUseSuffix) {
+if (useSuffix) {

Review Comment:
   Should we do transactional check for Materialised views?





Issue Time Tracking
---

Worklog Id: (was: 777307)
Time Spent: 12.5h  (was: 12h 20m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777305=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777305
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:03
Start Date: 02/Jun/22 07:03
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887627534


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {

Review Comment:
   Updated.





Issue Time Tracking
---

Worklog Id: (was: 777305)
Time Spent: 12h 10m  (was: 12h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777301=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777301
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:00
Start Date: 02/Jun/22 07:00
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887625817


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7940,6 +7970,46 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 return output;
   }
 
+  private Path getCTASDestinationTableLocation(CreateTableDesc tblDesc, 
boolean enableSuffixing) throws SemanticException {
+Path location;
+String suffix = "";
+try {
+  // When location is specified, suffix is not added
+  if (tblDesc.getLocation() == null) {
+String protoName = tblDesc.getDbTableName();
+String[] names = Utilities.getDbTableName(protoName);
+if (enableSuffixing) {
+  long txnId = ctx.getHiveTxnManager().getCurrentTxnId();
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);
+}
+if (!db.databaseExists(names[0])) {
+  throw new SemanticException("ERROR: The database " + names[0] + " 
does not exist.");
+}
+
+Warehouse wh = new Warehouse(conf);
+location = wh.getDefaultTablePath(db.getDatabase(names[0]), names[1] + 
suffix, false);
+  } else {
+location = new Path(tblDesc.getLocation());
+  }
+
+  // Handle table translation
+  // Property modifications of the table is handled later.
+  // We are interested in the location if it has changed
+  // due to table translation.
+  Table tbl = tblDesc.toTable(conf);
+  tbl = db.getTranslateTableDryrun(tbl.getTTable());

Review Comment:
   Updated.





Issue Time Tracking
---

Worklog Id: (was: 777301)
Time Spent: 11h 50m  (was: 11h 40m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777298=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777298
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 06:59
Start Date: 02/Jun/22 06:59
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887624982


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)

Review Comment:
   Renamed to createTableOrMVUseSuffix. Updated.





Issue Time Tracking
---

Worklog Id: (was: 777298)
Time Spent: 11h 20m  (was: 11h 10m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777306=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777306
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:03
Start Date: 02/Jun/22 07:03
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887628060


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {
+  long txnId = Optional.ofNullable(pCtx.getContext())

Review Comment:
   Updated and added a new function called "getTableOrMVSuffix"





Issue Time Tracking
---

Worklog Id: (was: 777306)
Time Spent: 12h 20m  (was: 12h 10m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777304=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777304
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:02
Start Date: 02/Jun/22 07:02
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887627260


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7940,6 +7970,46 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 return output;
   }
 
+  private Path getCTASDestinationTableLocation(CreateTableDesc tblDesc, 
boolean enableSuffixing) throws SemanticException {
+Path location;
+String suffix = "";
+try {
+  // When location is specified, suffix is not added
+  if (tblDesc.getLocation() == null) {

Review Comment:
   Location check happens now at the point where suffixing is decided. Updated.
   
https://github.com/apache/hive/pull/3281/files#diff-d4b1a32bbbd9e283893a6b52854c7aeb3e356a1ba1add2c4107e52901ca268f9R7616-R7618
   



##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7940,6 +7970,46 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 return output;
   }
 
+  private Path getCTASDestinationTableLocation(CreateTableDesc tblDesc, 
boolean enableSuffixing) throws SemanticException {

Review Comment:
   Updated.





Issue Time Tracking
---

Worklog Id: (was: 777304)
Time Spent: 12h  (was: 11h 50m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777299=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777299
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 06:59
Start Date: 02/Jun/22 06:59
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887625106


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {
+  long txnId = Optional.ofNullable(pCtx.getContext())
+  .map(ctx -> 
ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L);
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);

Review Comment:
   Updated.



##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -8229,9 +8299,17 @@ private void handleLineage(LoadTableDesc ltd, Operator 
output)
   Path tlocation = null;
   String tName = Utilities.getDbTableName(tableDesc.getDbTableName())[1];
   try {
+String suffix = "";
+if (AcidUtils.isTransactionalTable(destinationTable)) {

Review Comment:
   Updated.





Issue Time Tracking
---

Worklog Id: (was: 777299)
Time Spent: 11.5h  (was: 11h 20m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (HIVE-26274) No vectorization if query has upper case window function

2022-06-02 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa resolved HIVE-26274.
---
Resolution: Fixed

Pushed to master. Thanks [~abstractdog] for review.

> No vectorization if query has upper case window function
> 
>
> Key: HIVE-26274
> URL: https://issues.apache.org/jira/browse/HIVE-26274
> Project: Hive
>  Issue Type: Bug
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {code}
> CREATE TABLE t1 (a int, b int);
> EXPLAIN VECTORIZATION ONLY SELECT ROW_NUMBER() OVER(order by a) AS rn FROM t1;
> {code}
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
>   Vertices:
> Map 1 
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vector.serde.deserialize IS true
> inputFormatFeatureSupport: [DECIMAL_64]
> featureSupportInUse: [DECIMAL_64]
> inputFileFormats: org.apache.hadoop.mapred.TextInputFormat
> allNative: true
> usesVectorUDFAdaptor: false
> vectorized: true
> Reducer 2 
> Execution mode: llap
> Reduce Vectorization:
> enabled: true
> enableConditionsMet: hive.vectorized.execution.reduce.enabled 
> IS true, hive.execution.engine tez IN [tez] IS true
> notVectorizedReason: PTF operator: ROW_NUMBER not in 
> supported functions [avg, count, dense_rank, first_value, lag, last_value, 
> lead, max, min, rank, row_number, sum]
> vectorized: false
>   Stage: Stage-0
> Fetch Operator
> {code}
> {code}
> notVectorizedReason: PTF operator: ROW_NUMBER not in 
> supported functions [avg, count, dense_rank, first_value, lag, last_value, 
> lead, max, min, rank, row_number, sum]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26274) No vectorization if query has upper case window function

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26274?focusedWorklogId=777296=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777296
 ]

ASF GitHub Bot logged work on HIVE-26274:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 06:54
Start Date: 02/Jun/22 06:54
Worklog Time Spent: 10m 
  Work Description: kasakrisz merged PR #3332:
URL: https://github.com/apache/hive/pull/3332




Issue Time Tracking
---

Worklog Id: (was: 777296)
Time Spent: 0.5h  (was: 20m)

> No vectorization if query has upper case window function
> 
>
> Key: HIVE-26274
> URL: https://issues.apache.org/jira/browse/HIVE-26274
> Project: Hive
>  Issue Type: Bug
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {code}
> CREATE TABLE t1 (a int, b int);
> EXPLAIN VECTORIZATION ONLY SELECT ROW_NUMBER() OVER(order by a) AS rn FROM t1;
> {code}
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
>   Vertices:
> Map 1 
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vector.serde.deserialize IS true
> inputFormatFeatureSupport: [DECIMAL_64]
> featureSupportInUse: [DECIMAL_64]
> inputFileFormats: org.apache.hadoop.mapred.TextInputFormat
> allNative: true
> usesVectorUDFAdaptor: false
> vectorized: true
> Reducer 2 
> Execution mode: llap
> Reduce Vectorization:
> enabled: true
> enableConditionsMet: hive.vectorized.execution.reduce.enabled 
> IS true, hive.execution.engine tez IN [tez] IS true
> notVectorizedReason: PTF operator: ROW_NUMBER not in 
> supported functions [avg, count, dense_rank, first_value, lag, last_value, 
> lead, max, min, rank, row_number, sum]
> vectorized: false
>   Stage: Stage-0
> Fetch Operator
> {code}
> {code}
> notVectorizedReason: PTF operator: ROW_NUMBER not in 
> supported functions [avg, count, dense_rank, first_value, lag, last_value, 
> lead, max, min, rank, row_number, sum]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26274) No vectorization if query has upper case window function

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26274?focusedWorklogId=777293=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777293
 ]

ASF GitHub Bot logged work on HIVE-26274:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 06:46
Start Date: 02/Jun/22 06:46
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on PR #3332:
URL: https://github.com/apache/hive/pull/3332#issuecomment-1144501074

   LGTM, thanks for the patch @kasakrisz 




Issue Time Tracking
---

Worklog Id: (was: 777293)
Time Spent: 20m  (was: 10m)

> No vectorization if query has upper case window function
> 
>
> Key: HIVE-26274
> URL: https://issues.apache.org/jira/browse/HIVE-26274
> Project: Hive
>  Issue Type: Bug
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code}
> CREATE TABLE t1 (a int, b int);
> EXPLAIN VECTORIZATION ONLY SELECT ROW_NUMBER() OVER(order by a) AS rn FROM t1;
> {code}
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
>   Vertices:
> Map 1 
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vector.serde.deserialize IS true
> inputFormatFeatureSupport: [DECIMAL_64]
> featureSupportInUse: [DECIMAL_64]
> inputFileFormats: org.apache.hadoop.mapred.TextInputFormat
> allNative: true
> usesVectorUDFAdaptor: false
> vectorized: true
> Reducer 2 
> Execution mode: llap
> Reduce Vectorization:
> enabled: true
> enableConditionsMet: hive.vectorized.execution.reduce.enabled 
> IS true, hive.execution.engine tez IN [tez] IS true
> notVectorizedReason: PTF operator: ROW_NUMBER not in 
> supported functions [avg, count, dense_rank, first_value, lag, last_value, 
> lead, max, min, rank, row_number, sum]
> vectorized: false
>   Stage: Stage-0
> Fetch Operator
> {code}
> {code}
> notVectorizedReason: PTF operator: ROW_NUMBER not in 
> supported functions [avg, count, dense_rank, first_value, lag, last_value, 
> lead, max, min, rank, row_number, sum]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-26285) Overwrite database metadata on original source in optimised failover.

2022-06-02 Thread Haymant Mangla (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haymant Mangla updated HIVE-26285:
--
Parent: HIVE-25699
Issue Type: Sub-task  (was: Bug)

> Overwrite database metadata on original source in optimised failover.
> -
>
> Key: HIVE-26285
> URL: https://issues.apache.org/jira/browse/HIVE-26285
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)