[jira] [Assigned] (HIVE-24484) Upgrade Hadoop to 3.3.3
[ https://issues.apache.org/jira/browse/HIVE-24484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena reassigned HIVE-24484: --- Assignee: Ayush Saxena (was: David Mollitor) > Upgrade Hadoop to 3.3.3 > --- > > Key: HIVE-24484 > URL: https://issues.apache.org/jira/browse/HIVE-24484 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Time Spent: 12h 13m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (HIVE-26264) Iceberg integration: Fetch virtual columns on demand
[ https://issues.apache.org/jira/browse/HIVE-26264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa resolved HIVE-26264. --- Resolution: Fixed Pushed to master. Thanks [~pvary] for review. > Iceberg integration: Fetch virtual columns on demand > > > Key: HIVE-26264 > URL: https://issues.apache.org/jira/browse/HIVE-26264 > Project: Hive > Issue Type: Bug > Components: File Formats >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 5.5h > Remaining Estimate: 0h > > Currently virtual columns are fetched from iceberg tables if the statement > being executed is a delete or update statement and the setting is global. It > means it affects all tables affected by the statement. Also the read and > write schema depends on the operation setting. > Some statements fails due to invalid schema: > {code} > create external table tbl_ice(a int, b string, c int) stored by iceberg > stored as orc tblproperties ('format-version'='2'); > insert into tbl_ice values (1, 'one', 50), (2, 'two', 51), (3, 'three', 52), > (4, 'four', 53), (5, 'five', 54), (111, 'one', 55), (333, 'two', 56); > update tbl_ice set b='Changed' where b in (select b from tbl_ice where a < 4); > {code} > {code} > See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, > or check ./ql/target/surefire-reports or > ./itests/qtest/target/surefire-reports/ for specific test cases logs. > org.apache.hadoop.hive.ql.metadata.HiveException: Vertex failed, > vertexName=Map 3, vertexId=vertex_1653493839723_0001_3_01, diagnostics=[Task > failed, taskId=task_1653493839723_0001_3_01_00, diagnostics=[TaskAttempt > 0 failed, info=[Error: Error while running task ( failure ) : > attempt_1653493839723_0001_3_01_00_0:java.lang.RuntimeException: > java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: > Hive Runtime Error while processing row > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:348) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:276) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:82) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:69) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:69) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:39) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:110) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:83) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:414) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:293) > ... 15 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error while processing row > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:574) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:101) > ... 18 more > Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to > java.lang.Integer > at > org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaIntObjectInspector.get(JavaIntObjectInspector.java:40) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPLessThan.evaluate(GenericUDFOPLessThan.java:127) > at >
[jira] [Work logged] (HIVE-26264) Iceberg integration: Fetch virtual columns on demand
[ https://issues.apache.org/jira/browse/HIVE-26264?focusedWorklogId=777920=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777920 ] ASF GitHub Bot logged work on HIVE-26264: - Author: ASF GitHub Bot Created on: 03/Jun/22 04:13 Start Date: 03/Jun/22 04:13 Worklog Time Spent: 10m Work Description: kasakrisz merged PR #3324: URL: https://github.com/apache/hive/pull/3324 Issue Time Tracking --- Worklog Id: (was: 777920) Time Spent: 5.5h (was: 5h 20m) > Iceberg integration: Fetch virtual columns on demand > > > Key: HIVE-26264 > URL: https://issues.apache.org/jira/browse/HIVE-26264 > Project: Hive > Issue Type: Bug > Components: File Formats >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 5.5h > Remaining Estimate: 0h > > Currently virtual columns are fetched from iceberg tables if the statement > being executed is a delete or update statement and the setting is global. It > means it affects all tables affected by the statement. Also the read and > write schema depends on the operation setting. > Some statements fails due to invalid schema: > {code} > create external table tbl_ice(a int, b string, c int) stored by iceberg > stored as orc tblproperties ('format-version'='2'); > insert into tbl_ice values (1, 'one', 50), (2, 'two', 51), (3, 'three', 52), > (4, 'four', 53), (5, 'five', 54), (111, 'one', 55), (333, 'two', 56); > update tbl_ice set b='Changed' where b in (select b from tbl_ice where a < 4); > {code} > {code} > See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, > or check ./ql/target/surefire-reports or > ./itests/qtest/target/surefire-reports/ for specific test cases logs. > org.apache.hadoop.hive.ql.metadata.HiveException: Vertex failed, > vertexName=Map 3, vertexId=vertex_1653493839723_0001_3_01, diagnostics=[Task > failed, taskId=task_1653493839723_0001_3_01_00, diagnostics=[TaskAttempt > 0 failed, info=[Error: Error while running task ( failure ) : > attempt_1653493839723_0001_3_01_00_0:java.lang.RuntimeException: > java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: > Hive Runtime Error while processing row > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:348) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:276) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:82) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:69) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:69) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:39) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:110) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:83) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:414) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:293) > ... 15 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error while processing row > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:574) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:101) > ... 18 more > Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to >
[jira] [Work logged] (HIVE-25980) Reduce fs calls in HiveMetaStoreChecker.checkTable
[ https://issues.apache.org/jira/browse/HIVE-25980?focusedWorklogId=777663=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777663 ] ASF GitHub Bot logged work on HIVE-25980: - Author: ASF GitHub Bot Created on: 02/Jun/22 15:59 Start Date: 02/Jun/22 15:59 Worklog Time Spent: 10m Work Description: pvary commented on code in PR #3053: URL: https://github.com/apache/hive/pull/3053#discussion_r888123399 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java: ## @@ -326,10 +356,29 @@ void checkTable(Table table, PartitionIterable parts, byte[] filterExp, CheckRes CheckResult.PartitionResult prFromMetastore = new CheckResult.PartitionResult(); prFromMetastore.setPartitionName(getPartitionName(table, partition)); prFromMetastore.setTableName(partition.getTableName()); - if (!fs.exists(partPath)) { -result.getPartitionsNotOnFs().add(prFromMetastore); - } else { + if (allPartDirs.contains(partPath)) { result.getCorrectPartitions().add(prFromMetastore); +allPartDirs.remove(partPath); + } Review Comment: nit: ``` } else { // Comment here ``` Issue Time Tracking --- Worklog Id: (was: 777663) Time Spent: 6h 50m (was: 6h 40m) > Reduce fs calls in HiveMetaStoreChecker.checkTable > -- > > Key: HIVE-25980 > URL: https://issues.apache.org/jira/browse/HIVE-25980 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Chiran Ravani >Assignee: Chiran Ravani >Priority: Major > Labels: pull-request-available > Time Spent: 6h 50m > Remaining Estimate: 0h > > MSCK Repair table for high partition table can perform slow on Cloud Storage > such as S3, one of the case we found where slowness was observed in > HiveMetaStoreChecker.checkTable. > {code:java} > "HiveServer2-Background-Pool: Thread-382" #382 prio=5 os_prio=0 > tid=0x7f97fc4a4000 nid=0x5c2a runnable [0x7f97c41a8000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:171) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at > sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:464) > at > sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68) > at > sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1341) > at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73) > at > sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:957) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) > at > com.amazonaws.thirdparty.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) > at > com.amazonaws.thirdparty.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) > at > com.amazonaws.thirdparty.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157) > at > com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) > at > com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:82) > at > com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) > at > com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) > at > com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) > at >
[jira] [Work logged] (HIVE-25980) Reduce fs calls in HiveMetaStoreChecker.checkTable
[ https://issues.apache.org/jira/browse/HIVE-25980?focusedWorklogId=777666=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777666 ] ASF GitHub Bot logged work on HIVE-25980: - Author: ASF GitHub Bot Created on: 02/Jun/22 15:59 Start Date: 02/Jun/22 15:59 Worklog Time Spent: 10m Work Description: pvary commented on code in PR #3053: URL: https://github.com/apache/hive/pull/3053#discussion_r888123783 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java: ## @@ -326,10 +356,29 @@ void checkTable(Table table, PartitionIterable parts, byte[] filterExp, CheckRes CheckResult.PartitionResult prFromMetastore = new CheckResult.PartitionResult(); prFromMetastore.setPartitionName(getPartitionName(table, partition)); prFromMetastore.setTableName(partition.getTableName()); - if (!fs.exists(partPath)) { -result.getPartitionsNotOnFs().add(prFromMetastore); - } else { + if (allPartDirs.contains(partPath)) { result.getCorrectPartitions().add(prFromMetastore); +allPartDirs.remove(partPath); + } + // There can be edge case where user can define partition directory outside of table directory + // to avoid eviction of such partitions + // we check existence of partition path which are not in table directory + // and add to result for getPartitionsNotOnFs. + else { +if (!partPath.toString().contains(tablePath.toString())) { + if(!fs.exists(partPath)) { +result.getPartitionsNotOnFs().add(prFromMetastore); + } Review Comment: nit: `} else {` ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java: ## @@ -326,10 +356,29 @@ void checkTable(Table table, PartitionIterable parts, byte[] filterExp, CheckRes CheckResult.PartitionResult prFromMetastore = new CheckResult.PartitionResult(); prFromMetastore.setPartitionName(getPartitionName(table, partition)); prFromMetastore.setTableName(partition.getTableName()); - if (!fs.exists(partPath)) { -result.getPartitionsNotOnFs().add(prFromMetastore); - } else { + if (allPartDirs.contains(partPath)) { result.getCorrectPartitions().add(prFromMetastore); +allPartDirs.remove(partPath); + } + // There can be edge case where user can define partition directory outside of table directory + // to avoid eviction of such partitions + // we check existence of partition path which are not in table directory + // and add to result for getPartitionsNotOnFs. + else { +if (!partPath.toString().contains(tablePath.toString())) { + if(!fs.exists(partPath)) { Review Comment: nit `if (` Issue Time Tracking --- Worklog Id: (was: 777666) Time Spent: 7h (was: 6h 50m) > Reduce fs calls in HiveMetaStoreChecker.checkTable > -- > > Key: HIVE-25980 > URL: https://issues.apache.org/jira/browse/HIVE-25980 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Chiran Ravani >Assignee: Chiran Ravani >Priority: Major > Labels: pull-request-available > Time Spent: 7h > Remaining Estimate: 0h > > MSCK Repair table for high partition table can perform slow on Cloud Storage > such as S3, one of the case we found where slowness was observed in > HiveMetaStoreChecker.checkTable. > {code:java} > "HiveServer2-Background-Pool: Thread-382" #382 prio=5 os_prio=0 > tid=0x7f97fc4a4000 nid=0x5c2a runnable [0x7f97c41a8000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:171) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at > sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:464) > at > sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68) > at > sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1341) > at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73) > at > sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:957) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) > at >
[jira] [Work logged] (HIVE-25980) Reduce fs calls in HiveMetaStoreChecker.checkTable
[ https://issues.apache.org/jira/browse/HIVE-25980?focusedWorklogId=777662=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777662 ] ASF GitHub Bot logged work on HIVE-25980: - Author: ASF GitHub Bot Created on: 02/Jun/22 15:57 Start Date: 02/Jun/22 15:57 Worklog Time Spent: 10m Work Description: pvary commented on code in PR #3053: URL: https://github.com/apache/hive/pull/3053#discussion_r888121662 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java: ## @@ -308,8 +308,38 @@ void checkTable(Table table, PartitionIterable parts, byte[] filterExp, CheckRes result.getTablesNotOnFs().add(table.getTableName()); return; } + +// now check the table folder and see if we find anything +// that isn't in the metastore +Set allPartDirs = new HashSet<>(); +List partColumns = table.getPartitionKeys(); +checkPartitionDirs(tablePath, allPartDirs, Collections.unmodifiableList(getPartColNames(table))); + +if (filterExp != null) { + PartitionExpressionProxy expressionProxy = createExpressionProxy(conf); + List partitions = new ArrayList<>(); + Set partDirs = new HashSet(); + String tablePathStr = tablePath.toString(); + for (Path path : allPartDirs) { Review Comment: Could we do this with Stream, and filter and friends? Also we should calculate `tablePathStr.endsWith("/")` only once Issue Time Tracking --- Worklog Id: (was: 777662) Time Spent: 6h 40m (was: 6.5h) > Reduce fs calls in HiveMetaStoreChecker.checkTable > -- > > Key: HIVE-25980 > URL: https://issues.apache.org/jira/browse/HIVE-25980 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Chiran Ravani >Assignee: Chiran Ravani >Priority: Major > Labels: pull-request-available > Time Spent: 6h 40m > Remaining Estimate: 0h > > MSCK Repair table for high partition table can perform slow on Cloud Storage > such as S3, one of the case we found where slowness was observed in > HiveMetaStoreChecker.checkTable. > {code:java} > "HiveServer2-Background-Pool: Thread-382" #382 prio=5 os_prio=0 > tid=0x7f97fc4a4000 nid=0x5c2a runnable [0x7f97c41a8000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:171) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at > sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:464) > at > sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68) > at > sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1341) > at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73) > at > sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:957) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) > at > com.amazonaws.thirdparty.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) > at > com.amazonaws.thirdparty.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) > at > com.amazonaws.thirdparty.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157) > at > com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) > at > com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:82) > at > com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) > at >
[jira] [Updated] (HIVE-26277) Add unit tests for ColumnStatsAggregator classes
[ https://issues.apache.org/jira/browse/HIVE-26277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-26277: -- Labels: pull-request-available (was: ) > Add unit tests for ColumnStatsAggregator classes > > > Key: HIVE-26277 > URL: https://issues.apache.org/jira/browse/HIVE-26277 > Project: Hive > Issue Type: Test > Components: Standalone Metastore, Statistics, Tests >Affects Versions: 4.0.0-alpha-2 >Reporter: Alessandro Solimando >Assignee: Alessandro Solimando >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > We have no unit tests covering these classes, which also happen to contain > some complicated logic, making the absence of tests even more risky. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26277) Add unit tests for ColumnStatsAggregator classes
[ https://issues.apache.org/jira/browse/HIVE-26277?focusedWorklogId=777643=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777643 ] ASF GitHub Bot logged work on HIVE-26277: - Author: ASF GitHub Bot Created on: 02/Jun/22 15:27 Start Date: 02/Jun/22 15:27 Worklog Time Spent: 10m Work Description: asolimando opened a new pull request, #3339: URL: https://github.com/apache/hive/pull/3339 ### What changes were proposed in this pull request? Adding unit tests for *ColumnStatsAggregator classes (first commit), fixing bugs discovered while writing the UTs (second commit) and fixed some warnings (subsequent commits, at the discretion of the reviewers). ### Why are the changes needed? Lack of unit tests is detrimental for code quality, as highlighted by the bugs discovered while writing the tests. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? `mvn test -Dtest.groups=org.apache.hadoop.hive.metastore.annotation.MetastoreUnitTest -Dtest='*ColumnStatsAggregatorTest.java' -pl standalone-metastore/metastore-server` Issue Time Tracking --- Worklog Id: (was: 777643) Remaining Estimate: 0h Time Spent: 10m > Add unit tests for ColumnStatsAggregator classes > > > Key: HIVE-26277 > URL: https://issues.apache.org/jira/browse/HIVE-26277 > Project: Hive > Issue Type: Test > Components: Standalone Metastore, Statistics, Tests >Affects Versions: 4.0.0-alpha-2 >Reporter: Alessandro Solimando >Assignee: Alessandro Solimando >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > We have no unit tests covering these classes, which also happen to contain > some complicated logic, making the absence of tests even more risky. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.3
[ https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=777603=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777603 ] ASF GitHub Bot logged work on HIVE-24484: - Author: ASF GitHub Bot Created on: 02/Jun/22 14:30 Start Date: 02/Jun/22 14:30 Worklog Time Spent: 10m Work Description: TheCodeTracer commented on PR #3279: URL: https://github.com/apache/hive/pull/3279#issuecomment-1144935338 I am sorry if its the incorrect forum but since it's tracking Hadoop 3.3 upgrade on Hive, I just wanted to confirm if there is a possibility that webhcat might not work with Hadoop 3.3.x yet (https://issues.apache.org/jira/browse/HIVE-24083 ). Thanks a lot. Right now, it does seem that WebHCat tests are breaking anyway due to a different issue (https://issues.apache.org/jira/browse/HIVE-26286) Issue Time Tracking --- Worklog Id: (was: 777603) Time Spent: 12h 13m (was: 12.05h) > Upgrade Hadoop to 3.3.3 > --- > > Key: HIVE-24484 > URL: https://issues.apache.org/jira/browse/HIVE-24484 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 12h 13m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26286) Hive WebHCat Tests are failing
[ https://issues.apache.org/jira/browse/HIVE-26286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anmol Sundaram updated HIVE-26286: -- Description: The Hive TestWebHCatE2e tests seem to be failing due to {quote}templeton: Server failed to start: null [main] ERROR org.apache.hive.hcatalog.templeton.Main - Server failed to start: java.lang.NullPointerException at org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:174) at org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44) at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:220) at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:143) at org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) at org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) at org.apache.hive.hcatalog.templeton.Main.run(Main.java:147) at org.apache.hive.hcatalog.templeton.TestWebHCatE2e.startHebHcatInMem(TestWebHCatE2e.java:94) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59){quote} {quote} {quote} This seems to be caused due to HIVE-18728 , which is breaking. was: The Hive TestWebHCatE2e tests seem to be failing due to {quote} templeton: Server failed to start: null2022-05-31T21:51:13,711 ERROR [main] templeton.Main: Server failed to start: java.lang.NullPointerException: null at org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:186) ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44) ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:215) ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:145) ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) ~[classes/:?] at org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) ~[classes/:?] {quote} This seems to be caused due to HIVE-18728 , which is breaking. > Hive WebHCat Tests are failing > -- > > Key: HIVE-26286 > URL: https://issues.apache.org/jira/browse/HIVE-26286 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 4.0.0 > Environment: [link title|http://example.com] >Reporter: Anmol Sundaram >Priority: Major > > The Hive TestWebHCatE2e tests seem to be failing due to > > {quote}templeton: Server failed to start: null > [main] ERROR org.apache.hive.hcatalog.templeton.Main - Server failed to > start: > java.lang.NullPointerException > at > org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:174) > at > org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:220) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:143) > at > org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) > at org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) > at org.apache.hive.hcatalog.templeton.Main.run(Main.java:147) > at > org.apache.hive.hcatalog.templeton.TestWebHCatE2e.startHebHcatInMem(TestWebHCatE2e.java:94) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59){quote} > {quote} {quote} > This seems to be caused due to HIVE-18728 , which is breaking. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26286) Hive WebHCat Tests are failing
[ https://issues.apache.org/jira/browse/HIVE-26286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anmol Sundaram updated HIVE-26286: -- Description: The Hive TestWebHCatE2e tests seem to be failing due to {quote} templeton: Server failed to start: null2022-05-31T21:51:13,711 ERROR [main] templeton.Main: Server failed to start: java.lang.NullPointerException: null at org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:186) ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44) ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:215) ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:145) ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) ~[classes/:?] at org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) ~[classes/:?] {quote} This seems to be caused due to HIVE-18728 , which is breaking. was: The Hive TestWebHCatE2e tests seem to be failing due to templeton: Server failed to start: null2022-05-31T21:51:13,711 ERROR [main] templeton.Main: Server failed to start: java.lang.NullPointerException: null at org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:186) ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44) ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:215) ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:145) ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) ~[classes/:?] at org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) ~[classes/:?] This seems to be caused due to [HIVE-18728|https://issues.apache.org/jira/browse/HIVE-18728] , which is breaking. > Hive WebHCat Tests are failing > -- > > Key: HIVE-26286 > URL: https://issues.apache.org/jira/browse/HIVE-26286 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 4.0.0 > Environment: [link title|http://example.com] >Reporter: Anmol Sundaram >Priority: Major > > The Hive TestWebHCatE2e tests seem to be failing due to > > {quote} > templeton: Server failed to start: null2022-05-31T21:51:13,711 ERROR [main] > templeton.Main: Server failed to start: java.lang.NullPointerException: null > at > org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:186) > ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at > org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44) > ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at > org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:215) > ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at > org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:145) > ~[jetty-server-9.4.43.v20210629.jar:9.4.43.v20210629] at > org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) > ~[classes/:?] at > org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) ~[classes/:?] > > {quote} > This seems to be caused due to HIVE-18728 , which is breaking. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26228) Implement Iceberg table rollback feature
[ https://issues.apache.org/jira/browse/HIVE-26228?focusedWorklogId=777524=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777524 ] ASF GitHub Bot logged work on HIVE-26228: - Author: ASF GitHub Bot Created on: 02/Jun/22 13:28 Start Date: 02/Jun/22 13:28 Worklog Time Spent: 10m Work Description: szlta commented on code in PR #3287: URL: https://github.com/apache/hive/pull/3287#discussion_r887952384 ## ql/src/java/org/apache/hadoop/hive/ql/ddl/table/execute/AlterTableExecuteAnalyzer.java: ## @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.ddl.table.execute; + +import org.apache.hadoop.hive.common.TableName; +import org.apache.hadoop.hive.common.type.TimestampTZ; +import org.apache.hadoop.hive.common.type.TimestampTZUtil; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.ql.QueryState; +import org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory.DDLType; +import org.apache.hadoop.hive.ql.ddl.DDLWork; +import org.apache.hadoop.hive.ql.ddl.table.AbstractAlterTableAnalyzer; +import org.apache.hadoop.hive.ql.ddl.table.AlterTableType; +import org.apache.hadoop.hive.ql.exec.TaskFactory; +import org.apache.hadoop.hive.ql.hooks.ReadEntity; +import org.apache.hadoop.hive.ql.metadata.Table; +import org.apache.hadoop.hive.ql.parse.ASTNode; +import org.apache.hadoop.hive.ql.parse.HiveParser; +import org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec; +import org.apache.hadoop.hive.ql.parse.SemanticException; +import org.apache.hadoop.hive.ql.plan.PlanUtils; +import org.apache.hadoop.hive.ql.session.SessionState; + +import java.time.ZoneId; +import java.util.Map; + +import static org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec.ExecuteOperationType.ROLLBACK; +import static org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec.RollbackSpec.RollbackType.TIME; +import static org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec.RollbackSpec.RollbackType.VERSION; + +/** + * Analyzer for ALTER TABLE ... EXECUTE commands. + */ +@DDLType(types = HiveParser.TOK_ALTERTABLE_EXECUTE) +public class AlterTableExecuteAnalyzer extends AbstractAlterTableAnalyzer { + + public AlterTableExecuteAnalyzer(QueryState queryState) throws SemanticException { +super(queryState); + } + + @Override + protected void analyzeCommand(TableName tableName, Map partitionSpec, ASTNode command) + throws SemanticException { +Table table = getTable(tableName); +// the first child must be the execute operation type +ASTNode executeCommandType = (ASTNode) command.getChild(0); +validateAlterTableType(table, AlterTableType.EXECUTE, false); +inputs.add(new ReadEntity(table)); +AlterTableExecuteDesc desc = null; +if (HiveParser.KW_ROLLBACK == executeCommandType.getType()) { + AlterTableExecuteSpec spec; + // the second child must be the rollback parameter + ASTNode child = (ASTNode) command.getChild(1); + + if (child.getType() == HiveParser.StringLiteral) { Review Comment: I see, thanks! Issue Time Tracking --- Worklog Id: (was: 777524) Time Spent: 5h (was: 4h 50m) > Implement Iceberg table rollback feature > > > Key: HIVE-26228 > URL: https://issues.apache.org/jira/browse/HIVE-26228 > Project: Hive > Issue Type: New Feature >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 5h > Remaining Estimate: 0h > > We should allow rolling back iceberg table's data to the state at an older > table snapshot. > Rollback to the last snapshot before a specific timestamp > {code:java} > ALTER TABLE ice_t EXECUTE ROLLBACK('2022-05-12 00:00:00') > {code} > Rollback to a specific snapshot ID > {code:java} > ALTER TABLE ice_t EXECUTE ROLLBACK(); > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-21160) Rewrite Update statement as Multi-insert and do Update split early
[ https://issues.apache.org/jira/browse/HIVE-21160?focusedWorklogId=777513=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777513 ] ASF GitHub Bot logged work on HIVE-21160: - Author: ASF GitHub Bot Created on: 02/Jun/22 13:19 Start Date: 02/Jun/22 13:19 Worklog Time Spent: 10m Work Description: kasakrisz commented on code in PR #2855: URL: https://github.com/apache/hive/pull/2855#discussion_r887943205 ## ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java: ## @@ -48,28 +52,43 @@ public class UpdateDeleteSemanticAnalyzer extends RewriteSemanticAnalyzer { super(queryState); } - protected void analyze(ASTNode tree) throws SemanticException { + @Override + protected ASTNode getTargetTableNode(ASTNode tree) { +// The first child should be the table we are updating / deleting from +ASTNode tabName = (ASTNode)tree.getChild(0); +assert tabName.getToken().getType() == HiveParser.TOK_TABNAME : +"Expected tablename as first child of " + operation + " but found " + tabName.getName(); +return tabName; + } + + protected void analyze(ASTNode tree, Table table, ASTNode tabNameNode) throws SemanticException { switch (tree.getToken().getType()) { case HiveParser.TOK_DELETE_FROM: - analyzeDelete(tree); + analyzeDelete(tree, tabNameNode, table); break; case HiveParser.TOK_UPDATE_TABLE: - analyzeUpdate(tree); + analyzeUpdate(tree, tabNameNode, table); break; default: throw new RuntimeException("Asked to parse token " + tree.getName() + " in " + "UpdateDeleteSemanticAnalyzer"); } } - private void analyzeUpdate(ASTNode tree) throws SemanticException { + private void analyzeUpdate(ASTNode tree, ASTNode tabNameNode, Table mTable) throws SemanticException { operation = Context.Operation.UPDATE; -reparseAndSuperAnalyze(tree); +boolean nonNativeAcid = AcidUtils.isNonNativeAcidTable(mTable); + +if (HiveConf.getBoolVar(queryState.getConf(), HiveConf.ConfVars.SPLIT_UPDATE) && !nonNativeAcid) { + analyzeSplitUpdate(tree, mTable, tabNameNode); +} else { + reparseAndSuperAnalyze(tree, tabNameNode, mTable); Review Comment: Altered the order to use everywhere `tree,mTable, tableNameNode` Issue Time Tracking --- Worklog Id: (was: 777513) Time Spent: 4h 10m (was: 4h) > Rewrite Update statement as Multi-insert and do Update split early > -- > > Key: HIVE-21160 > URL: https://issues.apache.org/jira/browse/HIVE-21160 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Eugene Koifman >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Time Spent: 4h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-21160) Rewrite Update statement as Multi-insert and do Update split early
[ https://issues.apache.org/jira/browse/HIVE-21160?focusedWorklogId=777514=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777514 ] ASF GitHub Bot logged work on HIVE-21160: - Author: ASF GitHub Bot Created on: 02/Jun/22 13:19 Start Date: 02/Jun/22 13:19 Worklog Time Spent: 10m Work Description: kasakrisz commented on code in PR #2855: URL: https://github.com/apache/hive/pull/2855#discussion_r887943545 ## ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java: ## @@ -8292,8 +8294,8 @@ private WriteEntity generateTableWriteEntity(String dest, Table dest_tab, if ((dpCtx == null || dpCtx.getNumDPCols() == 0)) { output = new WriteEntity(dest_tab, determineWriteType(ltd, dest)); if (!outputs.add(output)) { -if(!((this instanceof MergeSemanticAnalyzer) && -conf.getBoolVar(ConfVars.MERGE_SPLIT_UPDATE))) { +if(!((this instanceof MergeSemanticAnalyzer || this instanceof UpdateDeleteSemanticAnalyzer) && Review Comment: Extracted to instance method. ## itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java: ## @@ -251,6 +252,44 @@ public void testExceucteUpdateCounts() throws Exception { assertEquals("1 row PreparedStatement update", 1, count); } + @Test + public void testExceucteMergeCounts() throws Exception { +testExceucteMergeCounts(true); + } + + @Test + public void testExceucteMergeCountsNoSplitUpdate() throws Exception { +testExceucteMergeCounts(false); + } + + private void testExceucteMergeCounts(boolean splitUpdateEarly) throws Exception { + +Statement stmt = con.createStatement(); +stmt.execute("set " + ConfVars.MERGE_SPLIT_UPDATE.varname + "=" + splitUpdateEarly); +stmt.execute("set " + ConfVars.HIVE_SUPPORT_CONCURRENCY.varname + "=true"); Review Comment: Changed to SPLIT_UPDATE Issue Time Tracking --- Worklog Id: (was: 777514) Time Spent: 4h 20m (was: 4h 10m) > Rewrite Update statement as Multi-insert and do Update split early > -- > > Key: HIVE-21160 > URL: https://issues.apache.org/jira/browse/HIVE-21160 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Eugene Koifman >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Time Spent: 4h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-21160) Rewrite Update statement as Multi-insert and do Update split early
[ https://issues.apache.org/jira/browse/HIVE-21160?focusedWorklogId=777510=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777510 ] ASF GitHub Bot logged work on HIVE-21160: - Author: ASF GitHub Bot Created on: 02/Jun/22 13:18 Start Date: 02/Jun/22 13:18 Worklog Time Spent: 10m Work Description: kasakrisz commented on code in PR #2855: URL: https://github.com/apache/hive/pull/2855#discussion_r887942105 ## ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java: ## @@ -91,23 +110,15 @@ private void analyzeDelete(ASTNode tree) throws SemanticException { * The sort by clause is put in there so that records come out in the right order to enable * merge on read. */ - private void reparseAndSuperAnalyze(ASTNode tree) throws SemanticException { + private void reparseAndSuperAnalyze(ASTNode tree, ASTNode tabNameNode, Table mTable) throws SemanticException { List children = tree.getChildren(); -// The first child should be the table we are updating / deleting from -ASTNode tabName = (ASTNode)children.get(0); -assert tabName.getToken().getType() == HiveParser.TOK_TABNAME : -"Expected tablename as first child of " + operation + " but found " + tabName.getName(); -Table mTable = getTargetTable(tabName); -validateTxnManager(mTable); -validateTargetTable(mTable); - // save the operation type into the query state SessionStateUtil.addResource(conf, Context.Operation.class.getSimpleName(), operation.name()); StringBuilder rewrittenQueryStr = new StringBuilder(); rewrittenQueryStr.append("insert into table "); -rewrittenQueryStr.append(getFullTableNameForSQL(tabName)); +rewrittenQueryStr.append(getFullTableNameForSQL(tabNameNode)); Review Comment: It is also used in `MergeSemanticAnalyzer` ``` String targetName = getSimpleTableName(targetNameNode); ... appendTarget(rewrittenQueryStr, targetNameNode, targetName); ... ``` and in `AcidExportSemanticAnalyzer` ``` StringBuilder rewrittenQueryStr = generateExportQuery( newTable.getPartCols(), tokRefOrNameExportTable, (ASTNode) tokRefOrNameExportTable.parent, newTableName); ``` Issue Time Tracking --- Worklog Id: (was: 777510) Time Spent: 4h (was: 3h 50m) > Rewrite Update statement as Multi-insert and do Update split early > -- > > Key: HIVE-21160 > URL: https://issues.apache.org/jira/browse/HIVE-21160 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Eugene Koifman >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Time Spent: 4h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-21160) Rewrite Update statement as Multi-insert and do Update split early
[ https://issues.apache.org/jira/browse/HIVE-21160?focusedWorklogId=777504=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777504 ] ASF GitHub Bot logged work on HIVE-21160: - Author: ASF GitHub Bot Created on: 02/Jun/22 13:09 Start Date: 02/Jun/22 13:09 Worklog Time Spent: 10m Work Description: kasakrisz commented on code in PR #2855: URL: https://github.com/apache/hive/pull/2855#discussion_r887933328 ## ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java: ## @@ -48,28 +52,43 @@ public class UpdateDeleteSemanticAnalyzer extends RewriteSemanticAnalyzer { super(queryState); } - protected void analyze(ASTNode tree) throws SemanticException { + @Override + protected ASTNode getTargetTableNode(ASTNode tree) { +// The first child should be the table we are updating / deleting from +ASTNode tabName = (ASTNode)tree.getChild(0); +assert tabName.getToken().getType() == HiveParser.TOK_TABNAME : +"Expected tablename as first child of " + operation + " but found " + tabName.getName(); +return tabName; + } + + protected void analyze(ASTNode tree, Table table, ASTNode tabNameNode) throws SemanticException { Review Comment: We need the quoted version of the db and table name. `TableName` object can't generate it when identifier contains `'` character because it does not escape the id but surround it with backticks. ``` public String getEscapedNotEmptyDbTable() { return db == null || db.trim().isEmpty() ? "`" + table + "`" : "`" + db + "`" + DatabaseName.CAT_DB_TABLE_SEPARATOR + "`" + table + "`"; } ``` Table object is good only in the update case. However the target table may have an alias defined in merge statements and we need that alias for the rewrite. It can be extracted from the `tabNameNode` AST. Issue Time Tracking --- Worklog Id: (was: 777504) Time Spent: 3h 50m (was: 3h 40m) > Rewrite Update statement as Multi-insert and do Update split early > -- > > Key: HIVE-21160 > URL: https://issues.apache.org/jira/browse/HIVE-21160 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Eugene Koifman >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Time Spent: 3h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26228) Implement Iceberg table rollback feature
[ https://issues.apache.org/jira/browse/HIVE-26228?focusedWorklogId=777497=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777497 ] ASF GitHub Bot logged work on HIVE-26228: - Author: ASF GitHub Bot Created on: 02/Jun/22 13:02 Start Date: 02/Jun/22 13:02 Worklog Time Spent: 10m Work Description: lcspinter commented on code in PR #3287: URL: https://github.com/apache/hive/pull/3287#discussion_r887926106 ## ql/src/java/org/apache/hadoop/hive/ql/ddl/table/execute/AlterTableExecuteAnalyzer.java: ## @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.ddl.table.execute; + +import org.apache.hadoop.hive.common.TableName; +import org.apache.hadoop.hive.common.type.TimestampTZ; +import org.apache.hadoop.hive.common.type.TimestampTZUtil; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.ql.QueryState; +import org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory.DDLType; +import org.apache.hadoop.hive.ql.ddl.DDLWork; +import org.apache.hadoop.hive.ql.ddl.table.AbstractAlterTableAnalyzer; +import org.apache.hadoop.hive.ql.ddl.table.AlterTableType; +import org.apache.hadoop.hive.ql.exec.TaskFactory; +import org.apache.hadoop.hive.ql.hooks.ReadEntity; +import org.apache.hadoop.hive.ql.metadata.Table; +import org.apache.hadoop.hive.ql.parse.ASTNode; +import org.apache.hadoop.hive.ql.parse.HiveParser; +import org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec; +import org.apache.hadoop.hive.ql.parse.SemanticException; +import org.apache.hadoop.hive.ql.plan.PlanUtils; +import org.apache.hadoop.hive.ql.session.SessionState; + +import java.time.ZoneId; +import java.util.Map; + +import static org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec.ExecuteOperationType.ROLLBACK; +import static org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec.RollbackSpec.RollbackType.TIME; +import static org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec.RollbackSpec.RollbackType.VERSION; + +/** + * Analyzer for ALTER TABLE ... EXECUTE commands. + */ +@DDLType(types = HiveParser.TOK_ALTERTABLE_EXECUTE) +public class AlterTableExecuteAnalyzer extends AbstractAlterTableAnalyzer { + + public AlterTableExecuteAnalyzer(QueryState queryState) throws SemanticException { +super(queryState); + } + + @Override + protected void analyzeCommand(TableName tableName, Map partitionSpec, ASTNode command) + throws SemanticException { +Table table = getTable(tableName); +// the first child must be the execute operation type +ASTNode executeCommandType = (ASTNode) command.getChild(0); +validateAlterTableType(table, AlterTableType.EXECUTE, false); +inputs.add(new ReadEntity(table)); +AlterTableExecuteDesc desc = null; +if (HiveParser.KW_ROLLBACK == executeCommandType.getType()) { + AlterTableExecuteSpec spec; + // the second child must be the rollback parameter + ASTNode child = (ASTNode) command.getChild(1); + + if (child.getType() == HiveParser.StringLiteral) { Review Comment: If a timezone is present we will try to use it. https://github.com/apache/hive/blob/63326ff775206e59547b6b1332e25279e90ef5ee/common/src/java/org/apache/hadoop/hive/common/type/TimestampTZUtil.java#L103 Otherwise use the preconfigured time zone. Issue Time Tracking --- Worklog Id: (was: 777497) Time Spent: 4h 50m (was: 4h 40m) > Implement Iceberg table rollback feature > > > Key: HIVE-26228 > URL: https://issues.apache.org/jira/browse/HIVE-26228 > Project: Hive > Issue Type: New Feature >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 4h 50m > Remaining Estimate: 0h > > We should allow rolling back iceberg table's data to the state at an older > table snapshot. > Rollback to the last snapshot before a specific timestamp > {code:java} > ALTER TABLE ice_t EXECUTE
[jira] [Work logged] (HIVE-26228) Implement Iceberg table rollback feature
[ https://issues.apache.org/jira/browse/HIVE-26228?focusedWorklogId=777493=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777493 ] ASF GitHub Bot logged work on HIVE-26228: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:58 Start Date: 02/Jun/22 12:58 Worklog Time Spent: 10m Work Description: lcspinter commented on code in PR #3287: URL: https://github.com/apache/hive/pull/3287#discussion_r887922632 ## ql/src/java/org/apache/hadoop/hive/ql/ddl/table/AbstractBaseAlterTableAnalyzer.java: ## @@ -161,7 +161,8 @@ protected void validateAlterTableType(Table table, AlterTableType op, boolean ex } if (table.isNonNative() && table.getStorageHandler() != null) { if (!table.getStorageHandler().isAllowedAlterOperation(op) || Review Comment: Yes, you are right, all this can be combined into isAllowedAlterOperation() Issue Time Tracking --- Worklog Id: (was: 777493) Time Spent: 4h 40m (was: 4.5h) > Implement Iceberg table rollback feature > > > Key: HIVE-26228 > URL: https://issues.apache.org/jira/browse/HIVE-26228 > Project: Hive > Issue Type: New Feature >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 4h 40m > Remaining Estimate: 0h > > We should allow rolling back iceberg table's data to the state at an older > table snapshot. > Rollback to the last snapshot before a specific timestamp > {code:java} > ALTER TABLE ice_t EXECUTE ROLLBACK('2022-05-12 00:00:00') > {code} > Rollback to a specific snapshot ID > {code:java} > ALTER TABLE ice_t EXECUTE ROLLBACK(); > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777488=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777488 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:39 Start Date: 02/Jun/22 12:39 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887904369 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + *
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777483=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777483 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:30 Start Date: 02/Jun/22 12:30 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887896665 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + *
[jira] [Work logged] (HIVE-21160) Rewrite Update statement as Multi-insert and do Update split early
[ https://issues.apache.org/jira/browse/HIVE-21160?focusedWorklogId=777481=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777481 ] ASF GitHub Bot logged work on HIVE-21160: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:30 Start Date: 02/Jun/22 12:30 Worklog Time Spent: 10m Work Description: kasakrisz commented on code in PR #2855: URL: https://github.com/apache/hive/pull/2855#discussion_r887896104 ## itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java: ## @@ -251,6 +252,44 @@ public void testExceucteUpdateCounts() throws Exception { assertEquals("1 row PreparedStatement update", 1, count); } + @Test + public void testExceucteMergeCounts() throws Exception { +testExceucteMergeCounts(true); + } + + @Test + public void testExceucteMergeCountsNoSplitUpdate() throws Exception { +testExceucteMergeCounts(false); + } + + private void testExceucteMergeCounts(boolean splitUpdateEarly) throws Exception { + +Statement stmt = con.createStatement(); +stmt.execute("set " + ConfVars.MERGE_SPLIT_UPDATE.varname + "=" + splitUpdateEarly); +stmt.execute("set " + ConfVars.HIVE_SUPPORT_CONCURRENCY.varname + "=true"); +stmt.execute("set " + ConfVars.HIVE_TXN_MANAGER.varname + +"=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager"); + +stmt.execute("drop table if exists transactional_crud"); +stmt.execute("drop table if exists source"); + +stmt.execute("create table transactional_crud (a int, b int) stored as orc " + +"tblproperties('transactional'='true', 'transactional_properties'='default')"); +stmt.executeUpdate("insert into transactional_crud values(1,2),(3,4),(5,6),(7,8),(9,10)"); + +stmt.execute("create table source (a int, b int) stored as orc " + +"tblproperties('transactional'='true', 'transactional_properties'='default')"); +stmt.executeUpdate("insert into source values(1,12),(3,14),(9,19),(100,100)"); + +int count = stmt.executeUpdate("MERGE INTO transactional_crud as t using source as s ON t.a = s.a\n" + +"WHEN MATCHED AND s.a > 7 THEN DELETE\n" + +"WHEN MATCHED AND s.a <= 8 THEN UPDATE set b = 100\n" + +"WHEN NOT MATCHED THEN INSERT VALUES (s.a, s.b)"); +stmt.close(); + +assertEquals("Statement merge", 4, count); Review Comment: The number of affected rows. With split update I experienced that it was not 4 so I had to introduce some logic because updated records was counted twice. Issue Time Tracking --- Worklog Id: (was: 777481) Time Spent: 3h 40m (was: 3.5h) > Rewrite Update statement as Multi-insert and do Update split early > -- > > Key: HIVE-21160 > URL: https://issues.apache.org/jira/browse/HIVE-21160 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Eugene Koifman >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Time Spent: 3h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777478=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777478 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:22 Start Date: 02/Jun/22 12:22 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887889651 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + * Stops
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777477=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777477 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:21 Start Date: 02/Jun/22 12:21 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887888610 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + * Stops
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777476=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777476 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:20 Start Date: 02/Jun/22 12:20 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887887873 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + * Stops
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777474=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777474 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:18 Start Date: 02/Jun/22 12:18 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887886357 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + * Stops
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777473=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777473 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:17 Start Date: 02/Jun/22 12:17 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887884610 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); Review Comment: and? you are not manipulating the iterator Issue Time Tracking --- Worklog Id: (was: 777473) Time Spent: 3h 40m (was: 3.5h) > Compaction heartbeater improvements > --- > > Key: HIVE-26242 > URL: https://issues.apache.org/jira/browse/HIVE-26242 > Project: Hive > Issue Type: Improvement >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Time Spent: 3h 40m > Remaining Estimate: 0h > > The Compaction heartbeater should be improved the following ways: > * The metastore clients should be reused between heartbeats and closed only > at the end, when the transaction ends > * Instead of having a dedicated heartbeater thread for each Compaction > transaction,
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777472=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777472 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:16 Start Date: 02/Jun/22 12:16 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887884610 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); Review Comment: and? Issue Time Tracking --- Worklog Id: (was: 777472) Time Spent: 3.5h (was: 3h 20m) > Compaction heartbeater improvements > --- > > Key: HIVE-26242 > URL: https://issues.apache.org/jira/browse/HIVE-26242 > Project: Hive > Issue Type: Improvement >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Time Spent: 3.5h > Remaining Estimate: 0h > > The Compaction heartbeater should be improved the following ways: > * The metastore clients should be reused between heartbeats and closed only > at the end, when the transaction ends > * Instead of having a dedicated heartbeater thread for each Compaction > transaction, there should be shared a heartbeater
[jira] [Work logged] (HIVE-21160) Rewrite Update statement as Multi-insert and do Update split early
[ https://issues.apache.org/jira/browse/HIVE-21160?focusedWorklogId=777470=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777470 ] ASF GitHub Bot logged work on HIVE-21160: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:15 Start Date: 02/Jun/22 12:15 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on code in PR #2855: URL: https://github.com/apache/hive/pull/2855#discussion_r887876292 ## ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java: ## @@ -8292,8 +8294,8 @@ private WriteEntity generateTableWriteEntity(String dest, Table dest_tab, if ((dpCtx == null || dpCtx.getNumDPCols() == 0)) { output = new WriteEntity(dest_tab, determineWriteType(ltd, dest)); if (!outputs.add(output)) { -if(!((this instanceof MergeSemanticAnalyzer) && -conf.getBoolVar(ConfVars.MERGE_SPLIT_UPDATE))) { +if(!((this instanceof MergeSemanticAnalyzer || this instanceof UpdateDeleteSemanticAnalyzer) && Review Comment: I believe this could be more readable if the condition would be placed into an instance method; which could be overriden by `MergeSemanticAnalyzer` / `UpdateDeleteSemanticAnalyzer` ## itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java: ## @@ -251,6 +252,44 @@ public void testExceucteUpdateCounts() throws Exception { assertEquals("1 row PreparedStatement update", 1, count); } + @Test + public void testExceucteMergeCounts() throws Exception { +testExceucteMergeCounts(true); + } + + @Test + public void testExceucteMergeCountsNoSplitUpdate() throws Exception { +testExceucteMergeCounts(false); + } + + private void testExceucteMergeCounts(boolean splitUpdateEarly) throws Exception { + +Statement stmt = con.createStatement(); +stmt.execute("set " + ConfVars.MERGE_SPLIT_UPDATE.varname + "=" + splitUpdateEarly); +stmt.execute("set " + ConfVars.HIVE_SUPPORT_CONCURRENCY.varname + "=true"); Review Comment: isn't `MERGE_SPLIT_UPDATE` is the old option's name; meanwhile the one is named `SPLIT_UPDATE` ? ## ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java: ## @@ -91,23 +110,15 @@ private void analyzeDelete(ASTNode tree) throws SemanticException { * The sort by clause is put in there so that records come out in the right order to enable * merge on read. */ - private void reparseAndSuperAnalyze(ASTNode tree) throws SemanticException { + private void reparseAndSuperAnalyze(ASTNode tree, ASTNode tabNameNode, Table mTable) throws SemanticException { List children = tree.getChildren(); -// The first child should be the table we are updating / deleting from -ASTNode tabName = (ASTNode)children.get(0); -assert tabName.getToken().getType() == HiveParser.TOK_TABNAME : -"Expected tablename as first child of " + operation + " but found " + tabName.getName(); -Table mTable = getTargetTable(tabName); -validateTxnManager(mTable); -validateTargetTable(mTable); - // save the operation type into the query state SessionStateUtil.addResource(conf, Context.Operation.class.getSimpleName(), operation.name()); StringBuilder rewrittenQueryStr = new StringBuilder(); rewrittenQueryStr.append("insert into table "); -rewrittenQueryStr.append(getFullTableNameForSQL(tabName)); +rewrittenQueryStr.append(getFullTableNameForSQL(tabNameNode)); Review Comment: it seems like `tabNameNode` is only used as an argument of this method... ## ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java: ## @@ -48,28 +52,43 @@ public class UpdateDeleteSemanticAnalyzer extends RewriteSemanticAnalyzer { super(queryState); } - protected void analyze(ASTNode tree) throws SemanticException { + @Override + protected ASTNode getTargetTableNode(ASTNode tree) { +// The first child should be the table we are updating / deleting from +ASTNode tabName = (ASTNode)tree.getChild(0); +assert tabName.getToken().getType() == HiveParser.TOK_TABNAME : +"Expected tablename as first child of " + operation + " but found " + tabName.getName(); +return tabName; + } + + protected void analyze(ASTNode tree, Table table, ASTNode tabNameNode) throws SemanticException { switch (tree.getToken().getType()) { case HiveParser.TOK_DELETE_FROM: - analyzeDelete(tree); + analyzeDelete(tree, tabNameNode, table); break; case HiveParser.TOK_UPDATE_TABLE: - analyzeUpdate(tree); + analyzeUpdate(tree, tabNameNode, table); break; default: throw new RuntimeException("Asked to parse token " + tree.getName() + " in " + "UpdateDeleteSemanticAnalyzer"); } } - private void analyzeUpdate(ASTNode tree) throws
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777467=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777467 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:13 Start Date: 02/Jun/22 12:13 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887882124 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + *
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777464=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777464 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:11 Start Date: 02/Jun/22 12:11 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887880115 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + * Stops
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777462=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777462 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:10 Start Date: 02/Jun/22 12:10 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887879180 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + *
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777461=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777461 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:06 Start Date: 02/Jun/22 12:06 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887876080 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + *
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777457=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777457 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:05 Start Date: 02/Jun/22 12:05 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887875032 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + *
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777455=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777455 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:04 Start Date: 02/Jun/22 12:04 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887874363 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + * Stops
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777451=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777451 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:02 Start Date: 02/Jun/22 12:02 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887872810 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + * Stops
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777453=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777453 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 12:03 Start Date: 02/Jun/22 12:03 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887873535 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + * Stops
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777450=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777450 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 11:59 Start Date: 02/Jun/22 11:59 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887870652 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + *
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777449=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777449 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 11:58 Start Date: 02/Jun/22 11:58 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887870152 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + * Stops
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777447=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777447 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 11:57 Start Date: 02/Jun/22 11:57 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887868543 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + * Stops
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777444=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777444 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 11:54 Start Date: 02/Jun/22 11:54 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887866285 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); Review Comment: The map is used in the `startHeartbeat()` and `stopHeartBeat()` methods which can be called concurrently if there a re multiple worker threads configured. Issue Time Tracking --- Worklog Id: (was: 777444) Time Spent: 1.5h (was: 1h 20m) > Compaction heartbeater improvements > --- > > Key: HIVE-26242 > URL: https://issues.apache.org/jira/browse/HIVE-26242 > Project: Hive > Issue Type: Improvement >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > The Compaction heartbeater should be improved the following ways: > * The metastore clients should be reused between heartbeats and closed only > at the end, when
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777443=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777443 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 11:52 Start Date: 02/Jun/22 11:52 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887865099 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + *
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777441=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777441 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 11:50 Start Date: 02/Jun/22 11:50 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887863531 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + * Stops
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777434=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777434 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 11:46 Start Date: 02/Jun/22 11:46 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887860283 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); + + /** + * Starts the heartbeat for the given transaction + * @param txnId The id of the compaction txn + * @param lockId The id of the lock associated with the txn + * @param tableName Required for logging only + * @throws IllegalStateException Thrown when the heartbeat for the given txn has already been started. + */ + void startHeartbeat(long txnId, long lockId, String tableName) { +if (tasks.containsKey(txnId)) { + throw new IllegalStateException("Heartbeat already started for TXN " + txnId); +} +LOG.info("Submitting heartbeat task for TXN {}", txnId); +CompactionHeartbeater heartbeater = new CompactionHeartbeater(txnId, lockId, tableName); +Future submittedTask = heartbeatExecutor.scheduleAtFixedRate(heartbeater, initialDelay, period, TimeUnit.MILLISECONDS); +tasks.put(txnId, new TaskWrapper(heartbeater, submittedTask)); + } + + /** + * Stops
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777429=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777429 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 11:41 Start Date: 02/Jun/22 11:41 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887856299 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { +try { + INSTANCE.shutdown(); +} catch (InterruptedException e) { + Thread.currentThread().interrupt(); +} + })); +} + } +} +if (INSTANCE.heartbeatExecutor.isShutdown()) { + throw new IllegalStateException("The CompactionHeartbeatService is already shut down!"); +} +return INSTANCE; + } + + private final ScheduledExecutorService heartbeatExecutor; + private final ObjectPool clientPool; + private final long initialDelay; + private final long period; + private final ConcurrentHashMap tasks = new ConcurrentHashMap<>(30); Review Comment: why should it be a concurrent map, I don't see any usage of it in the code ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,
[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres
[ https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777427=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777427 ] ASF GitHub Bot logged work on HIVE-26267: - Author: ASF GitHub Bot Created on: 02/Jun/22 11:38 Start Date: 02/Jun/22 11:38 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3325: URL: https://github.com/apache/hive/pull/3325#discussion_r887852145 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -3701,122 +3698,125 @@ public CompactionResponse compact(CompactionRequest rqst) throws MetaException { * compactions for any resource. */ handle = getMutexAPI().acquireLock(MUTEX_KEY.CompactionScheduler.name()); -dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED); -stmt = dbConn.createStatement(); - -long id = generateCompactionQueueId(stmt); - -GetValidWriteIdsRequest request = new GetValidWriteIdsRequest( -Collections.singletonList(getFullTableName(rqst.getDbname(), rqst.getTablename(; -final ValidCompactorWriteIdList tblValidWriteIds = - TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0)); -LOG.debug("ValidCompactWriteIdList: " + tblValidWriteIds.writeToString()); - -List params = new ArrayList<>(); -StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" FROM \"COMPACTION_QUEUE\" WHERE"). - append(" (\"CQ_STATE\" IN("). - append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)). -append(") OR (\"CQ_STATE\" = ").append(quoteChar(READY_FOR_CLEANING)). -append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))"). -append(" AND \"CQ_DATABASE\"=?"). -append(" AND \"CQ_TABLE\"=?").append(" AND "); -params.add(Long.toString(tblValidWriteIds.getHighWatermark())); -params.add(rqst.getDbname()); -params.add(rqst.getTablename()); -if(rqst.getPartitionname() == null) { - sb.append("\"CQ_PARTITION\" is null"); -} else { - sb.append("\"CQ_PARTITION\"=?"); - params.add(rqst.getPartitionname()); -} +try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED)) { Review Comment: I have googled for it before splitting, and this approach was the suggested one. if you check the java [spec](https://docs.oracle.com/javase/specs/jls/se7/html/jls-14.html#jls-14.20.3.2), the extended try-with -resources construct is translated to a nested try-catch. The inner try-catch is responsible for closing the resources, so when the outer catch clause is executed (`dbConn.rollback();`), the connection object should be already closed. So as I understand, you cannot execute code in the catch/finally blocks which require the resource to be open. Did I miss something here? Issue Time Tracking --- Worklog Id: (was: 777427) Time Spent: 1h 50m (was: 1h 40m) > Addendum to HIVE-26107: perpared statement is not working on Postgres > - > > Key: HIVE-26267 > URL: https://issues.apache.org/jira/browse/HIVE-26267 > Project: Hive > Issue Type: Bug >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > The assembled prepared statement in > {code:java} > org.apache.hadoop.hive.metastore.txn.TxnHandler#compact{code} > does not work for Postgres DB. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777419=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777419 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 11:31 Start Date: 02/Jun/22 11:31 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887849154 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; + + /** + * Return the singleton instance of this class. + * @param conf The {@link HiveConf} used to create the service. Used only during the firsst call + * @return Returns the singleton {@link CompactionHeartbeatService} + * @throws IllegalStateException Thrown when the service has already been shut down. + */ + static CompactionHeartbeatService getInstance(HiveConf conf) { +if (INSTANCE == null) { + synchronized (CompactionHeartbeatService.class) { +if (INSTANCE == null) { + LOG.debug("Initializing compaction txn heartbeater service."); + INSTANCE = new CompactionHeartbeatService(conf); + Runtime.getRuntime().addShutdownHook(new Thread(() -> { Review Comment: should we use ShutdownHookManager.addShutdownHook(runnable, SHUTDOWN_HOOK_PRIORITY); Issue Time Tracking --- Worklog Id: (was: 777419) Time Spent: 40m (was: 0.5h) > Compaction heartbeater improvements > --- > > Key: HIVE-26242 > URL: https://issues.apache.org/jira/browse/HIVE-26242 > Project: Hive > Issue Type: Improvement >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > The Compaction heartbeater should be improved the following ways: > * The metastore clients should be reused between heartbeats and closed only > at the end, when the transaction ends > * Instead of having a dedicated heartbeater thread for each Compaction > transaction, there should be shared a heartbeater executor where the > heartbeat tasks can be scheduled/submitted. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres
[ https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777395=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777395 ] ASF GitHub Bot logged work on HIVE-26267: - Author: ASF GitHub Bot Created on: 02/Jun/22 10:43 Start Date: 02/Jun/22 10:43 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3325: URL: https://github.com/apache/hive/pull/3325#discussion_r887809222 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -3701,122 +3698,125 @@ public CompactionResponse compact(CompactionRequest rqst) throws MetaException { * compactions for any resource. */ handle = getMutexAPI().acquireLock(MUTEX_KEY.CompactionScheduler.name()); -dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED); -stmt = dbConn.createStatement(); - -long id = generateCompactionQueueId(stmt); - -GetValidWriteIdsRequest request = new GetValidWriteIdsRequest( -Collections.singletonList(getFullTableName(rqst.getDbname(), rqst.getTablename(; -final ValidCompactorWriteIdList tblValidWriteIds = - TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0)); -LOG.debug("ValidCompactWriteIdList: " + tblValidWriteIds.writeToString()); - -List params = new ArrayList<>(); -StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" FROM \"COMPACTION_QUEUE\" WHERE"). - append(" (\"CQ_STATE\" IN("). - append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)). -append(") OR (\"CQ_STATE\" = ").append(quoteChar(READY_FOR_CLEANING)). -append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))"). -append(" AND \"CQ_DATABASE\"=?"). -append(" AND \"CQ_TABLE\"=?").append(" AND "); -params.add(Long.toString(tblValidWriteIds.getHighWatermark())); -params.add(rqst.getDbname()); -params.add(rqst.getTablename()); -if(rqst.getPartitionname() == null) { - sb.append("\"CQ_PARTITION\" is null"); -} else { - sb.append("\"CQ_PARTITION\"=?"); - params.add(rqst.getPartitionname()); -} +try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED)) { Review Comment: no need for an extra try, you could initialize dbConn and stmt in a single try: try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED); Statement stmt = dbConn.createStatement()) { Issue Time Tracking --- Worklog Id: (was: 777395) Time Spent: 1h 40m (was: 1.5h) > Addendum to HIVE-26107: perpared statement is not working on Postgres > - > > Key: HIVE-26267 > URL: https://issues.apache.org/jira/browse/HIVE-26267 > Project: Hive > Issue Type: Bug >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > The assembled prepared statement in > {code:java} > org.apache.hadoop.hive.metastore.txn.TxnHandler#compact{code} > does not work for Postgres DB. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres
[ https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777393=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777393 ] ASF GitHub Bot logged work on HIVE-26267: - Author: ASF GitHub Bot Created on: 02/Jun/22 10:43 Start Date: 02/Jun/22 10:43 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3325: URL: https://github.com/apache/hive/pull/3325#discussion_r887809222 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -3701,122 +3698,125 @@ public CompactionResponse compact(CompactionRequest rqst) throws MetaException { * compactions for any resource. */ handle = getMutexAPI().acquireLock(MUTEX_KEY.CompactionScheduler.name()); -dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED); -stmt = dbConn.createStatement(); - -long id = generateCompactionQueueId(stmt); - -GetValidWriteIdsRequest request = new GetValidWriteIdsRequest( -Collections.singletonList(getFullTableName(rqst.getDbname(), rqst.getTablename(; -final ValidCompactorWriteIdList tblValidWriteIds = - TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0)); -LOG.debug("ValidCompactWriteIdList: " + tblValidWriteIds.writeToString()); - -List params = new ArrayList<>(); -StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" FROM \"COMPACTION_QUEUE\" WHERE"). - append(" (\"CQ_STATE\" IN("). - append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)). -append(") OR (\"CQ_STATE\" = ").append(quoteChar(READY_FOR_CLEANING)). -append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))"). -append(" AND \"CQ_DATABASE\"=?"). -append(" AND \"CQ_TABLE\"=?").append(" AND "); -params.add(Long.toString(tblValidWriteIds.getHighWatermark())); -params.add(rqst.getDbname()); -params.add(rqst.getTablename()); -if(rqst.getPartitionname() == null) { - sb.append("\"CQ_PARTITION\" is null"); -} else { - sb.append("\"CQ_PARTITION\"=?"); - params.add(rqst.getPartitionname()); -} +try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED)) { Review Comment: no need for an extra try, you could initialize dbConn and stmt in single try: try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED); Statement stmt = dbConn.createStatement()) { Issue Time Tracking --- Worklog Id: (was: 777393) Time Spent: 1.5h (was: 1h 20m) > Addendum to HIVE-26107: perpared statement is not working on Postgres > - > > Key: HIVE-26267 > URL: https://issues.apache.org/jira/browse/HIVE-26267 > Project: Hive > Issue Type: Bug >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > The assembled prepared statement in > {code:java} > org.apache.hadoop.hive.metastore.txn.TxnHandler#compact{code} > does not work for Postgres DB. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres
[ https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777388=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777388 ] ASF GitHub Bot logged work on HIVE-26267: - Author: ASF GitHub Bot Created on: 02/Jun/22 10:36 Start Date: 02/Jun/22 10:36 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3325: URL: https://github.com/apache/hive/pull/3325#discussion_r887809222 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -3701,122 +3698,125 @@ public CompactionResponse compact(CompactionRequest rqst) throws MetaException { * compactions for any resource. */ handle = getMutexAPI().acquireLock(MUTEX_KEY.CompactionScheduler.name()); -dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED); -stmt = dbConn.createStatement(); - -long id = generateCompactionQueueId(stmt); - -GetValidWriteIdsRequest request = new GetValidWriteIdsRequest( -Collections.singletonList(getFullTableName(rqst.getDbname(), rqst.getTablename(; -final ValidCompactorWriteIdList tblValidWriteIds = - TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0)); -LOG.debug("ValidCompactWriteIdList: " + tblValidWriteIds.writeToString()); - -List params = new ArrayList<>(); -StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" FROM \"COMPACTION_QUEUE\" WHERE"). - append(" (\"CQ_STATE\" IN("). - append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)). -append(") OR (\"CQ_STATE\" = ").append(quoteChar(READY_FOR_CLEANING)). -append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))"). -append(" AND \"CQ_DATABASE\"=?"). -append(" AND \"CQ_TABLE\"=?").append(" AND "); -params.add(Long.toString(tblValidWriteIds.getHighWatermark())); -params.add(rqst.getDbname()); -params.add(rqst.getTablename()); -if(rqst.getPartitionname() == null) { - sb.append("\"CQ_PARTITION\" is null"); -} else { - sb.append("\"CQ_PARTITION\"=?"); - params.add(rqst.getPartitionname()); -} +try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED)) { Review Comment: no need for an extra try, you could initialize dbConn and stmt in the above block (line 3693) in single try: try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED); Statement stmt = dbConn.createStatement()) { Issue Time Tracking --- Worklog Id: (was: 777388) Time Spent: 1h 20m (was: 1h 10m) > Addendum to HIVE-26107: perpared statement is not working on Postgres > - > > Key: HIVE-26267 > URL: https://issues.apache.org/jira/browse/HIVE-26267 > Project: Hive > Issue Type: Bug >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > The assembled prepared statement in > {code:java} > org.apache.hadoop.hive.metastore.txn.TxnHandler#compact{code} > does not work for Postgres DB. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres
[ https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777387=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777387 ] ASF GitHub Bot logged work on HIVE-26267: - Author: ASF GitHub Bot Created on: 02/Jun/22 10:34 Start Date: 02/Jun/22 10:34 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3325: URL: https://github.com/apache/hive/pull/3325#discussion_r887809222 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -3701,122 +3698,125 @@ public CompactionResponse compact(CompactionRequest rqst) throws MetaException { * compactions for any resource. */ handle = getMutexAPI().acquireLock(MUTEX_KEY.CompactionScheduler.name()); -dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED); -stmt = dbConn.createStatement(); - -long id = generateCompactionQueueId(stmt); - -GetValidWriteIdsRequest request = new GetValidWriteIdsRequest( -Collections.singletonList(getFullTableName(rqst.getDbname(), rqst.getTablename(; -final ValidCompactorWriteIdList tblValidWriteIds = - TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0)); -LOG.debug("ValidCompactWriteIdList: " + tblValidWriteIds.writeToString()); - -List params = new ArrayList<>(); -StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" FROM \"COMPACTION_QUEUE\" WHERE"). - append(" (\"CQ_STATE\" IN("). - append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)). -append(") OR (\"CQ_STATE\" = ").append(quoteChar(READY_FOR_CLEANING)). -append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))"). -append(" AND \"CQ_DATABASE\"=?"). -append(" AND \"CQ_TABLE\"=?").append(" AND "); -params.add(Long.toString(tblValidWriteIds.getHighWatermark())); -params.add(rqst.getDbname()); -params.add(rqst.getTablename()); -if(rqst.getPartitionname() == null) { - sb.append("\"CQ_PARTITION\" is null"); -} else { - sb.append("\"CQ_PARTITION\"=?"); - params.add(rqst.getPartitionname()); -} +try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED)) { Review Comment: no need for an extra try, you could initialize dbConn in the above block Issue Time Tracking --- Worklog Id: (was: 777387) Time Spent: 1h 10m (was: 1h) > Addendum to HIVE-26107: perpared statement is not working on Postgres > - > > Key: HIVE-26267 > URL: https://issues.apache.org/jira/browse/HIVE-26267 > Project: Hive > Issue Type: Bug >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > The assembled prepared statement in > {code:java} > org.apache.hadoop.hive.metastore.txn.TxnHandler#compact{code} > does not work for Postgres DB. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres
[ https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777386=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777386 ] ASF GitHub Bot logged work on HIVE-26267: - Author: ASF GitHub Bot Created on: 02/Jun/22 10:28 Start Date: 02/Jun/22 10:28 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3325: URL: https://github.com/apache/hive/pull/3325#discussion_r887804677 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -3701,122 +3698,125 @@ public CompactionResponse compact(CompactionRequest rqst) throws MetaException { * compactions for any resource. */ handle = getMutexAPI().acquireLock(MUTEX_KEY.CompactionScheduler.name()); -dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED); -stmt = dbConn.createStatement(); - -long id = generateCompactionQueueId(stmt); - -GetValidWriteIdsRequest request = new GetValidWriteIdsRequest( -Collections.singletonList(getFullTableName(rqst.getDbname(), rqst.getTablename(; -final ValidCompactorWriteIdList tblValidWriteIds = - TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0)); -LOG.debug("ValidCompactWriteIdList: " + tblValidWriteIds.writeToString()); - -List params = new ArrayList<>(); -StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" FROM \"COMPACTION_QUEUE\" WHERE"). - append(" (\"CQ_STATE\" IN("). - append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)). -append(") OR (\"CQ_STATE\" = ").append(quoteChar(READY_FOR_CLEANING)). -append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))"). -append(" AND \"CQ_DATABASE\"=?"). -append(" AND \"CQ_TABLE\"=?").append(" AND "); -params.add(Long.toString(tblValidWriteIds.getHighWatermark())); -params.add(rqst.getDbname()); -params.add(rqst.getTablename()); -if(rqst.getPartitionname() == null) { - sb.append("\"CQ_PARTITION\" is null"); -} else { - sb.append("\"CQ_PARTITION\"=?"); - params.add(rqst.getPartitionname()); -} +try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED)) { + try (Statement stmt = dbConn.createStatement()) { + +long id = generateCompactionQueueId(stmt); + +GetValidWriteIdsRequest request = new GetValidWriteIdsRequest( +Collections.singletonList(getFullTableName(rqst.getDbname(), rqst.getTablename(; +final ValidCompactorWriteIdList tblValidWriteIds = + TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0)); +LOG.debug("ValidCompactWriteIdList: " + tblValidWriteIds.writeToString()); + +StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" FROM \"COMPACTION_QUEUE\" WHERE"). +append(" (\"CQ_STATE\" IN("). + append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)). +append(") OR (\"CQ_STATE\" = ").append(quoteChar(READY_FOR_CLEANING)). +append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))"). +append(" AND \"CQ_DATABASE\"=?"). +append(" AND \"CQ_TABLE\"=?").append(" AND "); +if(rqst.getPartitionname() == null) { + sb.append("\"CQ_PARTITION\" is null"); +} else { + sb.append("\"CQ_PARTITION\"=?"); +} -pst = sqlGenerator.prepareStmtWithParameters(dbConn, sb.toString(), params); -LOG.debug("Going to execute query <" + sb + ">"); -ResultSet rs = pst.executeQuery(); -if(rs.next()) { - long enqueuedId = rs.getLong(1); - String state = compactorStateToResponse(rs.getString(2).charAt(0)); - LOG.info("Ignoring request to compact " + rqst.getDbname() + "/" + rqst.getTablename() + -"/" + rqst.getPartitionname() + " since it is already " + quoteString(state) + -" with id=" + enqueuedId); - CompactionResponse resp = new CompactionResponse(-1, REFUSED_RESPONSE, false); - resp.setErrormessage("Compaction is already scheduled with state=" + quoteString(state) + - " and id=" + enqueuedId); - return resp; -} -close(rs); -closeStmt(pst); -params.clear(); -StringBuilder buf = new StringBuilder("INSERT INTO \"COMPACTION_QUEUE\" (\"CQ_ID\", \"CQ_DATABASE\", " + - "\"CQ_TABLE\", "); -String partName = rqst.getPartitionname(); -if (partName != null) buf.append("\"CQ_PARTITION\", "); -
[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres
[ https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777385=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777385 ] ASF GitHub Bot logged work on HIVE-26267: - Author: ASF GitHub Bot Created on: 02/Jun/22 10:27 Start Date: 02/Jun/22 10:27 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3325: URL: https://github.com/apache/hive/pull/3325#discussion_r887804398 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -3701,122 +3698,125 @@ public CompactionResponse compact(CompactionRequest rqst) throws MetaException { * compactions for any resource. */ handle = getMutexAPI().acquireLock(MUTEX_KEY.CompactionScheduler.name()); -dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED); -stmt = dbConn.createStatement(); - -long id = generateCompactionQueueId(stmt); - -GetValidWriteIdsRequest request = new GetValidWriteIdsRequest( -Collections.singletonList(getFullTableName(rqst.getDbname(), rqst.getTablename(; -final ValidCompactorWriteIdList tblValidWriteIds = - TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0)); -LOG.debug("ValidCompactWriteIdList: " + tblValidWriteIds.writeToString()); - -List params = new ArrayList<>(); -StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" FROM \"COMPACTION_QUEUE\" WHERE"). - append(" (\"CQ_STATE\" IN("). - append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)). -append(") OR (\"CQ_STATE\" = ").append(quoteChar(READY_FOR_CLEANING)). -append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))"). -append(" AND \"CQ_DATABASE\"=?"). -append(" AND \"CQ_TABLE\"=?").append(" AND "); -params.add(Long.toString(tblValidWriteIds.getHighWatermark())); -params.add(rqst.getDbname()); -params.add(rqst.getTablename()); -if(rqst.getPartitionname() == null) { - sb.append("\"CQ_PARTITION\" is null"); -} else { - sb.append("\"CQ_PARTITION\"=?"); - params.add(rqst.getPartitionname()); -} +try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED)) { + try (Statement stmt = dbConn.createStatement()) { + Review Comment: nit: new line Issue Time Tracking --- Worklog Id: (was: 777385) Time Spent: 50m (was: 40m) > Addendum to HIVE-26107: perpared statement is not working on Postgres > - > > Key: HIVE-26267 > URL: https://issues.apache.org/jira/browse/HIVE-26267 > Project: Hive > Issue Type: Bug >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > The assembled prepared statement in > {code:java} > org.apache.hadoop.hive.metastore.txn.TxnHandler#compact{code} > does not work for Postgres DB. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26228) Implement Iceberg table rollback feature
[ https://issues.apache.org/jira/browse/HIVE-26228?focusedWorklogId=777380=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777380 ] ASF GitHub Bot logged work on HIVE-26228: - Author: ASF GitHub Bot Created on: 02/Jun/22 09:52 Start Date: 02/Jun/22 09:52 Worklog Time Spent: 10m Work Description: szlta commented on code in PR #3287: URL: https://github.com/apache/hive/pull/3287#discussion_r887684398 ## ql/src/java/org/apache/hadoop/hive/ql/ddl/table/AbstractBaseAlterTableAnalyzer.java: ## @@ -161,7 +161,8 @@ protected void validateAlterTableType(Table table, AlterTableType op, boolean ex } if (table.isNonNative() && table.getStorageHandler() != null) { if (!table.getStorageHandler().isAllowedAlterOperation(op) || Review Comment: I may be missing the context here, but could you remind me why we don't incorporate these operation type checks within isAllowedAlterOperation() method of the storagehandler? (Meaning both setpartition and execute) ## ql/src/java/org/apache/hadoop/hive/ql/ddl/table/execute/AlterTableExecuteAnalyzer.java: ## @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.ddl.table.execute; + +import org.apache.hadoop.hive.common.TableName; +import org.apache.hadoop.hive.common.type.TimestampTZ; +import org.apache.hadoop.hive.common.type.TimestampTZUtil; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.ql.QueryState; +import org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory.DDLType; +import org.apache.hadoop.hive.ql.ddl.DDLWork; +import org.apache.hadoop.hive.ql.ddl.table.AbstractAlterTableAnalyzer; +import org.apache.hadoop.hive.ql.ddl.table.AlterTableType; +import org.apache.hadoop.hive.ql.exec.TaskFactory; +import org.apache.hadoop.hive.ql.hooks.ReadEntity; +import org.apache.hadoop.hive.ql.metadata.Table; +import org.apache.hadoop.hive.ql.parse.ASTNode; +import org.apache.hadoop.hive.ql.parse.HiveParser; +import org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec; +import org.apache.hadoop.hive.ql.parse.SemanticException; +import org.apache.hadoop.hive.ql.plan.PlanUtils; +import org.apache.hadoop.hive.ql.session.SessionState; + +import java.time.ZoneId; +import java.util.Map; + +import static org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec.ExecuteOperationType.ROLLBACK; +import static org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec.RollbackSpec.RollbackType.TIME; +import static org.apache.hadoop.hive.ql.parse.AlterTableExecuteSpec.RollbackSpec.RollbackType.VERSION; + +/** + * Analyzer for ALTER TABLE ... EXECUTE commands. + */ +@DDLType(types = HiveParser.TOK_ALTERTABLE_EXECUTE) +public class AlterTableExecuteAnalyzer extends AbstractAlterTableAnalyzer { + + public AlterTableExecuteAnalyzer(QueryState queryState) throws SemanticException { +super(queryState); + } + + @Override + protected void analyzeCommand(TableName tableName, Map partitionSpec, ASTNode command) + throws SemanticException { +Table table = getTable(tableName); +// the first child must be the execute operation type +ASTNode executeCommandType = (ASTNode) command.getChild(0); +validateAlterTableType(table, AlterTableType.EXECUTE, false); +inputs.add(new ReadEntity(table)); +AlterTableExecuteDesc desc = null; +if (HiveParser.KW_ROLLBACK == executeCommandType.getType()) { + AlterTableExecuteSpec spec; + // the second child must be the rollback parameter + ASTNode child = (ASTNode) command.getChild(1); + + if (child.getType() == HiveParser.StringLiteral) { Review Comment: What happens if the user specifies a timezone in this literal? Issue Time Tracking --- Worklog Id: (was: 777380) Time Spent: 4.5h (was: 4h 20m) > Implement Iceberg table rollback feature > > > Key: HIVE-26228 > URL: https://issues.apache.org/jira/browse/HIVE-26228 > Project: Hive > Issue Type: New Feature >
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777359=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777359 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 08:56 Start Date: 02/Jun/22 08:56 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887728099 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; Review Comment: shouldn't be uppercase if it's not marked as `final` Issue Time Tracking --- Worklog Id: (was: 777359) Time Spent: 0.5h (was: 20m) > Compaction heartbeater improvements > --- > > Key: HIVE-26242 > URL: https://issues.apache.org/jira/browse/HIVE-26242 > Project: Hive > Issue Type: Improvement >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > The Compaction heartbeater should be improved the following ways: > * The metastore clients should be reused between heartbeats and closed only > at the end, when the transaction ends > * Instead of having a dedicated heartbeater thread for each Compaction > transaction, there should be shared a heartbeater executor where the > heartbeat tasks can be scheduled/submitted. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26242) Compaction heartbeater improvements
[ https://issues.apache.org/jira/browse/HIVE-26242?focusedWorklogId=777358=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777358 ] ASF GitHub Bot logged work on HIVE-26242: - Author: ASF GitHub Bot Created on: 02/Jun/22 08:53 Start Date: 02/Jun/22 08:53 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3303: URL: https://github.com/apache/hive/pull/3303#discussion_r887725752 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionHeartbeatService.java: ## @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.commons.pool2.BasePooledObjectFactory; +import org.apache.commons.pool2.ObjectPool; +import org.apache.commons.pool2.PooledObject; +import org.apache.commons.pool2.impl.DefaultPooledObject; +import org.apache.commons.pool2.impl.GenericObjectPool; +import org.apache.commons.pool2.impl.GenericObjectPoolConfig; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; +import org.apache.hadoop.hive.metastore.IMetaStoreClient; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NoSuchElementException; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; + +/** + * Singleton service responsible for heartbeating the compaction transactions. + */ +class CompactionHeartbeatService { + + + private static final Logger LOG = LoggerFactory.getLogger(CompactionHeartbeatService.class); + + private static volatile CompactionHeartbeatService INSTANCE; Review Comment: why not `on demand holder` idiom Issue Time Tracking --- Worklog Id: (was: 777358) Time Spent: 20m (was: 10m) > Compaction heartbeater improvements > --- > > Key: HIVE-26242 > URL: https://issues.apache.org/jira/browse/HIVE-26242 > Project: Hive > Issue Type: Improvement >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > The Compaction heartbeater should be improved the following ways: > * The metastore clients should be reused between heartbeats and closed only > at the end, when the transaction ends > * Instead of having a dedicated heartbeater thread for each Compaction > transaction, there should be shared a heartbeater executor where the > heartbeat tasks can be scheduled/submitted. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26278) Add unit tests for Hive#getPartitionsByNames using batching
[ https://issues.apache.org/jira/browse/HIVE-26278?focusedWorklogId=777355=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777355 ] ASF GitHub Bot logged work on HIVE-26278: - Author: ASF GitHub Bot Created on: 02/Jun/22 08:43 Start Date: 02/Jun/22 08:43 Worklog Time Spent: 10m Work Description: zabetak commented on code in PR #3335: URL: https://github.com/apache/hive/pull/3335#discussion_r887717257 ## ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreClientApiArgumentsChecker.java: ## @@ -146,6 +151,26 @@ public void testGetPartitionsByNames3() throws HiveException { hive.getPartitionsByNames(t, new ArrayList<>(), true); } + @Test + public void testGetPartitionsByNamesWithSingleBatch() throws HiveException { +hive.getPartitionsByNames(t, Arrays.asList("Greece", "Italy"), true); + } + + @Test + public void testGetPartitionsByNamesWithMultipleEqualSizeBatches() + throws HiveException { +List names = Arrays.asList("Greece", "Italy", "France", "Spain"); +hive.getPartitionsByNames(t, names, true); + } + + @Test + public void testGetPartitionsByNamesWithMultipleUnequalSizeBatches() + throws HiveException { +List names = +Arrays.asList("Greece", "Italy", "France", "Spain", "Hungary"); +hive.getPartitionsByNames(t, names, true); Review Comment: The goal of the tests in this class are explained in the javadoc of `TestHiveMetaStoreClient`. > Tests in this class ensure that both tableId and validWriteIdList are sent from HS2 Asserting more things seems to be out of scope for this PR. Issue Time Tracking --- Worklog Id: (was: 777355) Time Spent: 1h (was: 50m) > Add unit tests for Hive#getPartitionsByNames using batching > --- > > Key: HIVE-26278 > URL: https://issues.apache.org/jira/browse/HIVE-26278 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > [Hive#getPartitionsByNames|https://github.com/apache/hive/blob/6626b5564ee206db5a656d2f611ed71f10a0ffc1/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L4155] > supports decomposing requests in batches but there are no unit tests > checking for the ValidWriteIdList when batching is used. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26278) Add unit tests for Hive#getPartitionsByNames using batching
[ https://issues.apache.org/jira/browse/HIVE-26278?focusedWorklogId=777352=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777352 ] ASF GitHub Bot logged work on HIVE-26278: - Author: ASF GitHub Bot Created on: 02/Jun/22 08:40 Start Date: 02/Jun/22 08:40 Worklog Time Spent: 10m Work Description: zabetak commented on code in PR #3335: URL: https://github.com/apache/hive/pull/3335#discussion_r887714198 ## ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreClientApiArgumentsChecker.java: ## @@ -146,6 +151,26 @@ public void testGetPartitionsByNames3() throws HiveException { hive.getPartitionsByNames(t, new ArrayList<>(), true); } + @Test + public void testGetPartitionsByNamesWithSingleBatch() throws HiveException { +hive.getPartitionsByNames(t, Arrays.asList("Greece", "Italy"), true); Review Comment: The assertions are in `TestHiveMetaStoreClient`. Issue Time Tracking --- Worklog Id: (was: 777352) Time Spent: 50m (was: 40m) > Add unit tests for Hive#getPartitionsByNames using batching > --- > > Key: HIVE-26278 > URL: https://issues.apache.org/jira/browse/HIVE-26278 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > [Hive#getPartitionsByNames|https://github.com/apache/hive/blob/6626b5564ee206db5a656d2f611ed71f10a0ffc1/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L4155] > supports decomposing requests in batches but there are no unit tests > checking for the ValidWriteIdList when batching is used. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26278) Add unit tests for Hive#getPartitionsByNames using batching
[ https://issues.apache.org/jira/browse/HIVE-26278?focusedWorklogId=777350=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777350 ] ASF GitHub Bot logged work on HIVE-26278: - Author: ASF GitHub Bot Created on: 02/Jun/22 08:36 Start Date: 02/Jun/22 08:36 Worklog Time Spent: 10m Work Description: kasakrisz commented on code in PR #3335: URL: https://github.com/apache/hive/pull/3335#discussion_r887711290 ## ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreClientApiArgumentsChecker.java: ## @@ -146,6 +151,26 @@ public void testGetPartitionsByNames3() throws HiveException { hive.getPartitionsByNames(t, new ArrayList<>(), true); } + @Test + public void testGetPartitionsByNamesWithSingleBatch() throws HiveException { +hive.getPartitionsByNames(t, Arrays.asList("Greece", "Italy"), true); + } + + @Test + public void testGetPartitionsByNamesWithMultipleEqualSizeBatches() + throws HiveException { +List names = Arrays.asList("Greece", "Italy", "France", "Spain"); +hive.getPartitionsByNames(t, names, true); + } + + @Test + public void testGetPartitionsByNamesWithMultipleUnequalSizeBatches() + throws HiveException { +List names = +Arrays.asList("Greece", "Italy", "France", "Spain", "Hungary"); +hive.getPartitionsByNames(t, names, true); Review Comment: Could you please add some asserts into these test methods like is the result of `getPartitionsByNames` contains partitions we expect. Example: ``` List result = objectStore.getAllStoredProcedures(new ListStoredProcedureRequest("hive")); Assert.assertEquals(5, result.size()); assertThat(result.get(0).getName, is("Greece")); ... ``` It would be also good to know how many times was the method `IMetaStoreClient.getPartitionsByNames` called. I think to get this a mock framework should be used. Mockito is already used in some Hive tests. Issue Time Tracking --- Worklog Id: (was: 777350) Time Spent: 40m (was: 0.5h) > Add unit tests for Hive#getPartitionsByNames using batching > --- > > Key: HIVE-26278 > URL: https://issues.apache.org/jira/browse/HIVE-26278 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > [Hive#getPartitionsByNames|https://github.com/apache/hive/blob/6626b5564ee206db5a656d2f611ed71f10a0ffc1/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L4155] > supports decomposing requests in batches but there are no unit tests > checking for the ValidWriteIdList when batching is used. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26278) Add unit tests for Hive#getPartitionsByNames using batching
[ https://issues.apache.org/jira/browse/HIVE-26278?focusedWorklogId=777349=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777349 ] ASF GitHub Bot logged work on HIVE-26278: - Author: ASF GitHub Bot Created on: 02/Jun/22 08:36 Start Date: 02/Jun/22 08:36 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on code in PR #3335: URL: https://github.com/apache/hive/pull/3335#discussion_r887711311 ## ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreClientApiArgumentsChecker.java: ## @@ -146,6 +151,26 @@ public void testGetPartitionsByNames3() throws HiveException { hive.getPartitionsByNames(t, new ArrayList<>(), true); } + @Test + public void testGetPartitionsByNamesWithSingleBatch() throws HiveException { +hive.getPartitionsByNames(t, Arrays.asList("Greece", "Italy"), true); Review Comment: I don't see any assertions - how these tests are working? Issue Time Tracking --- Worklog Id: (was: 777349) Time Spent: 0.5h (was: 20m) > Add unit tests for Hive#getPartitionsByNames using batching > --- > > Key: HIVE-26278 > URL: https://issues.apache.org/jira/browse/HIVE-26278 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > [Hive#getPartitionsByNames|https://github.com/apache/hive/blob/6626b5564ee206db5a656d2f611ed71f10a0ffc1/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L4155] > supports decomposing requests in batches but there are no unit tests > checking for the ValidWriteIdList when batching is used. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres
[ https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777343=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777343 ] ASF GitHub Bot logged work on HIVE-26267: - Author: ASF GitHub Bot Created on: 02/Jun/22 08:27 Start Date: 02/Jun/22 08:27 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3325: URL: https://github.com/apache/hive/pull/3325#discussion_r887702617 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -3712,25 +3712,26 @@ public CompactionResponse compact(CompactionRequest rqst) throws MetaException { TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0)); LOG.debug("ValidCompactWriteIdList: " + tblValidWriteIds.writeToString()); -List params = new ArrayList<>(); StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" FROM \"COMPACTION_QUEUE\" WHERE"). append(" (\"CQ_STATE\" IN("). append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)). append(") OR (\"CQ_STATE\" = ").append(quoteChar(READY_FOR_CLEANING)). append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))"). append(" AND \"CQ_DATABASE\"=?"). append(" AND \"CQ_TABLE\"=?").append(" AND "); -params.add(Long.toString(tblValidWriteIds.getHighWatermark())); -params.add(rqst.getDbname()); -params.add(rqst.getTablename()); if(rqst.getPartitionname() == null) { sb.append("\"CQ_PARTITION\" is null"); } else { sb.append("\"CQ_PARTITION\"=?"); Review Comment: no need to check explicitly for null, you can use `pstmt.setObject()` Issue Time Tracking --- Worklog Id: (was: 777343) Time Spent: 40m (was: 0.5h) > Addendum to HIVE-26107: perpared statement is not working on Postgres > - > > Key: HIVE-26267 > URL: https://issues.apache.org/jira/browse/HIVE-26267 > Project: Hive > Issue Type: Bug >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > The assembled prepared statement in > {code:java} > org.apache.hadoop.hive.metastore.txn.TxnHandler#compact{code} > does not work for Postgres DB. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics
[ https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777338=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777338 ] ASF GitHub Bot logged work on HIVE-26217: - Author: ASF GitHub Bot Created on: 02/Jun/22 08:22 Start Date: 02/Jun/22 08:22 Worklog Time Spent: 10m Work Description: SourabhBadhya commented on code in PR #3281: URL: https://github.com/apache/hive/pull/3281#discussion_r887697843 ## ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java: ## @@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext pCtx) throws SemanticExce try { String protoName = null, suffix = ""; boolean isExternal = false; - + boolean useSuffix = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX) + || HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED); + if (pCtx.getQueryProperties().isCTAS()) { protoName = pCtx.getCreateTable().getDbTableName(); isExternal = pCtx.getCreateTable().isExternal(); - +if (!isExternal && useSuffix) { + long txnId = Optional.ofNullable(pCtx.getContext()) + .map(ctx -> ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L); + suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, txnId); +} } else if (pCtx.getQueryProperties().isMaterializedView()) { protoName = pCtx.getCreateViewDesc().getViewName(); -boolean createMVUseSuffix = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX) - || HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED); -if (createMVUseSuffix) { +if (useSuffix) { Review Comment: Since suffixing must be done for transactional MV, added a condition. Updated. Issue Time Tracking --- Worklog Id: (was: 777338) Time Spent: 12h 40m (was: 12.5h) > Make CTAS use Direct Insert Semantics > - > > Key: HIVE-26217 > URL: https://issues.apache.org/jira/browse/HIVE-26217 > Project: Hive > Issue Type: Improvement >Reporter: Sourabh Badhya >Assignee: Sourabh Badhya >Priority: Major > Labels: pull-request-available > Time Spent: 12h 40m > Remaining Estimate: 0h > > CTAS on transactional tables currently does a copy from staging location to > table location. This can be avoided by using Direct Insert semantics. Added > support for suffixed table locations as well. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics
[ https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777339=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777339 ] ASF GitHub Bot logged work on HIVE-26217: - Author: ASF GitHub Bot Created on: 02/Jun/22 08:22 Start Date: 02/Jun/22 08:22 Worklog Time Spent: 10m Work Description: SourabhBadhya commented on code in PR #3281: URL: https://github.com/apache/hive/pull/3281#discussion_r887629211 ## ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java: ## @@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext pCtx) throws SemanticExce try { String protoName = null, suffix = ""; boolean isExternal = false; - + boolean useSuffix = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX) + || HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED); + if (pCtx.getQueryProperties().isCTAS()) { protoName = pCtx.getCreateTable().getDbTableName(); isExternal = pCtx.getCreateTable().isExternal(); - +if (!isExternal && useSuffix) { + long txnId = Optional.ofNullable(pCtx.getContext()) + .map(ctx -> ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L); + suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, txnId); +} } else if (pCtx.getQueryProperties().isMaterializedView()) { protoName = pCtx.getCreateViewDesc().getViewName(); -boolean createMVUseSuffix = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX) - || HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED); -if (createMVUseSuffix) { +if (useSuffix) { Review Comment: Should we do transactional check for Materialised views? ## ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java: ## @@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext pCtx) throws SemanticExce try { String protoName = null, suffix = ""; boolean isExternal = false; - + boolean useSuffix = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX) + || HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED); + if (pCtx.getQueryProperties().isCTAS()) { protoName = pCtx.getCreateTable().getDbTableName(); isExternal = pCtx.getCreateTable().isExternal(); - +if (!isExternal && useSuffix) { + long txnId = Optional.ofNullable(pCtx.getContext()) + .map(ctx -> ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L); + suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, txnId); +} } else if (pCtx.getQueryProperties().isMaterializedView()) { protoName = pCtx.getCreateViewDesc().getViewName(); -boolean createMVUseSuffix = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX) - || HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED); -if (createMVUseSuffix) { +if (useSuffix) { Review Comment: Updated. Issue Time Tracking --- Worklog Id: (was: 777339) Time Spent: 12h 50m (was: 12h 40m) > Make CTAS use Direct Insert Semantics > - > > Key: HIVE-26217 > URL: https://issues.apache.org/jira/browse/HIVE-26217 > Project: Hive > Issue Type: Improvement >Reporter: Sourabh Badhya >Assignee: Sourabh Badhya >Priority: Major > Labels: pull-request-available > Time Spent: 12h 50m > Remaining Estimate: 0h > > CTAS on transactional tables currently does a copy from staging location to > table location. This can be avoided by using Direct Insert semantics. Added > support for suffixed table locations as well. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres
[ https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777334=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777334 ] ASF GitHub Bot logged work on HIVE-26267: - Author: ASF GitHub Bot Created on: 02/Jun/22 08:09 Start Date: 02/Jun/22 08:09 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3325: URL: https://github.com/apache/hive/pull/3325#discussion_r887686514 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -3746,7 +3747,7 @@ public CompactionResponse compact(CompactionRequest rqst) throws MetaException { } close(rs); Review Comment: could we wrap prep statement in try-catch so no explicit close is required Issue Time Tracking --- Worklog Id: (was: 777334) Time Spent: 0.5h (was: 20m) > Addendum to HIVE-26107: perpared statement is not working on Postgres > - > > Key: HIVE-26267 > URL: https://issues.apache.org/jira/browse/HIVE-26267 > Project: Hive > Issue Type: Bug >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > The assembled prepared statement in > {code:java} > org.apache.hadoop.hive.metastore.txn.TxnHandler#compact{code} > does not work for Postgres DB. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26267) Addendum to HIVE-26107: perpared statement is not working on Postgres
[ https://issues.apache.org/jira/browse/HIVE-26267?focusedWorklogId=777333=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777333 ] ASF GitHub Bot logged work on HIVE-26267: - Author: ASF GitHub Bot Created on: 02/Jun/22 08:07 Start Date: 02/Jun/22 08:07 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3325: URL: https://github.com/apache/hive/pull/3325#discussion_r887685084 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -3712,25 +3712,26 @@ public CompactionResponse compact(CompactionRequest rqst) throws MetaException { TxnUtils.createValidCompactWriteIdList(getValidWriteIds(request).getTblValidWriteIds().get(0)); LOG.debug("ValidCompactWriteIdList: " + tblValidWriteIds.writeToString()); -List params = new ArrayList<>(); StringBuilder sb = new StringBuilder("SELECT \"CQ_ID\", \"CQ_STATE\" FROM \"COMPACTION_QUEUE\" WHERE"). append(" (\"CQ_STATE\" IN("). append(quoteChar(INITIATED_STATE)).append(",").append(quoteChar(WORKING_STATE)). append(") OR (\"CQ_STATE\" = ").append(quoteChar(READY_FOR_CLEANING)). append(" AND \"CQ_HIGHEST_WRITE_ID\" = ?))"). append(" AND \"CQ_DATABASE\"=?"). append(" AND \"CQ_TABLE\"=?").append(" AND "); -params.add(Long.toString(tblValidWriteIds.getHighWatermark())); -params.add(rqst.getDbname()); -params.add(rqst.getTablename()); if(rqst.getPartitionname() == null) { sb.append("\"CQ_PARTITION\" is null"); } else { sb.append("\"CQ_PARTITION\"=?"); - params.add(rqst.getPartitionname()); } -pst = sqlGenerator.prepareStmtWithParameters(dbConn, sb.toString(), params); +pst = dbConn.prepareStatement(sqlGenerator.addEscapeCharacters(sb.toString())); +pst.setLong(1, tblValidWriteIds.getHighWatermark()); +pst.setString(2, rqst.getDbname()); +pst.setString(3, rqst.getTablename()); +if(rqst.getPartitionname() != null) { Review Comment: nit: space Issue Time Tracking --- Worklog Id: (was: 777333) Time Spent: 20m (was: 10m) > Addendum to HIVE-26107: perpared statement is not working on Postgres > - > > Key: HIVE-26267 > URL: https://issues.apache.org/jira/browse/HIVE-26267 > Project: Hive > Issue Type: Bug >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > The assembled prepared statement in > {code:java} > org.apache.hadoop.hive.metastore.txn.TxnHandler#compact{code} > does not work for Postgres DB. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26278) Add unit tests for Hive#getPartitionsByNames using batching
[ https://issues.apache.org/jira/browse/HIVE-26278?focusedWorklogId=777332=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777332 ] ASF GitHub Bot logged work on HIVE-26278: - Author: ASF GitHub Bot Created on: 02/Jun/22 07:58 Start Date: 02/Jun/22 07:58 Worklog Time Spent: 10m Work Description: zabetak commented on code in PR #3335: URL: https://github.com/apache/hive/pull/3335#discussion_r887676826 ## ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreClientApiArgumentsChecker.java: ## @@ -38,10 +38,13 @@ import org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrGreaterThan; import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory; import org.apache.thrift.TException; + +import org.junit.Assert; Review Comment: Unused import, will remove before merging. Issue Time Tracking --- Worklog Id: (was: 777332) Time Spent: 20m (was: 10m) > Add unit tests for Hive#getPartitionsByNames using batching > --- > > Key: HIVE-26278 > URL: https://issues.apache.org/jira/browse/HIVE-26278 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > [Hive#getPartitionsByNames|https://github.com/apache/hive/blob/6626b5564ee206db5a656d2f611ed71f10a0ffc1/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L4155] > supports decomposing requests in batches but there are no unit tests > checking for the ValidWriteIdList when batching is used. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (HIVE-25421) Fallback from vectorization when reading Iceberg's time columns from ORC files
[ https://issues.apache.org/jira/browse/HIVE-25421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ádám Szita resolved HIVE-25421. --- Fix Version/s: 4.0.0 Resolution: Fixed Committed to master. Thanks for the review [~lpinter] > Fallback from vectorization when reading Iceberg's time columns from ORC files > -- > > Key: HIVE-25421 > URL: https://issues.apache.org/jira/browse/HIVE-25421 > Project: Hive > Issue Type: Bug >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > As discussed in HIVE-25420 time column is not native Hive type, reading it is > more complicated, and is not supported for vectorized read. Trying this > currently results in an exception for ORC formatted files, so we should make > an effort to gracefully fall back to non-vectorized reads when there's such a > column in the query's projection. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-25421) Fallback from vectorization when reading Iceberg's time columns from ORC files
[ https://issues.apache.org/jira/browse/HIVE-25421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ádám Szita updated HIVE-25421: -- Description: As discussed in HIVE-25420 time column is not native Hive type, reading it is more complicated, and is not supported for vectorized read. Trying this currently results in an exception for ORC formatted files, so we should make an effort to gracefully fall back to non-vectorized reads when there's such a column in the query's projection. (was: As discussed in HIVE-25420 time column is not native Hive type, reading it is more complicated, and is not supported for vectorized read. Trying this currently results in an exception, so we should make an effort to * either gracefully fall back to non-vectorized reads when there's such a column in the query's projection * or work around the reading issue on the execution side.) > Fallback from vectorization when reading Iceberg's time columns from ORC files > -- > > Key: HIVE-25421 > URL: https://issues.apache.org/jira/browse/HIVE-25421 > Project: Hive > Issue Type: Bug >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > As discussed in HIVE-25420 time column is not native Hive type, reading it is > more complicated, and is not supported for vectorized read. Trying this > currently results in an exception for ORC formatted files, so we should make > an effort to gracefully fall back to non-vectorized reads when there's such a > column in the query's projection. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-25421) Fallback from vectorization when reading Iceberg's time columns from ORC files
[ https://issues.apache.org/jira/browse/HIVE-25421?focusedWorklogId=777331=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777331 ] ASF GitHub Bot logged work on HIVE-25421: - Author: ASF GitHub Bot Created on: 02/Jun/22 07:50 Start Date: 02/Jun/22 07:50 Worklog Time Spent: 10m Work Description: szlta merged PR #3334: URL: https://github.com/apache/hive/pull/3334 Issue Time Tracking --- Worklog Id: (was: 777331) Time Spent: 20m (was: 10m) > Fallback from vectorization when reading Iceberg's time columns from ORC files > -- > > Key: HIVE-25421 > URL: https://issues.apache.org/jira/browse/HIVE-25421 > Project: Hive > Issue Type: Bug >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > As discussed in HIVE-25420 time column is not native Hive type, reading it is > more complicated, and is not supported for vectorized read. Trying this > currently results in an exception for ORC formatted files, so we should make > an effort to gracefully fall back to non-vectorized reads when there's such a > column in the query's projection. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-25421) Fallback from vectorization when reading Iceberg's time columns from ORC files
[ https://issues.apache.org/jira/browse/HIVE-25421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ádám Szita updated HIVE-25421: -- Summary: Fallback from vectorization when reading Iceberg's time columns from ORC files (was: Fallback from vectorization when reading Iceberg's time columns) > Fallback from vectorization when reading Iceberg's time columns from ORC files > -- > > Key: HIVE-25421 > URL: https://issues.apache.org/jira/browse/HIVE-25421 > Project: Hive > Issue Type: Bug >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > As discussed in HIVE-25420 time column is not native Hive type, reading it is > more complicated, and is not supported for vectorized read. Trying this > currently results in an exception, so we should make an effort to > * either gracefully fall back to non-vectorized reads when there's such a > column in the query's projection > * or work around the reading issue on the execution side. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas
[ https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=777328=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777328 ] ASF GitHub Bot logged work on HIVE-26244: - Author: ASF GitHub Bot Created on: 02/Jun/22 07:33 Start Date: 02/Jun/22 07:33 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3307: URL: https://github.com/apache/hive/pull/3307#discussion_r887639411 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -5283,6 +5284,39 @@ is performed on that db (e.g. show tables, created table, etc). return response; } } + +if (isValidTxn(txnId)) { + LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar) + .orElseThrow(() -> new MetaException("Unknown lock type: " + lockChar)); + + if (lockType == LockType.EXCL_WRITE && blockedBy.state == LockState.ACQUIRED) { Review Comment: I don't really like that we are adding extra overhead in checkLocks method, it's already a sensitive part performance-wise. I think we should try to optimize: if it's CTAS we know that it could only be blocked by another artificial CTAS or DROP database (EXCLUSIVE + EXCL_WRITE), so no need to run an expensive checkLock `BIG` query. Also that would mean that we can just give up and do not check against TXNS table what is the type of blocking TXN. Also, I would expect IOW to behave similarly to CTAS, currently it doesn't fail and is executed in sequential order, however, it doesn't require any extra cleanup in case of failure. So I am OK with the selected approach, but we should try to optimize if possible. Issue Time Tracking --- Worklog Id: (was: 777328) Time Spent: 2h 50m (was: 2h 40m) > Implementing locking for concurrent ctas > > > Key: HIVE-26244 > URL: https://issues.apache.org/jira/browse/HIVE-26244 > Project: Hive > Issue Type: Improvement >Reporter: Simhadri G >Assignee: Simhadri G >Priority: Major > Labels: pull-request-available > Time Spent: 2h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas
[ https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=777325=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777325 ] ASF GitHub Bot logged work on HIVE-26244: - Author: ASF GitHub Bot Created on: 02/Jun/22 07:28 Start Date: 02/Jun/22 07:28 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3307: URL: https://github.com/apache/hive/pull/3307#discussion_r887639411 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -5283,6 +5284,39 @@ is performed on that db (e.g. show tables, created table, etc). return response; } } + +if (isValidTxn(txnId)) { + LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar) + .orElseThrow(() -> new MetaException("Unknown lock type: " + lockChar)); + + if (lockType == LockType.EXCL_WRITE && blockedBy.state == LockState.ACQUIRED) { Review Comment: I don't really like that we are adding extra overhead in checkLocks method, it's already a sensitive part performance-wise. I think we should try to optimize: if it's CTAS we know that it could only be blocked by another artificial CTAS or DROP database (EXCLUSIVE + EXCL_WRITE), so no need to run expensive checkLock `BIG` query. Also, I would expect IOW to behave similarly to CTAS, currently it doesn't fail and is executed in sequential order, however, it doesn't require any extra cleanup in case of failure. So I am OK with the selected approach, but we should try to optimize if possible. Issue Time Tracking --- Worklog Id: (was: 777325) Time Spent: 2h 40m (was: 2.5h) > Implementing locking for concurrent ctas > > > Key: HIVE-26244 > URL: https://issues.apache.org/jira/browse/HIVE-26244 > Project: Hive > Issue Type: Improvement >Reporter: Simhadri G >Assignee: Simhadri G >Priority: Major > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas
[ https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=777322=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777322 ] ASF GitHub Bot logged work on HIVE-26244: - Author: ASF GitHub Bot Created on: 02/Jun/22 07:26 Start Date: 02/Jun/22 07:26 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3307: URL: https://github.com/apache/hive/pull/3307#discussion_r887639411 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -5283,6 +5284,39 @@ is performed on that db (e.g. show tables, created table, etc). return response; } } + +if (isValidTxn(txnId)) { + LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar) + .orElseThrow(() -> new MetaException("Unknown lock type: " + lockChar)); + + if (lockType == LockType.EXCL_WRITE && blockedBy.state == LockState.ACQUIRED) { Review Comment: I don't really like that we are adding extra overhead in checkLocks method, it's already a sensitive part performance-wise. I think we should try to optimize: if it's CTAS we know that it could only be blocked by another artificial CTAS or DROP database (EXCLUSIVE + EXCL_WRITE), so no need to run expensive checkLock `BIG` query. Also, I would expect IOW to behave similarly to CTAS, currently it doesn't fail and is executed in sequential order. Issue Time Tracking --- Worklog Id: (was: 777322) Time Spent: 2.5h (was: 2h 20m) > Implementing locking for concurrent ctas > > > Key: HIVE-26244 > URL: https://issues.apache.org/jira/browse/HIVE-26244 > Project: Hive > Issue Type: Improvement >Reporter: Simhadri G >Assignee: Simhadri G >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas
[ https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=777316=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777316 ] ASF GitHub Bot logged work on HIVE-26244: - Author: ASF GitHub Bot Created on: 02/Jun/22 07:22 Start Date: 02/Jun/22 07:22 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3307: URL: https://github.com/apache/hive/pull/3307#discussion_r887645373 ## ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java: ## @@ -13910,7 +13911,13 @@ private void addDbAndTabToOutputs(String[] qualifiedTabName, TableType type, for(Map.Entry serdeMap : storageFormat.getSerdeProps().entrySet()){ t.setSerdeParam(serdeMap.getKey(), serdeMap.getValue()); } -outputs.add(new WriteEntity(t, WriteEntity.WriteType.DDL_NO_LOCK)); +if (tblProps != null && +tblProps.get(TABLE_IS_CTAS) == "true" && Review Comment: could we do `Boolean.parseBoolean(..)` instead of comparing with the string Issue Time Tracking --- Worklog Id: (was: 777316) Time Spent: 2h 20m (was: 2h 10m) > Implementing locking for concurrent ctas > > > Key: HIVE-26244 > URL: https://issues.apache.org/jira/browse/HIVE-26244 > Project: Hive > Issue Type: Improvement >Reporter: Simhadri G >Assignee: Simhadri G >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics
[ https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777300=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777300 ] ASF GitHub Bot logged work on HIVE-26217: - Author: ASF GitHub Bot Created on: 02/Jun/22 07:00 Start Date: 02/Jun/22 07:00 Worklog Time Spent: 10m Work Description: SourabhBadhya commented on code in PR #3281: URL: https://github.com/apache/hive/pull/3281#discussion_r887625663 ## ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java: ## @@ -7598,6 +7602,26 @@ protected Operator genFileSinkPlan(String dest, QB qb, Operator input) destTableIsTransactional = tblProps != null && AcidUtils.isTablePropertyTransactional(tblProps); if (destTableIsTransactional) { +isNonNativeTable = MetaStoreUtils.isNonNativeTable(tblProps); +boolean isCtas = tblDesc != null && tblDesc.isCTAS(); +isMmTable = isMmCreate = AcidUtils.isInsertOnlyTable(tblProps); +if (!isNonNativeTable && !destTableIsTemporary && isCtas) { + destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps); + acidOperation = getAcidType(dest); + isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation); + boolean enableSuffixing = HiveConf.getBoolVar(conf, ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX) Review Comment: Updated. ## ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java: ## @@ -7598,6 +7602,26 @@ protected Operator genFileSinkPlan(String dest, QB qb, Operator input) destTableIsTransactional = tblProps != null && AcidUtils.isTablePropertyTransactional(tblProps); if (destTableIsTransactional) { +isNonNativeTable = MetaStoreUtils.isNonNativeTable(tblProps); +boolean isCtas = tblDesc != null && tblDesc.isCTAS(); +isMmTable = isMmCreate = AcidUtils.isInsertOnlyTable(tblProps); +if (!isNonNativeTable && !destTableIsTemporary && isCtas) { + destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps); + acidOperation = getAcidType(dest); + isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation); + boolean enableSuffixing = HiveConf.getBoolVar(conf, ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX) + || HiveConf.getBoolVar(conf, ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED); + if (isDirectInsert || isMmTable) { +destinationPath = getCTASDestinationTableLocation(tblDesc, enableSuffixing); +// Setting the location so that metadata transformers +// does not change the location later while creating the table. +tblDesc.setLocation(destinationPath.toString()); +// Property SOFT_DELETE_TABLE needs to be added to indicate that suffixing is used. +if (enableSuffixing && tblDesc.getLocation().matches("(.*)" + SOFT_DELETE_TABLE_PATTERN)) { Review Comment: Updated. Issue Time Tracking --- Worklog Id: (was: 777300) Time Spent: 11h 40m (was: 11.5h) > Make CTAS use Direct Insert Semantics > - > > Key: HIVE-26217 > URL: https://issues.apache.org/jira/browse/HIVE-26217 > Project: Hive > Issue Type: Improvement >Reporter: Sourabh Badhya >Assignee: Sourabh Badhya >Priority: Major > Labels: pull-request-available > Time Spent: 11h 40m > Remaining Estimate: 0h > > CTAS on transactional tables currently does a copy from staging location to > table location. This can be avoided by using Direct Insert semantics. Added > support for suffixed table locations as well. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas
[ https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=777312=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777312 ] ASF GitHub Bot logged work on HIVE-26244: - Author: ASF GitHub Bot Created on: 02/Jun/22 07:15 Start Date: 02/Jun/22 07:15 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3307: URL: https://github.com/apache/hive/pull/3307#discussion_r887639411 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -5283,6 +5284,39 @@ is performed on that db (e.g. show tables, created table, etc). return response; } } + +if (isValidTxn(txnId)) { + LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar) + .orElseThrow(() -> new MetaException("Unknown lock type: " + lockChar)); + + if (lockType == LockType.EXCL_WRITE && blockedBy.state == LockState.ACQUIRED) { Review Comment: I don't really like that we are adding extra overhead in checkLocks method, it's already a sensitive part performance-wise. I think we should try to optimize: if it's CTAS we know that it could only be blocked by another artificial CTAS or DROP database, so no need to run expensive checkLock `BIG` query. Also, I would expect IOW to behave similarly to CTAS, currently it doesn't fail and is executed in sequential order. Issue Time Tracking --- Worklog Id: (was: 777312) Time Spent: 2h 10m (was: 2h) > Implementing locking for concurrent ctas > > > Key: HIVE-26244 > URL: https://issues.apache.org/jira/browse/HIVE-26244 > Project: Hive > Issue Type: Improvement >Reporter: Simhadri G >Assignee: Simhadri G >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics
[ https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777307=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777307 ] ASF GitHub Bot logged work on HIVE-26217: - Author: ASF GitHub Bot Created on: 02/Jun/22 07:04 Start Date: 02/Jun/22 07:04 Worklog Time Spent: 10m Work Description: SourabhBadhya commented on code in PR #3281: URL: https://github.com/apache/hive/pull/3281#discussion_r887629211 ## ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java: ## @@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext pCtx) throws SemanticExce try { String protoName = null, suffix = ""; boolean isExternal = false; - + boolean useSuffix = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX) + || HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED); + if (pCtx.getQueryProperties().isCTAS()) { protoName = pCtx.getCreateTable().getDbTableName(); isExternal = pCtx.getCreateTable().isExternal(); - +if (!isExternal && useSuffix) { + long txnId = Optional.ofNullable(pCtx.getContext()) + .map(ctx -> ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L); + suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, txnId); +} } else if (pCtx.getQueryProperties().isMaterializedView()) { protoName = pCtx.getCreateViewDesc().getViewName(); -boolean createMVUseSuffix = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX) - || HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED); -if (createMVUseSuffix) { +if (useSuffix) { Review Comment: Should we do transactional check for Materialised views? Issue Time Tracking --- Worklog Id: (was: 777307) Time Spent: 12.5h (was: 12h 20m) > Make CTAS use Direct Insert Semantics > - > > Key: HIVE-26217 > URL: https://issues.apache.org/jira/browse/HIVE-26217 > Project: Hive > Issue Type: Improvement >Reporter: Sourabh Badhya >Assignee: Sourabh Badhya >Priority: Major > Labels: pull-request-available > Time Spent: 12.5h > Remaining Estimate: 0h > > CTAS on transactional tables currently does a copy from staging location to > table location. This can be avoided by using Direct Insert semantics. Added > support for suffixed table locations as well. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics
[ https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777305=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777305 ] ASF GitHub Bot logged work on HIVE-26217: - Author: ASF GitHub Bot Created on: 02/Jun/22 07:03 Start Date: 02/Jun/22 07:03 Worklog Time Spent: 10m Work Description: SourabhBadhya commented on code in PR #3281: URL: https://github.com/apache/hive/pull/3281#discussion_r887627534 ## ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java: ## @@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext pCtx) throws SemanticExce try { String protoName = null, suffix = ""; boolean isExternal = false; - + boolean useSuffix = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX) + || HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED); + if (pCtx.getQueryProperties().isCTAS()) { protoName = pCtx.getCreateTable().getDbTableName(); isExternal = pCtx.getCreateTable().isExternal(); - +if (!isExternal && useSuffix) { Review Comment: Updated. Issue Time Tracking --- Worklog Id: (was: 777305) Time Spent: 12h 10m (was: 12h) > Make CTAS use Direct Insert Semantics > - > > Key: HIVE-26217 > URL: https://issues.apache.org/jira/browse/HIVE-26217 > Project: Hive > Issue Type: Improvement >Reporter: Sourabh Badhya >Assignee: Sourabh Badhya >Priority: Major > Labels: pull-request-available > Time Spent: 12h 10m > Remaining Estimate: 0h > > CTAS on transactional tables currently does a copy from staging location to > table location. This can be avoided by using Direct Insert semantics. Added > support for suffixed table locations as well. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics
[ https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777301=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777301 ] ASF GitHub Bot logged work on HIVE-26217: - Author: ASF GitHub Bot Created on: 02/Jun/22 07:00 Start Date: 02/Jun/22 07:00 Worklog Time Spent: 10m Work Description: SourabhBadhya commented on code in PR #3281: URL: https://github.com/apache/hive/pull/3281#discussion_r887625817 ## ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java: ## @@ -7940,6 +7970,46 @@ protected Operator genFileSinkPlan(String dest, QB qb, Operator input) return output; } + private Path getCTASDestinationTableLocation(CreateTableDesc tblDesc, boolean enableSuffixing) throws SemanticException { +Path location; +String suffix = ""; +try { + // When location is specified, suffix is not added + if (tblDesc.getLocation() == null) { +String protoName = tblDesc.getDbTableName(); +String[] names = Utilities.getDbTableName(protoName); +if (enableSuffixing) { + long txnId = ctx.getHiveTxnManager().getCurrentTxnId(); + suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, txnId); +} +if (!db.databaseExists(names[0])) { + throw new SemanticException("ERROR: The database " + names[0] + " does not exist."); +} + +Warehouse wh = new Warehouse(conf); +location = wh.getDefaultTablePath(db.getDatabase(names[0]), names[1] + suffix, false); + } else { +location = new Path(tblDesc.getLocation()); + } + + // Handle table translation + // Property modifications of the table is handled later. + // We are interested in the location if it has changed + // due to table translation. + Table tbl = tblDesc.toTable(conf); + tbl = db.getTranslateTableDryrun(tbl.getTTable()); Review Comment: Updated. Issue Time Tracking --- Worklog Id: (was: 777301) Time Spent: 11h 50m (was: 11h 40m) > Make CTAS use Direct Insert Semantics > - > > Key: HIVE-26217 > URL: https://issues.apache.org/jira/browse/HIVE-26217 > Project: Hive > Issue Type: Improvement >Reporter: Sourabh Badhya >Assignee: Sourabh Badhya >Priority: Major > Labels: pull-request-available > Time Spent: 11h 50m > Remaining Estimate: 0h > > CTAS on transactional tables currently does a copy from staging location to > table location. This can be avoided by using Direct Insert semantics. Added > support for suffixed table locations as well. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics
[ https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777298=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777298 ] ASF GitHub Bot logged work on HIVE-26217: - Author: ASF GitHub Bot Created on: 02/Jun/22 06:59 Start Date: 02/Jun/22 06:59 Worklog Time Spent: 10m Work Description: SourabhBadhya commented on code in PR #3281: URL: https://github.com/apache/hive/pull/3281#discussion_r887624982 ## ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java: ## @@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext pCtx) throws SemanticExce try { String protoName = null, suffix = ""; boolean isExternal = false; - + boolean useSuffix = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX) Review Comment: Renamed to createTableOrMVUseSuffix. Updated. Issue Time Tracking --- Worklog Id: (was: 777298) Time Spent: 11h 20m (was: 11h 10m) > Make CTAS use Direct Insert Semantics > - > > Key: HIVE-26217 > URL: https://issues.apache.org/jira/browse/HIVE-26217 > Project: Hive > Issue Type: Improvement >Reporter: Sourabh Badhya >Assignee: Sourabh Badhya >Priority: Major > Labels: pull-request-available > Time Spent: 11h 20m > Remaining Estimate: 0h > > CTAS on transactional tables currently does a copy from staging location to > table location. This can be avoided by using Direct Insert semantics. Added > support for suffixed table locations as well. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics
[ https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777306=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777306 ] ASF GitHub Bot logged work on HIVE-26217: - Author: ASF GitHub Bot Created on: 02/Jun/22 07:03 Start Date: 02/Jun/22 07:03 Worklog Time Spent: 10m Work Description: SourabhBadhya commented on code in PR #3281: URL: https://github.com/apache/hive/pull/3281#discussion_r887628060 ## ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java: ## @@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext pCtx) throws SemanticExce try { String protoName = null, suffix = ""; boolean isExternal = false; - + boolean useSuffix = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX) + || HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED); + if (pCtx.getQueryProperties().isCTAS()) { protoName = pCtx.getCreateTable().getDbTableName(); isExternal = pCtx.getCreateTable().isExternal(); - +if (!isExternal && useSuffix) { + long txnId = Optional.ofNullable(pCtx.getContext()) Review Comment: Updated and added a new function called "getTableOrMVSuffix" Issue Time Tracking --- Worklog Id: (was: 777306) Time Spent: 12h 20m (was: 12h 10m) > Make CTAS use Direct Insert Semantics > - > > Key: HIVE-26217 > URL: https://issues.apache.org/jira/browse/HIVE-26217 > Project: Hive > Issue Type: Improvement >Reporter: Sourabh Badhya >Assignee: Sourabh Badhya >Priority: Major > Labels: pull-request-available > Time Spent: 12h 20m > Remaining Estimate: 0h > > CTAS on transactional tables currently does a copy from staging location to > table location. This can be avoided by using Direct Insert semantics. Added > support for suffixed table locations as well. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics
[ https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777304=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777304 ] ASF GitHub Bot logged work on HIVE-26217: - Author: ASF GitHub Bot Created on: 02/Jun/22 07:02 Start Date: 02/Jun/22 07:02 Worklog Time Spent: 10m Work Description: SourabhBadhya commented on code in PR #3281: URL: https://github.com/apache/hive/pull/3281#discussion_r887627260 ## ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java: ## @@ -7940,6 +7970,46 @@ protected Operator genFileSinkPlan(String dest, QB qb, Operator input) return output; } + private Path getCTASDestinationTableLocation(CreateTableDesc tblDesc, boolean enableSuffixing) throws SemanticException { +Path location; +String suffix = ""; +try { + // When location is specified, suffix is not added + if (tblDesc.getLocation() == null) { Review Comment: Location check happens now at the point where suffixing is decided. Updated. https://github.com/apache/hive/pull/3281/files#diff-d4b1a32bbbd9e283893a6b52854c7aeb3e356a1ba1add2c4107e52901ca268f9R7616-R7618 ## ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java: ## @@ -7940,6 +7970,46 @@ protected Operator genFileSinkPlan(String dest, QB qb, Operator input) return output; } + private Path getCTASDestinationTableLocation(CreateTableDesc tblDesc, boolean enableSuffixing) throws SemanticException { Review Comment: Updated. Issue Time Tracking --- Worklog Id: (was: 777304) Time Spent: 12h (was: 11h 50m) > Make CTAS use Direct Insert Semantics > - > > Key: HIVE-26217 > URL: https://issues.apache.org/jira/browse/HIVE-26217 > Project: Hive > Issue Type: Improvement >Reporter: Sourabh Badhya >Assignee: Sourabh Badhya >Priority: Major > Labels: pull-request-available > Time Spent: 12h > Remaining Estimate: 0h > > CTAS on transactional tables currently does a copy from staging location to > table location. This can be avoided by using Direct Insert semantics. Added > support for suffixed table locations as well. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics
[ https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777299=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777299 ] ASF GitHub Bot logged work on HIVE-26217: - Author: ASF GitHub Bot Created on: 02/Jun/22 06:59 Start Date: 02/Jun/22 06:59 Worklog Time Spent: 10m Work Description: SourabhBadhya commented on code in PR #3281: URL: https://github.com/apache/hive/pull/3281#discussion_r887625106 ## ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java: ## @@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext pCtx) throws SemanticExce try { String protoName = null, suffix = ""; boolean isExternal = false; - + boolean useSuffix = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX) + || HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED); + if (pCtx.getQueryProperties().isCTAS()) { protoName = pCtx.getCreateTable().getDbTableName(); isExternal = pCtx.getCreateTable().isExternal(); - +if (!isExternal && useSuffix) { + long txnId = Optional.ofNullable(pCtx.getContext()) + .map(ctx -> ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L); + suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, txnId); Review Comment: Updated. ## ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java: ## @@ -8229,9 +8299,17 @@ private void handleLineage(LoadTableDesc ltd, Operator output) Path tlocation = null; String tName = Utilities.getDbTableName(tableDesc.getDbTableName())[1]; try { +String suffix = ""; +if (AcidUtils.isTransactionalTable(destinationTable)) { Review Comment: Updated. Issue Time Tracking --- Worklog Id: (was: 777299) Time Spent: 11.5h (was: 11h 20m) > Make CTAS use Direct Insert Semantics > - > > Key: HIVE-26217 > URL: https://issues.apache.org/jira/browse/HIVE-26217 > Project: Hive > Issue Type: Improvement >Reporter: Sourabh Badhya >Assignee: Sourabh Badhya >Priority: Major > Labels: pull-request-available > Time Spent: 11.5h > Remaining Estimate: 0h > > CTAS on transactional tables currently does a copy from staging location to > table location. This can be avoided by using Direct Insert semantics. Added > support for suffixed table locations as well. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (HIVE-26274) No vectorization if query has upper case window function
[ https://issues.apache.org/jira/browse/HIVE-26274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa resolved HIVE-26274. --- Resolution: Fixed Pushed to master. Thanks [~abstractdog] for review. > No vectorization if query has upper case window function > > > Key: HIVE-26274 > URL: https://issues.apache.org/jira/browse/HIVE-26274 > Project: Hive > Issue Type: Bug >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > {code} > CREATE TABLE t1 (a int, b int); > EXPLAIN VECTORIZATION ONLY SELECT ROW_NUMBER() OVER(order by a) AS rn FROM t1; > {code} > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Vertices: > Map 1 > Execution mode: vectorized, llap > LLAP IO: all inputs > Map Vectorization: > enabled: true > enabledConditionsMet: > hive.vectorized.use.vector.serde.deserialize IS true > inputFormatFeatureSupport: [DECIMAL_64] > featureSupportInUse: [DECIMAL_64] > inputFileFormats: org.apache.hadoop.mapred.TextInputFormat > allNative: true > usesVectorUDFAdaptor: false > vectorized: true > Reducer 2 > Execution mode: llap > Reduce Vectorization: > enabled: true > enableConditionsMet: hive.vectorized.execution.reduce.enabled > IS true, hive.execution.engine tez IN [tez] IS true > notVectorizedReason: PTF operator: ROW_NUMBER not in > supported functions [avg, count, dense_rank, first_value, lag, last_value, > lead, max, min, rank, row_number, sum] > vectorized: false > Stage: Stage-0 > Fetch Operator > {code} > {code} > notVectorizedReason: PTF operator: ROW_NUMBER not in > supported functions [avg, count, dense_rank, first_value, lag, last_value, > lead, max, min, rank, row_number, sum] > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26274) No vectorization if query has upper case window function
[ https://issues.apache.org/jira/browse/HIVE-26274?focusedWorklogId=777296=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777296 ] ASF GitHub Bot logged work on HIVE-26274: - Author: ASF GitHub Bot Created on: 02/Jun/22 06:54 Start Date: 02/Jun/22 06:54 Worklog Time Spent: 10m Work Description: kasakrisz merged PR #3332: URL: https://github.com/apache/hive/pull/3332 Issue Time Tracking --- Worklog Id: (was: 777296) Time Spent: 0.5h (was: 20m) > No vectorization if query has upper case window function > > > Key: HIVE-26274 > URL: https://issues.apache.org/jira/browse/HIVE-26274 > Project: Hive > Issue Type: Bug >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > {code} > CREATE TABLE t1 (a int, b int); > EXPLAIN VECTORIZATION ONLY SELECT ROW_NUMBER() OVER(order by a) AS rn FROM t1; > {code} > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Vertices: > Map 1 > Execution mode: vectorized, llap > LLAP IO: all inputs > Map Vectorization: > enabled: true > enabledConditionsMet: > hive.vectorized.use.vector.serde.deserialize IS true > inputFormatFeatureSupport: [DECIMAL_64] > featureSupportInUse: [DECIMAL_64] > inputFileFormats: org.apache.hadoop.mapred.TextInputFormat > allNative: true > usesVectorUDFAdaptor: false > vectorized: true > Reducer 2 > Execution mode: llap > Reduce Vectorization: > enabled: true > enableConditionsMet: hive.vectorized.execution.reduce.enabled > IS true, hive.execution.engine tez IN [tez] IS true > notVectorizedReason: PTF operator: ROW_NUMBER not in > supported functions [avg, count, dense_rank, first_value, lag, last_value, > lead, max, min, rank, row_number, sum] > vectorized: false > Stage: Stage-0 > Fetch Operator > {code} > {code} > notVectorizedReason: PTF operator: ROW_NUMBER not in > supported functions [avg, count, dense_rank, first_value, lag, last_value, > lead, max, min, rank, row_number, sum] > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26274) No vectorization if query has upper case window function
[ https://issues.apache.org/jira/browse/HIVE-26274?focusedWorklogId=777293=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777293 ] ASF GitHub Bot logged work on HIVE-26274: - Author: ASF GitHub Bot Created on: 02/Jun/22 06:46 Start Date: 02/Jun/22 06:46 Worklog Time Spent: 10m Work Description: abstractdog commented on PR #3332: URL: https://github.com/apache/hive/pull/3332#issuecomment-1144501074 LGTM, thanks for the patch @kasakrisz Issue Time Tracking --- Worklog Id: (was: 777293) Time Spent: 20m (was: 10m) > No vectorization if query has upper case window function > > > Key: HIVE-26274 > URL: https://issues.apache.org/jira/browse/HIVE-26274 > Project: Hive > Issue Type: Bug >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > {code} > CREATE TABLE t1 (a int, b int); > EXPLAIN VECTORIZATION ONLY SELECT ROW_NUMBER() OVER(order by a) AS rn FROM t1; > {code} > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Vertices: > Map 1 > Execution mode: vectorized, llap > LLAP IO: all inputs > Map Vectorization: > enabled: true > enabledConditionsMet: > hive.vectorized.use.vector.serde.deserialize IS true > inputFormatFeatureSupport: [DECIMAL_64] > featureSupportInUse: [DECIMAL_64] > inputFileFormats: org.apache.hadoop.mapred.TextInputFormat > allNative: true > usesVectorUDFAdaptor: false > vectorized: true > Reducer 2 > Execution mode: llap > Reduce Vectorization: > enabled: true > enableConditionsMet: hive.vectorized.execution.reduce.enabled > IS true, hive.execution.engine tez IN [tez] IS true > notVectorizedReason: PTF operator: ROW_NUMBER not in > supported functions [avg, count, dense_rank, first_value, lag, last_value, > lead, max, min, rank, row_number, sum] > vectorized: false > Stage: Stage-0 > Fetch Operator > {code} > {code} > notVectorizedReason: PTF operator: ROW_NUMBER not in > supported functions [avg, count, dense_rank, first_value, lag, last_value, > lead, max, min, rank, row_number, sum] > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26285) Overwrite database metadata on original source in optimised failover.
[ https://issues.apache.org/jira/browse/HIVE-26285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haymant Mangla updated HIVE-26285: -- Parent: HIVE-25699 Issue Type: Sub-task (was: Bug) > Overwrite database metadata on original source in optimised failover. > - > > Key: HIVE-26285 > URL: https://issues.apache.org/jira/browse/HIVE-26285 > Project: Hive > Issue Type: Sub-task >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.7#820007)