[ https://issues.apache.org/jira/browse/HIVE-26704?focusedWorklogId=847928&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-847928 ]
ASF GitHub Bot logged work on HIVE-26704: ----------------------------------------- Author: ASF GitHub Bot Created on: 28/Feb/23 04:31 Start Date: 28/Feb/23 04:31 Worklog Time Spent: 10m Work Description: SourabhBadhya commented on code in PR #3576: URL: https://github.com/apache/hive/pull/3576#discussion_r1119563491 ########## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java: ########## @@ -1535,20 +1556,14 @@ public void setHadoopJobId(String hadoopJobId, long id) { @Override @RetrySemantics.Idempotent public long findMinOpenTxnIdForCleaner() throws MetaException { - Connection dbConn = null; try { - try { - dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED, connPoolCompaction); + try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED, connPoolCompaction)) { return getMinOpenTxnIdWaterMark(dbConn); } catch (SQLException e) { - LOG.error("Unable to getMinOpenTxnIdForCleaner", e); - rollbackDBConn(dbConn); - checkRetryable(e, "getMinOpenTxnForCleaner"); - throw new MetaException("Unable to execute getMinOpenTxnIfForCleaner() " + - e.getMessage()); - } finally { - closeDbConn(dbConn); - } + LOG.error("Unable to findMinOpenTxnIdForCleaner", e); + checkRetryable(e, "findMinOpenTxnIdForCleaner"); + throw new MetaException("Unable to execute getMinOpenTxnIfForCleaner() " + e.getMessage()); Review Comment: nit: Typo - `getMinOpenTxnIdForCleaner` instead of `getMinOpenTxnIfForCleaner` Issue Time Tracking ------------------- Worklog Id: (was: 847928) Time Spent: 5h 20m (was: 5h 10m) > Cleaner shouldn't be blocked by global min open txnId > ----------------------------------------------------- > > Key: HIVE-26704 > URL: https://issues.apache.org/jira/browse/HIVE-26704 > Project: Hive > Issue Type: Task > Reporter: Denys Kuzmenko > Assignee: Denys Kuzmenko > Priority: Major > Labels: pull-request-available > Time Spent: 5h 20m > Remaining Estimate: 0h > > *Single transaction blocks cluster-wide Cleaner operations* > Currently, if there is a single long-running transaction that can prevent the > Cleaner to clean up any tables. This causes file buildup in tables, which can > cause performance penalties when listing the directories (note that the > compaction is not blocked by this, so unnecessary data is not read, but the > files remain there which causes performance penalty). > We can reduce the protected files from the open transaction if we have > query-table correlation data stored in the backend DB, but this change will > need the current method of recording that detail to be revisited. > The naive and somewhat backward-compatible approach is to capture the > minOpenWriteIds per table. It involves a non-mutation operation (as in, there > is no need for the HMS DB to wait for another user’s operation to record it). > This does spew data writes into the HMS backend DB, but this is a blind > insert operation that can be group-committed across many users. -- This message was sent by Atlassian Jira (v8.20.10#820010)