Yu-Wen Lai created HIVE-25113: --------------------------------- Summary: Connection starvation in TxnHandler.getValidWriteIds Key: HIVE-25113 URL: https://issues.apache.org/jira/browse/HIVE-25113 Project: Hive Issue Type: Bug Components: Transactions Reporter: Yu-Wen Lai Assignee: Yu-Wen Lai
The current code looks like below. {code:java} dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED); validTxnList = TxnUtils.createValidReadTxnList(getOpenTxns(), 0); {code} In the function getOpenTxns, it will request another connection from pool. That is, this thread already held a connection, however, it would request for another connection. When there are more than 10 (default connection pool size) simultaneous getValidWriteIds requests, it can cause a starvation problem. In that situation, each thread holds a connection and waits for another connection. Then, we will see the following exception after timeout. {code:java} metastore.RetryingHMSHandler: MetaException(message:Unable to select from transaction database, java.sql.SQLTransientConnectionException: HikariPool-3 - Connection is not available, request timed out after 30000ms.{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)