Yongzhi Chen created HIVE-24236:
-----------------------------------

             Summary: Connection leak in TxnHandler
                 Key: HIVE-24236
                 URL: https://issues.apache.org/jira/browse/HIVE-24236
             Project: Hive
          Issue Type: Bug
          Components: Metastore
            Reporter: Yongzhi Chen
            Assignee: Yongzhi Chen


We see failures in QE tests with cannot allocate connections errors. The 
exception stack like following:
{noformat}
2020-09-29T18:44:26,563 INFO  [Heartbeater-0]: txn.TxnHandler 
(TxnHandler.java:checkRetryable(3733)) - Non-retryable error in 
heartbeat(HeartbeatRequest(lockid:0, txnid:11908)) : Cannot get a connection, 
general error (SQLState=null, ErrorCode=0)
2020-09-29T18:44:26,564 ERROR [Heartbeater-0]: metastore.RetryingHMSHandler 
(RetryingHMSHandler.java:invokeInternal(201)) - MetaException(message:Unable to 
select from transaction database org.apache.commons.dbcp.SQLNestedException: 
Cannot get a connection, general error
        at 
org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:118)
        at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3605)
        at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3598)
        at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2739)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452)
        at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
        at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
        at com.sun.proxy.$Proxy63.heartbeat(Unknown Source)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.heartbeat(HiveMetaStoreClient.java:3247)
        at sun.reflect.GeneratedMethodAccessor414.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:213)
        at com.sun.proxy.$Proxy64.heartbeat(Unknown Source)
        at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:671)
        at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.lambda$run$0(DbTxnManager.java:1102)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
        at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:1101)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.InterruptedException
        at java.lang.Object.wait(Native Method)
        at 
org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1112)
        at 
org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
        ... 29 more
)
        at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2747)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452)
        at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source)
{noformat}

and
{noformat}
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
        at 
org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1134)
        at 
org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
        ... 53 more
)
        at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.cleanupRecords(TxnHandler.java:3375)
        at 
org.apache.hadoop.hive.metastore.AcidEventListener.onDropTable(AcidEventListener.java:65)
        at 
org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$19.notify(MetaStoreListenerNotifier.java:103)
        at 
org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:285)
        at 
org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:347)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:2986)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:3240)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table(HiveMetaStore.java:3227)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_database_core(HiveMetaStore.java:1879)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_database(HiveMetaStore.java:1978)
        at sun.reflect.GeneratedMethodAccessor369.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
        at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
        at com.sun.proxy.$Proxy63.drop_database(Unknown Source)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabaseCascadePerDb(HiveMetaStoreClient.java:1393)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabase(HiveMetaStoreClient.java:1324)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabase(HiveMetaStoreClient.java:1277)
        at sun.reflect.GeneratedMethodAccessor368.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:213)
        at com.sun.proxy.$Proxy64.dropDatabase(Unknown Source)
        at org.apache.hadoop.hive.ql.metadata.Hive.dropDatabase(Hive.java:618)
        at 
org.apache.hadoop.hive.ql.ddl.database.drop.DropDatabaseOperation.execute(DropDatabaseOperation.java:50)
        at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:80)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213)
        at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105)
        at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357)
        at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330)
        at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246)
        at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:740)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:495)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:489)
        at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166)
        at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225)
{noformat}

It is not easy to reproduce, so there are possible connection leaks under some 
error conditions. This jira is to fix code that may cause connection leaks in 
TxnHandler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to