Diego Jaramillo created HDDS-6102: ------------------------------------- Summary: Integrating Ozone with Hive produce a thread leak in HS2 server Key: HDDS-6102 URL: https://issues.apache.org/jira/browse/HDDS-6102 Project: Apache Ozone Issue Type: Bug Components: OFS Affects Versions: 1.1.0 Environment: ozone: 1.1.0 hadoop: 3.1.1 hive: 3.1.0 tez: 0.10.0 (this version is needed because of [TEZ-4032|https://issues.apache.org/jira/browse/TEZ-4032])
both the hadoop cluster + ozone are secured using kerberos. Reporter: Diego Jaramillo Integration ozone with hive is producing a thread leak in HS2, in this sample, 12 open connections to hive produced 149 threads and the count kept increasing until HS2 needed to be restarted. SETTINGS: HDFS integration using the following settings - viewfs-mount-table: fs.viewfs.mounttable.clusters.link./cluster1=hdfs://cluster1 fs.viewfs.mounttable.clusters.link./ozfs1=ofs://ozfs1 - core-site.xml: fs.ofs.impl=org.apache.hadoop.fs.ozone.RootedOzoneFileSystem fs.AbstractFileSystem.o3fs.impl=org.apache.hadoop.fs.ozone.OzFs - hdfs-site.xml: ozone.om.service.ids=ozfs1 ozone.om.nodes.ozfs1=om1,om2 ozone.om.address.ozfs1.om1=ozone1.domain.com:9862 ozone.om.address.ozfs1.om2=ozone2.domain.com:9862 dfs.nameservices=cluster1,ozfs1 ozone.om.kerberos.keytab.file=/etc/security/keytabs/om.service.keytab ozone.om.kerberos.principal=om/_h...@domain.com Hive integration using the following setting - hive-site.xml: tez.job.fs-servers=hdfs://cluster1,ofs://ozfs1 mapreduce.job.hdfs-servers=hdfs://cluster1,ofs://ozfs1 >From hive's stack trace we see many thread like these: {{Thread 4958 (Truststore reloader thread):}} {{ State: TIMED_WAITING}} {{ Blocked count: 0}} {{ Waited count: 221}} {{ Stack:}} {{ java.lang.Thread.$$YJP$$sleep(Native Method)}} {{ java.lang.Thread.sleep(Thread.java)}} {{ org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run(ReloadingX509TrustManager.java:195)}} {{ java.lang.Thread.run(Thread.java:748)}} {{Thread 4948 (Truststore reloader thread):}} {{ State: TIMED_WAITING}} {{ Blocked count: 79}} {{ Waited count: 221}} {{ Stack:}} {{ java.lang.Thread.$$YJP$$sleep(Native Method)}} {{ java.lang.Thread.sleep(Thread.java)}} {{ org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run(ReloadingX509TrustManager.java:195)}} {{ java.lang.Thread.run(Thread.java:748)}} {{Thread 4777 (Truststore reloader thread):}} {{ State: TIMED_WAITING}} {{ Blocked count: 0}} {{ Waited count: 252}} {{ Stack:}} {{ java.lang.Thread.$$YJP$$sleep(Native Method)}} {{ java.lang.Thread.sleep(Thread.java)}} {{ org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run(ReloadingX509TrustManager.java:195)}} {{ java.lang.Thread.run(Thread.java:748)}} Using yourKit we identified the following: {{java.lang.Thread.<init>(Runnable, String) Thread.java org.apache.hadoop.security.ssl.ReloadingX509TrustManager.init() ReloadingX509TrustManager.java:95 org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory.init(SSLFactory$Mode) FileBasedKeyStoresFactory.java:223 org.apache.hadoop.security.ssl.SSLFactory.init() SSLFactory.java:180 org.apache.hadoop.yarn.client.api.impl.TimelineConnector.getSSLFactory(Configuration) TimelineConnector.java:181 org.apache.hadoop.yarn.client.api.impl.TimelineConnector.serviceInit(Configuration) TimelineConnector.java:108 org.apache.hadoop.service.AbstractService.init(Configuration) AbstractService.java:164 org.apache.hadoop.service.CompositeService.serviceInit(Configuration) CompositeService.java:108 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.serviceInit(Configuration) TimelineClientImpl.java:130 org.apache.hadoop.service.AbstractService.init(Configuration) AbstractService.java:164 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken() YarnClientImpl.java:405 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(ContainerLaunchContext) YarnClientImpl.java:381 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(ApplicationSubmissionContext) YarnClientImpl.java:300 org.apache.tez.client.TezYarnClient.submitApplication(ApplicationSubmissionContext) TezYarnClient.java:77 org.apache.tez.client.TezClient.start() TezClient.java:402 org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezClient, HiveConf, Map, TezConfiguration, boolean) TezSessionState.java:516 org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(String[], boolean, SessionState$LogHelper, TezSessionState$HiveResources) TezSessionState.java:451 org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(String[], boolean, SessionState$LogHelper, TezSessionState$HiveResources) TezSessionPoolSession.java:124 org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(48String[]) TezSessionState.java:373 org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezSessionState, String[]) TezTask.java:373 org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(DriverContext) TezTask.java:200 org.apache.hadoop.hive.ql.exec.Task.executeTask(HiveHistory) Task.java:212 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential() TaskRunner.java:103 org.apache.hadoop.hive.ql.Driver.launchTask(Task, String, boolean, String, int, DriverContext) Driver.java:2712 org.apache.hadoop.hive.ql.Driver.execute() Driver.java:2383 org.apache.hadoop.hive.ql.Driver.runInternal(String, boolean) Driver.java:2055 org.apache.hadoop.hive.ql.Driver.run(String, boolean) Driver.java:1753 org.apache.hadoop.hive.ql.Driver.run() Driver.java:1747 org.apache.hadoop.hive.ql.reexec.ReExecDriver.run() ReExecDriver.java:157 org.apache.hive.service.cli.operation.SQLOperation.runQuery() SQLOperation.java:226 org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation) SQLOperation.java:87 org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run() SQLOperation.java:324 javax.security.auth.Subject.doAs(Subject, PrivilegedExceptionAction) Subject.java org.apache.hadoop.security.UserGroupInformation.doAs(PrivilegedExceptionAction) UserGroupInformation.java:1729 org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run() SQLOperation.java:342 java.util.concurrent.Executors$RunnableAdapter.call() Executors.java:511 java.util.concurrent.FutureTask.run() FutureTask.java:266 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1149 java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:624 java.lang.Thread.run() Thread.java:748 ---- java.lang.Thread.<init>(Runnable, String) Thread.java org.apache.hadoop.security.ssl.ReloadingX509TrustManager.init() ReloadingX509TrustManager.java:95 org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory.init(SSLFactory$Mode) FileBasedKeyStoresFactory.java:223 org.apache.hadoop.security.ssl.SSLFactory.init() SSLFactory.java:180 org.apache.hadoop.crypto.key.kms.KMSClientProvider.<init>(URI, Configuration) KMSClientProvider.java:390 org.apache.hadoop.crypto.key.kms.KMSClientProvider$Factory.createProviders(Configuration, URL, int, String) KMSClientProvider.java:318 org.apache.hadoop.crypto.key.kms.KMSClientProvider$Factory.createProvider(URI, Configuration) KMSClientProvider.java:303 org.apache.hadoop.crypto.key.KeyProviderFactory.get(URI, Configuration) KeyProviderFactory.java:96 org.apache.hadoop.util.KMSUtil.createKeyProviderFromUri(Configuration, URI) KMSUtil.java:83 org.apache.hadoop.ozone.client.rpc.OzoneKMSUtil.getKeyProvider(ConfigurationSource, URI) OzoneKMSUtil.java:138 org.apache.hadoop.ozone.client.rpc.RpcClient.getKeyProvider() RpcClient.java:1310 org.apache.hadoop.ozone.client.ObjectStore.getKeyProvider() ObjectStore.java:222 org.apache.hadoop.fs.ozone.BasicRootedOzoneClientAdapterImpl.getKeyProvider() BasicRootedOzoneClientAdapterImpl.java:785 org.apache.hadoop.fs.ozone.RootedOzoneFileSystem.getKeyProvider() RootedOzoneFileSystem.java:54 org.apache.hadoop.fs.ozone.RootedOzoneFileSystem.getAdditionalTokenIssuers() RootedOzoneFileSystem.java:67 org.apache.hadoop.security.token.DelegationTokenIssuer.collectDelegationTokens(DelegationTokenIssuer, String, Credentials, List) DelegationTokenIssuer.java:104 org.apache.hadoop.security.token.DelegationTokenIssuer.addDelegationTokens(String, Credentials) DelegationTokenIssuer.java:76 org.apache.tez.common.security.TokenCache.obtainTokensForFileSystemsInternal(FileSystem, Credentials, Configuration) TokenCache.java:140 org.apache.tez.common.security.TokenCache.obtainTokensForFileSystemsInternal(Credentials, Path[], Configuration) TokenCache.java:101 org.apache.tez.common.security.TokenCache.obtainTokensForFileSystems(Credentials, Path[], Configuration) TokenCache.java:77 org.apache.tez.client.TezClientUtils.populateTokenCache(TezConfiguration, Credentials) TezClientUtils.java:746 org.apache.tez.client.TezClientUtils.prepareAmLaunchCredentials(AMConfiguration, Credentials, TezConfiguration, Path) TezClientUtils.java:722 org.apache.tez.client.TezClientUtils.createApplicationSubmissionContext(ApplicationId, DAG, String, AMConfiguration, Map, Credentials, boolean, TezApiVersionInfo, ServicePluginsDescriptor, JavaOptsChecker) TezClientUtils.java:487 org.apache.tez.client.TezClient.setupApplicationContext() TezClient.java:501 org.apache.tez.client.TezClient.start() TezClient.java:401 org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezClient, HiveConf, Map, TezConfiguration, boolean) TezSessionState.java:516 org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(String[], boolean, SessionState$LogHelper, TezSessionState$HiveResources) TezSessionState.java:451 org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(String[], boolean, SessionState$LogHelper, TezSessionState$HiveResources) TezSessionPoolSession.java:124 org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(String[]) TezSessionState.java:373 org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezSessionState, String[]) TezTask.java:373 org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(DriverContext) TezTask.java:200 org.apache.hadoop.hive.ql.exec.Task.executeTask(HiveHistory) Task.java:212 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential() TaskRunner.java:103 org.apache.hadoop.hive.ql.Driver.launchTask(Task, String, boolean, String, int, DriverContext) Driver.java:2712 org.apache.hadoop.hive.ql.Driver.execute() Driver.java:2383 org.apache.hadoop.hive.ql.Driver.runInternal(String, boolean) Driver.java:2055 org.apache.hadoop.hive.ql.Driver.run(String, boolean) Driver.java:1753 org.apache.hadoop.hive.ql.Driver.run() Driver.java:1747 org.apache.hadoop.hive.ql.reexec.ReExecDriver.run() ReExecDriver.java:157 org.apache.hive.service.cli.operation.SQLOperation.runQuery() SQLOperation.java:226 org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation) SQLOperation.java:87 org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run() SQLOperation.java:324 javax.security.auth.Subject.doAs(Subject, PrivilegedExceptionAction) Subject.java org.apache.hadoop.security.UserGroupInformation.doAs(PrivilegedExceptionAction) UserGroupInformation.java:1729 org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run() SQLOperation.java:342 java.util.concurrent.Executors$RunnableAdapter.call() Executors.java:511 java.util.concurrent.FutureTask.run() FutureTask.java:266 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1149 java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:624 java.lang.Thread.run() Thread.java:748}} This looks similar to [HDFS-14037|https://issues.apache.org/jira/browse/HDFS-14037] -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org