Diego Jaramillo created HDDS-6102:
-------------------------------------

             Summary: Integrating Ozone with Hive produce a thread leak in HS2 
server
                 Key: HDDS-6102
                 URL: https://issues.apache.org/jira/browse/HDDS-6102
             Project: Apache Ozone
          Issue Type: Bug
          Components: OFS
    Affects Versions: 1.1.0
         Environment: ozone: 1.1.0
hadoop: 3.1.1
hive: 3.1.0
tez: 0.10.0 (this version is needed because of 
[TEZ-4032|https://issues.apache.org/jira/browse/TEZ-4032])

both the hadoop cluster + ozone are secured using kerberos.
            Reporter: Diego Jaramillo


Integration ozone with hive is producing a thread leak in HS2, in this sample, 
12 open connections to hive produced 149 threads and the count kept increasing 
until HS2 needed to be restarted.

SETTINGS:

HDFS integration using the following settings
- viewfs-mount-table:
  fs.viewfs.mounttable.clusters.link./cluster1=hdfs://cluster1
  fs.viewfs.mounttable.clusters.link./ozfs1=ofs://ozfs1


- core-site.xml:
  fs.ofs.impl=org.apache.hadoop.fs.ozone.RootedOzoneFileSystem
  fs.AbstractFileSystem.o3fs.impl=org.apache.hadoop.fs.ozone.OzFs
 

- hdfs-site.xml:
  ozone.om.service.ids=ozfs1
  ozone.om.nodes.ozfs1=om1,om2
  ozone.om.address.ozfs1.om1=ozone1.domain.com:9862
  ozone.om.address.ozfs1.om2=ozone2.domain.com:9862
  dfs.nameservices=cluster1,ozfs1
  ozone.om.kerberos.keytab.file=/etc/security/keytabs/om.service.keytab
  ozone.om.kerberos.principal=om/_h...@domain.com

 

Hive integration using the following setting
- hive-site.xml:
  tez.job.fs-servers=hdfs://cluster1,ofs://ozfs1
  mapreduce.job.hdfs-servers=hdfs://cluster1,ofs://ozfs1

 

>From hive's stack trace we see many thread like these:

{{Thread 4958 (Truststore reloader thread):}}
{{  State: TIMED_WAITING}}
{{  Blocked count: 0}}
{{  Waited count: 221}}
{{  Stack:}}
{{    java.lang.Thread.$$YJP$$sleep(Native Method)}}
{{    java.lang.Thread.sleep(Thread.java)}}
{{    
org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run(ReloadingX509TrustManager.java:195)}}
{{    java.lang.Thread.run(Thread.java:748)}}
{{Thread 4948 (Truststore reloader thread):}}
{{  State: TIMED_WAITING}}
{{  Blocked count: 79}}
{{  Waited count: 221}}
{{  Stack:}}
{{    java.lang.Thread.$$YJP$$sleep(Native Method)}}
{{    java.lang.Thread.sleep(Thread.java)}}
{{    
org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run(ReloadingX509TrustManager.java:195)}}
{{    java.lang.Thread.run(Thread.java:748)}}
{{Thread 4777 (Truststore reloader thread):}}
{{  State: TIMED_WAITING}}
{{  Blocked count: 0}}
{{  Waited count: 252}}
{{  Stack:}}
{{    java.lang.Thread.$$YJP$$sleep(Native Method)}}
{{    java.lang.Thread.sleep(Thread.java)}}
{{    
org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run(ReloadingX509TrustManager.java:195)}}
{{    java.lang.Thread.run(Thread.java:748)}}

 

Using yourKit we identified the following:

{{java.lang.Thread.<init>(Runnable, String) Thread.java
org.apache.hadoop.security.ssl.ReloadingX509TrustManager.init() 
ReloadingX509TrustManager.java:95
org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory.init(SSLFactory$Mode) 
FileBasedKeyStoresFactory.java:223
org.apache.hadoop.security.ssl.SSLFactory.init() SSLFactory.java:180
org.apache.hadoop.yarn.client.api.impl.TimelineConnector.getSSLFactory(Configuration)
 TimelineConnector.java:181
org.apache.hadoop.yarn.client.api.impl.TimelineConnector.serviceInit(Configuration)
 TimelineConnector.java:108
org.apache.hadoop.service.AbstractService.init(Configuration) 
AbstractService.java:164
org.apache.hadoop.service.CompositeService.serviceInit(Configuration) 
CompositeService.java:108
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.serviceInit(Configuration)
 TimelineClientImpl.java:130
org.apache.hadoop.service.AbstractService.init(Configuration) 
AbstractService.java:164
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken()
 YarnClientImpl.java:405
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(ContainerLaunchContext)
 YarnClientImpl.java:381
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(ApplicationSubmissionContext)
 YarnClientImpl.java:300
org.apache.tez.client.TezYarnClient.submitApplication(ApplicationSubmissionContext)
 TezYarnClient.java:77
org.apache.tez.client.TezClient.start() TezClient.java:402
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezClient,
 HiveConf, Map, TezConfiguration, boolean) TezSessionState.java:516
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(String[], 
boolean, SessionState$LogHelper, TezSessionState$HiveResources) 
TezSessionState.java:451
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(String[], 
boolean, SessionState$LogHelper, TezSessionState$HiveResources) 
TezSessionPoolSession.java:124
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(48String[]) 
TezSessionState.java:373
org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezSessionState,
 String[]) TezTask.java:373
org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(DriverContext) 
TezTask.java:200
org.apache.hadoop.hive.ql.exec.Task.executeTask(HiveHistory) Task.java:212
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential() TaskRunner.java:103
org.apache.hadoop.hive.ql.Driver.launchTask(Task, String, boolean, String, int, 
DriverContext) Driver.java:2712
org.apache.hadoop.hive.ql.Driver.execute() Driver.java:2383
org.apache.hadoop.hive.ql.Driver.runInternal(String, boolean) Driver.java:2055
org.apache.hadoop.hive.ql.Driver.run(String, boolean) Driver.java:1753
org.apache.hadoop.hive.ql.Driver.run() Driver.java:1747
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run() ReExecDriver.java:157
org.apache.hive.service.cli.operation.SQLOperation.runQuery() 
SQLOperation.java:226
org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation) 
SQLOperation.java:87
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run() 
SQLOperation.java:324
javax.security.auth.Subject.doAs(Subject, PrivilegedExceptionAction) 
Subject.java
org.apache.hadoop.security.UserGroupInformation.doAs(PrivilegedExceptionAction) 
UserGroupInformation.java:1729
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run() 
SQLOperation.java:342
java.util.concurrent.Executors$RunnableAdapter.call() Executors.java:511
java.util.concurrent.FutureTask.run() FutureTask.java:266
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 
ThreadPoolExecutor.java:1149
java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:624
java.lang.Thread.run() Thread.java:748
----
java.lang.Thread.<init>(Runnable, String) Thread.java
org.apache.hadoop.security.ssl.ReloadingX509TrustManager.init() 
ReloadingX509TrustManager.java:95
org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory.init(SSLFactory$Mode) 
FileBasedKeyStoresFactory.java:223
org.apache.hadoop.security.ssl.SSLFactory.init() SSLFactory.java:180
org.apache.hadoop.crypto.key.kms.KMSClientProvider.<init>(URI, Configuration) 
KMSClientProvider.java:390
org.apache.hadoop.crypto.key.kms.KMSClientProvider$Factory.createProviders(Configuration,
 URL, int, String) KMSClientProvider.java:318
org.apache.hadoop.crypto.key.kms.KMSClientProvider$Factory.createProvider(URI, 
Configuration) KMSClientProvider.java:303
org.apache.hadoop.crypto.key.KeyProviderFactory.get(URI, Configuration) 
KeyProviderFactory.java:96
org.apache.hadoop.util.KMSUtil.createKeyProviderFromUri(Configuration, URI) 
KMSUtil.java:83
org.apache.hadoop.ozone.client.rpc.OzoneKMSUtil.getKeyProvider(ConfigurationSource,
 URI) OzoneKMSUtil.java:138
org.apache.hadoop.ozone.client.rpc.RpcClient.getKeyProvider() 
RpcClient.java:1310
org.apache.hadoop.ozone.client.ObjectStore.getKeyProvider() ObjectStore.java:222
org.apache.hadoop.fs.ozone.BasicRootedOzoneClientAdapterImpl.getKeyProvider() 
BasicRootedOzoneClientAdapterImpl.java:785
org.apache.hadoop.fs.ozone.RootedOzoneFileSystem.getKeyProvider() 
RootedOzoneFileSystem.java:54
org.apache.hadoop.fs.ozone.RootedOzoneFileSystem.getAdditionalTokenIssuers() 
RootedOzoneFileSystem.java:67
org.apache.hadoop.security.token.DelegationTokenIssuer.collectDelegationTokens(DelegationTokenIssuer,
 String, Credentials, List) DelegationTokenIssuer.java:104
org.apache.hadoop.security.token.DelegationTokenIssuer.addDelegationTokens(String,
 Credentials) DelegationTokenIssuer.java:76
org.apache.tez.common.security.TokenCache.obtainTokensForFileSystemsInternal(FileSystem,
 Credentials, Configuration) TokenCache.java:140
org.apache.tez.common.security.TokenCache.obtainTokensForFileSystemsInternal(Credentials,
 Path[], Configuration) TokenCache.java:101
org.apache.tez.common.security.TokenCache.obtainTokensForFileSystems(Credentials,
 Path[], Configuration) TokenCache.java:77
org.apache.tez.client.TezClientUtils.populateTokenCache(TezConfiguration, 
Credentials) TezClientUtils.java:746
org.apache.tez.client.TezClientUtils.prepareAmLaunchCredentials(AMConfiguration,
 Credentials, TezConfiguration, Path) TezClientUtils.java:722
org.apache.tez.client.TezClientUtils.createApplicationSubmissionContext(ApplicationId,
 DAG, String, AMConfiguration, Map, Credentials, boolean, TezApiVersionInfo, 
ServicePluginsDescriptor, JavaOptsChecker) TezClientUtils.java:487
org.apache.tez.client.TezClient.setupApplicationContext() TezClient.java:501
org.apache.tez.client.TezClient.start() TezClient.java:401
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezClient,
 HiveConf, Map, TezConfiguration, boolean) TezSessionState.java:516
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(String[], 
boolean, SessionState$LogHelper, TezSessionState$HiveResources) 
TezSessionState.java:451
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(String[], 
boolean, SessionState$LogHelper, TezSessionState$HiveResources) 
TezSessionPoolSession.java:124
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(String[]) 
TezSessionState.java:373
org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezSessionState,
 String[]) TezTask.java:373
org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(DriverContext) 
TezTask.java:200
org.apache.hadoop.hive.ql.exec.Task.executeTask(HiveHistory) Task.java:212
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential() TaskRunner.java:103
org.apache.hadoop.hive.ql.Driver.launchTask(Task, String, boolean, String, int, 
DriverContext) Driver.java:2712
org.apache.hadoop.hive.ql.Driver.execute() Driver.java:2383
org.apache.hadoop.hive.ql.Driver.runInternal(String, boolean) Driver.java:2055
org.apache.hadoop.hive.ql.Driver.run(String, boolean) Driver.java:1753
org.apache.hadoop.hive.ql.Driver.run() Driver.java:1747
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run() ReExecDriver.java:157
org.apache.hive.service.cli.operation.SQLOperation.runQuery() 
SQLOperation.java:226
org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation) 
SQLOperation.java:87
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run() 
SQLOperation.java:324
javax.security.auth.Subject.doAs(Subject, PrivilegedExceptionAction) 
Subject.java
org.apache.hadoop.security.UserGroupInformation.doAs(PrivilegedExceptionAction) 
UserGroupInformation.java:1729
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run() 
SQLOperation.java:342
java.util.concurrent.Executors$RunnableAdapter.call() Executors.java:511
java.util.concurrent.FutureTask.run() FutureTask.java:266
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 
ThreadPoolExecutor.java:1149
java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:624
java.lang.Thread.run() Thread.java:748}}

 

This looks similar to 
[HDFS-14037|https://issues.apache.org/jira/browse/HDFS-14037]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org
For additional commands, e-mail: issues-h...@ozone.apache.org

Reply via email to