[ https://issues.apache.org/jira/browse/IGNITE-8890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Puviarasu updated IGNITE-8890: ------------------------------ Labels: Ignite kerberos yarn (was: ) > Ignite YARN Kerberos - Delegation Token renewal > ----------------------------------------------- > > Key: IGNITE-8890 > URL: https://issues.apache.org/jira/browse/IGNITE-8890 > Project: Ignite > Issue Type: Bug > Components: yarn > Affects Versions: 2.3 > Environment: Kerberos cluster > Ignite Version : 2.3.0 > Module : Ignite-YARN > Class : ApplicationMaster > > Reporter: Puviarasu > Priority: Blocker > Labels: Ignite, kerberos, yarn > > As Ignite-YARN is a long running application in YARN environment it should > have a mechanism to renew the delegation token. > In Ignite-YARN, when the ApplicationMaster is started, it acquires Delegation > tokens and stores in a ByteBuffer[Class: ApplicationMaster, Method: init()]. > This ByteBuffer with token information is given to all the containers > received from ResourceManager [Class: ApplicationMaster, Method: > onContainersAllocated()]. > Everything works fine till the life time of the delegation token. > Once the delegation token expires, the ApplicationMaster is not able to start > Ignite inside containers it receive and below exception occurs > *WARNING: Error launching container* > > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager*$InvalidToken*) > : > at org.apache.hadoop.ipc.Client.call(Client.java:1504) > at org.apache.hadoop.ipc.Client.call(Client.java:1441) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771) > at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2123) > at > org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1253) > at > org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1249) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1249) > at > org.apache.ignite.yarn.utils.IgniteYarnUtils.setupFile(IgniteYarnUtils.java:65) > at > org.apache.ignite.yarn.ApplicationMaster.onContainersAllocated(ApplicationMaster.java:131) > at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:292) > ApplicationMaster keeps on asking for more and more containers [Class: > ApplicationMaster, Method: run()] but not able to start Ignite inside any of > the containers due to the expired/missing delegation token. The failed > containers are not released when Exception occurs. > *This repeats until all the resources in the cluster are allocated to > Ignition. As a result of this Ignition uses all resources in the cluster and > no other jobs were able to run.* > Kindly help in resolving the issue. > Thanks in Advance!!! > -- This message was sent by Atlassian JIRA (v7.6.3#76005)