[ 
https://issues.apache.org/jira/browse/TWILL-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189407#comment-14189407
 ] 

Alvin Wang commented on TWILL-106:
----------------------------------

The transaction service is using an old token, even though the token in HDFS 
(in "credential.store") is updated properly. I made a simple Twill app that 
prints out the tokens of the current user, initializes a filesystem, writes a 
file to it, and closes the file and filesystem. This Twill app gets the 
following error:

{code}
00:30:12.876 [message-callback] ERROR o.a.t.i.y.AbstractYarnTwillService - 
Failed to update secure store.
java.io.IOException: Filesystem closed
        at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:629) 
~[hadoop-hdfs-2.2.0.2.0.11.0-1.jar:na]
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1210) 
~[hadoop-hdfs-2.2.0.2.0.11.0-1.jar:na]
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:290)
 ~[hadoop-hdfs-2.2.0.2.0.11.0-1.jar:na]
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:286)
 ~[hadoop-hdfs-2.2.0.2.0.11.0-1.jar:na]
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 ~[hadoop-common-2.2.0.2.0.11.0-1.jar:na]
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:286)
 ~[hadoop-hdfs-2.2.0.2.0.11.0-1.jar:na]
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:763) 
~[hadoop-common-2.2.0.2.0.11.0-1.jar:na]
        at 
org.apache.twill.filesystem.HDFSLocation.getInputStream(HDFSLocation.java:74) 
~[hello-twill-1.0-SNAPSHOT.jar:na]
        at 
org.apache.twill.internal.yarn.AbstractYarnTwillService.handleSecureStoreUpdate(AbstractYarnTwillService.java:86)
 ~[hello-twill-1.0-SNAPSHOT.jar:na]
        at 
org.apache.twill.internal.container.TwillContainerService.onReceived(TwillContainerService.java:89)
 [hello-twill-1.0-SNAPSHOT.jar:na]
        at 
org.apache.twill.internal.AbstractTwillService.handleMessage(AbstractTwillService.java:323)
 [hello-twill-1.0-SNAPSHOT.jar:na]
        at 
org.apache.twill.internal.AbstractTwillService.access$900(AbstractTwillService.java:84)
 [hello-twill-1.0-SNAPSHOT.jar:na]
        at 
org.apache.twill.internal.AbstractTwillService$4.onSuccess(AbstractTwillService.java:274)
 [hello-twill-1.0-SNAPSHOT.jar:na]
        at 
org.apache.twill.internal.AbstractTwillService$4.onSuccess(AbstractTwillService.java:254)
 [hello-twill-1.0-SNAPSHOT.jar:na]
        at com.google.common.util.concurrent.Futures$6.run(Futures.java:799) 
[hello-twill-1.0-SNAPSHOT.jar:na]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
 [na:1.6.0_45]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) 
[na:1.6.0_45]
        at java.lang.Thread.run(Thread.java:662) [na:1.6.0_45]
{code}

When the code interacting with the filesystem is removed, the token is properly 
updated:

{code}
00:40:41.896 [TwillContainerService] INFO  examples.HelloWorld - Token: Kind: 
HDFS_DELEGATION_TOKEN, Service: 10.240.135.9:8020, Ident: 
(HDFS_DELEGATION_TOKEN token 6562 for yarn)
00:40:41.896 [TwillContainerService] INFO  examples.HelloWorld - Token: Kind: 
RM_DELEGATION_TOKEN, Service: 10.240.135.9:8032, Ident: 
(owner=yarn/[email protected], 
renewer=yarn, realUser=, issueDate=1414629381163, maxDate=1415234181163, 
sequenceNumber=13, masterKeyId=18)
00:40:51.897 [TwillContainerService] INFO  examples.HelloWorld - ########## 
Current time: 1414629651897
00:40:51.898 [TwillContainerService] INFO  examples.HelloWorld - Token: Kind: 
HDFS_DELEGATION_TOKEN, Service: 10.240.135.9:8020, Ident: 
(HDFS_DELEGATION_TOKEN token 6562 for yarn)
00:40:51.898 [TwillContainerService] INFO  examples.HelloWorld - Token: Kind: 
RM_DELEGATION_TOKEN, Service: 10.240.135.9:8032, Ident: 
(owner=yarn/[email protected], 
renewer=yarn, realUser=, issueDate=1414629381163, maxDate=1415234181163, 
sequenceNumber=13, masterKeyId=18)
00:41:01.898 [TwillContainerService] INFO  examples.HelloWorld - ########## 
Current time: 1414629661898
00:41:01.899 [TwillContainerService] INFO  examples.HelloWorld - Token: Kind: 
HDFS_DELEGATION_TOKEN, Service: 10.240.135.9:8020, Ident: 
(HDFS_DELEGATION_TOKEN token 6562 for yarn)
00:41:01.899 [TwillContainerService] INFO  examples.HelloWorld - Token: Kind: 
RM_DELEGATION_TOKEN, Service: 10.240.135.9:8032, Ident: 
(owner=yarn/[email protected], 
renewer=yarn, realUser=, issueDate=1414629381163, maxDate=1415234181163, 
sequenceNumber=13, masterKeyId=18)
00:41:10.259 [message-callback] INFO  o.a.t.i.y.AbstractYarnTwillService - 
Secure store updated from 
hdfs://10.240.135.9/user/yarn/HelloWorldRunnable/126031a3-5917-4fe4-b447-27a534111bcd/credentials.st....
00:41:11.899 [TwillContainerService] INFO  examples.HelloWorld - ########## 
Current time: 1414629671899
00:41:11.900 [TwillContainerService] INFO  examples.HelloWorld - Token: Kind: 
HDFS_DELEGATION_TOKEN, Service: 10.240.135.9:8020, Ident: 
(HDFS_DELEGATION_TOKEN token 6565 for yarn)
00:41:11.900 [TwillContainerService] INFO  examples.HelloWorld - Token: Kind: 
RM_DELEGATION_TOKEN, Service: 10.240.135.9:8032, Ident: 
(owner=yarn/[email protected], 
renewer=yarn, realUser=, issueDate=1414629381163, maxDate=1415234181163, 
sequenceNumber=13, masterKeyId=18)
00:41:21.901 [TwillContainerService] INFO  examples.HelloWorld - ########## 
Current time: 1414629681901
00:41:21.902 [TwillContainerService] INFO  examples.HelloWorld - Token: Kind: 
HDFS_DELEGATION_TOKEN, Service: 10.240.135.9:8020, Ident: 
(HDFS_DELEGATION_TOKEN token 6565 for yarn)
00:41:21.902 [TwillContainerService] INFO  examples.HelloWorld - Token: Kind: 
RM_DELEGATION_TOKEN, Service: 10.240.135.9:8032, Ident: 
(owner=yarn/[email protected], 
renewer=yarn, realUser=, issueDate=1414629381163, maxDate=1415234181163, 
sequenceNumber=13, masterKeyId=18)
{code} 

> HDFS delegation token is not being refreshed properly
> -----------------------------------------------------
>
>                 Key: TWILL-106
>                 URL: https://issues.apache.org/jira/browse/TWILL-106
>             Project: Apache Twill
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.4.0-incubating
>            Reporter: Poorna Chandra
>
> We have a Twill app that runs in a secure Hadoop cluster. The app starts up 
> fine, and runs for a day. I can see in logs that say secure store was updated 
> regularly. However, after a day I see exceptions that say "token 
> (HDFS_DELEGATION_TOKEN token 4287 for yarn) can't be found in cache". 
> Exception:
> -------------
> 2014-10-23T04:12:42,101Z ERROR c.c.t.TransactionManager 
> [cdap-secure120-1000.dev.continuuity.net] [tx-snapshot] 
> TransactionManager:abortService(TransactionManager.java:594) - Aborting 
> transaction manager due to: Snapshot (timestamp 1414037562088) failed due to: 
> token (HDFS_DELEGATION_TOKEN token 4287 for yarn) can't be found in cache
> org.apache.hadoop.ipc.RemoteException: token (HDFS_DELEGATION_TOKEN token 
> 4287 for yarn) can't be found in cache
>         at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to