[ 
https://issues.apache.org/jira/browse/TWILL-106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alvin Wang updated TWILL-106:
-----------------------------
    Comment: was deleted

(was: I see that the dfs.namenode.delegation.key.update-interval in the test 
cluster defaulted to 86400000 (1 day), which is the time that the CDAP 
transaction service takes to go down due to this error:

{code}
Aborting transaction manager due to: Snapshot (timestamp 1414037562088) failed 
due to: token (HDFS_DELEGATION_TOKEN token 4287 for yarn) can't be found in 
cache
org.apache.hadoop.ipc.RemoteException: token (HDFS_DELEGATION_TOKEN token 4287 
for yarn) can't be found in cache
at org.apache.hadoop.ipc.Client.call(Client.java:1347)
{code}

Also, there weren't any other properties set to 1 day, so it seems likely that 
the error is related to the dfs.namenode.delegation.key.update-interval. I also 
remember that we made a fix to Twill for delegation key expiration, and that I 
was able to have the CDAP transaction service stay running for longer than a 
day. This error could be caused by some different configuration.)

> HDFS delegation token is not being refreshed properly
> -----------------------------------------------------
>
>                 Key: TWILL-106
>                 URL: https://issues.apache.org/jira/browse/TWILL-106
>             Project: Apache Twill
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.4.0-incubating
>            Reporter: Poorna Chandra
>
> We have a Twill app that runs in a secure Hadoop cluster. The app starts up 
> fine, and runs for a day. I can see in logs that say secure store was updated 
> regularly. However, after a day I see exceptions that say "token 
> (HDFS_DELEGATION_TOKEN token 4287 for yarn) can't be found in cache". 
> Exception:
> -------------
> 2014-10-23T04:12:42,101Z ERROR c.c.t.TransactionManager 
> [cdap-secure120-1000.dev.continuuity.net] [tx-snapshot] 
> TransactionManager:abortService(TransactionManager.java:594) - Aborting 
> transaction manager due to: Snapshot (timestamp 1414037562088) failed due to: 
> token (HDFS_DELEGATION_TOKEN token 4287 for yarn) can't be found in cache
> org.apache.hadoop.ipc.RemoteException: token (HDFS_DELEGATION_TOKEN token 
> 4287 for yarn) can't be found in cache
>         at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to