[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488180#comment-14488180
 ] 

Hudson commented on YARN-3055:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #7552 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7552/])
YARN-3055. Fixed ResourceManager's DelegationTokenRenewer to not stop token 
renewal of applications part of a bigger workflow. Contributed by Daryn Sharp. 
(vinodkv: rev 9c5911294e0ba71aefe4763731b0e780cde9d0ca)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java


 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Daryn Sharp
Priority: Blocker
 Fix For: 2.7.0

 Attachments: YARN-3055.001.patch, YARN-3055.002.patch, 
 YARN-3055.patch, YARN-3055.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-09 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487620#comment-14487620
 ] 

Daryn Sharp commented on YARN-3055:
---

Two apps could double renew tokens (completely benign) before this patch.  In 
practice the possibility is slim and its harmless. 

However, currently it's quite buggy. Both apps renewed and then stomped over 
each other's dttrs in allTokens.  Now both apps reference separate yet 
equivalent dttr instances, when the intention was only one app should reference 
a token.  A second/duplicate timer task was also scheduled.  Haven't bothered 
to check later fallout from the inconsistencies.

Patch: A double renew can still occur (unavoidable) but only one timer is 
scheduled.  All apps reference the same dttr instance.  Moving the logic down 
only creates 3 loops instead of 2 loops but I'll do if you feel strongly.

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch, YARN-3055.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-09 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488149#comment-14488149
 ] 

Vinod Kumar Vavilapalli commented on YARN-3055:
---

bq. Not related to this patch, it's a bug in YARN-2704 .We should remove the 
token from the allTokens, otherwise, it's a leak in allTokens. it can be fixed 
separately.
Good catch. Agree that this is not related to this patch, can you please file a 
ticket?

Checking this in now.

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch, 
 YARN-3055.patch, YARN-3055.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-09 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487482#comment-14487482
 ] 

Daryn Sharp commented on YARN-3055:
---

Thanks Vinod, I'll revise this morning.  The ignores shouldn't be there.  I did 
that for our internal emergency fix because we I didn't handle proxy refresh 
tokens so I didn't care the tests failed.

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch, YARN-3055.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-09 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488115#comment-14488115
 ] 

Jian He commented on YARN-3055:
---

thanks Daryn,  patch looks good to me too.  +1 

Not related to this patch, it's a bug in YARN-2704 .We should remove the token 
from the allTokens, otherwise, it's a leak in allTokens. it can be fixed 
separately.
{code}
if (t.token.getKind().equals(new Text(HDFS_DELEGATION_TOKEN))) {
  iter.remove();
  t.cancelTimer();
  LOG.info(Removed expiring token  + t);
}
{code}

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch, 
 YARN-3055.patch, YARN-3055.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487781#comment-14487781
 ] 

Hadoop QA commented on YARN-3055:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12724256/YARN-3055.patch
  against trunk revision 6495940.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7276//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7276//console

This message is automatically generated.

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch, 
 YARN-3055.patch, YARN-3055.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-09 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487871#comment-14487871
 ] 

Vinod Kumar Vavilapalli commented on YARN-3055:
---

You are right that the previous code also had the same issue. I am good with 
the patch.

Will check this in unless jenkins or [~jianhe] say no.

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch, 
 YARN-3055.patch, YARN-3055.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-08 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485461#comment-14485461
 ] 

Daryn Sharp commented on YARN-3055:
---

bq. It does seem odd to get the expiration date by renewing the token

The expiration is metadata associated with the token that is only known to the 
token issuer's secret manager.  The correct fix is for the renewer to not 
reschedule if the next expiration is the same as the last.  The bug wasn't a 
real priority when tokens weren't renewed forever.  If we regress to renewing 
forever, then it does become a problem.

bq.   I think currently the sub-job won't kill the overall workflow.

Correct, I misread in my haste.  It's rather the opposite:  sub-jobs can 
override the original job's request to cancel the tokens.

bq. I think overall the current patch will work, other than few comments I have.

It works but not in a desirable way.  Jason posted my patch we use internally 
on YARN-3439 which is duped to this jira.  I'm updating it to handle the proxy 
refresh cases and will post shortly.  The current semantics of the conf setting 
and the 2.x changes have been nothing but production blockers.  Ref counting 
will solve this once and for all.


 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Blocker
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-08 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485638#comment-14485638
 ] 

Jian He commented on YARN-3055:
---

sure, looking forward to your patch.

bq.  The correct fix is for the renewer to not reschedule if the next 
expiration is the same as the last.
Sorry, didn't get what you mean. mind clarifying more ?  The renew call after 
getting the new token is solely to retrieve the expiration date for the token. 
I found given that RM renews all tokens at once for each app on app submission, 
if renew rescheduling becomes a DOS problem, then app submission situation may 
be much worse. 

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Blocker
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-08 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485890#comment-14485890
 ] 

Daryn Sharp commented on YARN-3055:
---

The renew at job submission isn't the problem.  It's actually very desirable.  
Years back, a job submitted with bad tokens - that was destined to fail - would 
be launched anyway.  The tasks failed to connect, ipc level retries occurred, 
then higher level retries occurred, and yarn generally caught all exceptions 
and retried.  Tasks were retried, perhaps the app attempt was retried, etc.  In 
the end, a job that _clearly was going to fail_ might tie up cluster resources 
for 20+ minutes.  Why was it launched when a failed renew could have prevented 
the problem?  Not to mention the renewer was hardcoded to assume the expiration 
interval was 24h...  So much for being able to stress test the renewer with 1m 
expirations.

The potential DOS problem is when a token has reached end of life expiration.  
Let's say the token can be renewed twice.The third and subsequent renews 
return the same expiration.
# t1 = submit + renew
# t2 = t1 + renew
# t3 = t2
# t4 = t2

The renew timers fire 90% of the delta between now and the next expiration.  So 
as end of life expiration approaches, the timer fires with an increasing 
frequency.  50 threads doing that virtually non-stop would not be pretty.  The 
solution is stop renewing when the next expiration equals the last expiration.  
That can be addressed in another jira that's not a blocker because if tokens 
aren't renewed forever then it's a rare situation.

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Blocker
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch, YARN-3055.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-08 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486104#comment-14486104
 ] 

Daryn Sharp commented on YARN-3055:
---

I believe you are describing the behavior of 2.6's new proxy token refresh 
feature.  I won't digress into how broken it appears to be except in the 
simplest use case.  With it off, there is no fetching a new token.

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Blocker
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch, YARN-3055.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-08 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485987#comment-14485987
 ] 

Jian He commented on YARN-3055:
---

bq. The third and subsequent renews return the same expiration.
Sorry, I'm still not fully understanding your point. Why would the new token 
return the same expiration date ? RM is obtaining a brand-new token and the new 
token has a different expiration date. In addition, the old expiring token is 
removed and won't be renewed if a new token is acquired.  The token renew 
frequency is not changing.  

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Blocker
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch, YARN-3055.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-08 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486220#comment-14486220
 ] 

Jian He commented on YARN-3055:
---

Okay, now I understand what you are saying. I think we are thinking completely 
two different things.  I thought you were saying how the token refresh feature 
results in  this token renewal frequency changing. It's actually an existing  
problem.  Yes, the current token renewal will cause the token renewal frequency 
to increase towards the end of the token life and I think this has been like 
that long time ago. thanks for the clarification. 

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Blocker
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch, YARN-3055.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-07 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484117#comment-14484117
 ] 

Daryn Sharp commented on YARN-3055:
---

On cursory glance, are you sure this isn't going to leak tokens?  Ie. does it 
remove tokens from data structures in all cases or can a token get left in 
allTokens?

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Blocker
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-07 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484201#comment-14484201
 ] 

Daryn Sharp commented on YARN-3055:
---

This appears to go back to the really old days of renewing the token for its 
entire lifetime.  Most unfortunate.

The renewer looks like it may turn into a DOS weapon.  Renewing a token returns 
the next expiration.  The renewer uses a timer to renew 90% before expiration.  
After the last renewal, the same expiration (the wall) will be returned as 
before.  90% of the wall eventually becomes a rapid fire renewal.  There's an 
army of 50 threads prepared to fire concurrently.

My other concern is that it used to be the first job submitted with a given 
token that determined if the token is to be cancelled.  Now any job can 
influence the cancelling.  This patch didn't specifically break that behavior, 
but the original YARN-2704 did, which precipitated YARN-2964 to break it 
differently, and now this jira.

The ramification is we used to tell users to make sure the first job set the 
conf correctly, and essentially don't worry after that.  Now they do have to 
worry.  Any sub-job with the default of canceling tokens will kill the overall 
workflow.  Sub-jobs should not have jurisdiction over the tokens.

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Blocker
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-07 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484098#comment-14484098
 ] 

Vinod Kumar Vavilapalli commented on YARN-3055:
---

[~daryn]/[~jianhe], I briefly looked at the existing patch on this JIRA and it 
seems like it will work. Can you also take a look?

[~hitliuyi], can you see if you can add a test for this in 
TestDelegationTokenRenewer.java?

This is the last blocker on 2.7.0 as of today. Appreciate all the help I can 
get, thanks all.

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Blocker
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-07 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484374#comment-14484374
 ] 

Jian He commented on YARN-3055:
---

bq.  does it remove tokens from data structures in all cases or can a token get 
left in allTokens?
I think it does not have a leak. removeFailedDelegationToken will remove the 
token if renew fails. If there's a leak, the leak exists before  YARN-2704.
bq. The renewer looks like it may turn into a DOS weapon.
It does seem odd to get the expiration date by renewing the token. But there's 
just currently no way to get the expiration date other than the renew method.
bq. Any sub-job with the default of canceling tokens will kill the overall 
workflow.
Am I missing something ? I think currently  the sub-job won't kill the overall 
workflow. the sub-job flag will be ignored, if the first job sets the flag.

Overall, I think overall the current patch will work, other than few comments I 
have.
[~daryn], you mentioned you have another patch. could you share the patch or 
you think the current patch is fine ?

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Blocker
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-06 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481225#comment-14481225
 ] 

Daryn Sharp commented on YARN-3055:
---

Correctly handling the don't cancel setting for jobs submitting job has been 
a recurring issue.  We're internally testing a small patch to continue renewing 
until all jobs using the token(s) have finished.  Handling the auto-fetch of 
proxy tokens proved a bit more difficult so I need to complete the internal 
patch.  I can take this over or post a partial patch if [~hitliuyi] would like 
to finish it.

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Blocker
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-04-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395973#comment-14395973
 ] 

Hadoop QA commented on YARN-3055:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12691941/YARN-3055.002.patch
  against trunk revision 4b3948e.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7220//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7220//console

This message is automatically generated.

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Blocker
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-01-14 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14277879#comment-14277879
 ] 

Jason Lowe commented on YARN-3055:
--

Agree with Jian.  I believe the concern being raised here is not specific to 
the changes in YARN-2704 and YARN-2964 but a long-standing issue with YARN's 
handling of delegation tokens shared between jobs.

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-01-13 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14276275#comment-14276275
 ] 

Yi Liu commented on YARN-3055:
--

Is it possible the launcher job finishes firstly, but sub-jobs are still 
running? If so, the issue exists. If not, then the issue is invalid.

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-01-13 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14276280#comment-14276280
 ] 

Jian He commented on YARN-3055:
---

bq. Is it possible the launcher job finishes firstly, but sub-jobs are still 
running? 
This is an existing issue as discussed in 
https://issues.apache.org/jira/browse/YARN-2964?focusedCommentId=14252218page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14252218.
  And a long-term solution is to have a group Id for a group of applications so 
that the token lifetime is tied to a group of applications instead of a single 
application.

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-01-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14275291#comment-14275291
 ] 

Hadoop QA commented on YARN-3055:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12691941/YARN-3055.002.patch
  against trunk revision 08ac062.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6322//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6322//console

This message is automatically generated.

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3055) The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

2015-01-13 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14275952#comment-14275952
 ] 

Jian He commented on YARN-3055:
---

bq.  Meanwhile, we should not cancel the timerTask, also we should not remove 
it from allTokens.
IIUC, this is not the case. Because If launcher job first gets added to the 
appTokens map, DelegationTokenRenewer will not add DelegationTokenToRenew 
instance for the sub-job. So the tokens in removeApplicationFromRenewal will 
return empty for the sub-job when the sub-job completes. So the token won’t be 
removed from the allTokens. 

 The token is not renewed properly if it's shared by jobs (oozie) in 
 DelegationTokenRenewer
 --

 Key: YARN-3055
 URL: https://issues.apache.org/jira/browse/YARN-3055
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: YARN-3055.001.patch, YARN-3055.002.patch


 After YARN-2964, there is only one timer to renew the token if it's shared by 
 jobs. 
 In {{removeApplicationFromRenewal}}, when going to remove a token, and the 
 token is shared by other jobs, we will not cancel the token. 
 Meanwhile, we should not cancel the _timerTask_, also we should not remove it 
 from {{allTokens}}. Otherwise for the existing submitted applications which 
 share this token will not get renew any more, and for new submitted 
 applications which share this token, the token will be renew immediately.
 For example, we have 3 applications: app1, app2, app3. And they share the 
 token1. See following scenario:
 *1).* app1 is submitted firstly, then app2, and then app3. In this case, 
 there is only one token renewal timer for token1, and is scheduled when app1 
 is submitted
 *2).* app1 is finished, then the renewal timer is cancelled. token1 will not 
 be renewed any more, but app2 and app3 still use it, so there is problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)