[jira] [Updated] (YARN-4041) Slow delegation token renewal can severely prolong RM recovery

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-4041:
-
Fix Version/s: 2.8.0

> Slow delegation token renewal can severely prolong RM recovery
> --
>
> Key: YARN-4041
> URL: https://issues.apache.org/jira/browse/YARN-4041
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Sunil G
> Fix For: 2.8.0, 2.7.2, 3.0.0-alpha1
>
> Attachments: 0001-YARN-4041.patch, 0002-YARN-4041.patch, 
> 0003-YARN-4041.patch, 0004-YARN-4041.patch, 0005-YARN-4041.patch
>
>
> When the RM does a work-preserving restart it synchronously tries to renew 
> delegation tokens for every active application.  If a token server happens to 
> be down or is running slow and a lot of the active apps were using tokens 
> from that server then it can have a huge impact on the time it takes the RM 
> to process the restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4041) Slow delegation token renewal can severely prolong RM recovery

2015-10-22 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-4041:
--
Attachment: 0005-YARN-4041.patch

Yes [~jlowe], we can compare with token itself and do wait in smaller units. 
With new patch, I kept a total wait time of 1sec but with 10ms units. Locally 
test run seems more faster. Uploading a new patch.

> Slow delegation token renewal can severely prolong RM recovery
> --
>
> Key: YARN-4041
> URL: https://issues.apache.org/jira/browse/YARN-4041
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Sunil G
> Attachments: 0001-YARN-4041.patch, 0002-YARN-4041.patch, 
> 0003-YARN-4041.patch, 0004-YARN-4041.patch, 0005-YARN-4041.patch
>
>
> When the RM does a work-preserving restart it synchronously tries to renew 
> delegation tokens for every active application.  If a token server happens to 
> be down or is running slow and a lot of the active apps were using tokens 
> from that server then it can have a huge impact on the time it takes the RM 
> to process the restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4041) Slow delegation token renewal can severely prolong RM recovery

2015-10-21 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-4041:
--
Attachment: 0004-YARN-4041.patch

Hi [~jlowe] and [~jianhe]
Pls find an updated patch. I made a correction in test case to wait for 
{{renewerService}} thread pool executor to process the renew event raised. 
Kindly share your thoughts.

> Slow delegation token renewal can severely prolong RM recovery
> --
>
> Key: YARN-4041
> URL: https://issues.apache.org/jira/browse/YARN-4041
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Sunil G
> Attachments: 0001-YARN-4041.patch, 0002-YARN-4041.patch, 
> 0003-YARN-4041.patch, 0004-YARN-4041.patch
>
>
> When the RM does a work-preserving restart it synchronously tries to renew 
> delegation tokens for every active application.  If a token server happens to 
> be down or is running slow and a lot of the active apps were using tokens 
> from that server then it can have a huge impact on the time it takes the RM 
> to process the restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4041) Slow delegation token renewal can severely prolong RM recovery

2015-10-20 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-4041:
--
Attachment: 0003-YARN-4041.patch

Updating patch after test case fix.

> Slow delegation token renewal can severely prolong RM recovery
> --
>
> Key: YARN-4041
> URL: https://issues.apache.org/jira/browse/YARN-4041
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Sunil G
> Attachments: 0001-YARN-4041.patch, 0002-YARN-4041.patch, 
> 0003-YARN-4041.patch
>
>
> When the RM does a work-preserving restart it synchronously tries to renew 
> delegation tokens for every active application.  If a token server happens to 
> be down or is running slow and a lot of the active apps were using tokens 
> from that server then it can have a huge impact on the time it takes the RM 
> to process the restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4041) Slow delegation token renewal can severely prolong RM recovery

2015-10-19 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-4041:
--
Attachment: 0002-YARN-4041.patch

Thank you [~jianhe] and [~jlowe].
As per latest jenkins, patch needs rebase. Attaching a rebased version. Tests 
are passing locally.

> Slow delegation token renewal can severely prolong RM recovery
> --
>
> Key: YARN-4041
> URL: https://issues.apache.org/jira/browse/YARN-4041
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Sunil G
> Attachments: 0001-YARN-4041.patch, 0002-YARN-4041.patch
>
>
> When the RM does a work-preserving restart it synchronously tries to renew 
> delegation tokens for every active application.  If a token server happens to 
> be down or is running slow and a lot of the active apps were using tokens 
> from that server then it can have a huge impact on the time it takes the RM 
> to process the restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4041) Slow delegation token renewal can severely prolong RM recovery

2015-08-19 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-4041:
--
Attachment: 0001-YARN-4041.patch

Uploading an initial version of work in progress patch where token renewal is 
made as asynchronous. Used {{DelegationTokenRenewerRunnable}} to achieve the 
same.


 Slow delegation token renewal can severely prolong RM recovery
 --

 Key: YARN-4041
 URL: https://issues.apache.org/jira/browse/YARN-4041
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Sunil G
 Attachments: 0001-YARN-4041.patch


 When the RM does a work-preserving restart it synchronously tries to renew 
 delegation tokens for every active application.  If a token server happens to 
 be down or is running slow and a lot of the active apps were using tokens 
 from that server then it can have a huge impact on the time it takes the RM 
 to process the restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)