[jira] [Updated] (YARN-4247) Deadlock in FSAppAttempt and RMAppAttemptImpl causes RM to stop processing events

2015-10-09 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-4247:

Attachment: YARN-4247.001.patch

Fix removes need for locking from FSAppAttempt to RMAppAttemptImpl.

> Deadlock in FSAppAttempt and RMAppAttemptImpl causes RM to stop processing 
> events
> -
>
> Key: YARN-4247
> URL: https://issues.apache.org/jira/browse/YARN-4247
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
>Priority: Blocker
> Attachments: YARN-4247.001.patch
>
>
> We see this deadlock in our testing where events do not get processed and we 
> see this in the logs before the RM dies of OOM {noformat} 2015-10-08 
> 04:48:01,918 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of 
> event-queue is 1488000 2015-10-08 04:48:01,918 INFO 
> org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1488000 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4247) Deadlock in FSAppAttempt and RMAppAttemptImpl causes RM to stop processing events

2015-10-09 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-4247:

Attachment: YARN-4247.001.patch

retrigger jenkins

> Deadlock in FSAppAttempt and RMAppAttemptImpl causes RM to stop processing 
> events
> -
>
> Key: YARN-4247
> URL: https://issues.apache.org/jira/browse/YARN-4247
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
>Priority: Blocker
> Attachments: YARN-4247.001.patch, YARN-4247.001.patch
>
>
> We see this deadlock in our testing where events do not get processed and we 
> see this in the logs before the RM dies of OOM {noformat} 2015-10-08 
> 04:48:01,918 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of 
> event-queue is 1488000 2015-10-08 04:48:01,918 INFO 
> org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1488000 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4247) Deadlock in FSAppAttempt and RMAppAttemptImpl causes RM to stop processing events

2015-10-09 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-4247:
---
Priority: Blocker  (was: Major)

> Deadlock in FSAppAttempt and RMAppAttemptImpl causes RM to stop processing 
> events
> -
>
> Key: YARN-4247
> URL: https://issues.apache.org/jira/browse/YARN-4247
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
>Priority: Blocker
>
> We see this deadlock in our testing where events do not get processed and we 
> see this in the logs before the RM dies of OOM {noformat} 2015-10-08 
> 04:48:01,918 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of 
> event-queue is 1488000 2015-10-08 04:48:01,918 INFO 
> org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1488000 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)