[jira] [Updated] (YARN-2612) Some completed containers are not reported to NM

2014-09-26 Thread hex108 (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hex108 updated YARN-2612:
-
Description: 
In YARN-1372, NM will report completed containers to RM until it gets ACK from 
RM.  If AM does not call allocate, which means that AM does not ack RM, RM will 
not ack NM. We([~chenchun]) have observed these two cases when running 
Mapreduce task 'pi':
1) RM sends completed containers to AM. After receiving it, AM thinks it has 
done the work and does not need resource, so it does not call allocate.
2) When AM finishes, it could not ack to RM because AM itself has not finished 
yet.

In order to solve this problem, we have two solutions:
1) When RMAppAttempt call FinalTransition, it means AppAttempt finishes, then 
RM could send this AppAttempt's completed containers to NM.
2) In  FairScheduler#nodeUpdate, if completed containers sent by NM does not 
have corresponding RMContainer, RM just ack it to NM.

We prefer to solution 2 because it is more clear and concise. However RM might 
ack same completed containers to NM many times.

  was:
In YARN-1372, NM will report completed containers to RM until it gets ACK from 
RM.  If AM does not call allocate, which means that AM does not ack RM, RM will 
not ack NM. We have observed these two cases when running Mapreduce task 'pi':
1) RM sends completed containers to AM. After receiving it, AM thinks it has 
done the work and does not need resource, so it does not call allocate.
2) When AM finishes, it could not ack to RM because AM itself has not finished 
yet.

In order to solve this problem, we have two solutions:
1) When RMAppAttempt call FinalTransition, it means AppAttempt finishes, then 
RM could send this AppAttempt's completed containers to NM.
2) In  FairScheduler#nodeUpdate, if completed containers sent by NM does not 
have corresponding RMContainer, RM just ack it to NM.

We prefer to solution 2 because it is more clear and concise. However RM might 
ack same completed containers to NM many times.


> Some completed containers are not reported to NM
> 
>
> Key: YARN-2612
> URL: https://issues.apache.org/jira/browse/YARN-2612
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: hex108
> Fix For: 2.6.0
>
> Attachments: YARN-2612.patch
>
>
> In YARN-1372, NM will report completed containers to RM until it gets ACK 
> from RM.  If AM does not call allocate, which means that AM does not ack RM, 
> RM will not ack NM. We([~chenchun]) have observed these two cases when 
> running Mapreduce task 'pi':
> 1) RM sends completed containers to AM. After receiving it, AM thinks it has 
> done the work and does not need resource, so it does not call allocate.
> 2) When AM finishes, it could not ack to RM because AM itself has not 
> finished yet.
> In order to solve this problem, we have two solutions:
> 1) When RMAppAttempt call FinalTransition, it means AppAttempt finishes, then 
> RM could send this AppAttempt's completed containers to NM.
> 2) In  FairScheduler#nodeUpdate, if completed containers sent by NM does not 
> have corresponding RMContainer, RM just ack it to NM.
> We prefer to solution 2 because it is more clear and concise. However RM 
> might ack same completed containers to NM many times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2612) Some completed containers are not reported to NM

2014-09-26 Thread hex108 (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hex108 updated YARN-2612:
-
Attachment: YARN-2612.patch

> Some completed containers are not reported to NM
> 
>
> Key: YARN-2612
> URL: https://issues.apache.org/jira/browse/YARN-2612
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: hex108
> Fix For: 2.6.0
>
> Attachments: YARN-2612.patch
>
>
> In YARN-1372, NM will report completed containers to RM until it gets ACK 
> from RM.  If AM does not call allocate, which means that AM does not ack RM, 
> RM will not ack NM. We have observed these two cases when running Mapreduce 
> task 'pi':
> 1) RM sends completed containers to AM. After receiving it, AM thinks it has 
> done the work and does not need resource, so it does not call allocate.
> 2) When AM finishes, it could not ack to RM because AM itself has not 
> finished yet.
> In order to solve this problem, we have two solutions:
> 1) When RMAppAttempt call FinalTransition, it means AppAttempt finishes, then 
> RM could send this AppAttempt's completed containers to NM.
> 2) In  FairScheduler#nodeUpdate, if completed containers sent by NM does not 
> have corresponding RMContainer, RM just ack it to NM.
> We prefer to solution 2 because it is more clear and concise. However RM 
> might ack same completed containers to NM many times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2612) Some completed containers are not reported to NM

2014-09-26 Thread hex108 (JIRA)
hex108 created YARN-2612:


 Summary: Some completed containers are not reported to NM
 Key: YARN-2612
 URL: https://issues.apache.org/jira/browse/YARN-2612
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: hex108
 Fix For: 2.6.0


In YARN-1372, NM will report completed containers to RM until it gets ACK from 
RM.  If AM does not call allocate, which means that AM does not ack RM, RM will 
not ack NM. We have observed these two cases when running Mapreduce task 'pi':
1) RM sends completed containers to AM. After receiving it, AM thinks it has 
done the work and does not need resource, so it does not call allocate.
2) When AM finishes, it could not ack to RM because AM itself has not finished 
yet.

In order to solve this problem, we have two solutions:
1) When RMAppAttempt call FinalTransition, it means AppAttempt finishes, then 
RM could send this AppAttempt's completed containers to NM.
2) In  FairScheduler#nodeUpdate, if completed containers sent by NM does not 
have corresponding RMContainer, RM just ack it to NM.

We prefer to solution 2 because it is more clear and concise. However RM might 
ack same completed containers to NM many times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2405) NPE in FairSchedulerAppsBlock (scheduler page)

2014-08-26 Thread hex108 (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111740#comment-14111740
 ] 

hex108 commented on YARN-2405:
--

If RMApp has not been accepted by scheduler, it will only be recorded in 
`Map rmContext.getRMApps()`. So I think we could first 
test whether it is in `Map applications`, 
then we decide whether to get its fair. Is it OK?

> NPE in FairSchedulerAppsBlock (scheduler page)
> --
>
> Key: YARN-2405
> URL: https://issues.apache.org/jira/browse/YARN-2405
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Maysam Yabandeh
>
> FairSchedulerAppsBlock#render throws NPE at this line
> {code}
>   int fairShare = fsinfo.getAppFairShare(attemptId);
> {code}
> This causes the scheduler page now showing the app since it lack the 
> definition of appsTableData
> {code}
>  Uncaught ReferenceError: appsTableData is not defined 
> {code}
> The problem is temporary meaning that it is usually resolved by itself either 
> after a retry or after a few hours.



--
This message was sent by Atlassian JIRA
(v6.2#6252)