[jira] [Commented] (YARN-2055) Preemption: Jobs are failing due to AMs are getting launched and killed multiple times

2015-08-11 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692532#comment-14692532
 ] 

Sangjin Lee commented on YARN-2055:
---

Should this be targeted to 2.6.2? We're trying to release 2.6.1 soon. Let me 
know.

> Preemption: Jobs are failing due to AMs are getting launched and killed 
> multiple times
> --
>
> Key: YARN-2055
> URL: https://issues.apache.org/jira/browse/YARN-2055
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Mayank Bansal
>
> If Queue A does not have enough capacity to run AM, then AM will borrow 
> capacity from queue B to run AM in that case AM will be killed if queue B 
> will reclaim its capacity and again AM will be launched and killed again, in 
> that case job will be failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2055) Preemption: Jobs are failing due to AMs are getting launched and killed multiple times

2014-05-14 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13997328#comment-13997328
 ] 

Sunil G commented on YARN-2055:
---

Hi Mayank,
Is this issue same as YARN-2022 ?

> Preemption: Jobs are failing due to AMs are getting launched and killed 
> multiple times
> --
>
> Key: YARN-2055
> URL: https://issues.apache.org/jira/browse/YARN-2055
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Mayank Bansal
> Fix For: 2.1.0-beta
>
>
> If Queue A does not have enough capacity to run AM, then AM will borrow 
> capacity from queue B to run AM in that case AM will be killed if queue B 
> will reclaim its capacity and again AM will be launched and killed again, in 
> that case job will be failed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2055) Preemption: Jobs are failing due to AMs are getting launched and killed multiple times

2014-05-16 Thread Mayank Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13998941#comment-13998941
 ] 

Mayank Bansal commented on YARN-2055:
-

YARN-2022 is for avoiding killing AM however this issue more like how we are 
launching AM after preemption as there would be situations where you get some 
capacity for one heart beat and then again that capacity is reclaimed by other 
queue and then again AM will be killed and job will be failed. Based on the 
comments of YARN-2022 i dont see this case have been handeled there.

Thanks,
Mayank

> Preemption: Jobs are failing due to AMs are getting launched and killed 
> multiple times
> --
>
> Key: YARN-2055
> URL: https://issues.apache.org/jira/browse/YARN-2055
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Mayank Bansal
>
> If Queue A does not have enough capacity to run AM, then AM will borrow 
> capacity from queue B to run AM in that case AM will be killed if queue B 
> will reclaim its capacity and again AM will be launched and killed again, in 
> that case job will be failed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2055) Preemption: Jobs are failing due to AMs are getting launched and killed multiple times

2014-05-16 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999724#comment-13999724
 ] 

Sunil G commented on YARN-2055:
---

Thank you Mayank for the clarification.
I have a small doubt here. In such scenarios, is it like scheduler should not 
assign any more container for Queue A?
Assuming that here Queue B is demand is there, then only Queue B's requests has 
to be served first. Am I correct?

> Preemption: Jobs are failing due to AMs are getting launched and killed 
> multiple times
> --
>
> Key: YARN-2055
> URL: https://issues.apache.org/jira/browse/YARN-2055
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Mayank Bansal
>
> If Queue A does not have enough capacity to run AM, then AM will borrow 
> capacity from queue B to run AM in that case AM will be killed if queue B 
> will reclaim its capacity and again AM will be launched and killed again, in 
> that case job will be failed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2055) Preemption: Jobs are failing due to AMs are getting launched and killed multiple times

2014-05-16 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999726#comment-13999726
 ] 

Sunil G commented on YARN-2055:
---

Thank you Mayank for the clarification.
I have a small doubt here. In such scenarios, is it like scheduler should not 
assign any more container for Queue A?
Assuming that here Queue B is demand is there, then only Queue B's requests has 
to be served first. Am I correct?

> Preemption: Jobs are failing due to AMs are getting launched and killed 
> multiple times
> --
>
> Key: YARN-2055
> URL: https://issues.apache.org/jira/browse/YARN-2055
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Mayank Bansal
>
> If Queue A does not have enough capacity to run AM, then AM will borrow 
> capacity from queue B to run AM in that case AM will be killed if queue B 
> will reclaim its capacity and again AM will be launched and killed again, in 
> that case job will be failed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2055) Preemption: Jobs are failing due to AMs are getting launched and killed multiple times

2014-05-19 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14002140#comment-14002140
 ] 

Vinod Kumar Vavilapalli commented on YARN-2055:
---

Hi folks, I filed YARN-2074 to address the orthogonal issue of not failing apps 
when repeatedly preempting AM containers.

> Preemption: Jobs are failing due to AMs are getting launched and killed 
> multiple times
> --
>
> Key: YARN-2055
> URL: https://issues.apache.org/jira/browse/YARN-2055
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Mayank Bansal
>
> If Queue A does not have enough capacity to run AM, then AM will borrow 
> capacity from queue B to run AM in that case AM will be killed if queue B 
> will reclaim its capacity and again AM will be launched and killed again, in 
> that case job will be failed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)