[ 
https://issues.apache.org/jira/browse/SLIDER-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14805337#comment-14805337
 ] 

Youjie Chen commented on SLIDER-939:
------------------------------------

Yes, when the outstanding requests are cancelled, yarn should return to normal 
and be ready to accept resource request from other jobs instead of being 
reserved by the previous outstanding request. 
One thing to notice is that the outstanding request would never be satisfied in 
this case unless we release one worker. It seems then the queue is unblocked 
and other requests can go on.

> flex down does not cancel the outstanding request
> -------------------------------------------------
>
>                 Key: SLIDER-939
>                 URL: https://issues.apache.org/jira/browse/SLIDER-939
>             Project: Slider
>          Issue Type: Bug
>          Components: core
>    Affects Versions: Slider 0.80
>         Environment: Hadoop 2.7.1 
> Slider 0.80.0
>            Reporter: Youjie Chen
>            Assignee: Steve Loughran
>              Labels: patch
>             Fix For: Slider 0.81
>
>
> I run slider app on  a 6 nodes cluster. To ensure there is only one 
> comonent(worker) instance on each node, I set yarn.memory to 51% of the total 
> memory. 
> Then I flex up to 7 workers,  there would be one worker request(outstanding)  
> that will never be met, this is expected.
> Then I flexed down back to 6 workers, and any container request for any job 
> would be blocked even if there are plenty of memory/core for the job, From RM 
> log, we can see there are continuous output:
> capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainersToNode(1240)) - Skipping scheduling 
> since node test.example.com:45454 is reserved by application 
> appattempt_1442384698868_0008_000001
>  It seems  the outstanding requests are not actually cancelled in the 
> requesting container queue but keep trying to request.
> After I flexed down to 5 workers, the other blocked jobs can run.
> This is related to JIRA https://issues.apache.org/jira/browse/SLIDER-490



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to