[ https://issues.apache.org/jira/browse/SLIDER-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981940#comment-14981940 ]
kyungwan nam commented on SLIDER-939: ------------------------------------- Hi. I've met same problem when "yarn.memory" is not a multiple of "yarn.scheduler.minimum-allocation-mb". I think that this issue can be caused by SLIDER-955. > flex down does not cancel the outstanding request > ------------------------------------------------- > > Key: SLIDER-939 > URL: https://issues.apache.org/jira/browse/SLIDER-939 > Project: Slider > Issue Type: Bug > Components: core > Affects Versions: Slider 0.80 > Environment: Hadoop 2.7.1 > Slider 0.80.0 > Reporter: Youjie Chen > Assignee: Steve Loughran > Labels: patch > Fix For: Slider 0.90 > > > I run slider app on a 6 nodes cluster. To ensure there is only one > comonent(worker) instance on each node, I set yarn.memory to 51% of the total > memory. > Then I flex up to 7 workers, there would be one worker request(outstanding) > that will never be met, this is expected. > Then I flexed down back to 6 workers, and any container request for any job > would be blocked even if there are plenty of memory/core for the job, From RM > log, we can see there are continuous output: > capacity.CapacityScheduler > (CapacityScheduler.java:allocateContainersToNode(1240)) - Skipping scheduling > since node test.example.com:45454 is reserved by application > appattempt_1442384698868_0008_000001 > It seems the outstanding requests are not actually cancelled in the > requesting container queue but keep trying to request. > After I flexed down to 5 workers, the other blocked jobs can run. > This is related to JIRA https://issues.apache.org/jira/browse/SLIDER-490 -- This message was sent by Atlassian JIRA (v6.3.4#6332)