[ 
https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16607495#comment-16607495
 ] 

Wangda Tan commented on YARN-8513:
----------------------------------

Spent good amount of time to check the issue.

I found scheduler tries to reserve containers on two nodes. What happens is:

1) For root queue, total resource = 1351680, used resource = 1095680, available 
resource = 256000
2) The app which gets resource is running under dev queue, maximum resource = 
8811008, used resource = 7168.
3) The app always get container reserved with size=360448, which is beyond 
parent queue's available resource. So this request will be rejected by resource 
committer.

In my mind, this is expected behavior, even though the resource proposal / 
reject is not necessary. This behavior is in-line with YARN-4280, which we want 
to keep under-utilized queue still get resources when resource request is large.

Let me use an example to explain this:

Scheduler has two queues, a and b, capacity of each queues are 0.5. max 
capacity of a = 1.0, b=0.8. Assume cluster resource = 100.

There's an app running in a, which uses 75 resources, so a's absolute used 
capacity = 0.75. There're still many pending resource request from a, size of 
each = 1

And then user submit app to b. asking a single container, which has size = 30. 
In that case, scheduler cannot allocate the container because cluster's total 
available = 25.

If we give these resources to queue=a, queue=b can never get the available 
resource, because smaller resource request will be always preferred.

Instead, the logic in YARN-4280 is: if queue b don't get resource because of 
parent queue's resource limit. Instead of giving resources to other queues, 
scheduler hold the resource. So you can see that there're 25 resources 
available, but no one can get the resource.

The problem only occurs in a super busy cluster, with less node. To solve the 
problem, turn on preemption can alleviate the issue a lot.

I prefer to close this as "no fix needed".

Thoughts?

> CapacityScheduler infinite loop when queue is near fully utilized
> -----------------------------------------------------------------
>
>                 Key: YARN-8513
>                 URL: https://issues.apache.org/jira/browse/YARN-8513
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler, yarn
>    Affects Versions: 3.1.0, 2.9.1
>         Environment: Ubuntu 14.04.5 and 16.04.4
> YARN is configured with one label and 5 queues.
>            Reporter: Chen Yufei
>            Priority: Major
>         Attachments: jstack-1.log, jstack-2.log, jstack-3.log, jstack-4.log, 
> jstack-5.log, top-during-lock.log, top-when-normal.log, yarn3-jstack1.log, 
> yarn3-jstack2.log, yarn3-jstack3.log, yarn3-jstack4.log, yarn3-jstack5.log, 
> yarn3-resourcemanager.log, yarn3-top
>
>
> ResourceManager does not respond to any request when queue is near fully 
> utilized sometimes. Sending SIGTERM won't stop RM, only SIGKILL can. After RM 
> restart, it can recover running jobs and start accepting new ones.
>  
> Seems like CapacityScheduler is in an infinite loop printing out the 
> following log messages (more than 25,000 lines in a second):
>  
> {{2018-07-10 17:16:29,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> assignedContainer queue=root usedCapacity=0.99816763 
> absoluteUsedCapacity=0.99816763 used=<memory:16170624, vCores:1577> 
> cluster=<memory:29441544, vCores:5792>}}
> {{2018-07-10 17:16:29,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Failed to accept allocation proposal}}
> {{2018-07-10 17:16:29,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator:
>  assignedContainer application attempt=appattempt_1530619767030_1652_000001 
> container=null 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@14420943
>  clusterResource=<memory:29441544, vCores:5792> type=NODE_LOCAL 
> requestedPartition=}}
>  
> I encounter this problem several times after upgrading to YARN 2.9.1, while 
> the same configuration works fine under version 2.7.3.
>  
> YARN-4477 is an infinite loop bug in FairScheduler, not sure if this is a 
> similar problem.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to