[ 
https://issues.apache.org/jira/browse/YARN-9598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859709#comment-16859709
 ] 

Tao Yang edited comment on YARN-9598 at 6/10/19 5:28 AM:
---------------------------------------------------------

Thanks [~jutia] for the review.
{quote}
for this, if re-reservation is disabled, the shouldAllocOrReserveNewContainer 
may return false in most cases, and thus even scheduler has a change to look up 
other candidates, it may not assign containers.
{quote}
IIUIC, shouldAllocOrReserveNewContainer variable is used for reserving more 
resource than required, which I think it's not only unnecessary (we can see and 
choose available resources from all nodes) but also harmful in multi-nodes 
scenarios, this logic can make a low-priority app get much more resources than 
needs which won't be released util all the needs satisfied, it's inefficient 
for the cluster utilization and can block requirements from high-priority apps. 
On another hand, disable re-reservation can only make the scheduler skip 
reserving the same container repeatedly and try to allocate on other nodes, it 
won't affect normal scheduling for this app and later apps. Thoughts?
{quote}
I'm wondering why we just handle this case like sing-node, and change th logic 
in CapacityScheduler#allocateContainersOnMultiNodes like below
{quote}
[~cheersyang] and I have discussed about moving allocateFromReservedContainer 
ahead to avoid trying to allocate from reserved containers many times in once 
scheduling for YARN-9432, and chose not to do that after considering that won't 
be a tiny change and should affect current scheduling process, just fix the 
problem without changing more, same as this issue.


was (Author: tao yang):
Thanks [~jutia] for the
{quote}
for this, if re-reservation is disabled, the shouldAllocOrReserveNewContainer 
may return false in most cases, and thus even scheduler has a change to look up 
other candidates, it may not assign containers.
{quote}
IIUIC, shouldAllocOrReserveNewContainer variable is used for reserving more 
resource than required, which I think it's not only unnecessary (we can see and 
choose available resources from all nodes) but also harmful in multi-nodes 
scenarios, this logic can make a low-priority app get much more resources than 
needs which won't be released util all the needs satisfied, it's inefficient 
for the cluster utilization and can block requirements from high-priority apps. 
On another hand, disable re-reservation can only make the scheduler skip 
reserving the same container repeatedly and try to allocate on other nodes, it 
won't affect normal scheduling for this app and later apps. Thoughts?
{quote}
I'm wondering why we just handle this case like sing-node, and change th logic 
in CapacityScheduler#allocateContainersOnMultiNodes like below
{quote}
[~cheersyang] and I have discussed about moving allocateFromReservedContainer 
ahead to avoid trying to allocate from reserved containers many times in once 
scheduling for YARN-9432, and chose not to do that after considering that won't 
be a tiny change and should affect current scheduling process, just fix the 
problem without changing more, same as this issue.

> Make reservation work well when multi-node enabled
> --------------------------------------------------
>
>                 Key: YARN-9598
>                 URL: https://issues.apache.org/jira/browse/YARN-9598
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>            Reporter: Tao Yang
>            Assignee: Tao Yang
>            Priority: Major
>         Attachments: YARN-9598.001.patch, image-2019-06-10-11-37-43-283.png, 
> image-2019-06-10-11-37-44-975.png
>
>
> This issue is to solve problems about reservation when multi-node enabled:
>  # As discussed in YARN-9576, re-reservation proposal may be always generated 
> on the same node and break the scheduling for this app and later apps. I 
> think re-reservation in unnecessary and we can replace it with 
> LOCALITY_SKIPPED to let scheduler have a chance to look up follow candidates 
> for this app when multi-node enabled.
>  # Scheduler iterates all nodes and try to allocate for reserved container in 
> LeafQueue#allocateFromReservedContainer. Here there are two problems:
>  ** The node of reserved container should be taken as candidates instead of 
> all nodes when calling FiCaSchedulerApp#assignContainers, otherwise later 
> scheduler may generate a reservation-fulfilled proposal on another node, 
> which will always be rejected in FiCaScheduler#commonCheckContainerAllocation.
>  ** Assignment returned by FiCaSchedulerApp#assignContainers could never be 
> null even if it's just skipped, it will break the normal scheduling process 
> for this leaf queue because of the if clause in LeafQueue#assignContainers: 
> "if (null != assignment) \{ return assignment;}"
>  # Nodes which have been reserved should be skipped when iterating candidates 
> in RegularContainerAllocator#allocate, otherwise scheduler may generate 
> allocation or reservation proposal on these node which will always be 
> rejected in FiCaScheduler#commonCheckContainerAllocation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to