[ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16346404#comment-16346404
 ] 

Tao Yang commented on YARN-7494:
--------------------------------

Thanks [~cheersyang] for your mention.

Some thoughts (parts are the same with those in my last comments) from my side:
 # Sorting by nodeLookupPolicy for every allocation process is expensive. We 
have planned to add new service to manage and periodically refresh 
per-ordering-policy ordered list of nodes, scheduler can filter candidate nodes 
from ordered node lists for app request and need no more sorting. So that we 
can define cluster-level(or default) ordering policy to achieve better load 
balance or other requirements and it's better for the performance of scheduler.
 # This patch iterates all partition nodes to create new 
PartitionBasedCandidateNodeSet instance for every schedule process in 
CapacityScheduler#getCandidateNodeSet. I think we can keep a single instance to 
avoid always creating it. Further more, we can replace it with ordered node 
list if the plan is acceptable.
 # This patch remains as it is to iterate all nodes and trigger the schedule 
process for every node in CapacityScheduler#schedule. It's property for 
scheduler before which dose allocation for single node. But for multiple nodes, 
I think it's better to iterates all partitions to trigger the schedule process, 
we can move multiNodePlacementEnabled check branch from 
CapacityScheduler#getCandidateNodeSet to CapacityScheduler#schedule, do 
different iteration and logic for different choose.
 # CandidateNodeSet#getAllNodes returns Map<NodeId, N> type, and it seems no 
need to find node by NodeId, perhaps we can change it to Set or List to support 
getting ordered nodes.

Thanks.

> Add muti node lookup support for better placement
> -------------------------------------------------
>
>                 Key: YARN-7494
>                 URL: https://issues.apache.org/jira/browse/YARN-7494
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacity scheduler
>            Reporter: Sunil G
>            Assignee: Sunil G
>            Priority: Major
>         Attachments: YARN-7494.001.patch, YARN-7494.v0.patch, 
> YARN-7494.v1.patch
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup 
> based on partition to start with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to