[ 
https://issues.apache.org/jira/browse/YARN-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16599324#comment-16599324
 ] 

Mike commented on YARN-5139:
----------------------------

Hi [~leftnoteasy],

I am currently looking at trying to reproduce the simulation results provided 
in {{YARN-5139-Concurrent-scheduling-performance-report.pdf}} in YARN 3.1.1.  
Based on the information provided, I created my own environment and ran the 
test, but the results I obtained do not match what was obtained earlier.  Below 
is my environment, please advise me if there is something else I can try?.

 

Thanks.

 

OS - CentOS 6.6, 40 cores, 256 GB RAM, Java 1.8_101

YARN 3.1.1

Using capacity scheduler, with a single queue, maximum-applications=100000, 
maximum-am-resource-percent=0.1, DefaultResourceCalculator, 
node-locality-delay=-1, rack-locality-additional-delay=-1, 
schedule-asynchronously-enable=true, schedule-asynchronously.maximum-threads=4--

 

SLS - yarn.sls.runner.pool.size=4000, yarn.sls.nm.memory.mb=131072, 
yarn.sls.nm.vcores=128, nm/am heartbeat 1000ms, metrics=true

I had a more complex workload that more closely matched what was given, but for 
simplicity, I'll add an average one here (I ran this too, and got similar 
results to my more complex workload):

{
  "num.nodes": 20000,
  "num.racks": 1000
}
{
  "job.start.ms": 0,
  "job.queue.name": "my_queue",
  "job.count": 47000,
  "job.tasks": [
  {
    "count": 400,
    "container.duration.ms": 120000,
    "container.type": "map"
  } ]
}

 

 

> [Umbrella] Move YARN scheduler towards global scheduler
> -------------------------------------------------------
>
>                 Key: YARN-5139
>                 URL: https://issues.apache.org/jira/browse/YARN-5139
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>            Priority: Major
>         Attachments: Explanantions of Global Scheduling (YARN-5139) 
> Implementation.pdf, YARN-5139-Concurrent-scheduling-performance-report.pdf, 
> YARN-5139-Global-Schedulingd-esign-and-implementation-notes-v2.pdf, 
> YARN-5139-Global-Schedulingd-esign-and-implementation-notes.pdf, 
> YARN-5139.000.patch, wip-1.YARN-5139.patch, wip-2.YARN-5139.patch, 
> wip-3.YARN-5139.patch, wip-4.YARN-5139.patch, wip-5.YARN-5139.patch
>
>
> Existing YARN scheduler is based on node heartbeat. This can lead to 
> sub-optimal decisions because scheduler can only look at one node at the time 
> when scheduling resources.
> Pseudo code of existing scheduling logic looks like:
> {code}
> for node in allNodes:
>    Go to parentQueue
>       Go to leafQueue
>         for application in leafQueue.applications:
>            for resource-request in application.resource-requests
>               try to schedule on node
> {code}
> Considering future complex resource placement requirements, such as node 
> constraints (give me "a && b || c") or anti-affinity (do not allocate HBase 
> regionsevers and Storm workers on the same host), we may need to consider 
> moving YARN scheduler towards global scheduling.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to