[ https://issues.apache.org/jira/browse/YARN-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422002#comment-15422002 ]
Xianyin Xin commented on YARN-5479: ----------------------------------- [~He Tianyi], hope YARN-4090 can provide some information, in which the locked resourceusage was snapshoted and such the performance was improved greatly. > FairScheduler: Scheduling performance improvement > ------------------------------------------------- > > Key: YARN-5479 > URL: https://issues.apache.org/jira/browse/YARN-5479 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler, resourcemanager > Affects Versions: 2.6.0 > Reporter: He Tianyi > Assignee: He Tianyi > > Currently ResourceManager uses a single thread to handle async events for > scheduling. As number of nodes grows, more events need to be processed in > time in FairScheduler. Also, increased number of applications & queues slows > down processing of each single event. > There are two cases that slow processing of nodeUpdate events is problematic: > A. global throughput is lower than number of nodes through heartbeat rounds. > This keeps resource from being allocated since the inefficiency. > B. global throughput meets the need, but for some of these rounds, events of > some nodes cannot get processed before next heartbeat. This brings > inefficiency handling burst requests (i.e. newly submitted MapReduce > application cannot get its all task launched soon given enough resource). > Pretty sure some people will encounter the problem eventually after a single > cluster is scaled to several K of nodes (even with {{assignmultiple}} > enabled). > This issue proposes to perform several optimization towards performance in > FairScheduler {{nodeUpdate}} method. To be specific: > A. trading off fairness with efficiency, queue & app sorting can be skipped > (or should this be called 'delayed sorting'?). we can either start another > dedicated thread to do the sorting & updating, or actually perform sorting > after current result have been used several times (say sort once in every 100 > calls.) > B. performing calculation on {{Resource}} instances is expensive, since at > least 2 objects ({{ResourceImpl}} and its proto builder) is created each time > (using 'immutable' apis). the overhead can be eliminated with a > light-weighted implementation of Resource, which do not instantiate a builder > until necessary, because most instances are used as intermediate result in > scheduler instead of being exchanged via IPC. Also, {{createResource}} is > using reflection, which can be replaced by a plain {{new}} (for scheduler > usage only). furthermore, perhaps we could 'intern' resource to avoid > allocation. > C. other minor changes: such as move {{updateRootMetrics}} call to > {{update}}, making root queue metrics eventual consistent (which may > satisfies most of the needs). or introduce counters to {{getResourceUsage}} > and make changing of resource incrementally instead of recalculate each time. > With A and B, I was looking at 4 times improvement in a cluster with 2K nodes. > Suggestions? Opinions? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org