Does the simulator need to emulate the job/task execution?
2014-08-06 4:06 GMT+08:00 Karthik Kambatla (JIRA) <[email protected]>: > > [ > https://issues.apache.org/jira/browse/YARN-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel > ] > > Karthik Kambatla resolved YARN-2381. > ------------------------------------ > > Resolution: Invalid > > I am not sure if this clone was intentional - resolving the JIRA as > "Invalid". > > > CLONE - Yarn Scheduler Load Simulator > > ------------------------------------- > > > > Key: YARN-2381 > > URL: https://issues.apache.org/jira/browse/YARN-2381 > > Project: Hadoop YARN > > Issue Type: New Feature > > Components: scheduler > > Reporter: JiankunLiu > > Fix For: 2.3.0 > > > > > > The Yarn Scheduler is a fertile area of interest with different > implementations, e.g., Fifo, Capacity and Fair schedulers. Meanwhile, > several optimizations are also made to improve scheduler performance for > different scenarios and workload. Each scheduler algorithm has its own set > of features, and drives scheduling decisions by many factors, such as > fairness, capacity guarantee, resource availability, etc. It is very > important to evaluate a scheduler algorithm very well before we deploy it > in a production cluster. Unfortunately, currently it is non-trivial to > evaluate a scheduling algorithm. Evaluating in a real cluster is always > time and cost consuming, and it is also very hard to find a large-enough > cluster. Hence, a simulator which can predict how well a scheduler > algorithm for some specific workload would be quite useful. > > We want to build a Scheduler Load Simulator to simulate large-scale Yarn > clusters and application loads in a single machine. This would be > invaluable in furthering Yarn by providing a tool for researchers and > developers to prototype new scheduler features and predict their behavior > and performance with reasonable amount of confidence, there-by aiding rapid > innovation. > > The simulator will exercise the real Yarn ResourceManager removing the > network factor by simulating NodeManagers and ApplicationMasters via > handling and dispatching NM/AMs heartbeat events from within the same JVM. > > To keep tracking of scheduler behavior and performance, a scheduler > wrapper will wrap the real scheduler. > > The simulator will produce real time metrics while executing, including: > > * Resource usages for whole cluster and each queue, which can be > utilized to configure cluster and queue's capacity. > > * The detailed application execution trace (recorded in relation to > simulated time), which can be analyzed to understand/validate the > scheduler behavior (individual jobs turn around time, throughput, > fairness, capacity guarantee, etc). > > * Several key metrics of scheduler algorithm, such as time cost of each > scheduler operation (allocate, handle, etc), which can be utilized by > Hadoop developers to find the code spots and scalability limits. > > The simulator will provide real time charts showing the behavior of the > scheduler and its performance. > > A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, > showing how to use simulator to simulate Fair Scheduler and Capacity > Scheduler. > > > > -- > This message was sent by Atlassian JIRA > (v6.2#6252) > -- *Regards,* *Zhaojie*
