Thanks! That makes perfect sense. john From: Sandy Ryza [mailto:sandy.r...@cloudera.com] Sent: Monday, September 09, 2013 4:17 AM To: user@hadoop.apache.org Subject: Re: Scheduler question
Hi John, YARN schedulers handle this with the concept of "reservations". Scheduling decisions occur on node heartbeats. When a node that is full heartbeats, the next application that should be able to place a container on it gets to place a "reservation" on it. Each node has space for a single reservation. Containers for other applications will not be placed on the node until a reservation is fulfilled. If you are using the Fair Scheduler (Capacity Scheduler works similarly, but I'm not sure on the specifics), this means that app B would get containers far before app A completed, but not soon either. After app A gets its 20 containers, it would get reservations as well on the nodes. After one of app A's containers finishes on a node, it would get to place another container on that node to fulfill its reservation. Then app B would get a reservation on that node. Then no containers would be placed on that node until app B is able to place one, which would be after both of app A's containers finish. It's also possible to configure the schedulers to use preemption to make this kind of thing go a lot faster. Does that make some sense? -Sandy On Mon, Sep 9, 2013 at 7:21 AM, John Lilley <john.lil...@redpoint.net<mailto:john.lil...@redpoint.net>> wrote: Do the Hadoop 2.0 YARN scheduler(s) deal with situations like the following? Hadoop cluster of 10 nodes, with 8GB each available for containers. There is only one queue. Application A requests 100 4GB containers. It initially, or after a little while, gets 20 containers. Later, application B requests 1 8GB container. Suppose that App-A's containers each take a few minutes. At some point one will complete. When that happens, will the scheduler immediately allocate another 4GB container to App-A? If so will App-B ever get its container until App-A is almost done? Thanks John