Hi everyone. I'm pretty much a Hadoop newbie and want to make sure I
understand things correctly.

I set up my Hadoop cluster with the fair schedular and 3 queues, where each
queue has the same weight as the other - the goal is for 3 users to get the
same share of the cluster.

Preemption is enabled, but I'm seeing some non-intuitive behavior. If one
user submits a job A that takes up the entire cluster, and then another
user submits job B, 33% of job A's containers are preempted and their
capacity transferred to job B. I had expected it to be 50% - since the two
*active* queues have the same weight.

What seems problematic here, is that if the two users submitted their jobs
at the same time, they would receive 50% each, right? It seems very strange
that the *stable* scheduling situation of long-running jobs would be
influenced by a race condition such as the exact submission time. Or in
other words, that the scheduling policy for allocating new/empty containers
is different from the scheduling policy for preempting already-running ones.

I do understand that this is how the fair scheduler works. I was wondering
if I'm missing something, or whether some other setup could provide my
expected behavior (perhaps with the capacity scheduler?).

Any input here would be greatly appreciated!

Shay

Reply via email to