Hi,

I'm curious about the interaction between the DRF algorithm and preemption
in Hadoop. Let's say that a job that enters Hadoop by itself, so it could
get all of the CPU in the whole cluster. Then, another job comes in and
would like to request some CPU resources.

DRF then kicks in and the first job needs to be potentially deallocated
from the machines it is running on correct? However, if the first job is
close to finishing or that it is operating in a huge dataset, then it might
actually be beneficial to wait for the first job to finish instead of
kicking it out.

How does the DRF algorithm implemented in hadoop 2.0.x handle the above
situation? Hopefully my explanation is clear.

Thanks in advance,

-- 
~Hilfi Alkaff~

Reply via email to