Hadoop job scheduling issue

Bryan Duxbury Wed, 24 Sep 2008 16:13:53 -0700

I encountered an interesting situation today. I'm running Hadoop0.17.1. What happened was that 3 jobs started simultaneously, whichis expected in my workflow, but then resources got very mixed up.

One of the jobs grabbed all the available reducers (5) and got onemap task in before the second job started taking all the map tasks.This means the first job (the one with the reducers) was holding thereducers and doing absolutely no work. The other job was mapping, butwas suboptimally using resources since it wasn't shuffling at thesame time as it mapped. (The third job was doing nothing at all.)

Does Hadoop not schedule jobs first-come-first served? I'm prettyconfident that the jobs all had identical priority since I haven'tset it to be different anywhere else. If it doesn't schedule jobs inthis manner, is there a reason why it doesn't? It seems like thisproblem will decrease total throughput significantly in some situations.


-Bryan

Hadoop job scheduling issue

Reply via email to