I encountered an interesting situation today. I'm running Hadoop
0.17.1. What happened was that 3 jobs started simultaneously, which
is expected in my workflow, but then resources got very mixed up.
One of the jobs grabbed all the available reducers (5) and got one
map task in before the second job started taking all the map tasks.
This means the first job (the one with the reducers) was holding the
reducers and doing absolutely no work. The other job was mapping, but
was suboptimally using resources since it wasn't shuffling at the
same time as it mapped. (The third job was doing nothing at all.)
Does Hadoop not schedule jobs first-come-first served? I'm pretty
confident that the jobs all had identical priority since I haven't
set it to be different anywhere else. If it doesn't schedule jobs in
this manner, is there a reason why it doesn't? It seems like this
problem will decrease total throughput significantly in some situations.
-Bryan
- Hadoop job scheduling issue Bryan Duxbury
-