I encountered an interesting situation today. I'm running Hadoop 0.17.1. What happened was that 3 jobs started simultaneously, which is expected in my workflow, but then resources got very mixed up.

One of the jobs grabbed all the available reducers (5) and got one map task in before the second job started taking all the map tasks. This means the first job (the one with the reducers) was holding the reducers and doing absolutely no work. The other job was mapping, but was suboptimally using resources since it wasn't shuffling at the same time as it mapped. (The third job was doing nothing at all.)

Does Hadoop not schedule jobs first-come-first served? I'm pretty confident that the jobs all had identical priority since I haven't set it to be different anywhere else. If it doesn't schedule jobs in this manner, is there a reason why it doesn't? It seems like this problem will decrease total throughput significantly in some situations.

-Bryan

Reply via email to