Hadoop job scheduling issue

2008-09-24 Thread Bryan Duxbury
I encountered an interesting situation today. I'm running Hadoop  
0.17.1. What happened was that 3 jobs started simultaneously, which  
is expected in my workflow, but then resources got very mixed up.


One of the jobs grabbed all the available reducers (5) and got one  
map task in before the second job started taking all the map tasks.  
This means the first job (the one with the reducers) was holding the  
reducers and doing absolutely no work. The other job was mapping, but  
was suboptimally using resources since it wasn't shuffling at the  
same time as it mapped. (The third job was doing nothing at all.)


Does Hadoop not schedule jobs first-come-first served? I'm pretty  
confident that the jobs all had identical priority since I haven't  
set it to be different anywhere else. If it doesn't schedule jobs in  
this manner, is there a reason why it doesn't? It seems like this  
problem will decrease total throughput significantly in some situations.


-Bryan


Re: Hadoop job scheduling issue

2008-09-24 Thread omalley
On 9/24/08, Bryan Duxbury [EMAIL PROTECTED] wrote:

 Does Hadoop not schedule jobs first-come-first served?

Yes, Hadoop 0.17 schedules jobs fifo. If it isn't, that is a bug.

-- Owen