Hi people,
I've a cluster where around 40% of nodes are low on disk space. The output 
after the maps is too big for the mapred.local.dir in the nodes low on disk 
space and with hadoop it happens a lot that it tries to flush the output to 
these nodes, fails, tries  on some other node until it finally flushes the data 
into one of the nodes with large disk space. 

So my question is whether its possible for hadoop to select or for us to be 
able to notify hadoop of the nodes which have larger disk space so that it 
doesn't waste time on nodes with low disk space.

Many thanks

H


      

Reply via email to