Take a look at Arun's slide deck on Hadoop performance:

http://bit.ly/EDCg3

It is important to get io.sort.mb large enough, the io.sort.factor should be closer to 100 instead of 10. I'd also use large block sizes to reduce the number of maps. Please see the deck for other important factors.

-- Owen

Reply via email to