Re: Reduce doesn't start until map finishes

Chris Douglas Tue, 03 Mar 2009 15:10:18 -0800

This is normal behavior. The Reducer is guaranteed to receive all theresults for its partition in sorted order. No reduce can start untilall the maps are completed, since any running map could emit a resultthat would violate the order for the results it currently has. -C


On Mar 1, 2009, at 9:24 AM, Rasit OZDAS wrote:

Hi!

Whatever code I run on hadoop, reduce starts a few seconds after map
finishes.
And worse, when I run 10 jobs parallely (using threads and sendingone after
another)
all maps finish sequentially, then after 8-10 seconds reduces start.
I use reducer also as combiner, my cluster has 6 machines, namenodeand
jobtracker run also as slaves.
There were 44 maps and 6 reduces in the last example, I never trieda bigger
job.
What can the problem be? I've read somewhere that this is not thenormal
behaviour.
Replication factor is 3.
Thank you in advance for any pointers.

Rasit

Re: Reduce doesn't start until map finishes

Reply via email to