What is the reason for putting the output of one mapper task into one file ?

Jeff Zhang Wed, 16 Jun 2010 19:54:02 -0700

Hi all,

I check the source code of Mapper Task, it seems that the output of
one mapper task is one data file and one index file. And reducer task
will fetch part of the output of mapper.
I am wondering why not putting the output of mapper into n files (n is
the reducer number), since mapper task knows the Partitioner. and the
logic will be much easier. Is there any performance consideration for
putting the output into one file ? Thanks.



-- 
Best Regards

Jeff Zhang

What is the reason for putting the output of one mapper task into one file ?

Reply via email to