I think one of the issues is that the Reducer starts very late in the process, slowing the entire job significantly.
Is there a way to let reducer start earlier? --- On Fri, 12/5/08, Songting Chen <[EMAIL PROTECTED]> wrote: > From: Songting Chen <[EMAIL PROTECTED]> > Subject: Re: slow shuffle > To: core-user@hadoop.apache.org > Date: Friday, December 5, 2008, 1:27 PM > We have 4 testing data nodes with 3 reduce tasks. The > parallel.copies parameter has been increased to 20,30, even > 50. But it doesn't really help... > > > --- On Fri, 12/5/08, Aaron Kimball > <[EMAIL PROTECTED]> wrote: > > > From: Aaron Kimball <[EMAIL PROTECTED]> > > Subject: Re: slow shuffle > > To: core-user@hadoop.apache.org > > Date: Friday, December 5, 2008, 12:28 PM > > How many reduce tasks do you have? Look into > increasing > > mapred.reduce.parallel.copies from the default of 5 to > > something more like > > 20 or 30. > > > > - Aaron > > > > On Fri, Dec 5, 2008 at 10:00 PM, Songting Chen > > <[EMAIL PROTECTED]>wrote: > > > > > A little more information: > > > > > > We optimized our Map process quite a bit that now > the > > Shuffle becomes the > > > bottleneck. > > > > > > 1. There are 300 Map jobs (128M size block), each > > takes about 13 sec. > > > 2. The Reducer starts running at a very late > stage > > (80% maps are done) > > > 3. copy 300 map outputs (shuffle) takes as long > as the > > entire map process, > > > although each map output is just about 50Kbytes > > > > > > > > > > > > > > > > > > --- On Fri, 12/5/08, Alex Loddengaard > > <[EMAIL PROTECTED]> wrote: > > > > > > > From: Alex Loddengaard > <[EMAIL PROTECTED]> > > > > Subject: Re: slow shuffle > > > > To: core-user@hadoop.apache.org > > > > Date: Friday, December 5, 2008, 11:43 AM > > > > These configuration options will be useful: > > > > > > > > <property> > > > > > > > > > > > > <name>mapred.job.shuffle.merge.percent</name> > > > > > <value>0.66</value> > > > > > <description>The usage > threshold at > > which an > > > > in-memory merge will be > > > > > initiated, expressed as a percentage > of > > the total > > > > memory allocated to > > > > > storing in-memory map outputs, as > defined > > by > > > > > > mapred.job.shuffle.input.buffer.percent. > > > > > </description> > > > > > </property> > > > > > > > > > > <property> > > > > > > > > > > > > <name>mapred.job.shuffle.input.buffer.percent</name> > > > > > <value>0.70</value> > > > > > <description>The percentage of > > memory to be > > > > allocated from the maximum > > > > > heap > > > > > size to storing map outputs during > the > > shuffle. > > > > > </description> > > > > > </property> > > > > > > > > > > <property> > > > > > > > > > > > > <name>mapred.job.reduce.input.buffer.percent</name> > > > > > <value>0.0</value> > > > > > <description>The percentage of > > memory- > > > > relative to the maximum heap size- > > > > > to > > > > > retain map outputs during the reduce. > When > > the > > > > shuffle is concluded, any > > > > > remaining map outputs in memory must > > consume less > > > > than this threshold > > > > > before > > > > > the reduce can begin. > > > > > </description> > > > > > </property> > > > > > > > > > > > > > How long did the shuffle take relative to > the > > rest of the > > > > job? > > > > > > > > Alex > > > > > > > > On Fri, Dec 5, 2008 at 11:17 AM, Songting > Chen > > > > <[EMAIL PROTECTED]>wrote: > > > > > > > > > We encountered a bottleneck during the > > shuffle phase. > > > > However, there is not > > > > > much data to be shuffled across the > network > > at all - > > > > total less than > > > > > 10MBytes (the combiner aggregated most > of > > the data). > > > > > > > > > > Are there any parameters or anything we > can > > tune to > > > > improve the shuffle > > > > > performance? > > > > > > > > > > Thanks, > > > > > -Songting > > > > > > > >