How many reduce tasks do you have? Look into increasing
mapred.reduce.parallel.copies from the default of 5 to something more like
20 or 30.

- Aaron

On Fri, Dec 5, 2008 at 10:00 PM, Songting Chen <[EMAIL PROTECTED]>wrote:

> A little more information:
>
> We optimized our Map process quite a bit that now the Shuffle becomes the
> bottleneck.
>
> 1. There are 300 Map jobs (128M size block), each takes about 13 sec.
> 2. The Reducer starts running at a very late stage (80% maps are done)
> 3. copy 300 map outputs (shuffle) takes as long as the entire map process,
> although each map output is just about 50Kbytes
>
>
>
>
>
> --- On Fri, 12/5/08, Alex Loddengaard <[EMAIL PROTECTED]> wrote:
>
> > From: Alex Loddengaard <[EMAIL PROTECTED]>
> > Subject: Re: slow shuffle
> > To: core-user@hadoop.apache.org
> > Date: Friday, December 5, 2008, 11:43 AM
> > These configuration options will be useful:
> >
> > <property>
> > >
> > <name>mapred.job.shuffle.merge.percent</name>
> > >   <value>0.66</value>
> > >   <description>The usage threshold at which an
> > in-memory merge will be
> > >   initiated, expressed as a percentage of the total
> > memory allocated to
> > >   storing in-memory map outputs, as defined by
> > >   mapred.job.shuffle.input.buffer.percent.
> > >   </description>
> > > </property>
> > >
> > > <property>
> > >
> > <name>mapred.job.shuffle.input.buffer.percent</name>
> > >   <value>0.70</value>
> > >   <description>The percentage of memory to be
> > allocated from the maximum
> > > heap
> > >   size to storing map outputs during the shuffle.
> > >   </description>
> > > </property>
> > >
> > > <property>
> > >
> > <name>mapred.job.reduce.input.buffer.percent</name>
> > >   <value>0.0</value>
> > >   <description>The percentage of memory-
> > relative to the maximum heap size-
> > > to
> > >   retain map outputs during the reduce. When the
> > shuffle is concluded, any
> > >   remaining map outputs in memory must consume less
> > than this threshold
> > > before
> > >   the reduce can begin.
> > >   </description>
> > > </property>
> > >
> >
> > How long did the shuffle take relative to the rest of the
> > job?
> >
> > Alex
> >
> > On Fri, Dec 5, 2008 at 11:17 AM, Songting Chen
> > <[EMAIL PROTECTED]>wrote:
> >
> > > We encountered a bottleneck during the shuffle phase.
> > However, there is not
> > > much data to be shuffled across the network at all -
> > total less than
> > > 10MBytes (the combiner aggregated most of the data).
> > >
> > > Are there any parameters or anything we can tune to
> > improve the shuffle
> > > performance?
> > >
> > > Thanks,
> > > -Songting
> > >
>

Reply via email to