Re: Lazy initialization of Reducers

Arun C Murthy Wed, 21 Jul 2010 15:24:56 -0700

Moving to mapreduce-user@, bcc gene...@. Please do not use thegeneral@ list for project specific discussions.


On Jul 21, 2010, at 10:15 AM, Syed Wasti wrote:

It says “:In M/R job Reducers are initialized with Mappers at thejob initialization, but the reduce method is called in reduce phasewhen all the maps had been finished. So in large jobs where Reducerloads data (>100 MB for business logic) in-memory on initialization,the performance can be increased by lazily initializing Reducersi.e. loading data in reduce method controlled by an initialize flagvariable which assures that it is loaded only once. By lazilyinitializing Reducers which require memory (for business logic) oninitialization, number of maps can be increased.”

The part about 'loading data in reduce method controlled by aninitialize flag variable which assures that it is loaded only once'makes no sense to me.

However, you can 'slowstart' reduces by ensuring sufficient maps arecomplete before _any_ reduces are launched... from mapred-default.xml:


<property>
  <name>mapred.reduce.slowstart.completed.maps</name>
  <value>0.05</value>

<description>Fraction of the number of maps in the job which shouldbe

  complete before reduces are scheduled for the job.
  </description>
</property>

Arun

Re: Lazy initialization of Reducers

Reply via email to