Re: Hadoop MySQL database access

2011-12-29 Thread Praveen Sripati
Check the `mapreduce.job.reduce.slowstart.completedmaps` parameter. The
reducers cannot start processing the data from the mappers until the all
the map tasks are complete, but the reducers can start fetching the data
from the nodes on which the map tasks have completed.

Praveen

On Thu, Dec 29, 2011 at 12:44 AM, Prashant Kommireddi
prash1...@gmail.comwrote:

 By design reduce would start only after all the maps finish. There is
 no way for the reduce to begin grouping/merging by key unless all the
 maps have finished.

 Sent from my iPhone

 On Dec 28, 2011, at 8:53 AM, JAGANADH G jagana...@gmail.com wrote:

  Hi All,
 
  I wrote a map reduce program to fetch data from MySQL and process the
  data(word count).
  The program executes successfully . But I noticed that the reduce task
  starts after finishing the map task only .
  Is there any way to run the map and reduce in parallel.
 
  The program fetches data from MySQL and writes the processed output to
  hdfs.
  I am using hadoop in pseduo-distributed mode .
  --
  **
  JAGANADH G
  http://jaganadhg.in
  *ILUGCBE*
  http://ilugcbe.org.in



Re: Hadoop MySQL database access

2011-12-29 Thread JAGANADH G
@Praveen
Thanks . I got it .


-- 
**
JAGANADH G
http://jaganadhg.in
*ILUGCBE*
http://ilugcbe.org.in


Hadoop MySQL database access

2011-12-28 Thread JAGANADH G
Hi All,

I wrote a map reduce program to fetch data from MySQL and process the
data(word count).
The program executes successfully . But I noticed that the reduce task
starts after finishing the map task only .
Is there any way to run the map and reduce in parallel.

The program fetches data from MySQL and writes the processed output to
hdfs.
I am using hadoop in pseduo-distributed mode .
-- 
**
JAGANADH G
http://jaganadhg.in
*ILUGCBE*
http://ilugcbe.org.in


Re: Hadoop MySQL database access

2011-12-28 Thread Prashant Kommireddi
By design reduce would start only after all the maps finish. There is
no way for the reduce to begin grouping/merging by key unless all the
maps have finished.

Sent from my iPhone

On Dec 28, 2011, at 8:53 AM, JAGANADH G jagana...@gmail.com wrote:

 Hi All,

 I wrote a map reduce program to fetch data from MySQL and process the
 data(word count).
 The program executes successfully . But I noticed that the reduce task
 starts after finishing the map task only .
 Is there any way to run the map and reduce in parallel.

 The program fetches data from MySQL and writes the processed output to
 hdfs.
 I am using hadoop in pseduo-distributed mode .
 --
 **
 JAGANADH G
 http://jaganadhg.in
 *ILUGCBE*
 http://ilugcbe.org.in