Re: Hadoop MySQL database access
Check the `mapreduce.job.reduce.slowstart.completedmaps` parameter. The reducers cannot start processing the data from the mappers until the all the map tasks are complete, but the reducers can start fetching the data from the nodes on which the map tasks have completed. Praveen On Thu, Dec 29, 2011 at 12:44 AM, Prashant Kommireddi prash1...@gmail.comwrote: By design reduce would start only after all the maps finish. There is no way for the reduce to begin grouping/merging by key unless all the maps have finished. Sent from my iPhone On Dec 28, 2011, at 8:53 AM, JAGANADH G jagana...@gmail.com wrote: Hi All, I wrote a map reduce program to fetch data from MySQL and process the data(word count). The program executes successfully . But I noticed that the reduce task starts after finishing the map task only . Is there any way to run the map and reduce in parallel. The program fetches data from MySQL and writes the processed output to hdfs. I am using hadoop in pseduo-distributed mode . -- ** JAGANADH G http://jaganadhg.in *ILUGCBE* http://ilugcbe.org.in
Re: Hadoop MySQL database access
@Praveen Thanks . I got it . -- ** JAGANADH G http://jaganadhg.in *ILUGCBE* http://ilugcbe.org.in
Hadoop MySQL database access
Hi All, I wrote a map reduce program to fetch data from MySQL and process the data(word count). The program executes successfully . But I noticed that the reduce task starts after finishing the map task only . Is there any way to run the map and reduce in parallel. The program fetches data from MySQL and writes the processed output to hdfs. I am using hadoop in pseduo-distributed mode . -- ** JAGANADH G http://jaganadhg.in *ILUGCBE* http://ilugcbe.org.in
Re: Hadoop MySQL database access
By design reduce would start only after all the maps finish. There is no way for the reduce to begin grouping/merging by key unless all the maps have finished. Sent from my iPhone On Dec 28, 2011, at 8:53 AM, JAGANADH G jagana...@gmail.com wrote: Hi All, I wrote a map reduce program to fetch data from MySQL and process the data(word count). The program executes successfully . But I noticed that the reduce task starts after finishing the map task only . Is there any way to run the map and reduce in parallel. The program fetches data from MySQL and writes the processed output to hdfs. I am using hadoop in pseduo-distributed mode . -- ** JAGANADH G http://jaganadhg.in *ILUGCBE* http://ilugcbe.org.in