Re: Ideal number of mappers and reducers to increase performance

2014-08-07 Thread Harsh J
Felix has already explained most of the characteristics that define the parallelism of MR jobs. How many mappers does your program run? Your parallel performance depends on how much parallelism your job actually runs with, aside of what the platform is providing as a capability. Perhaps for your

Re: Ideal number of mappers and reducers to increase performance

2014-08-04 Thread Felix Chern
The mapper and reducer numbers really depends on what your program is trying to do. Without your actual query it’s really difficult to tell why you are having this problem. For example, if you tried to perform a global sum or count, cascalog will only use one reducer since this is the only way

Re: Ideal number of mappers and reducers to increase performance

2014-08-04 Thread Sindhu Hosamane
Thanks a lot for your explanation Felix . MY query is not using global sort/count. But still i am unable to understand - even i set the mapped.reduce.tasks=4 when the hadoop job runs i still see 14/08/03 15:01:48 INFO mapred.MapTask: numReduceTasks: 1 14/08/03 15:01:48 INFO mapred.MapTask:

Re: Ideal number of mappers and reducers to increase performance

2014-08-01 Thread sindhu hosamane
Thanks a ton for ur help Harsh . I am a newbie in hadoop. If i have set mapred.tasktracker.map.tasks.maximum = 4 mapred.tasktracker.reduce.tasks.maximum = 4 Should i also bother or set below values mapred.map.Tasks and mapred.reduce.Tasks . If yes then what is the ideal value? On Fri, Aug

Re: Ideal number of mappers and reducers to increase performance

2014-08-01 Thread Nitin Pawar
the setting mapred.tasktracker.* related settings are related to maximum number of maps or reducers a tasktracker can run. This can change across machines if you have multiple nodes then depending on machine config you can decide these values. If you set it to 4, it will basically mean that at

Ideal number of mappers and reducers to increase performance

2014-07-31 Thread Sindhu Hosamane
Hello friends , If i am running my experiment on a server with 2 processors (4 cores each ) . To say it has 2 processors and 8 cores . What would be the ideal values for mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum to get maximum performance. I am running

Re: Ideal number of mappers and reducers to increase performance

2014-07-31 Thread Harsh J
You can perhaps start with a generic 4+4 configuration (which matches your cores), and tune your way upwards or downwards from there based on your results. On Thu, Jul 31, 2014 at 8:35 PM, Sindhu Hosamane sindh...@gmail.com wrote: Hello friends , If i am running my experiment on a server with