Hi, You should definitely change mapred.tasktracker.map/reduce.tasks.maximum. If your tasks are more CPU bound then you should run the tasks equal to the number of CPU cores otherwise you can run more tasks than cores. You can determine CPU and memory usage by running "top" command on datanodes. You should also take care of following configuration parameters to achieve best performance
*mapred.compress.map.output:* Faster data transfer (from mapper to reducers), saves disk space, faster disk writing. Extra time in compression and decompression *io.sort.mb: *If you have idle physical memory after running all tasks you can increase this value. But swap space should not be used since it makes it slow.* **io.sort.factor: *If your map tasks have large number of spills* *then you should increase this value.It also helps in merging at reducers. *mapred.job.reuse.jvm.num.tasks: *The overhead of JVM creation for each task is around 1 second. So for the tasks which live for seconds or a few minutes and have lengthy initialization, this value can be increased to gain performance. *mapred.reduce.parallel.copies: *For Large jobs (the jobs in which map output is very large), value of this property can be increased keeping in mind that it will increase the total CPU usage.* **mapred.map/reduce.tasks.speculative.execution: *set to false to gain high throughput. *dfs.block.size* or *mapred.min.split.size* or *mapred.max.split.size* : to control the number of maps On Thu, Sep 10, 2009 at 8:06 AM, Mat Kelcey <matthew.kel...@gmail.com>wrote: > > I've a cluster where every node is a multicore. From doing internet > searches I've figured out that I definitely need to change > mapred.tasktracker.tasks.maximum according to the number of clusters. But > there are definitely other things that I would like to change for example > mapred.map.tasks. Can someone point me out the list of things I should > change to get the best performance out of my cluster ? > > nothing will give you better results than benchmarking with some jobs > indicative to your domain! > -- Thanks & Regards, Chandra Prakash Bhagtani,