I ran a job. In the jobtracker web interface, I found 4 maps and 1 reduce running. This is not what I set in my configuration files (hadoop-site.xml).
My configuration file is set as follows: mapred.map.tasks = 2 mapred.reduce.tasks = 2 However, the description of these properties mention that these settings would be ignored if mapred.job.tracker is set as 'local'. Mine is set properly with IP address, port number. Please note that the above configuration is from the 'conf/hadoop-site.xml' file of the job tracker node. So, can anyone please explain why it was executing 4 maps but only 1 reduce? I have included some important entries from the job.xml of this job below: name value mapred.skip.reduce.max.skip.groups 0 mapred.reduce.max.attempts 4 mapred.reduce.tasks 1 mapred.reduce.tasks.speculative.execution true mapred.tasktracker.reduce.tasks.maximum 2 dfs.replication 2 mapred.reduce.copy.backoff 300 mapred.task.cache.levels 2 mapred.max.tracker.failures 4 mapred.map.tasks 4 mapred.map.tasks.speculative.execution true mapred.tasktracker.map.tasks.maximum 2 Please help.