Set "mapred.tasktracker.tasks.maximum" and each node will be able to process N number of tasks - map or/and reduce.
Please note that once you set "mapred.tasktracker.tasks.maximum", "mapred.tasktracker.map.tasks.maximum" and "mapred.tasktracker.reduce.tasks.maximum" setting will not take effect. On Tue, Jun 17, 2008 at 1:46 PM, Amar Kamat <[EMAIL PROTECTED]> wrote: > Daniel Leffel wrote: > >> Why not just combine them? How do I do that? >> >> >> > Consider a case where the cluster (of n nodes) is configured to process > just one task per node. Let there be (n-1) reducers. Lets assume that the > map phase is complete and the reducers are shuffling. There will be (n-1) > nodes with reducers. Now consider a case where the only node without the > reducer gets lost. The cluster needs slots to run maps that were lost since > the reducers are waiting for the maps to finish. In such a case the job will > get stuck. To avoid such cases, there are separate maps and reduce task > slots. > Amar > > Rationale is that our tasks are very balanced in load, but unbalanced >> in timing. I've found that limiting the number of total threads to be >> the most safe approach to not overloading the dfs daemon. To date, >> I've done that just through intelligent scheduling of jobs to stagger >> maps and reduces, but have I missed a setting that exists to simply >> limit number of tasks in-total? >> >> > >