Set "mapred.tasktracker.tasks.maximum"
and each node will be able to process N number of tasks - map or/and reduce.

Please note that once you set "mapred.tasktracker.tasks.maximum",
"mapred.tasktracker.map.tasks.maximum" and
"mapred.tasktracker.reduce.tasks.maximum" setting will not take effect.




On Tue, Jun 17, 2008 at 1:46 PM, Amar Kamat <[EMAIL PROTECTED]> wrote:

> Daniel Leffel wrote:
>
>> Why not just combine them? How do I do that?
>>
>>
>>
> Consider a case where the cluster (of n nodes) is configured to process
> just one task per node. Let there be (n-1) reducers. Lets assume that the
> map phase is complete and the reducers are shuffling. There will be (n-1)
> nodes with reducers. Now consider a case where the only node without the
> reducer gets lost. The cluster needs slots to run maps that were lost since
> the reducers are waiting for the maps to finish. In such a case the job will
> get stuck. To avoid such cases, there are separate maps and reduce task
> slots.
> Amar
>
>  Rationale is that our tasks are very balanced in load, but unbalanced
>> in timing. I've found that limiting the number of total threads to be
>> the most safe approach to not overloading the dfs daemon. To date,
>> I've done that just through intelligent scheduling of jobs to stagger
>> maps and reduces, but have I missed a setting that exists to simply
>> limit number of tasks in-total?
>>
>>
>
>

Reply via email to