Vitaliy/Edward,
One thing to keep in mind is that overcommitting the number of cores can lead 
to map timeouts unless the map task submits progress updates to jobtracker.  I 
found out the hard way that with a few computationally expensive maps.

Nick Jones

-----Original Message-----
From: Vitaliy Semochkin [mailto:vitaliy...@gmail.com] 
Sent: Thursday, July 08, 2010 5:15 AM
To: common-user@hadoop.apache.org
Subject: Re: How to control the number of map tasks for each nodes?

Hi,

in mapred-site.xml you should place

<property>
  <name>mapred.tasktracker.map.tasks.maximum</name>
  <value>8</value>
   <description>the number of available cores on the tasktracker machines
for map tasks
  </description>
</property>
<property>
  <name>mapred.tasktracker.reduce.tasks.maximum</name>
  <value>8</value>
   <description>the number of available cores on the tasktracker machines
for reduce tasks
  </description>
</property>

where 8 is number of your CORES not CPUS, if you have 8 dual core processors
place 16 there.
I found out that having number of map tasks a bit bigger than number of
cores is better cause sometimes hadoop waits for IO operations and task do
nothing.

Regards,
Vitaliy S

On Thu, Jul 8, 2010 at 1:07 PM, edward choi <mp2...@gmail.com> wrote:

> Hi,
>
> I have a cluster consisting of 11 slaves and a single master.
>
> The thing is that 3 of my slaves have i7 cpu which means that they can have
> up to 8 simultaneous processes.
> But other slaves only have dual core cpus.
>
> So I was wondering if I can specify the number of map tasks for each of my
> slaves.
> For example, I want to give 8 map tasks to the slaves that have i7 cpus and
> only two map tasks to the others.
>
> Is there a way to do this?
>

Reply via email to