Re: How can I control Number of Mappers of a job?

Gopal Gandhi Thu, 31 Jul 2008 12:31:07 -0700

Thank you, finally someone has interests in my questions =)
My cluster contains more than one machine. Please don't get me wrong :-). I 
don't want to limit the total mappers in one node (by mapred.map.tasks). What I 
want is to limit the total mappers for one job. The motivation is that I have 2 
jobs to run at the same time. they have "the same input data in Hadoop". I 
found that one job has to wait until the other finishes its mapping. Because 
the 2 jobs are submitted by 2 different people, I don't want one job to be 
starving. So I want to limit the first job's total mappers so that the 2 jobs 
will be launched simultaneously.

----- Original Message ----
From: "Goel, Ankur" <[EMAIL PROTECTED]>
To: core-user@hadoop.apache.org
Cc: [EMAIL PROTECTED]
Sent: Wednesday, July 30, 2008 10:17:53 PM
Subject: RE: How can I control Number of Mappers of a job?

How big is your cluster? Assuming you are running a single node cluster,

Hadoop-default.xml has a parameter 'mapred.map.tasks' that is set to 2.
So
By default, no matter how many map tasks are calculated by framework,
only  2 map task will execute on a single node cluster.

-----Original Message-----
From: Gopal Gandhi [mailto:[EMAIL PROTECTED] 
Sent: Thursday, July 31, 2008 4:38 AM
To: core-user@hadoop.apache.org
Cc: [EMAIL PROTECTED]
Subject: How can I control Number of Mappers of a job?

The motivation is to control the max # of mappers of a job. For example,
the input data is 246MB, divided by 64M is 4. If by default there will
be 4 mappers launched on the 4 blocks. 
What I want is to set its max # of mappers as 2, so that 2 mappers are
launched first and when they completes on the first 2 blocks, another 2
mappers start on the rest 2 blocks. Does Hadoop provide a way?

Re: How can I control Number of Mappers of a job?

Reply via email to