Thank you, finally someone has interests in my questions =) My cluster contains more than one machine. Please don't get me wrong :-). I don't want to limit the total mappers in one node (by mapred.map.tasks). What I want is to limit the total mappers for one job. The motivation is that I have 2 jobs to run at the same time. they have "the same input data in Hadoop". I found that one job has to wait until the other finishes its mapping. Because the 2 jobs are submitted by 2 different people, I don't want one job to be starving. So I want to limit the first job's total mappers so that the 2 jobs will be launched simultaneously.
----- Original Message ---- From: "Goel, Ankur" <[EMAIL PROTECTED]> To: core-user@hadoop.apache.org Cc: [EMAIL PROTECTED] Sent: Wednesday, July 30, 2008 10:17:53 PM Subject: RE: How can I control Number of Mappers of a job? How big is your cluster? Assuming you are running a single node cluster, Hadoop-default.xml has a parameter 'mapred.map.tasks' that is set to 2. So By default, no matter how many map tasks are calculated by framework, only 2 map task will execute on a single node cluster. -----Original Message----- From: Gopal Gandhi [mailto:[EMAIL PROTECTED] Sent: Thursday, July 31, 2008 4:38 AM To: core-user@hadoop.apache.org Cc: [EMAIL PROTECTED] Subject: How can I control Number of Mappers of a job? The motivation is to control the max # of mappers of a job. For example, the input data is 246MB, divided by 64M is 4. If by default there will be 4 mappers launched on the 4 blocks. What I want is to set its max # of mappers as 2, so that 2 mappers are launched first and when they completes on the first 2 blocks, another 2 mappers start on the rest 2 blocks. Does Hadoop provide a way?