Re: How can I control Number of Mappers of a job?

Alejandro Abdelnur Fri, 01 Aug 2008 01:36:50 -0700

when done, HADOOP-3387 would allow you to do that. In our
implementation we can tell Hadoop the exact # maps and it will group
splits if necessary.


On Fri, Aug 1, 2008 at 5:25 AM, Andreas Kostyrka <[EMAIL PROTECTED]> wrote:
> Well, the only way to reliably fix the number of maptasks that I've found is
> by using compressed input files, that forces hadoop to assign one and only
> one file to a map task ;)
>
> Andreas
>
> On Thursday 31 July 2008 21:30:33 Gopal Gandhi wrote:
>> Thank you, finally someone has interests in my questions =)
>> My cluster contains more than one machine. Please don't get me wrong :-). I
>> don't want to limit the total mappers in one node (by mapred.map.tasks).
>> What I want is to limit the total mappers for one job. The motivation is
>> that I have 2 jobs to run at the same time. they have "the same input data
>> in Hadoop". I found that one job has to wait until the other finishes its
>> mapping. Because the 2 jobs are submitted by 2 different people, I don't
>> want one job to be starving. So I want to limit the first job's total
>> mappers so that the 2 jobs will be launched simultaneously.
>>
>>
>>
>> ----- Original Message ----
>> From: "Goel, Ankur" <[EMAIL PROTECTED]>
>> To: core-user@hadoop.apache.org
>> Cc: [EMAIL PROTECTED]
>> Sent: Wednesday, July 30, 2008 10:17:53 PM
>> Subject: RE: How can I control Number of Mappers of a job?
>>
>> How big is your cluster? Assuming you are running a single node cluster,
>>
>> Hadoop-default.xml has a parameter 'mapred.map.tasks' that is set to 2.
>> So
>> By default, no matter how many map tasks are calculated by framework,
>> only  2 map task will execute on a single node cluster.
>>
>> -----Original Message-----
>> From: Gopal Gandhi [mailto:[EMAIL PROTECTED]
>> Sent: Thursday, July 31, 2008 4:38 AM
>> To: core-user@hadoop.apache.org
>> Cc: [EMAIL PROTECTED]
>> Subject: How can I control Number of Mappers of a job?
>>
>> The motivation is to control the max # of mappers of a job. For example,
>> the input data is 246MB, divided by 64M is 4. If by default there will
>> be 4 mappers launched on the 4 blocks.
>> What I want is to set its max # of mappers as 2, so that 2 mappers are
>> launched first and when they completes on the first 2 blocks, another 2
>> mappers start on the rest 2 blocks. Does Hadoop provide a way?
>
>
>

Re: How can I control Number of Mappers of a job?

Reply via email to