Re: Control the number of Mappers

Niels Basjes Thu, 25 Nov 2010 12:19:06 -0800

Ah,

In that case this should answer your question:
http://wiki.apache.org/hadoop/HowManyMapsAndReduces



2010/11/25 Shai Erera <ser...@gmail.com>:
> I wasn't talking about how to configure the cluster to not invoke more than
> a certain # of Mappers simultaneously. Instead, I'd like to configure a
> (certain) job to invoke exactly N Mappers, where N is the number of cores in
> the cluster. Irregardless of the size of the data. This is not critical if
> it can't be done, but it can improve the performance of my job if it can be
> done.
>
> Thanks
> Shai
>
> On Thu, Nov 25, 2010 at 9:55 PM, Niels Basjes <ni...@basjes.nl> wrote:
>>
>> Hi,
>>
>> 2010/11/25 Shai Erera <ser...@gmail.com>:
>> > Is there a way to make MapReduce create exactly N Mappers? More
>> > specifically, if say my data can be split to 200 Mappers, and I have
>> > only
>> > 100 cores, how can I ensure only 100 Mappers will be created? The number
>> > of
>> > cores is not something I know in advance, so writing a special
>> > InputFormat
>> > might be tricky, unless I can query Hadoop for the available # of cores
>> > (in
>> > the entire cluster).
>>
>> You can configure on a node by node basis how many map and reduce
>> tasks can be started by the task tracker on that node.
>> This is done via the conf/mapred-site.xml using these two settings:
>> mapred.tasktracker.{map|reduce}.tasks.maximum
>>
>> Have a look at this page for more information
>> http://hadoop.apache.org/common/docs/current/cluster_setup.html
>>
>> --
>> Met vriendelijke groeten,
>>
>> Niels Basjes
>
>



-- 
Met vriendelijke groeten,

Niels Basjes

Re: Control the number of Mappers

Reply via email to