Ah, In that case this should answer your question: http://wiki.apache.org/hadoop/HowManyMapsAndReduces
2010/11/25 Shai Erera <ser...@gmail.com>: > I wasn't talking about how to configure the cluster to not invoke more than > a certain # of Mappers simultaneously. Instead, I'd like to configure a > (certain) job to invoke exactly N Mappers, where N is the number of cores in > the cluster. Irregardless of the size of the data. This is not critical if > it can't be done, but it can improve the performance of my job if it can be > done. > > Thanks > Shai > > On Thu, Nov 25, 2010 at 9:55 PM, Niels Basjes <ni...@basjes.nl> wrote: >> >> Hi, >> >> 2010/11/25 Shai Erera <ser...@gmail.com>: >> > Is there a way to make MapReduce create exactly N Mappers? More >> > specifically, if say my data can be split to 200 Mappers, and I have >> > only >> > 100 cores, how can I ensure only 100 Mappers will be created? The number >> > of >> > cores is not something I know in advance, so writing a special >> > InputFormat >> > might be tricky, unless I can query Hadoop for the available # of cores >> > (in >> > the entire cluster). >> >> You can configure on a node by node basis how many map and reduce >> tasks can be started by the task tracker on that node. >> This is done via the conf/mapred-site.xml using these two settings: >> mapred.tasktracker.{map|reduce}.tasks.maximum >> >> Have a look at this page for more information >> http://hadoop.apache.org/common/docs/current/cluster_setup.html >> >> -- >> Met vriendelijke groeten, >> >> Niels Basjes > > -- Met vriendelijke groeten, Niels Basjes