Before setting the task limits, do take into account the memory considerations ( many archive posts on this can be found ). Also, your tasktracker and datanode daemons will run on that machine as well, so you might want to set aside some processing power for that.
Cheers! Amogh -----Original Message----- From: Erik Forsberg [mailto:forsb...@opera.com] Sent: Friday, September 04, 2009 11:55 AM To: common-user@hadoop.apache.org Subject: Re: multi core nodes On Thu, 3 Sep 2009 13:20:16 -0700 (PDT) ll_oz_ll <himanshu_cool...@yahoo.com> wrote: > > Hi, > Is hadoop able to take into account multi core nodes, so that nodes > which have multiple cores run multiple concurrent jobs ? > Or does that need to be configured manually and if so can that be > configured individually for each node ? Yes, it has to be configured manually. You set the following two configuration variables in hadoop-site.xml on each node depending on the number of cores on the node: mapred.tasktracker.map.tasks.maximum mapred.tasktracker.reduce.tasks.maximum According to the book "Hadoop - the definitive guide", a good rule of thumb is to have between 1 and 2 tasks per processor, counting both map and reduce tasks. So, for example, if a machine has 8 cores, setting mapred.tasktracker.map.tasks.maximum = 8 and mapred.tasktracker.reduce.tasks.maximum = 8 probably makes sense, but this also depends a bit on your load. Cheers, \EF -- Erik Forsberg <forsb...@opera.com> Developer, Opera Software - http://www.opera.com/