Well, Map/Reduce and Hadoop by definition run maps in parallel. I think you're interested in the following two configuration settings:
mapred.tasktracker.map.tasks.maximum mapred.tasktracker.reduce.tasks.maximum These go in hadoop-site.xml and will set the number of map and reduce tasks for each tasktracker (node). Learn more here: < http://hadoop.apache.org/core/docs/current/cluster_setup.html#Configuring+the+Hadoop+Daemons > Map tasks + reduce tasks should be slightly above the number of cores you have per node. So if you have 8 cores per node, setting map tasks to 6 and reduce tasks to 4 would probably be good. Hope this helps. Alex On Thu, Dec 4, 2008 at 6:42 AM, Aayush Garg <[EMAIL PROTECTED]> wrote: > Hi, > > I am having a 5 node cluster for hadoop usage. All nodes are multi-core. > I am running a shell command in Map function of my program and this shell > command takes one file as an input. Many of such files are copied in the > HDFS. > > So in summary map function will run a command like ./run <file1> > <outputfile1> > > Could you please suggest the optimized way to do this..like if I can use > multi core processing of nodes and many of such maps in parallel. > > Thanks, > Aayush >