Matt, Thanks for your help! I think I get it now, but this part is a bit confusing: * * *so: tasktracker/datanode and 6 slots left. How you break it up from there is your call but I would suggest either 4 mappers / 2 reducers or 5 mappers / 1 reducer.* * * If it's 2 processes per core, then it's: 4 Nodes * 4 Cores/Node * 2 Processes/Core = 32 Processes Total
So my configuration mapred-site.xml should include these props: *<property>* * <name>mapred.map.tasks</name>* * <value>28</value>* *</property>* *<property>* * <name>mapred.reduce.tasks</name>* * <value>4</value>* *</property>* * * Is that correct? On Mon, Jun 27, 2011 at 4:59 PM, GOEKE, MATTHEW (AG/1000) < matthew.go...@monsanto.com> wrote: > If you are running default configurations then you are only getting 2 > mappers and 1 reducer per node. The rule of thumb I have gone on (and back > up by the definitive guide) is 2 processes per core so: tasktracker/datanode > and 6 slots left. How you break it up from there is your call but I would > suggest either 4 mappers / 2 reducers or 5 mappers / 1 reducer. > > Check out the below configs for details on what you are *most likely* > running currently: > http://hadoop.apache.org/common/docs/r0.20.2/mapred-default.html > http://hadoop.apache.org/common/docs/r0.20.2/hdfs-default.html > http://hadoop.apache.org/common/docs/r0.20.2/core-default.html > > HTH, > Matt > > -----Original Message----- > From: Juan P. [mailto:gordoslo...@gmail.com] > Sent: Monday, June 27, 2011 2:50 PM > To: common-user@hadoop.apache.org > Subject: Performance Tunning > > I'm trying to run a MapReduce task against a cluster of 4 DataNodes with 4 > cores each. > My input data is 4GB in size and it's split into 100MB files. Current > configuration is default so block size is 64MB. > > If I understand it correctly Hadoop should be running 64 Mappers to process > the data. > > I'm running a simple data counting MapReduce and it's taking about 30mins > to > complete. This seems like way too much, doesn't it? > Is there any tunning you guys would recommend to try and see an improvement > in performance? > > Thanks, > Pony > This e-mail message may contain privileged and/or confidential information, > and is intended to be received only by persons entitled > to receive such information. If you have received this e-mail in error, > please notify the sender immediately. Please delete it and > all attachments from any servers, hard drives or any other media. Other use > of this e-mail by you is strictly prohibited. > > All e-mails and attachments sent and received are subject to monitoring, > reading and archival by Monsanto, including its > subsidiaries. The recipient of this e-mail is solely responsible for > checking for the presence of "Viruses" or other "Malware". > Monsanto, along with its subsidiaries, accepts no liability for any damage > caused by any such code transmitted by or accompanying > this e-mail or any attachment. > > > The information contained in this email may be subject to the export > control laws and regulations of the United States, potentially > including but not limited to the Export Administration Regulations (EAR) > and sanctions regulations issued by the U.S. Department of > Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this > information you are obligated to comply with all > applicable U.S. export laws and regulations. > >