On Sat, Mar 3, 2012 at 7:41 PM, Joey Echeverria <j...@cloudera.com> wrote:
> Sorry, I meant have you set the mapred.jobtracker.taskScheduler > property in your mapred-site.xml file. If not, you're using the > standard, FIFO scheduler. The default scheduler doesn't do data-local > scheduling, but the fair scheduler and capacity scheduler do. You want > to set mapred.jobtracker.taskScheduler to either > org.apache.hadoop.mapred.FairScheduler (for the fair scheduler) or > org.apache.hadoop.mapred.CapacityTaskScheduler (for the capacity > scheduler) and then restart the JobTracker. You can read about the two > schedulers here: > > http://hadoop.apache.org/common/docs/current/fair_scheduler.html > http://hadoop.apache.org/common/docs/current/capacity_scheduler.html > > I thought by default tasks are scheduled on those nodes that have those data blocks. I thought that was inherent. In the faire scheduler link I don't see anything about data-local -Joey > > On Sat, Mar 3, 2012 at 6:32 PM, Hassen Riahi <hassen.ri...@cern.ch> wrote: > > The jobtracker is running in another machine (node C) > > > > Hassen > > > > > >> Which scheduler are you using? > >> > >> -Joey > >> > >> On Mar 3, 2012, at 18:52, Hassen Riahi <hassen.ri...@cern.ch> wrote: > >> > >>> Hi all, > >>> > >>> We tried using mapreduce to execute a simple map code which read a txt > >>> file stored in HDFS and write then the output. > >>> The file to read is a very small one. It was not split and written > >>> entirely and only in a single datanode (node A). This node is > configured > >>> also as a tasktracker node > >>> While we was expecting that the location of the map execution is node A > >>> (since the input is stored there), from log files, we see that the map > was > >>> executed in another tasktracker (node B) of the cluster. > >>> Am I missing something? > >>> > >>> Thanks for the help! > >>> Hassen > >>> > > > > > > -- > Joseph Echeverria > Cloudera, Inc. > 443.305.9434 >