Sandy/Shekhar Thank you very much for the helpful responses. One last question/clarification- the getFileBlockLocations(..) in the FileSystem API is the only file-->node mapping that I'm aware of, and it seems the only place its called is in the map/reduce client (FileInputFormat, MultiFileSplit).
Is this accurate: 1) Jobclient calls getFileBlockLocations(..) and serializes this info to a file on the DFS. 2) Scheduler reads this information for determining locality Or am I missing another call/method for determining node location for block locality? I haven't located the spot in source that reads the block location file for scheduler is why i ask. -bc ----- Original Message ----- > From: "Sandy Ryza" <sandy.r...@cloudera.com> > To: common-dev@hadoop.apache.org > Sent: Thursday, April 3, 2014 2:38:42 PM > Subject: Re: Yarn / mapreduce scheduling > > The equivalent code in the Fair Scheduler is in AppSchedulable.java, > under assignContainer(FSSchedulerNode node, boolean reserved). > > YARN uses delay scheduling ( > http://people.csail.mit.edu/matei/papers/2010/eurosys_delay_scheduling.pdf) > for achieving data-locality. > > -Sandy > > > On Thu, Apr 3, 2014 at 11:16 AM, Shekhar Gupta <shkhr...@gmail.com> wrote: > > > Hi Brad, > > > > YARN scheduling does take care of data locality. In YARN, tasks are not > > assigned based on capacity. Actually certain number of containers are > > allocated on every node based on node's capacity. Tasks are executed on > > those containers. While scheduling tasks on containers YARN scheduler > > satisfies data locality requirements. I am not very familiar with Fair > > Scheduler but if you check the source of FifoScheduler you will find a > > function 'assignContainersonNode' which looks like following > > > > private int assignContainersOnNode(FiCaSchedulerNode node, > > FiCaSchedulerApp application, Priority priority > > ) { > > // Data-local > > int nodeLocalContainers = > > assignNodeLocalContainers(node, application, priority); > > > > // Rack-local > > int rackLocalContainers = > > assignRackLocalContainers(node, application, priority); > > > > // Off-switch > > int offSwitchContainers = > > assignOffSwitchContainers(node, application, priority); > > > > > > LOG.debug("assignContainersOnNode:" + > > " node=" + node.getRMNode().getNodeAddress() + > > " application=" + application.getApplicationId().getId() + > > " priority=" + priority.getPriority() + > > " #assigned=" + > > (nodeLocalContainers + rackLocalContainers + offSwitchContainers)); > > > > > > return (nodeLocalContainers + rackLocalContainers + > > offSwitchContainers); > > } > > > > In this routine you will find that data-local tasks are scheduled first, > > then rack-local and in then off-switch. > > > > After this you may find similar function in fairScheduler too. > > > > I hope this helps. Let me know if you more questions or if something is > > wrong in my reasoning. > > > > Regards, > > Shekhar > > > > > > On Thu, Apr 3, 2014 at 10:56 AM, Brad Childs <b...@redhat.com> wrote: > > > > > Sorry if this is the wrong list, i am looking for deep technical/hadoop > > > source help :) > > > > > > How does job scheduling work on yarn framework for map reduce jobs? I > > see > > > the yarn scheduler discussed here: > > > > > https://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/YARN.htmlwhich > > leads me to believe tasks are scheduled based on node capacity and > > > not data locality. I've sifted through the fair scheduler and can't find > > > anything about data location or locality. > > > > > > Where does data locality play into the scheduling of map/reduce tasks on > > > yarn? Can someone point me to the hadoop 2.x source where the data block > > > location is used to calculate node/container/task assignment (if thats > > > still happening). > > > > > > > > > > > > -bc > > > > > > > > >