Looking in the source, it appears that In HDFS, the Namenode supports getting this info directly via the client, and ultimately communicates block locations to the DFSClient , which is used by the DistributedFileSystem.
/** * @see ClientProtocol#getBlockLocations(String, long, long) */ static LocatedBlocks callGetBlockLocations(ClientProtocol namenode, String src, long start, long length) throws IOException { try { return namenode.getBlockLocations(src, start, length); } catch(RemoteException re) { throw re.unwrapRemoteException(AccessControlException.class, FileNotFoundException.class, UnresolvedPathException.class); } } On Tue, Jun 4, 2013 at 2:00 AM, Mahmood Naderan <nt_mahm...@yahoo.com>wrote: > There are many instances of getFileBlockLocations in hadoop/fs. Can you > explain which one is the main? > >It must be combined with a method of logically splitting the input data > along block boundaries, and of launching tasks on worker nodes that >are > close to the data splits > Is this a user level task of system level task? > > > Regards, > Mahmood* > * > > ------------------------------ > *From:* John Lilley <john.lil...@redpoint.net> > *To:* "user@hadoop.apache.org" <user@hadoop.apache.org>; Mahmood Naderan < > nt_mahm...@yahoo.com> > *Sent:* Tuesday, June 4, 2013 3:28 AM > *Subject:* RE: HDFS interfaces > > Mahmood, > > It is the in the FileSystem interface. > http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileSystem.html#getFileBlockLocations(org.apache.hadoop.fs.Path, > long, > long)<http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileSystem.html#getFileBlockLocations%28org.apache.hadoop.fs.Path,%20long,%20long%29> > > This by itself is not sufficient for application programmers to make good > use of data locality. It must be combined with a method of logically > splitting the input data along block boundaries, and of launching tasks on > worker nodes that are close to the data splits. MapReduce does both of > these things internally along with the file-format input classes. For an > application to do so directly, see the new YARN-based interfaces > ApplicationMaster and ResourceManager. These are however very new and > there is little documentation or examples. > > john > > *From:* Mahmood Naderan [mailto:nt_mahm...@yahoo.com] > *Sent:* Monday, June 03, 2013 12:09 PM > *To:* user@hadoop.apache.org > *Subject:* HDFS interfaces > > Hello, > It is stated in the "HDFS architecture guide" ( > https://hadoop.apache.org/docs/r1.0.4/hdfs_design.html) that > > *HDFS provides interfaces for applications to move themselves closer to > where the data is located. * > > What are these interfaces and where they are in the source code? Is > there any manual for the interfaces? > > Regards, > Mahmood > > > -- Jay Vyas http://jayunit100.blogspot.com