Hi, Please do not use the general@ lists for development/usage questions. This list is meant for project-level discussions alone. Thanks! :)
I've moved this mail to hdfs-dev@hadoop.apache.org. When replying, please instead use this list, going forward. My reply inline: On Thu, Sep 6, 2012 at 3:12 PM, Adrian (Xinyu) Liu <adrian....@sas.com> wrote: > Hi All, > > Nowadays, I am working with HDFS and implementing some functionalities base > on HDFS API. > As I knew, one specific file is divided into several blocks and distributed > into different datanode with certain replication numbers. > And I want to find out a series of HDFS API which can meet the following > requirement: > > > 1. Given a specific filename and related information that already > uploaded into the HDFS, retrieve: how many blocks are there, > > each datanode contain which blocks, etc. This isn't possible to get if you're using simple Public APIs. The FileSystem#getFileBlockLocations will tell you what hosts are carrying the blocks of a file (a list of hosts for each block in the file). See http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/fs/FileSystem.html#getFileBlockLocations(org.apache.hadoop.fs.Path,%20long,%20long) For the list of block IDs, you'd have to pull from a DFSClient instance, which calls the (NameNode-side) ClientProtocol's getBlockLocations(…) method call. See the interface at http://svn.apache.org/viewvc/hadoop/common/branches/branch-2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java?view=markup > 2. Given a specific filename, source datanode, specific block id, > destination datanode and related information to transfer the block > > from source node to destination node. This needs to be done via the DataTransferProtocol, and its specific method of replaceBlock(…). See the interface at http://svn.apache.org/viewvc/hadoop/common/branches/branch-2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/DataTransferProtocol.java?view=markup > I've read several materials and API reference about HDFS and can't find > appropriate ones, the objective of this mail is to make sure > if such API existed, and if so, what are they (especially the second one, > transfer a specific block of a specific file from a certain datanode to > another) > > There is a tool called Balancer already existed in HDFS package, I am reading > the source code, but it's too intricate to track the line, can anyone help me? In the Balancer sources, see the final replaceBlock(…) call made at L376 at http://svn.apache.org/viewvc/hadoop/common/branches/branch-2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java?view=markup, and then trace backwards from that point to see how its built up till that point. Feel free to send across any more questions you have! -- Harsh J