Hi David, For what it's worth, you should be aware that you're calling internal APIs that have no guarantee of stability between versions. I can practically guarantee that your code will have to be modified for any HDFS upgrade you do. That's why these APIs are undocumented.
Perhaps you can explain what your high-level goal is, here, and we can suggest a supported mechanism for achieving it. -Todd On Mon, Jan 9, 2012 at 9:56 AM, David Pavlis <david.pav...@javlin.eu> wrote: > Hi Denny, > > Thanks a lot. I was able to make my code work. > > I am posting a small example below - in case somebody in the future has > similar need ;-) > (not handling replica datablocks). > > David. > > *************************************************************************** > public static void main(String args[]){ > String filename="/user/hive/warehouse/sample_07/sample_07.csv"; > int DATANODE_PORT = 50010; > int NAMENODE_PORT = 8020; > String HOST_IP = "192.168.1.230"; > > byte[] buf=new byte[1000]; > > > try{ > > ClientProtocol client= DFSClient.createNamenode(new > InetSocketAddress(HOST_IP,NAMENODE_PORT), new Configuration()); > > > > > LocatedBlocks located= client.getBlockLocations(filename, 0, > Long.MAX_VALUE); > > > > for(LocatedBlock block : located.getLocatedBlocks()){ > Socket sock = > SocketFactory.getDefault().createSocket(); > InetSocketAddress targetAddr = new > InetSocketAddress(HOST_IP,DATANODE_PORT); > NetUtils.connect(sock, targetAddr, 10000); > sock.setSoTimeout(10000); > > > BlockReader reader=BlockReader.newBlockReader(sock, > filename, > block.getBlock().getBlockId(), > block.getBlockToken(), > block.getBlock().getGenerationStamp(), 0, > block.getBlockSize(), > 1000); > > > int count=0; > int length; > while((length=reader.read(buf,0,1000))>0){ > //System.out.print(new > String(buf,0,length,"UTF-8")); > if (length<1000) break; > } > reader.close(); > sock.close(); > } > > > }catch(IOException ex){ > ex.printStackTrace(); > } > > } > > > > > > > *************************************************************************** > > > > From: Denny Ye <denny...@gmail.com> > Reply-To: <hdfs-user@hadoop.apache.org> > Date: Mon, 9 Jan 2012 16:29:18 +0800 > To: <hdfs-user@hadoop.apache.org> > Subject: Re: How-to use DFSClient's BlockReader from Java > > > hi David Please refer to the method "DFSInputStream#blockSeekTo", it > has same purpose with you. > > *************************************************************************** > LocatedBlock targetBlock = getBlockAt(target, true); > assert (target==this.pos) : "Wrong postion " + pos + " expect " + > target; > long offsetIntoBlock = target - targetBlock.getStartOffset(); > > DNAddrPair retval = chooseDataNode(targetBlock); > chosenNode = retval.info <http://retval.info>; > InetSocketAddress targetAddr = retval.addr; > > try { > s = socketFactory.createSocket(); > NetUtils.connect(s, targetAddr, socketTimeout); > s.setSoTimeout(socketTimeout); > Block blk = targetBlock.getBlock(); > Token<BlockTokenIdentifier> accessToken = > targetBlock.getBlockToken(); > > blockReader = BlockReader.newBlockReader(s, src, > blk.getBlockId(), > accessToken, > blk.getGenerationStamp(), > offsetIntoBlock, blk.getNumBytes() - offsetIntoBlock, > buffersize, verifyChecksum, clientName); > > > *************************************************************************** > > > -Regards > Denny Ye > > 2012/1/6 David Pavlis <david.pav...@javlin.eu> > > Hi, > > I am relatively new to Hadoop and I am trying to utilize HDFS for own > application where I want to take advantage of data partitioning HDFS > performs. > > The idea is that I get list of individual blocks - BlockLocations of > particular file and then directly read those (go to individual DataNodes). > So far I found org.apache.hadoop.hdfs.DFSClient.BlockReader to be the way > to go. > > However I am struggling with instantiating the BlockReader() class, namely > creating the "Token<BlockTokenIdentifier>". > > Is there an example Java code showing how to access individual blocks of > particular file stored on HDFS ? > > Thanks in advance, > > David. > > > > > > > > > > > -- Todd Lipcon Software Engineer, Cloudera