Thank you very much for your answers. I will probably go through searching for the blockID and then read the block from the local file system directly. I need it for a specific purpose!
Thank you very much for your answers! - Thodoris On Tue, 2018-04-24 at 05:54 +0000, Takanobu Asanuma wrote: > In addition to others' comments, I think fsck command like below is > the easiest way to find the block locations of the file. > > $ hdfs fsck /path/to/the/data -blocks -files -locations > > Thanks, > - Takanobu > > -----Original Message----- > From: Jim Clampffer [mailto:james.clampf...@gmail.com] > Sent: Tuesday, April 24, 2018 10:42 AM > To: Arpit Agarwal <aagar...@hortonworks.com> > Cc: hdfs-dev@hadoop.apache.org > Subject: Re: Read or Save specific blocks of a file > > If you want to read replicas from a specific DN after determining the > block bounds via getFileBlockLocations you could abuse the rack > locality infrastructure by generating a dummy topology script to get > the NN to order replicas such that the client tries to read from the > DNs you prefer first. > It's not going to guarantee a read from a specific DN and is a > terrible idea to do in a multi-tenant/production cluster but if you > have a very specific goal in mind or want to learn more about the > storage layer it may be an interesting exercise. > > On Mon, Apr 23, 2018 at 9:14 PM, Arpit Agarwal <aagarwal@hortonworks. > com> > wrote: > > > Hi, > > > > Perhaps I missed something in the question. > > FileSystem#getFileBlockLocations followed by open, seek to start > > of > > target block, read. This will let you read the contents of a > > specific block using public APIs. > > > > > > > > On 4/23/18, 5:26 PM, "Daniel Templeton" <dan...@cloudera.com> > > wrote: > > > > I'm not aware of a way to work with blocks using the public > > APIs. The > > easiest way to do it is probably to retrieve the block IDs and > > then go > > grab those blocks from the data nodes' local file systems > > directly. > > > > Daniel > > > > On 4/23/18 9:05 AM, Thodoris Zois wrote: > > > Hello list, > > > > > > I have a file on HDFS that is divided into 10 blocks > > (partitions). > > > > > > Is there any way to retrieve data from a specific block? > > (e.g: using > > > the blockID). > > > > > > Except that, is there any option to write the contents of > > each block > > > (or of one block) into separate files? > > > > > > Thank you very much, > > > Thodoris > > > > > > > > > > > > > > > ------------------------------------------------------------ > > --------- > > > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.or > > g > > > For additional commands, e-mail: hdfs-dev-help@hadoop.apache. > > org > > > > > > > > > --------------------------------------------------------------- > > ------ > > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.or > > g > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org