In addition to others' comments, I think fsck command like below is the easiest 
way to find the block locations of the file.

$ hdfs fsck /path/to/the/data -blocks -files -locations

Thanks,
- Takanobu

-----Original Message-----
From: Jim Clampffer [mailto:james.clampf...@gmail.com] 
Sent: Tuesday, April 24, 2018 10:42 AM
To: Arpit Agarwal <aagar...@hortonworks.com>
Cc: hdfs-dev@hadoop.apache.org
Subject: Re: Read or Save specific blocks of a file

If you want to read replicas from a specific DN after determining the block 
bounds via getFileBlockLocations you could abuse the rack locality 
infrastructure by generating a dummy topology script to get the NN to order 
replicas such that the client tries to read from the DNs you prefer first.
It's not going to guarantee a read from a specific DN and is a terrible idea to 
do in a multi-tenant/production cluster but if you have a very specific goal in 
mind or want to learn more about the storage layer it may be an interesting 
exercise.

On Mon, Apr 23, 2018 at 9:14 PM, Arpit Agarwal <aagar...@hortonworks.com>
wrote:

> Hi,
>
> Perhaps I missed something in the question. 
> FileSystem#getFileBlockLocations followed by open, seek to start of 
> target block, read. This will let you read the contents of a specific block 
> using public APIs.
>
>
>
> On 4/23/18, 5:26 PM, "Daniel Templeton" <dan...@cloudera.com> wrote:
>
>     I'm not aware of a way to work with blocks using the public APIs. The
>     easiest way to do it is probably to retrieve the block IDs and then go
>     grab those blocks from the data nodes' local file systems directly.
>
>     Daniel
>
>     On 4/23/18 9:05 AM, Thodoris Zois wrote:
>     > Hello list,
>     >
>     > I have a file on HDFS that is divided into 10 blocks (partitions).
>     >
>     > Is there any way to retrieve data from a specific block? (e.g: using
>     > the blockID).
>     >
>     > Except that, is there any option to write the contents of each block
>     > (or of one block) into separate files?
>     >
>     > Thank you very much,
>     > Thodoris
>     >
>     >
>     >
>     >
>     > ------------------------------------------------------------
> ---------
>     > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>     > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>     >
>
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>     For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>
>
>

Reply via email to