Yuduo,

Before you may possibly end up duplicating work done to improve
co-located client reads from DNs, I'd suggest seeing JIRAs
https://issues.apache.org/jira/browse/HDFS-2246 and
https://issues.apache.org/jira/browse/HDFS-347

Regarding your last requirement, about getting the path to the block
files - there's no public API available for that yet. The info is
carried by the DataNode alone at the moment, and does not expose it
out directly (one instead makes a transceiver and DN does the read
work by itself).

On Tue, Oct 18, 2011 at 8:35 AM, Yuduo <yuduoz...@gmail.com> wrote:
> Thanks, Uma! I'll try to figure it out according to your direction.
>
> Best,
> Yuduo
> On 10/17/2011 10:51 PM, Uma Maheswara Rao G 72686 wrote:
>>
>> ----- Original Message -----
>> From: Yuduo Zhou<yuduoz...@gmail.com>
>> Date: Tuesday, October 18, 2011 6:30 am
>> Subject: About block name and location.
>> To: hdfs-user@hadoop.apache.org
>>
>>> Hi all,
>>>
>>> I'm a rookie to HDFS. Here is just a quick question, suppose I have
>>> a big file stored in HDFS, is there any way to generate a file
>>> containing all information about blocks belong to this file?
>>> For example list of records with format of "block_id, length,
>>> offset, hosts[], local/path/to/this/block"?
>>>
>> FileSystem#getFileStatus(Path f) will give some information. FileStatus
>> contains below parameters to get.
>>
>> Path path;
>> long length;
>> boolean isdir;
>> short block_replication;
>> long blocksize;
>> long modification_time;
>> long access_time;
>> FsPermission permission;
>> String owner;
>> String group;
>> Path symlink;
>>
>> And to get the blcok locations nd offsets you can use
>> FileSystem#getFileBlockLocations
>>
>> If you want exactly in your format, i would suggest you to write small
>> wrapper in your app and format it using above APIs.
>>
>>> The purpose is to enable programs to only access blocks on the same
>>> node, to utilize block locality.
>>>
>> Hadoop already supports it.
>>>
>>> I can retrieve most information using getFileBlockLocations() but I
>>> didn't find how to gather information about the local path.
>>>
>> AFAIK, Local files will be written as just normal file. So, hadoop will
>> not split local files into blocks. It will do that only in DFS case.
>>>
>>> Thanks,
>>> Yuduo
>
>



-- 
Harsh J

Reply via email to