----- Original Message -----
From: Yuduo Zhou <yuduoz...@gmail.com>
Date: Tuesday, October 18, 2011 6:30 am
Subject: About block name and location.
To: hdfs-user@hadoop.apache.org

> Hi all,
> 
> I'm a rookie to HDFS. Here is just a quick question, suppose I have 
> a big file stored in HDFS, is there any way to generate a file 
> containing all information about blocks belong to this file? 
> For example list of records with format of "block_id, length, 
> offset, hosts[], local/path/to/this/block"?
> 
FileSystem#getFileStatus(Path f) will give some information. FileStatus 
contains below parameters to get.

Path path;
long length;
boolean isdir;
short block_replication;
long blocksize;
long modification_time;
long access_time;
FsPermission permission;
String owner;
String group;
Path symlink;

And to get the blcok locations nd offsets you can use 
FileSystem#getFileBlockLocations

If you want exactly in your format, i would suggest you to write small wrapper 
in your app and format it using above APIs.

> The purpose is to enable programs to only access blocks on the same 
> node, to utilize block locality.
> 
Hadoop already supports it.
> I can retrieve most information using getFileBlockLocations() but I 
> didn't find how to gather information about the local path.
> 
AFAIK, Local files will be written as just normal file. So, hadoop will not 
split local files into blocks. It will do that only in DFS case.
> Thanks,
> Yuduo

Reply via email to