RE: How to use DFS API to travel across the directory tree and retrieve content of a DFS file?

Wenrui Guo Tue, 16 Jun 2009 19:50:50 -0700

Hi, Nick

I think the listStatus(Path) is really what I want.


Meanwhile, I also asked How to set the Configuration object when
constructing the FileSystem object. As I know, in order to make Hadoop
client programs runs (like ./hadoop fs ls / command), the hadoop
configuration files, e.g hadoop-default.xml and hadoop-sites.xml must be
parsed to obtain information of NameNode and DataNode.

So, If I'd like to run the directory traversal class as a standalone
Java application on a Machine rather than Nodes within the Hadoop
cluster, Do I need to copy hadoop configuration files to client side and
load them at runtime?

BR/anderson

-----Original Message-----
From: Nick Cen [mailto:cenyo...@gmail.com] 
Sent: Tuesday, June 16, 2009 1:19 PM
To: core-user@hadoop.apache.org
Subject: Re: How to use DFS API to travel across the directory tree and
retrieve content of a DFS file?

I think you can take a look at the following classes. FileSystem, Path,
FileStatus.

*and the listStatus(Path path)* method in FileSystem.



2009/6/16 Wenrui Guo <wenrui....@ericsson.com>

> Hi, all
>
>
> As I know, hadoop fs -ls / can list files and directory of root 
> directory, so I am wondering How could I write a Java program to 
> travel across the whole DFS directory structure?
>
> That is, if the directory structure at the moment like the following:
>
> /
>  |
>  |
>  +----home
>          |
>          |
>         + anderson
>                 |
>                 |
>                + samples.dat
>
>
> Is it possible to write a Java program that can visit from the / 
> directory and list subdirectory, and find if it reaches a dat file?
>
> Afterwards, how could I obtain the content of the samples.dat ? So 
> far, I know the starting point is constructing a Configuration object,

> however, What's the necessary information should be included in the 
> Configuration object?
> Shall I specify the hadoop-defaults.xml and hadoop-sites.xml inside
it.
>
> I'll appreciate if a simple sample program is provided.
>
> BR/anderson
>



--
http://daily.appspot.com/food/

RE: How to use DFS API to travel across the directory tree and retrieve content of a DFS file?

Reply via email to