Hi, Nick I think the listStatus(Path) is really what I want.
Meanwhile, I also asked How to set the Configuration object when constructing the FileSystem object. As I know, in order to make Hadoop client programs runs (like ./hadoop fs ls / command), the hadoop configuration files, e.g hadoop-default.xml and hadoop-sites.xml must be parsed to obtain information of NameNode and DataNode. So, If I'd like to run the directory traversal class as a standalone Java application on a Machine rather than Nodes within the Hadoop cluster, Do I need to copy hadoop configuration files to client side and load them at runtime? BR/anderson -----Original Message----- From: Nick Cen [mailto:cenyo...@gmail.com] Sent: Tuesday, June 16, 2009 1:19 PM To: core-user@hadoop.apache.org Subject: Re: How to use DFS API to travel across the directory tree and retrieve content of a DFS file? I think you can take a look at the following classes. FileSystem, Path, FileStatus. *and the listStatus(Path path)* method in FileSystem. 2009/6/16 Wenrui Guo <wenrui....@ericsson.com> > Hi, all > > > As I know, hadoop fs -ls / can list files and directory of root > directory, so I am wondering How could I write a Java program to > travel across the whole DFS directory structure? > > That is, if the directory structure at the moment like the following: > > / > | > | > +----home > | > | > + anderson > | > | > + samples.dat > > > Is it possible to write a Java program that can visit from the / > directory and list subdirectory, and find if it reaches a dat file? > > Afterwards, how could I obtain the content of the samples.dat ? So > far, I know the starting point is constructing a Configuration object, > however, What's the necessary information should be included in the > Configuration object? > Shall I specify the hadoop-defaults.xml and hadoop-sites.xml inside it. > > I'll appreciate if a simple sample program is provided. > > BR/anderson > -- http://daily.appspot.com/food/