On Jun 14, 2009, at 11:01 PM, Sugandha Naolekar wrote:

Hello!

I have a 4 node cluster of hadoop running. Now, there is 5th machine which
is acting as a  client of hadoop. It's not a part of the hadoop
cluster(master/slave config file). Now I have to writer a JAVA code that gets executed on this client which will simply put the client ystem's data into HDFS(and get it replicated over 2 datanodes) and as per my requirement,
I can simply fetch it back on the client machine itself.

For this, I have done following things as of now::

***********************************
-> Among 4 nodes 2 are datanodes and ther oter 2 are namenode and jobtracker
respectively.
***********************************

***********************************
-> Now, to make that code work on client machine, I have designed a UI. Now
here on the client m/c, do i need to install hadoop?
***********************************

You will need to have the same version of Hadoop installed on any client that need to communicate with the Hadoop cluster.


***********************************
-> I have installed hadoop on it, and in it's config file, I have specified
only 2 tags.
  1) fs.default.name-> value=namenode's address.
  2) dfs.http.address(namenode's addres)
***********************************

I'm assuming you mean that you have Hadoop installed on the client with a hadoop-site.xml (or core-site.xml) with the correct fs.default.name. Correct?


***********************************
Thus, If there is a file in /home/hadoop/test.java on client machine; I will
have 2 get the instance of HDFS fs by Filesystem.get. rt??
***********************************

Before you begin writing special FileSystem Java code, I would do a quick sanity check of the client configuration.

Can you run the command...

% bin/hadoop fs -ls

...without error?

Can you -put files onto HDFS from the client...

% bin/hadoop fs -put <src> <dst>

...without error?

* You should also check your firewall rules between the client and NameNode. * Make sure that the TCP port you specified in fs.default.name is open for connection from the client. * Run "netstat -t -l" to make sure that the NameNode is running and listening on the TCP port you specified.

Only when you've ensured that the hadoop commandline works would I begin writing custom client code based on the FileSystem class.


***********************************
Then, by using Filesystem.util, I will have to simply specify both the
fs::local as src, hdfs as destination, and src path as the
/home/hadoop/test.java and destination as /user/hadoop/. rt??
So it should work ...!
***********************************

***********************************
-> But, it gives me an error as "not able to find src path
/home/hadoop/test.java"

-> Will i have to use RPC classes and methods under hadoop api to do this.??
***********************************

You should be able to just use the FileSystem class to w/o needing to use any RPC classes

FileSystem documentation:
http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/fs/FileSystem.html



***********************************
Things don;t seem to be working in any of the ways. Please help me out.
***********************************

Thanks!

Reply via email to