On Tue, 10 Aug 2010 09:39:17 -0700 Philip Zeyliger wrote: > On Tue, Aug 10, 2010 at 5:06 AM, Bjoern Schiessle > <bjo...@schiessle.org>wrote: > > On Mon, 9 Aug 2010 16:35:07 -0700 Philip Zeyliger wrote: > > > To give you an example of how this may be done, HUE, under the > > > covers, pipes your data to 'bin/hadoop fs > > > -Dhadoop.job.ugi=user,group put - path'. (That's from memory, but > > > it's approximately right; the full python code is at > > > > > http://github.com/cloudera/hue/blob/master/desktop/libs/hadoop/src/hadoop/fs/hadoopfs.py#L692 > > > ) > > > > Thank you! If I understand it correctly this only works if my python > > app runs on the same server as hadoop, right? > > > > It works only if your python app has network connectivity to your > namenode. You can access an explicitly specified HDFS by passing > -Dfs.default.name=hdfs://<namenode>:<namenode_port>/ > . (The default is read from hadoop-site.xml (or perhaps hdfs-site.xml), > and, I think, defaults to file:///).
Thank you. This sounds really good! I tried it but i still have a problem. The namenode is defined at hadoop/conf/core-site.xml. At the namenode it looks like: <property> <name>fs.default.name</name> <value>hdfs://hadoopserver:9000</value> </property> I have now copied the whole hadoop directory to the client where the python app runs. If I run "hadoop fs -ls /" I get a message the he can't connect to the server and hadoop tries to connect again and again: 10/08/11 12:06:34 INFO ipc.Client: Retrying connect to server: hadoopserver/129.69.216.55:9000. Already tried 0 time(s). 10/08/11 12:06:35 INFO ipc.Client: Retrying connect to server: hadoopserver/129.69.216.55:9000. Already tried 1 time(s). From the client I can access the web interface of the namenode (hadoopserver:50070). "Browse the file system" links to http://pcmoholynagy:50070/nn_browsedfscontent.jsp but if I click at the link I get redirected to http://localhost:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F which of course can't be accessed by the client. If I replace "localhost" with "hadoopserver" it works. Maybe the wrong redirection also causes the problem if i call "bin/hadoop fs -ls /"? If have tried to find something by reading the documentation and by google but I couldn't find a solution. Any ideas? Thanks! Björn