Hey Björn, You also mention that your app will be accessing data stored in HBase. There's a Python client for the Avro HBase gateway at http://github.com/hammer/pyhbase. If you try it out, let me know how it goes.
Thanks, Jeff On Wed, Aug 11, 2010 at 4:39 AM, Bjoern Schiessle <bjo...@schiessle.org>wrote: > On Tue, 10 Aug 2010 09:39:17 -0700 Philip Zeyliger wrote: > > On Tue, Aug 10, 2010 at 5:06 AM, Bjoern Schiessle > > <bjo...@schiessle.org>wrote: > > > On Mon, 9 Aug 2010 16:35:07 -0700 Philip Zeyliger wrote: > > > > To give you an example of how this may be done, HUE, under the > > > > covers, pipes your data to 'bin/hadoop fs > > > > -Dhadoop.job.ugi=user,group put - path'. (That's from memory, but > > > > it's approximately right; the full python code is at > > > > > > > > http://github.com/cloudera/hue/blob/master/desktop/libs/hadoop/src/hadoop/fs/hadoopfs.py#L692 > > > > ) > > > > > > Thank you! If I understand it correctly this only works if my python > > > app runs on the same server as hadoop, right? > > > > > > > It works only if your python app has network connectivity to your > > namenode. You can access an explicitly specified HDFS by passing > > -Dfs.default.name=hdfs://<namenode>:<namenode_port>/ > > . (The default is read from hadoop-site.xml (or perhaps hdfs-site.xml), > > and, I think, defaults to file:///). > > Thank you. This sounds really good! I tried it but i still have a problem. > > The namenode is defined at hadoop/conf/core-site.xml. At the namenode it > looks like: > > <property> > <name>fs.default.name</name> > <value>hdfs://hadoopserver:9000</value> > </property> > > I have now copied the whole hadoop directory to the client where the > python app runs. > > If I run "hadoop fs -ls /" > I get a message the he can't connect to the server and hadoop tries to > connect again and again: > > 10/08/11 12:06:34 INFO ipc.Client: Retrying connect to server: > hadoopserver/129.69.216.55:9000. Already tried 0 time(s). 10/08/11 > 12:06:35 INFO ipc.Client: Retrying connect to server: hadoopserver/ > 129.69.216.55:9000. Already tried 1 time(s). > > From the client I can access the web interface of the namenode > (hadoopserver:50070). "Browse the file system" links to > http://pcmoholynagy:50070/nn_browsedfscontent.jsp but if I click at the > link I get redirected to > http://localhost:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F > which of course can't be accessed by the client. If I replace "localhost" > with "hadoopserver" it works. > > Maybe the wrong redirection also causes the problem if i call "bin/hadoop > fs -ls /"? > > If have tried to find something by reading the documentation and by > google but I couldn't find a solution. > > Any ideas? > > Thanks! > Björn >