Hi Bjoern, To give you an example of how this may be done, HUE, under the covers, pipes your data to 'bin/hadoop fs -Dhadoop.job.ugi=user,group put - path'. (That's from memory, but it's approximately right; the full python code is at http://github.com/cloudera/hue/blob/master/desktop/libs/hadoop/src/hadoop/fs/hadoopfs.py#L692 )
Cheers, -- Philip On Mon, Aug 9, 2010 at 9:18 AM, Bjoern Schiessle <bjo...@schiessle.org>wrote: > Hi all, > > I develop a web application with Django(Python) which should access an > hbase database and store large files to hdfs. > > I wonder what is the best way to write files to hdfs from my Django app? > Basically I thought about two ways but maybe you know a better option: > > 1. First store the file on the local file system and than move it with > the thrift interface to hdfs. (downside: needs always enough space on the > web application server) > > 2. Use hdfs-fuse to mount the hdfs file system and write the file directly > to hdfs. (downside: I don't know how well hdfs-fuse is supported and I'm > not sure if it is a good idea to mount the file system and run large > operation on it). > > Since I'm new to hdfs and Hadoop in general I'm not sure what's the best > and less error-prone way. > > What would be your recommendation? > > Thanks a lot! > Björn > >