RE: Storing millions of small files

2012-05-23 Thread Brendan cheng
might make sense to process many small files per mapper. > > On Tue, May 22, 2012 at 2:39 AM, Brendan cheng > mailto:ccp...@hotmail.com>> wrote: > > Hi, > I read HDFS architecture doc and it said HDFS is tuned for at storing > large file, typically gigabyte

Storing millions of small files

2012-05-22 Thread Brendan cheng
Hi, I read HDFS architecture doc and it said HDFS is tuned for at storing large file, typically gigabyte to terabytes.What is the downsize of storing million of small files like <10MB?  or what setting of HDFS is suitable for storing small files? Actually, I plan to find a distribute filed syst

RE: namenode directory disappear after machines restart

2012-05-22 Thread Brendan cheng
p/data for example, of course, a directory > > durable in time. > > So, you should change your dfs.name.dir and your dfs.data.dir variable in > > your hdfs-site.xml. > > > > Regards > > > > > > On 05/21/2012 11:21 PM, Brendan cheng wrote: >

webhdfs is inaccessible

2012-05-21 Thread Brendan cheng
Hi, I followed single-node setup and successfully installed.  HDFS run on 50070 website. but I can't curl it  curl -i -X PUT "http://192.168.56.101:50070/webhdfs/v1/user/brendan/22m.png?op=CREATE"HTTP/1.1 405 HTTP method PUT is not supported by this URLContent-Length: 0Server: Jetty(6.1.26) I

namenode directory disappear after machines restart

2012-05-21 Thread Brendan cheng
Hi, I'm not sure if there is a setting to avoid the Namenode removed after hosting machine of Namenode restart.I found that after successfully installed single node pseudo distributed hadoop following from your website, the name node dir /tmp/hadoop-brendan/dfs/name are removed if machine reboo