Hi, Hi Brendan,
The number of files that can be stored in HDFS is limited by the size of the NameNode's RAM. The downside with storing small files is that you would saturate the NameNode's RAM with a small data set (sum of the size of all your small files). However, you can store around 100 million files (at least) using 60GB of RAM at the NameNode. The downside with having a large namespace is that the NameNode might take upto an hour to recover from failures, but you can overcome this issue by using the HA Namenode. Are you planning to store more than 100 million files? Regards, Wasif Riaz Malik On Tue, May 22, 2012 at 11:39 AM, Brendan cheng <ccp...@hotmail.com> wrote: > > Hi, > I read HDFS architecture doc and it said HDFS is tuned for at storing > large file, typically gigabyte to terabytes.What is the downsize of storing > million of small files like <10MB? or what setting of HDFS is suitable for > storing small files? > Actually, I plan to find a distribute filed system for storing mult > million of files. > Brendan