Haystack is described here http://www.facebook.com/note.php?note_id=76191543919
Regards Ian --- Ian Holsman AOL Inc ian.hols...@teamaol.com (703) 879-3128 / AIM:ianholsman it's just a technicality On Feb 2, 2011, at 7:28 PM, "Dhodapkar, Chinmay" <chinm...@qualcomm.com> wrote: > Hello, > > > > I have been following this thread for some time now. I am very comfortable > with the advantages of hdfs, but still have lingering questions about the > usage of hdfs for general purpose storage (no mapreduce/hbase etc). > > > > Can somebody shed light on what the limitations are on the number of files > that can be stored. Is it limited in anyway by the namenode? The use case I > am interested in is to store a very large number of relatively small files > (1MB to 25MB). > > > > Interestingly, I saw a facebook presentation on how they use hbase/hdfs > internally. Them seem to store all metadata in hbase and the actual > images/files/etc in something called “haystack” (why not use hdfs since they > already have it?). Anybody know what “haystack” is? > > > > Thanks! > > Chinmay > > > > > > > > > From: Jeff Hammerbacher [mailto:ham...@cloudera.com] > Sent: Wednesday, February 02, 2011 3:31 PM > To: hdfs-user@hadoop.apache.org > Subject: Re: HDFS without Hadoop: Why? > > > > Large block size wastes space for small file. The minimum file size is 1 > block. > That's incorrect. If a file is smaller than the block size, it will only > consume as much space as there is data in the file. > > There are no hardlinks, softlinks, or quotas. > That's incorrect; there are quotas and softlinks.