Re: HDFS without Hadoop: Why?

Ian Holsman Wed, 02 Feb 2011 16:39:06 -0800

Haystack is described here
http://www.facebook.com/note.php?note_id=76191543919


Regards
Ian


--- 
Ian Holsman
AOL Inc
ian.hols...@teamaol.com
(703) 879-3128 / AIM:ianholsman 

it's just a technicality

On Feb 2, 2011, at 7:28 PM, "Dhodapkar, Chinmay" <chinm...@qualcomm.com> wrote:

> Hello,
> 
>  
> 
> I have been following this thread for some time now. I am very comfortable 
> with the advantages of hdfs, but still have lingering questions about the 
> usage of hdfs for general purpose storage (no mapreduce/hbase etc).
> 
>  
> 
> Can somebody shed light on what the limitations are on the number of files 
> that can be stored. Is it limited in anyway by the namenode? The use case I 
> am interested in is to store a very large number of relatively small files 
> (1MB to 25MB).
> 
>  
> 
> Interestingly, I saw a facebook presentation on how they use hbase/hdfs 
> internally. Them seem to store all metadata in hbase and the actual 
> images/files/etc in something called “haystack” (why not use hdfs since they 
> already have it?). Anybody know what “haystack” is?
> 
>  
> 
> Thanks!
> 
> Chinmay
> 
>  
> 
>  
> 
> 
>  
> 
> From: Jeff Hammerbacher [mailto:ham...@cloudera.com] 
> Sent: Wednesday, February 02, 2011 3:31 PM
> To: hdfs-user@hadoop.apache.org
> Subject: Re: HDFS without Hadoop: Why?
> 
>  
> 
> Large block size wastes space for small file.  The minimum file size is 1 
> block.
> That's incorrect. If a file is smaller than the block size, it will only 
> consume as much space as there is data in the file.
> 
> There are no hardlinks, softlinks, or quotas.
> That's incorrect; there are quotas and softlinks.

Re: HDFS without Hadoop: Why?

Reply via email to