Re: HDFS without Hadoop: Why?

Nathan Rutman Thu, 03 Feb 2011 10:49:16 -0800

On Feb 2, 2011, at 6:42 PM, Konstantin Shvachko wrote:

> Thanks for the link Stu.
> More details are on limitations are here:
> http://www.usenix.org/publications/login/2010-04/openpdfs/shvachko.pdf
> 
> I think that Nathan raised an interesting question and his assessment of HDFS 
> use 
> cases are generally right.
> Some assumptions though are outdated at this point. 
> And people mentioned about it in the thread.
> We have append implementation, which allows reopening files for updates.
> We also have symbolic links and quotas (space and name-space).
> The api to HDFS is not posix, true. But in addition to Fuse people also use 
> Thrift to access hdfs.
> Most of these features are explained in HDFS overview paper:
> http://storageconference.org/2010/Papers/MSST/Shvachko.pdf
> 
> Stand-alone HDFS is actually used in several places. I like what
> Brian Bockelman at University of Nebraska does.
> They store CERN data in their cluster, and physicists use Fortran to access 
> the data,
> not map-reduce, as I heard.
> http://storageconference.org/2010/Presentations/MSST/3.Bockelman.pdf
This doesn't seem to mention what storage they're using.
> 
> With respect to other distributed file systems. HDFS performance was compared 
> to
> PVFS, GPFS and Lustre. The results were in favor of HDFS. See e.g.
PVFS
> http://www.cs.cmu.edu/~wtantisi/files/hadooppvfs-pdl08.pdf
>


Some other references for those interested:  HDFS vs
GPFS
Cloud analytics: Do we really need to reinvent the storage stack?
Lustre
http://wiki.lustre.org/images/1/1b/Hadoop_wp_v0.4.2.pdf
Ceph
www.usenix.org—maltzahn.pdf

These GPFS and Lustre papers were both favorable toward HDFS because
they missed a fundamental issue: for the former FS's, network speed is critical.
HDFS doesn't need network on reads (ideally), and so is simultaneously immune 
to network
speed, but also cannot take advantage of network speed.  For slow networks 
(1GigE)
this plays into HDFS's strength, but for fast networks (10GigE, Infiniband),
the balance tips the other way. (My testing: for a heavily loaded network, a 
3-4x read 
speed factor for Lustre.  For writes, the difference is even more extreme 
(10x), 
since HDFS has to hop all write data over the network twice.)

Let me say clearly that your choice of FS should depend on which of many factors
are most important to you -- there is no "one size fits all", although that 
sadly makes our
decisions more complex.  For those using Hadoop that have a high weighting on
IO performance (as well as some other factors I listed in my original mail), I 
suggest you 
at least think about spending money on a fast network and using a FS that can 
utilize it.


> So I agree with Nathan HDFS was designed and optimized as a storage layer for 
> map-reduce type tasks, but it performs well as a general purpose fs as well.
> 
> Thanks,
> --Konstantin
> 
> 
> 
> 
> On Wed, Feb 2, 2011 at 6:08 PM, Stuart Smith <stu24m...@yahoo.com> wrote:
> 
> This is the best coverage I've seen from a source that would know:
> 
> http://developer.yahoo.com/blogs/hadoop/posts/2010/05/scalability_of_the_hadoop_dist/
> 
> One relevant quote:
> 
> To store 100 million files (referencing 200 million blocks), a name-node 
> should have at least 60 GB of RAM.
> 
> But, honestly, if you're just building out your cluster, you'll probably run 
> into a lot of other limits first: hard drive space, regionserver memory, the 
> infamous ulimit/xciever :), etc...
> 
> Take care,
>   -stu
> 
> --- On Wed, 2/2/11, Dhruba Borthakur <dhr...@gmail.com> wrote:
> 
> From: Dhruba Borthakur <dhr...@gmail.com>
> Subject: Re: HDFS without Hadoop: Why?
> To: hdfs-user@hadoop.apache.org
> Date: Wednesday, February 2, 2011, 9:00 PM
> 
> The Namenode uses around 160 bytes/file and 150 bytes/block in HDFS. This is 
> a very rough calculation.
> 
> dhruba
> 
> On Wed, Feb 2, 2011 at 5:11 PM, Dhodapkar, Chinmay <chinm...@qualcomm.com> 
> wrote:
> What you describe is pretty much my use case as well. Since I don’t know how 
> big the number of files could get , I am trying to figure out if there is a 
> theoretical design limitation in hdfs…..
> 
>  
> From what I have read, the name node will store all metadata of all files in 
> the RAM. Assuming (in my case), that a file is less than the configured block 
> size….there should be a very rough formula that can be used to calculate the 
> max number of files that hdfs can serve based on the configured RAM on the 
> name node?
> 
>  
> Can any of the implementers comment on this? Am I even thinking on the right 
> track…?
> 
>  
> Thanks Ian for the haystack link…very informative indeed.
> 
>  
> -Chinmay
> 
>  
>  
>  
> From: Stuart Smith [mailto:stu24m...@yahoo.com] 
> Sent: Wednesday, February 02, 2011 4:41 PM
> 
> 
> To: hdfs-user@hadoop.apache.org
> Subject: RE: HDFS without Hadoop: Why?
> 
>  
> Hello,
>    I'm actually using hbase/hadoop/hdfs for lots of small files (with a long 
> tail of larger files). Well, millions of small files - I don't know what you 
> mean by lots :) 
> 
> Facebook probably knows better, But what I do is:
> 
>   - store metadata in hbase
>   - files smaller than 10 MB or so in hbase
>    -larger files in a hdfs directory tree. 
> 
> I started storing 64 MB files and smaller in hbase (chunk size), but that 
> causes issues with regionservers when running M/R jobs. This is related to 
> the fact that I'm running a cobbled together cluster & my region servers 
> don't have that much memory. I would play the size to see what works for you..
> 
> Take care, 
>    -stu
> 
> --- On Wed, 2/2/11, Dhodapkar, Chinmay <chinm...@qualcomm.com> wrote:
> 
> 
> From: Dhodapkar, Chinmay <chinm...@qualcomm.com>
> Subject: RE: HDFS without Hadoop: Why?
> To: "hdfs-user@hadoop.apache.org" <hdfs-user@hadoop.apache.org>
> Date: Wednesday, February 2, 2011, 7:28 PM
> 
> Hello,
> 
>  
> I have been following this thread for some time now. I am very comfortable 
> with the advantages of hdfs, but still have lingering questions about the 
> usage of hdfs for general purpose storage (no mapreduce/hbase etc).
> 
>  
> Can somebody shed light on what the limitations are on the number of files 
> that can be stored. Is it limited in anyway by the namenode? The use case I 
> am interested in is to store a very large number of relatively small files 
> (1MB to 25MB).
> 
>  
> Interestingly, I saw a facebook presentation on how they use hbase/hdfs 
> internally. Them seem to store all metadata in hbase and the actual 
> images/files/etc in something called “haystack” (why not use hdfs since they 
> already have it?). Anybody know what “haystack” is?
> 
>  
> Thanks!
> 
> Chinmay
> 
>  
>  
>  
> From: Jeff Hammerbacher [mailto:ham...@cloudera.com] 
> Sent: Wednesday, February 02, 2011 3:31 PM
> To: hdfs-user@hadoop.apache.org
> Subject: Re: HDFS without Hadoop: Why?
> 
>  
> Large block size wastes space for small file.  The minimum file size is 1 
> block.
> That's incorrect. If a file is smaller than the block size, it will only 
> consume as much space as there is data in the file.
> 
> There are no hardlinks, softlinks, or quotas.
> That's incorrect; there are quotas and softlinks.
> 
>  
> 
> 
> 
> -- 
> Connect to me at http://www.facebook.com/dhruba
> 
>

Re: HDFS without Hadoop: Why?

Reply via email to