On Mar 26, 2009, at 5:44 PM, Aaron Kimball wrote:

In general, Hadoop is unsuitable for the application you're suggesting.
Systems like Fuse HDFS do exist, though they're not widely used.

We use FUSE on a 270TB cluster to serve up physics data because the client (2.5M lines of C++) doesn't understand how to connect to HDFS directly.

Brian

I don't
know of anyone trying to connect Hadoop with Apache httpd.

When you say that you have huge images, how big is "huge?" It might be
useful if these images are 1 GB or larger. But in general, "huge" on Hadoop means 10s of GBs up to TBs. If you have a large number of moderately-sized
files, you'll find that HDFS responds very poorly for your needs.

It sounds like glusterfs is designed more for your needs.

- Aaron

On Thu, Mar 26, 2009 at 4:06 PM, phil cryer <p...@cryer.us> wrote:

This is somewhat of a noob question I know, but after learning about
Hadoop, testing it in a small cluster and running Map Reduce jobs on
it, I'm still not sure if Hadoop is the right distributed file system
to serve web requests.  In other words, can, or is it right to, serve
Images and data from HDFS using something like FUSE to mount a
filesystem where Apache could serve images from it?  We have huge
images, thus the need for a distributed file system, and they go in,
get stored with lots of metadata, and are redundant with Hadoop/ HDFS -
but is it the right way to serve web content?

I looked at glusterfs before, they had an Apache and Lighttpd module
which made it simple, does HDFS have something like this, do people
just use a FUSE option as I described, or is this not a good use of
Hadoop?

Thanks

P


Reply via email to