HDFS itself has some facilities for serving data over HTTP: https://issues.apache.org/jira/browse/HADOOP-5010. YMMV.
On Thu, Mar 26, 2009 at 3:47 PM, Brian Bockelman <bbock...@cse.unl.edu>wrote: > > On Mar 26, 2009, at 8:55 PM, phil cryer wrote: > > When you say that you have huge images, how big is "huge?" >>> >> >> Yes, we're looking at some images that are 100Megs in size, but >> nothing like what you're speaking of. This helps me understand >> Hadoop's usage better and unfortunately it won't be the fit I was >> hoping for. >> >> > I wouldn't split hairs between 100MB and 1GB. However, it may be less > reliable due to the extra layer via FUSE if you want to serve it via apache. > It wouldn't be too bad to whip up a tomcat webapp that goes through > Hadoop... > > It really depends on your hardware level and redundancy. If you have the > money to get the hardware necessary to go with a Lustre-based solution, do > that. If you have enough money to load up your pre-existing cluster with > lots of disk, HDFS might be better. Certainly it will be outperformed by > lustre if you have lots of reliable hardware, especially in terms of > latency. > > Brian > > > You can use the API or the FUSE module to mount hadoop but that is not >>> a direct goal of hadoop. Hope that helps. >>> >> >> Very interesting, and yes, that indeed does help, not to veer off >> thread too much, but does Sun's Lustre follow in the steps of Gluster >> then? I know Lustre requires kernel patches to install, so it's at a >> different level than the others, but I have seen some articles about >> large scale clusters built with Lustre and want to look at that as >> another option. >> >> Again, thanks for the info, if anyone has general information on >> cluster software, or know of a more appropriate list, I'd appreciate >> the advice. >> >> Thanks >> >> P >> >> On Thu, Mar 26, 2009 at 12:32 PM, Edward Capriolo <edlinuxg...@gmail.com> >> wrote: >> >>> It is a little more natural to connect to HDFS from apache tomcat. >>> This will allow you to skip the FUSE mounts and just use the HDFS-API. >>> >>> I have modified this code to run inside tomcat. >>> http://wiki.apache.org/hadoop/HadoopDfsReadWriteExample >>> >>> I will not testify to how well this setup will perform under internet >>> traffic, but it does work. >>> >>> GlusterFS is more like a traditional POSIX filesystem. It supports >>> locking and appends and you can do things like put the mysql data >>> directory on it. >>> >>> GLUSTERFS is geared for storing data to be accessed with low latency. >>> Nodes (Bricks) are normally connected via GIG-E or infiniban. The >>> GlusterFS volume is mounted directly on a unix system. >>> >>> Hadoop is a user space file system. The latency is higher. Nodes are >>> connected by GIG-E. It is closely coupled with MAP/REDUCE. >>> >>> You can use the API or the FUSE module to mount hadoop but that is not >>> a direct goal of hadoop. Hope that helps. >>> >>> >