You should keep in mind that HDFS is not POSIX conform so you will have a hard time to use it as "real fs". I know there is a fuse driver for it but I would not use it for heavy usage. Also HDFS is not really a good fit for random access at all.
If you really need a POSIX fs I would recomment you to have a look at DRBD or glusterfs.. Bye, Norman 2011/9/15 Per Steffensen <st...@designware.dk>: > David Rosenstrauch skrev: >> >> On 09/14/2011 02:02 PM, Per Steffensen wrote: >>> >>> Hi >>> >>> If my goal is to have multiple physical disks seem as one big disk with >>> redundancy built in, why would I use a HDFS cluster among machines with >>> one disk each, instead of using software RAID like md(adm) directly on >>> top of the disks? I am looking for pros and cons on the two solutions. >>> http://en.wikipedia.org/wiki/RAID#Software-based_RAID >>> http://en.wikipedia.org/wiki/Mdadm >>> >>> Regards, Per Steffensen >> >> HDFS was never intended to be a general-purpose file system. It is a >> system optimized for a) running map/reduce, and b) holding large files. It >> should not be considered as a replacement for RAID. >> >> DR > > Thanks for you reply, David. Despite that HDFS wasnt intended to be used for > this, I guess it could be. So if we forget for a moment that it was not > designed/optimized to be used as a general purpose file system (GPFS), what > are the pros and cons for using it as a GPFS with built in redundancy vs > using software RAID. Is HDFS too slow for some kind of file operations, or > what will the problems (and benefits) be? Hope for some input - I need > arguments for and against to be used in a discussion with a customer. > Thanks! >> >> > >