HDFS does have a single point of failure, and there is no way around this in its current implementation. The namenode keeps track of a FS image and and edits log. It's common for these to be stored both on the local disk and on a NFS mount. In the case when the namenode fails, a new machine can be provisioned to be the namenode by loading the backed-up image and edits files. Can you say more about how you'll use HDFS? It's not a very latent file system, so it shouldn't be used to serve images, videos, etc in a web environment. It's most common use is to be the basis of batch Map/Reduce jobs.
Alex On Thu, Nov 13, 2008 at 5:18 PM, S. L. <[EMAIL PROTECTED]> wrote: > Hi list > I am kind of new to Hadoop but have some good background. I am seriously > considering adopting Hadoop and especially HDFS first to be able to store > various files (in the low hundreds thousands at first) on a few nodes in a > manner where I don't need a RAID system or a SAN. HDFS seems a perfect fit > for the job... > > BUT > > from what I learn in the past couple days it seems that the single point of > failure in HDFS is the NameNode. So I was wondering if anyone in the list > that did deploy HDFS in a production environment on what is their strategy > for High Availability of the system... Having the NameNode unavailable is > basically bringing the whole HDFS system offline. So what are the scripts > or > other techniques recommended to add H.A to HDFS ! > > Thank ! > > -- S. >