Hi list I am kind of new to Hadoop but have some good background. I am seriously considering adopting Hadoop and especially HDFS first to be able to store various files (in the low hundreds thousands at first) on a few nodes in a manner where I don't need a RAID system or a SAN. HDFS seems a perfect fit for the job...
BUT from what I learn in the past couple days it seems that the single point of failure in HDFS is the NameNode. So I was wondering if anyone in the list that did deploy HDFS in a production environment on what is their strategy for High Availability of the system... Having the NameNode unavailable is basically bringing the whole HDFS system offline. So what are the scripts or other techniques recommended to add H.A to HDFS ! Thank ! -- S.