On Mon, Sep 21, 2009 at 7:50 AM, Edward Capriolo <edlinuxg...@gmail.com>wrote:
> > > >Storing the only copy of the NN data into NFS would make the NFS server an > > SPOF, and you still need to solve the problems of > > @Steve correct. It is hair splitting but Stas asked if there was an > approach that did not use DRBD. Linux-HA + NFS, or Linux-HA plus SAN > does not use DRBD. Implicitly, I think he meant is there any approach > that does not rely on "shared storage", but DRBD and Linux-HA are > separate entities although they are often employed together. > Well, if you want to look at it another way, DRBD is just shared storage that happens to work with a pair of nodes rather than an external device. It's still a shared block device that's synchronized, right? When discussing HA it's easy to conflate the failover mechanism and the shared storage mechanism. Linux-HA is just a failover mechanism, with configuration that can determine which node gets to be the master, and hopefully enough magic that you won't have two of them (split brain syndrome). When the standby namenode needs to become master, it has to get the data somehow, and that's where you need some shared storage. As people above mentioned, DRBD is but one of several viable options. Regarding vanilla NFS for the shared storage, I wouldn't consider it a SPOF - since the namenode can sync its edit log to multiple volumes, you can have it write to its local disk as well as the NFS server. If the NFS server goes down, the NN keeps running. If the NN goes down, the NFS server still has the edit log. It's only if both of them go down that you are out of luck. If both go down it's probably because your datacenter lost power, and then you're screwed anyway, to put it bluntly :) -Todd