On Mon, Sep 21, 2009 at 6:03 AM, Steve Loughran <ste...@apache.org> wrote: > Edward Capriolo wrote: > >> >> Just for reference. Linux HA and some other tools deal with the split >> brain decisions by requiring a quorum. A quorum involves having a >> third party or having more then 50% of the nodes agree. >> >> An issue with linux-ha and hadoop is that linux-ha is only >> supported/tested on clusters of up to 16 nodes. > > Usually odd numbers; stops a 50%-50% split. > >> That is not a hard >> limit, but no one claims to have done it on 1000 or so nodes. > > If the voting algorithm requires communication with every node then there is > an implicit limit. > > >> You >> could just install linux HA on a random sampling of 10 nodes across >> your network. That would in theory create an effective quorum. > > > >> >> There are other HA approaches that do not involve DRBD. One is store >> your name node table on a SAN or and NFS server. Terracotta is another >> option that you might want to look at. But no, at the moment there is >> no fail-over built into hadoop. > > Storing the only copy of the NN data into NFS would make the NFS server an > SPOF, and you still need to solve the problems of -detecting NN failure and > deciding who else is in charge > -making another node the NN by giving it the same hostname/IPAddr as the one > that went down. > > That is what the linux HA stuff promises > > -steve >
>> An issue with linux-ha and hadoop is that linux-ha is only >> supported/tested on clusters of up to 16 nodes. > > Usually odd numbers; stops a 50%-50% split. @Steve correct. I was getting at the fact that unless you have your HA cluster manager on every node in the cluster your HA Cluster manager may be making a correct decision for the configuration, but it may not be making the optimal decision. The only way for linux-ha to make an optimal decisions is to install it on every node in the hadoop cluster. Linux HA has is tested/tested supported on more then 16 nodes. I had a thread about this on the Linux-HA mailing list 16 is not a hard limit, but no one has attempted larger definitely their target is not in the thousands. >Storing the only copy of the NN data into NFS would make the NFS server an > SPOF, and you still need to solve the problems of @Steve correct. It is hair splitting but Stas asked if there was an approach that did not use DRBD. Linux-HA + NFS, or Linux-HA plus SAN does not use DRBD. Implicitly, I think he meant is there any approach that does not rely on "shared storage", but DRBD and Linux-HA are separate entities although they are often employed together.