Which Hadoop version you used?
Sent from my iPhone5s > On 2014年3月15日, at 9:29, dlmarion <dlmar...@hotmail.com> wrote: > > Server 1: NN1 and ZKFC1 > Server 2: NN2 and ZKFC2 > Server 3: Journal1 and ZK1 > Server 4: Journal2 and ZK2 > Server 5: Journal3 and ZK3 > Server 6+: Datanode > > All in the same rack. I would expect the ZKFC from the active name node > server to lose its lock and the other ZKFC to tell the standby namenode that > it should become active (I’m assuming that’s how it works). > > - Dave > > From: Juan Carlos [mailto:juc...@gmail.com] > Sent: Friday, March 14, 2014 9:12 PM > To: user@hadoop.apache.org > Subject: Re: HA NN Failover question > > Hi Dave, > How many zookeeper servers do you have and where are them? > > Juan Carlos Fernández Rodríguez > > El 15/03/2014, a las 01:21, dlmarion <dlmar...@hotmail.com> escribió: > > I was doing some testing with HA NN today. I set up two NN with active > failover (ZKFC) using sshfence. I tested that its working on both NN by doing > ‘kill -9 <pid>’ on the active NN. When I did this on the active node, the > standby would become the active and everything seemed to work. Next, I logged > onto the active NN and did a ‘service network stop’ to simulate a NIC/network > failure. The standby did not become the active in this scenario. In fact, it > remained in standby mode and complained in the log that it could not > communicate with (what was) the active NN. I was unable to find anything > relevant via searches in Google in Jira. Does anyone have experience > successfully testing this? I’m hoping that it is just a configuration problem. > > FWIW, when the network was restarted on the active NN, it failed over almost > immediately. > > Thanks, > > Dave