Re: HA NN Failover question

Azuryy Fri, 14 Mar 2014 19:46:25 -0700

Which Hadoop version you used?


Sent from my iPhone5s

> On 2014年3月15日, at 9:29, dlmarion <dlmar...@hotmail.com> wrote:
> 
> Server 1: NN1 and ZKFC1
> Server 2: NN2 and ZKFC2
> Server 3: Journal1 and ZK1
> Server 4: Journal2 and ZK2
> Server 5: Journal3 and ZK3
> Server 6+: Datanode
>  
> All in the same rack. I would expect the ZKFC from the active name node 
> server to lose its lock and the other ZKFC to tell the standby namenode that 
> it should become active (I’m assuming that’s how it works).
>  
> - Dave
>  
> From: Juan Carlos [mailto:juc...@gmail.com] 
> Sent: Friday, March 14, 2014 9:12 PM
> To: user@hadoop.apache.org
> Subject: Re: HA NN Failover question
>  
> Hi Dave,
> How many zookeeper servers do you have and where are them? 
> 
> Juan Carlos Fernández Rodríguez
> 
> El 15/03/2014, a las 01:21, dlmarion <dlmar...@hotmail.com> escribió:
> 
> I was doing some testing with HA NN today. I set up two NN with active 
> failover (ZKFC) using sshfence. I tested that its working on both NN by doing 
> ‘kill -9 <pid>’ on the active NN. When I did this on the active node, the 
> standby would become the active and everything seemed to work. Next, I logged 
> onto the active NN and did a ‘service network stop’ to simulate a NIC/network 
> failure. The standby did not become the active in this scenario. In fact, it 
> remained in standby mode and complained in the log that it could not 
> communicate with (what was) the active NN. I was unable to find anything 
> relevant via searches in Google in Jira. Does anyone have experience 
> successfully testing this? I’m hoping that it is just a configuration problem.
>  
> FWIW, when the network was restarted on the active NN, it failed over almost 
> immediately.
>  
> Thanks,
>  
> Dave

Re: HA NN Failover question

Reply via email to