Hello All,
   I would appreciate if you could help me on this problem I  
   am facing with Apache HA with HB and MON.

 
   I have been working on setting up 2 node failover cluster for my
   web service. I have installed the heartbeat 2.0.5 amd MON on the 2 SUSE
   Linux servers. The MON is monitoring the Apache webserver. I tested two
   methods  of causing failover and then a failback. I end up having a
  split brain in the cluster in Method 1.
 
  Method 1:
  
  I find that SLAVENODE takes all the resource if I stop the heartbeat of 
  the MASTERNODE by running 'rcheartbeat stop', this is quite normal.
  But If I do 'rcheartbeat start' on the MASTERNODE again to restart
  heartbeat, the MASTERNODE thinks the SLAVENODE is dead and takes over
  the resources ending up in a unrecoverable split-brain. 
 

 Method 2:

 Suprisingly, If I had caused the failover by pulling off the network
 cable and the restored back the network cable followed by starting the
 heartbeat again on the MASTERNODE,  I see that MASTERNODE senses the
 SLAVENODE, SLAVENODE relinquishes resources to MASTER and it seems 
 all fine.
 
 I am not able to get why the Method-1 of failover is ending up with
 a split brain.
 
  My ha.cf and haresource are as below. 
  
  debug 1
  logfile /var/log/ha-log
  keepalive 2
  warntime 30
  deadtime 80
  initdead 90
  node MASTERNODE
  node SLAVENODE
  bcast eth0
  udpport 694
  auto_failback on
  ping_group ping-cluster-test 10.10.10.1 10.10.10.151
  respawn hacluster /usr/lib/heartbeat/ipfail
  crm off
  
  Also attached are the master and slave dump when split brain occurs in
  Method-1.
  
  It would be great to get your solutios to this.
  
  
  Regards
  Shailesh P Shirali


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to