Hi, just arrived to the office to continue with my tests and found
that my cluster is in a split brain condition. The VLAN used for
"cluster interconnect"  had intermitent troubles for almost 3 hours :s

Obviously, the two nodes wanted to become primary. The weird think is,
after the link was restablished, both were still displaying the other
one as OFFLINE. Isn't hb supposed to find the other node again?.

Trying to shutdown one of the nodes didn't work, the script never
returned the prompt, so i had to kill the heartbeat process and
restarting just one of them took to cluster to normal state again.

View from node1:
============
Last updated: Fri Jul  6 22:17:36 2007
Current DC: asusis-ope1 (34b56337-2402-4e2d-a66a-f0f0c2cee1b5)
2 Nodes configured.
3 Resources configured.
============

Node: asusis-ope1 (34b56337-2402-4e2d-a66a-f0f0c2cee1b5): online
Node: asusis-ope2 (9b4731c8-163c-482a-93f4-084a5067a414): OFFLINE

resource_ip_10_129_4_236        (heartbeat::ocf:IPaddr2):
Started asusis-ope1
Master/Slave Set: ms-hadata0
   hadata0:0   (heartbeat::ocf:drbd):  Master asusis-ope1
   hadata0:1   (heartbeat::ocf:drbd):  Stopped
fs_hadata       (heartbeat::ocf:Filesystem):    Started asusis-ope1


View from node2:
============
Last updated: Fri Jul  6 22:17:53 2007
Current DC: asusis-ope1 (34b56337-2402-4e2d-a66a-f0f0c2cee1b5)
2 Nodes configured.
3 Resources configured.
============

Node: asusis-ope1 (34b56337-2402-4e2d-a66a-f0f0c2cee1b5): OFFLINE
Node: asusis-ope2 (9b4731c8-163c-482a-93f4-084a5067a414): online

resource_ip_10_129_4_236        (heartbeat::ocf:IPaddr2):
Started asusis-ope2
Master/Slave Set: ms-hadata0
   hadata0:0   (heartbeat::ocf:drbd):  Stopped
   hadata0:1   (heartbeat::ocf:drbd):  Started asusis-ope2

Ciro
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to