Hi all ,
I have a qmail mail relay server running in HA v2 ( active/passive , no shared
disk ) for several months now, but one node from the 2 node cluster crashed and
I had to setup a new machine.
I took a copy of the node that was still running ( in ESX ) , reconfigured the
hostname in /etc/hostname/ and /etc/hosts , changed the necessary IP addresses
so the 2 nodes could see one another again.
The problem is I don’t know what steps to follow in this scenario. Both nodes
had the same HA setup ,so both nodes believe they are DC. The moment the link
between the 2 was up , I had the latest machine , in a “never started” status
when looking at the GUI.
I figured out looking at the ha-log , there was an issue with the
/var/lib/heartbeat/hostcache contents. Therefore I erased the contents of the
file on both nodes, while heartbeat was shut down on both nodes.
When I restarted HA on both machines , I got the following error flood in the
ha-log :
heartbeat[12798]: 2010/02/18_12.50:24 WARN: nodename relay1 uuid changed to
relay2
heartbeat[12798]: 2010/02/18_12.50:24 WARN: nodename relay1 uuid changed to
relay2
heartbeat[12798]: 2010/02/18_12.50:24 ERROR: should_drop_message: attempted
replay attack [relay1]? [gen = 1242831508, curgen = 1242831509]
heartbeat[12798]: 2010/02/18_12.50:25 ERROR: should_drop_message: attempted
replay attack [relay1]? [gen = 1242831508, curgen = 1242831509]
Anybody knows how to restore this , what steps to follow when cloning a node ?
_________________________________________________________________
Windows 7: kijk live tv, rechtstreeks vanaf je laptop. Meer informatie.
http://windows.microsoft.com/windows-7
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems