Hi,

> I finally have the primary server back up (fspbro213.rchland.ibm.com), but
> I can't get the secondary server back up.  I get these messages in the
> /var/log/ha-log file of the secondary server (fspbro214.rchland.ibm.com):
> 
> 
> heartbeat[12894]: 2008/03/10_11:57:37 ERROR: should_drop_message:
> attempted replay attack [fspbro213.rchland.ibm.com]? [gen = 42, curgen =
> 44]

It seems that /var/lib/heartbeat/hb_generation contains a wrong number.
Try edit that file on fspbro213.rchland.ibm.com,
for instance, from 42 to more than 44.

> heartbeat[12894]: 2008/03/10_11:57:38 WARN: nodename
> fspbro213.rchland.ibm.com uuid changed to fspbro214.rchland.ibm.com
> heartbeat[12894]: 2008/03/10_11:57:38 WARN: nodename
> fspbro214.rchland.ibm.com uuid changed to fspbro213.rchland.ibm.com
> heartbeat[12894]: 2008/03/10_11:57:38 WARN: nodename
> fspbro213.rchland.ibm.com uuid changed to fspbro214.rchland.ibm.com
> heartbeat[12894]: 2008/03/10_11:57:38 WARN: nodename
> fspbro214.rchland.ibm.com uuid changed to fspbro213.rchland.ibm.com

Are the node's uuid exchanged for some reason?
I think they are recorded in /var/lib/heartbeat/hostcache, and hb_uuid.

- stop all heartbeat service
- remove /var/lib/heartbeat/hostcache
- start heartbeat
(hostcache will be created again automatically)

It might be some rough operation but work for now...
If someone knows a better way, please let me know!

Thanks,
Junko




_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to