Hi, > I finally have the primary server back up (fspbro213.rchland.ibm.com), but > I can't get the secondary server back up. I get these messages in the > /var/log/ha-log file of the secondary server (fspbro214.rchland.ibm.com): > > > heartbeat[12894]: 2008/03/10_11:57:37 ERROR: should_drop_message: > attempted replay attack [fspbro213.rchland.ibm.com]? [gen = 42, curgen = > 44]
It seems that /var/lib/heartbeat/hb_generation contains a wrong number. Try edit that file on fspbro213.rchland.ibm.com, for instance, from 42 to more than 44. > heartbeat[12894]: 2008/03/10_11:57:38 WARN: nodename > fspbro213.rchland.ibm.com uuid changed to fspbro214.rchland.ibm.com > heartbeat[12894]: 2008/03/10_11:57:38 WARN: nodename > fspbro214.rchland.ibm.com uuid changed to fspbro213.rchland.ibm.com > heartbeat[12894]: 2008/03/10_11:57:38 WARN: nodename > fspbro213.rchland.ibm.com uuid changed to fspbro214.rchland.ibm.com > heartbeat[12894]: 2008/03/10_11:57:38 WARN: nodename > fspbro214.rchland.ibm.com uuid changed to fspbro213.rchland.ibm.com Are the node's uuid exchanged for some reason? I think they are recorded in /var/lib/heartbeat/hostcache, and hb_uuid. - stop all heartbeat service - remove /var/lib/heartbeat/hostcache - start heartbeat (hostcache will be created again automatically) It might be some rough operation but work for now... If someone knows a better way, please let me know! Thanks, Junko _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems