Hi,
Sorry for my bad english, i'm a french people. I'm heartbeat to have a 3 nodes cluster server with a VIP. I have a big problem. In bcast mode, heartbeat fall down the network. I have put the ucast method to resolv the problem, and it works. But If the first server fail, the VIP goes on the second one, and the network is slow on the client application. authkeys : auth 1 1 sha1 net-cluster 2 md5 "cluster" 3 crc ha.cf use_logd on #debugfile /var/log/ha-debug #logfile /var/log/ha-log keepalive 1 warntime 2 deadtime 3 initdead 60 #bcast eth1 ucast eth1 192.168.1.1 ucast eth1 192.168.1.2 ucast eth1 192.168.1.3 udpport 694 node SVR01 node SVR02 node SVR03 auto_failback on crm on I had a power crash yesterday, and this problem appears (slow network). Some logs : d22312/HBWRITE] heartbeat[16404]: 2008/07/23_15:11:16 info: cl_malloc stats: 446/5809982 42328/ 20020 [pid22312/HBWRITE] heartbeat[16404]: 2008/07/23_15:11:16 info: RealMalloc stats: 50916 total malloc bytes. pid [22312/HBWRITE] heartbeat[16404]: 2008/07/23_15:11:16 info: Current arena value: 0 heartbeat[16404]: 2008/07/23_15:11:16 info: MSG stats: 0/0 ms age 1314784650 [pi d22313/HBREAD] heartbeat[16404]: 2008/07/23_15:11:16 info: cl_malloc stats: 447/22018891 42412 /20064 [pid22313/HBREAD] heartbeat[16404]: 2008/07/23_15:11:16 info: RealMalloc stats: 50584 total malloc bytes. pid [22313/HBREAD] heartbeat[16404]: 2008/07/23_15:11:16 info: Current arena value: 0 heartbeat[16404]: 2008/07/23_15:11:16 info: These are nothing to worry about. heartbeat[16404]: 2008/07/23_15:13:23 WARN: 2 lost packet(s) for [svr03] [4233 315:4233318] heartbeat[16404]: 2008/07/23_15:13:23 WARN: Late heartbeat: Node svr03: interv al 3000 ms heartbeat[16404]: 2008/07/23_15:13:23 info: No pkts missing from svr03! heartbeat[16404]: 2008/07/23_15:13:24 WARN: node svr02: is dead heartbeat[16404]: 2008/07/23_15:13:24 info: Link svr02:eth1 dead. heartbeat[16404]: 2008/07/23_15:13:24 CRIT: Cluster node svr02 returning after partition. heartbeat[16404]: 2008/07/23_15:13:24 info: For information on cluster partition s, See URL: http://linux-ha.org/SplitBrain heartbeat[16404]: 2008/07/23_15:13:24 WARN: Deadtime value may be too small. heartbeat[16404]: 2008/07/23_15:13:24 info: See FAQ for information on tuning deadtime. heartbeat[16404]: 2008/07/23_15:13:24 info: URL: http://linux-ha.org/FAQ#heavy_load heartbeat[16404]: 2008/07/23_15:13:24 info: Link svr02:eth1 up. heartbeat[16404]: 2008/07/23_15:13:24 WARN: Late heartbeat: Node svr02: interval 4000 ms heartbeat[16404]: 2008/07/23_15:13:24 info: Status update for node svr02: status active heartbeat[16404]: 2008/07/23_15:13:26 WARN: 1 lost packet(s) for [svr03] [4233322:4233324] heartbeat[16404]: 2008/07/23_15:13:26 info: No pkts missing from svr03! heartbeat[16404]: 2008/07/23_16:06:24 WARN: 2 lost packet(s) for [svr03] [4236562:4236565] heartbeat[16404]: 2008/07/23_16:06:24 WARN: Late heartbeat: Node svr03: interval 3000 ms heartbeat[16404]: 2008/07/23_16:06:24 info: No pkts missing from svr03! heartbeat[16404]: 2008/07/23_16:10:49 WARN: 1 lost packet(s) for [svr02] [4236937:4236939] heartbeat[16404]: 2008/07/23_16:10:49 WARN: 1 lost packet(s) for [svr03] [4236828:4236830] heartbeat[16404]: 2008/07/23_16:10:49 info: No pkts missing from svr02! heartbeat[16404]: 2008/07/23_16:10:50 info: No pkts missing from svr03! heartbeat[16404]: 2008/07/23_16:10:57 WARN: 1 lost packet(s) for [svr03] [4236835:4236837] h How should I resolve the problem ? I must have a specific equipment to use the multicast method ? --- Reza ISSANY _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems