Hello,
On Thu, May 08, 2008 at 08:53:39AM -0400, Rob Morin wrote:
> Actually another question....
> 
> I would simply add eth1 to the heartbeat ha.cf then? and whats the diff 
> between using mcast vs bcast? I am not sure i understand this ?

mcast = multicast
if your router supports multicast-routing this packes were routed

bcast = broadcast
broadcasts will NEVER routed.

ucast = unicast
that's my choice, broadcast are mostly trash on the net (imho) and i use
always non-bcast if I can. unicast ist also routed traffic.

Madd

> Thanks a bunch
> :)
> 
> 
> Rob Morin
> Dido Internet Inc.
> Montreal,Canada
> http://www.dido.ca
> 514-990-4444
> 
> 
> 
> Dominik Klein wrote:
> >Rob Morin wrote:
> >>I have not seen my original email get to the list yet... but after 
> >>looking through the logs i see this on each node...
> >>see below for log excerts...
> >>
> >>My test involved bringing down eth0 only(heartbeat & replication), 
> >>should i have also brought down eth1 the public side of Joe(primary)
> >>
> >>my conf file is...
> >>
> >>logfacility     daemon        # This is deprecated
> >>keepalive 2                   # Interval between heartbeat (HB) packets.
> >>deadtime 60                   # How quickly HB determines a dead node.
> >>warntime 5                    # Time HB will issue a late HB.
> >>initdead 120                  # Time delay needed by HB to report a 
> >>dead node.
> >>udpport 694                   # UDP port HB uses to communicate 
> >>between nodes.
> >>#ping 192.168.5.1              # Ping VMware Server host to simulate 
> >>network resource.
> >>bcast eth0
> >
> >You only use one connection for heartbeat communication. That is a 
> >configuration error.
> >
> >As you unplugged that interface for testing, you forced a splitbrain 
> >situation. Read http://www.linux-ha.org/SplitBrain
> >
> >Dual split brain so to speak. Your drbd replication is also done over 
> >this link. So not only does heartbeat loose connection, but also does 
> >drbd. In a standard setup, a not connected secondary drbd device can 
> >be promoted disregarding the peer's drbd state.
> >
> >You might want to read about dopd, too: 
> >http://www.drbd.org/users-guide/s-heartbeat-dopd.html
> >It can prevent drbd splitbrain, but you need to have >1 network 
> >connection anyways.
> >
> >>#baud 115200
> >>#serial /dev/ttyS0              # Which interface to use for HB packets.
> >>coredumps true
> >>auto_failback on             # Auto promotion of primary node upon 
> >>return to cluster.
> >
> >Your comment answers your later question on what will happen when a 
> >rebooted (stonith'd) node rejoins the cluster.
> >
> >Regards
> >Dominik
> >
> >>node    joe      # Node name must be same as uname -n.
> >>node    stewie      # Node name must be same as uname -n.
> >>###
> >>###
> >>respawn hacluster /usr/lib/heartbeat/ipfail
> >># Specifies which programs to run at startup
> >># DO not use the below unless you use the 
> >>/var/lib/heartbeat/crm/cib/xml config file instead
> >>#crm on
> >>use_logd yes                  # Use system logging.
> >>logfile /var/log/hb.log       # Heartbeat logfile.
> >>debugfile /var/log/heartbeat-debug.log # Debugging logfile.
> >>
> >>
> >>Primary
> >>--------
> >>
> >>May  6 23:04:44 joe heartbeat: [4342]: WARN: node stewie: is dead
> >>May  6 23:04:44 joe heartbeat: [4342]: WARN: No STONITH device 
> >>configured.
> >>May  6 23:04:44 joe heartbeat: [4342]: WARN: Shared disks are not 
> >>protected.
> >>May  6 23:04:44 joe heartbeat: [4342]: info: Resources being acquired 
> >>from stewie.
> >>May  6 23:04:44 joe heartbeat: [4342]: info: Link stewie:eth0 dead.
> >>May  6 23:04:44 joe heartbeat: [4249]: debug: notify_world: setting 
> >>SIGCHLD Handler to SIG_DFL
> >>May  6 23:04:44 joe mach_down[4283]: [4328]: info: 
> >>/usr/lib/heartbeat/mach_down: nice_failback: foreign resources acquired
> >>May  6 23:04:44 joe heartbeat: [4342]: info: mach_down takeover 
> >>complete.
> >>May  6 23:04:44 joe heartbeat: [4342]: debug: 
> >>StartNextRemoteRscReq(): child count 1
> >>May  6 23:04:44 joe heartbeat: [4250]: info: Local Resource 
> >>acquisition completed.
> >>
> >>
> >>Secondary
> >>-----------
> >>
> >>May  6 23:04:46 stewie heartbeat: [21820]: info: Resources being 
> >>acquired from joe.
> >>May  6 23:04:46 stewie heartbeat: [21820]: info: Link joe:eth0 dead.
> >>May  6 23:04:46 stewie heartbeat: [4946]: info: No local resources 
> >>[/usr/lib/heartbeat/ResourceManager listkeys stewie] to acquire.
> >>May  6 23:04:46 stewie heartbeat: [21825]: ERROR: MSG[4] : 
> >>[info=req_our_resources()]
> >>May  6 23:05:10 stewie mach_down[4953]: [6063]: info: 
> >>/usr/lib/heartbeat/mach_down: nice_failback: foreign resources acquired
> >>May  6 23:05:10 stewie heartbeat: [21820]: info: mach_down takeover 
> >>complete.
> >>May  6 23:05:10 stewie heartbeat: [21825]: ERROR: MSG[2] : 
> >>[info=mach_down]
> >
> >_______________________________________________
> >Linux-HA mailing list
> >[email protected]
> >http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >See also: http://linux-ha.org/ReportingProblems
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to