Hi, Vadym Chepkov wrote: > On Oct 14, 2010, at 4:41 AM, Lars Ellenberg wrote: > >> If you happen to be somehow target locked on heartbeat, tell us why, >> and what you are trying to achieve, and we figure something out. >> > > Sorry for barge in, but I actually started with corosync, but had to > "backout", so to speak. > The major reason - lack of support for PPC architecture, it just doesn't work > there. > I was hoping since RedHat fully supports this platform things will get better > with RHEL6, > but to my unpleasant surprise instead of fixing it, they just decided to not > build corosync on anything but Intel. > > Redundant rings also don't work in corosync yet and "bonding" suggested as > workaround won't save you from a switch failure. > I, personally, always add direct link between two modes for redundancy. > > I've also noticed a failure in the rrp_mode:passive in corosync-1.2.7-1.1.el5.x86_64.rpm. The expected behavior as per the docs should be:
/RRP can have three modes (rrp_mode): if set to active, Corosync uses all interfaces actively. If set to passive, Corosync uses the second interface only if the first ring fails. If rrp_mode is set to none, RRP is disabled. With RRP, two physi- cally separate networks are used for communication. In case one network fails, the cluster nodes can still communicate via the other network./ So the logic is, on passive mode it uses the first network (ringnumber 0) by default, if it fails, it goes to the second one. Now, the failure type in my test was the removal of the cable from the network card on the primary node, at which point it didn't switch to the second available ring, it went into a situation where node 1 thinks it's primary, has drbd partitions mounted, node 2 thinks it's alone, switches to primary and tries to do the same with the drbd partitions (which are linked between servers on the second network connection, ringnumber 1), however it fails, since drbd is primary and mounted on the other node. My solution was to switch to rrp_mode: active, then when performing the same test it worked the way it should. Regards, Dan > Vadym > > > > _______________________________________________ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > -- Dan FRINCU Systems Engineer CCNA, RHCE Streamwide Romania _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems