Re: [Pacemaker] Pacemaker cluster: OpenAis communication channels
On 10/22/2009 09:48 AM, Steven Dake wrote: ftp://ftp% 40corosync.org:downlo...@corosync.org/presentations/corosync-roadmap.pdf unfortunately, this url is not working for me. cheers, raoul -- DI (FH) Raoul Bhatia M.Sc. email. r.bha...@ipax.at Technischer Leiter IPAX - Aloy Bhatia Hava OEG web. http://www.ipax.at Barawitzkagasse 10/2/2/11 email.off...@ipax.at 1190 Wien tel. +43 1 3670030 FN 277995t HG Wien fax.+43 1 3670030 15 ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] Pacemaker cluster: OpenAis communication channels
Steve, what has repeatedly come up is that RRP links don't auto-heal (see thread: http://oss.clusterlabs.org/pipermail/pacemaker/2009-May/001784.html), and that passive mode RRP seems to not work at all (see thread: https://lists.linux-foundation.org/pipermail/openais/2009-October/013095.html -- this was also heavily discussed on IRC; the only approach that fixed the issue was to change rrp_mode to active). Can you fill us in on the progress on these issues? Thanks! Cheers, Florian On 10/22/2009 06:14 AM, Steven Dake wrote: You can run with one NIC (and switch) but then your NIC and switch become a SPOF (single point of failure). Vehicles have a spare tire for a reason :) If a NIC fails it may be ok to switch a service to a different node. If a switch fails, The entire cluster becomes disabled until the switch returns to operation. Availability is a mathematical equation: A = MTTF / (MTTF+MTTR) Pacemaker improves availability (A) by reducing mean time to repair (MTTR) using failover while keeping the mean time to failure (MTTF) essentially the same (although it is generally a bit lower because of other components in the system required to introduce redundancy). Instead of a typical 1 machine MTTR of 4 hours under a typical SLA, MTTR may be 5-10 seconds or less (the time to failover the application and restart it). If MTTR is several days to service a switch, your availability may not meet your customer SLA obligations. When determining whether to use a redundant switch the risks vs cost have to be evaluated based upon your availability requirements. Regards -steve signature.asc Description: OpenPGP digital signature ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] Pacemaker cluster: OpenAis communication channels
Florian, I have checked the different links you have sent and I am a bit confused with this RRP matter. I want to use two machines (primary / secondary) and set up the following interfaces: - eth0 : direct link between primary and secondary - bond0: bonding of eth1 and eth2 (redundant network with two switches). Will there be any issues while using these two interfaces for OpenAis communication channels? (especially with the bonding) Thank you. 2009/10/22 Florian Haas florian.h...@linbit.com Steve, what has repeatedly come up is that RRP links don't auto-heal (see thread: http://oss.clusterlabs.org/pipermail/pacemaker/2009-May/001784.html), and that passive mode RRP seems to not work at all (see thread: https://lists.linux-foundation.org/pipermail/openais/2009-October/013095.html -- this was also heavily discussed on IRC; the only approach that fixed the issue was to change rrp_mode to active). Can you fill us in on the progress on these issues? Thanks! Cheers, Florian On 10/22/2009 06:14 AM, Steven Dake wrote: You can run with one NIC (and switch) but then your NIC and switch become a SPOF (single point of failure). Vehicles have a spare tire for a reason :) If a NIC fails it may be ok to switch a service to a different node. If a switch fails, The entire cluster becomes disabled until the switch returns to operation. Availability is a mathematical equation: A = MTTF / (MTTF+MTTR) Pacemaker improves availability (A) by reducing mean time to repair (MTTR) using failover while keeping the mean time to failure (MTTF) essentially the same (although it is generally a bit lower because of other components in the system required to introduce redundancy). Instead of a typical 1 machine MTTR of 4 hours under a typical SLA, MTTR may be 5-10 seconds or less (the time to failover the application and restart it). If MTTR is several days to service a switch, your availability may not meet your customer SLA obligations. When determining whether to use a redundant switch the risks vs cost have to be evaluated based upon your availability requirements. Regards -steve ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker