Re: [Pacemaker] Split-site cluster in two locations
On Tue, 2011-01-11 at 10:21 +0100, Christoph Herrmann wrote: > -Ursprüngliche Nachricht- > Von: Andrew Beekhof > Gesendet: Di 11.01.2011 09:01 > An: The Pacemaker cluster resource manager ; > CC: Michael Schwartzkopff ; > Betreff: Re: [Pacemaker] Split-site cluster in two locations > > > On Tue, Dec 28, 2010 at 10:21 PM, Anton Altaparmakov > > wrote: > > > Hi, > > > > > > On 28 Dec 2010, at 20:32, Michael Schwartzkopff wrote: > > >> Hi, > > >> > > >> I have four nodes in a split site scenario located in two computing > > >> centers. > > >> STONITH is enabled. > > >> > > >> Is there and best practise how to deal with this setup? Does it make > > >> sense to > > >> set expected-quorum-votes to "3" to make the whole setup still running > > >> with > > >> one data center online? Is this possible at all? > > >> > > >> Is quorum needed with STONITH enabled? > > >> > > >> Is there a quorum server available already? > > > > > > I couldn't see a quorum server in Pacemaker so I have installed a third > > > dummy > > node which is not allowed to run any resources (using location constraints > > and > > setting the cluster to not be symmetric) which just acts as a third vote. > > I am > > hoping this effectively acts as a quorum server as a node that looses > > connectivity will lose quorum and shut down its services whilst the other > > real > > node will retain connectivity and thus quorum due to the dummy node still > > being > > present. > > > > > > Obviously this is quite wasteful of servers as you can only run a single > > Pacemaker instance on a server (as far as I know) so that is a lot of dummy > > servers when you run multiple pacemaker clusters... Solution for us is to > > use > > virtualization - one physical server with VMs and each VM is a dummy node > > for a > > cluster... > > > > With recent 1.1.x builds it should be possible to run just the > > corosync piece (no pacemaker). > > > > As long as you have only two computing centers it doesn't matter if you run a > corosync > only piece or whatever on a physikal or a virtual machine. The question is: > How to > configure a four node (or six node, an even number bigger then two) > corosync/pacemaker > cluster to continue services if you have a blackout in one computing center > (you will > always loose (at least) one half of your nodes), but to shutdown everything > if you have > less then half of the node available. Are there any best practices on how to > deal with > clusters in two computing centers? Anything like an external quorum node or a > quorum > partition? I'd like to set the expected-quorum-votes to "3" but this is not > possible > (with corosync-1.2.6 and pacemaker-1.1.2 on SLES11 SP1) Does anybody know why? > Currently, the only way I can figure out is to run the cluster with > no-quorum-policy="ignore". But I don't like that. Any suggestions? > > > Best regards > > Christoph Hi, I assume the only solution is to work with manual intervention, i.e. the stonith meatware module. Whenever a site goes down a human being has to confirm that it is lost, pull the power cords or the inter-site links so it will not come back unintentionally. Then confirm with meatclient on the healthy site that the no longer reachable site can be considered gone. From theory this can be configured with an additional meatware stonith resource with lower priority. The intention is to let your regular stonith resources do the work with meatware as last resort. Although I was not able to get this running with versions packaged with SLES11 SP1. The priority was not honored and a lot of zombie meatware processes were left over. I found some patches in the upstream repositories that seem to address these problems but I didn't follow up. Regards Holger ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Split-site cluster in two locations
-Original message- To: The Pacemaker cluster resource manager ; From: Christoph Herrmann Sent: Tue 11-01-2011 10:24 Subject:Re: [Pacemaker] Split-site cluster in two locations > As long as you have only two computing centers it doesn't matter if you run a > corosync > only piece or whatever on a physikal or a virtual machine. The question is: > How to > configure a four node (or six node, an even number bigger then two) > corosync/pacemaker > cluster to continue services if you have a blackout in one computing center > (you will > always loose (at least) one half of your nodes), but to shutdown everything > if > you have > less then half of the node available. Are there any best practices on how to > deal with > clusters in two computing centers? Anything like an external quorum node or a > quorum > partition? I'd like to set the expected-quorum-votes to "3" but this is not > possible > (with corosync-1.2.6 and pacemaker-1.1.2 on SLES11 SP1) Does anybody know why? > Currently, the only way I can figure out is to run the cluster with > no-quorum-policy="ignore". But I don't like that. Any suggestions? Apart from the number of nodes in de datacenter: with 2 datacentre's you have another issue: How do you know which DC is reachable (from you're clients point of view) when the communication between DC fails? Best fix for this would be a node at a third DC but you still run into problems with the fencing devices. I doubt you can remotely power-off the non-responding DC :-) So a split brain situation is likely to happen sometime. So for 100% data integrity I think it is best to let the cluster freeze itself... Best Regards, Robert van Leeuwen ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Split-site cluster in two locations
-Ursprüngliche Nachricht- Von: Andrew Beekhof Gesendet: Di 11.01.2011 09:01 An: The Pacemaker cluster resource manager ; CC: Michael Schwartzkopff ; Betreff: Re: [Pacemaker] Split-site cluster in two locations > On Tue, Dec 28, 2010 at 10:21 PM, Anton Altaparmakov wrote: > > Hi, > > > > On 28 Dec 2010, at 20:32, Michael Schwartzkopff wrote: > >> Hi, > >> > >> I have four nodes in a split site scenario located in two computing > >> centers. > >> STONITH is enabled. > >> > >> Is there and best practise how to deal with this setup? Does it make sense > >> to > >> set expected-quorum-votes to "3" to make the whole setup still running with > >> one data center online? Is this possible at all? > >> > >> Is quorum needed with STONITH enabled? > >> > >> Is there a quorum server available already? > > > > I couldn't see a quorum server in Pacemaker so I have installed a third > > dummy > node which is not allowed to run any resources (using location constraints > and > setting the cluster to not be symmetric) which just acts as a third vote. I > am > hoping this effectively acts as a quorum server as a node that looses > connectivity will lose quorum and shut down its services whilst the other > real > node will retain connectivity and thus quorum due to the dummy node still > being > present. > > > > Obviously this is quite wasteful of servers as you can only run a single > Pacemaker instance on a server (as far as I know) so that is a lot of dummy > servers when you run multiple pacemaker clusters... Solution for us is to > use > virtualization - one physical server with VMs and each VM is a dummy node for > a > cluster... > > With recent 1.1.x builds it should be possible to run just the > corosync piece (no pacemaker). > As long as you have only two computing centers it doesn't matter if you run a corosync only piece or whatever on a physikal or a virtual machine. The question is: How to configure a four node (or six node, an even number bigger then two) corosync/pacemaker cluster to continue services if you have a blackout in one computing center (you will always loose (at least) one half of your nodes), but to shutdown everything if you have less then half of the node available. Are there any best practices on how to deal with clusters in two computing centers? Anything like an external quorum node or a quorum partition? I'd like to set the expected-quorum-votes to "3" but this is not possible (with corosync-1.2.6 and pacemaker-1.1.2 on SLES11 SP1) Does anybody know why? Currently, the only way I can figure out is to run the cluster with no-quorum-policy="ignore". But I don't like that. Any suggestions? Best regards Christoph -- Vorstand/Board of Management: Dr. Bernd Finkbeiner, Dr. Roland Niemeier, Dr. Arno Steitz, Dr. Ingrid Zech Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Michel Lepert Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Split-site cluster in two locations
On Tue, Dec 28, 2010 at 10:21 PM, Anton Altaparmakov wrote: > Hi, > > On 28 Dec 2010, at 20:32, Michael Schwartzkopff wrote: >> Hi, >> >> I have four nodes in a split site scenario located in two computing centers. >> STONITH is enabled. >> >> Is there and best practise how to deal with this setup? Does it make sense to >> set expected-quorum-votes to "3" to make the whole setup still running with >> one data center online? Is this possible at all? >> >> Is quorum needed with STONITH enabled? >> >> Is there a quorum server available already? > > I couldn't see a quorum server in Pacemaker so I have installed a third dummy > node which is not allowed to run any resources (using location constraints > and setting the cluster to not be symmetric) which just acts as a third vote. > I am hoping this effectively acts as a quorum server as a node that looses > connectivity will lose quorum and shut down its services whilst the other > real node will retain connectivity and thus quorum due to the dummy node > still being present. > > Obviously this is quite wasteful of servers as you can only run a single > Pacemaker instance on a server (as far as I know) so that is a lot of dummy > servers when you run multiple pacemaker clusters... Solution for us is to > use virtualization - one physical server with VMs and each VM is a dummy node > for a cluster... With recent 1.1.x builds it should be possible to run just the corosync piece (no pacemaker). > > Best regards, > > Anton > >> >> Thanks for any hints, >> >> Greetings! >> >> -- >> Dr. Michael Schwartzkopff >> Guardinistr. 63 >> 81375 München >> >> Tel: (0163) 172 50 98 >> ___ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: >> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > Best regards, > > Anton > -- > Anton Altaparmakov (replace at with @) > Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK > Linux NTFS maintainer, http://www.linux-ntfs.org/ > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Split-site cluster in two locations
Hi, On 28 Dec 2010, at 20:32, Michael Schwartzkopff wrote: > Hi, > > I have four nodes in a split site scenario located in two computing centers. > STONITH is enabled. > > Is there and best practise how to deal with this setup? Does it make sense to > set expected-quorum-votes to "3" to make the whole setup still running with > one data center online? Is this possible at all? > > Is quorum needed with STONITH enabled? > > Is there a quorum server available already? I couldn't see a quorum server in Pacemaker so I have installed a third dummy node which is not allowed to run any resources (using location constraints and setting the cluster to not be symmetric) which just acts as a third vote. I am hoping this effectively acts as a quorum server as a node that looses connectivity will lose quorum and shut down its services whilst the other real node will retain connectivity and thus quorum due to the dummy node still being present. Obviously this is quite wasteful of servers as you can only run a single Pacemaker instance on a server (as far as I know) so that is a lot of dummy servers when you run multiple pacemaker clusters... Solution for us is to use virtualization - one physical server with VMs and each VM is a dummy node for a cluster... Best regards, Anton > > Thanks for any hints, > > Greetings! > > -- > Dr. Michael Schwartzkopff > Guardinistr. 63 > 81375 München > > Tel: (0163) 172 50 98 > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker Best regards, Anton -- Anton Altaparmakov (replace at with @) Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK Linux NTFS maintainer, http://www.linux-ntfs.org/ ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
[Pacemaker] Split-site cluster in two locations
Hi, I have four nodes in a split site scenario located in two computing centers. STONITH is enabled. Is there and best practise how to deal with this setup? Does it make sense to set expected-quorum-votes to "3" to make the whole setup still running with one data center online? Is this possible at all? Is quorum needed with STONITH enabled? Is there a quorum server available already? Thanks for any hints, Greetings! -- Dr. Michael Schwartzkopff Guardinistr. 63 81375 München Tel: (0163) 172 50 98 signature.asc Description: This is a digitally signed message part. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker