Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters - working :)
On Fri, Aug 6, 2021 at 3:47 PM Ulrich Windl wrote: > > >>> Antony Stone schrieb am 06.08.2021 um > 14:41 in > Nachricht <202108061441.59936.antony.st...@ha.open.source.it>: > ... > > location pref_A GroupA rule ‑inf: site ne cityA > > location pref_B GroupB rule ‑inf: site ne cityB > > I'm wondering whether the first is equivalentto > location pref_A GroupA rule inf: site eq cityA > No, it is not. The original constraint prohibits running resources anywhere except cityA even if cityA is not available; your version allows it if cityA is not available. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters - working :)
On Friday 06 August 2021 at 14:47:03, Ulrich Windl wrote: > Antony Stone schrieb am 06.08.2021 um 14:41 > > > location pref_A GroupA rule ‑inf: site ne cityA > > location pref_B GroupB rule ‑inf: site ne cityB > > I'm wondering whether the first is equivalentto > location pref_A GroupA rule inf: site eq cityA I certainly believe it is. > In that case I think it's more clear (avoiding double negation). Fair point :) Antony. -- 3 logicians walk into a bar. The bartender asks "Do you all want a drink?" The first logician says "I don't know." The second logician says "I don't know." The third logician says "Yes!" Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters - working :)
>>> Antony Stone schrieb am 06.08.2021 um 14:41 in Nachricht <202108061441.59936.antony.st...@ha.open.source.it>: ... > location pref_A GroupA rule ‑inf: site ne cityA > location pref_B GroupB rule ‑inf: site ne cityB I'm wondering whether the first is equivalentto location pref_A GroupA rule inf: site eq cityA In that case I think it's more clear (avoiding double negation). ... Regards, Ulrich ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters - working :)
Hi! Nice to hear. What could be "interesting" is how stable the WAN-type of corosync communication works. If it's not that stable, the cluster could try to fence nodes rather frequently. OK, you disabled fencing; maybe it works without. Did you tune the parameters? Regards, Ulrich >>> Antony Stone schrieb am 05.08.2021 um 14:44 in Nachricht <202108051444.39919.antony.st...@ha.open.source.it>: > On Thursday 05 August 2021 at 10:51:37, Antony Stone wrote: > >> On Thursday 05 August 2021 at 07:48:37, Ulrich Windl wrote: >> > >> > Have you ever tried to find out why this happens? (Talking about logs) >> >> Not in detail, no, but just in case there's a chance of getting this >> working as suggested simply using location constraints, I shall look >> further. > > I now have a working solution ‑ thank you to everyone who has helped. > > The answer to the problem above was simple ‑ with a 6‑node cluster, 3 votes is > > not quorum. > > I added a 7th node (in "city C") and adjusted the location constraints to > ensure that cluster A resources run in city A, cluster B resources run in > city > B, and the "anywhere" resource runs in either city A or city B. > > I've even added a colocation constraint to ensure that the "anywhere" > resource > runs on the same machine in either city A or city B as is running the local > resources there (which wasn't a strict requirement, but is very useful). > > For anyone interested in the detail of how to do this (without needing > booth), > here is my cluster.conf file, as in "crm configure load replace > cluster.conf": > > > node tom attribute site=cityA > node dick attribute site=cityA > node harry attribute site=cityA > > node fred attribute site=cityB > node george attribute site=cityB > node ron attribute site=cityB > > primitive A‑float IPaddr2 params ip=192.168.32.250 cidr_netmask=24 meta > migration‑threshold=3 failure‑timeout=60 op monitor interval=5 timeout=20 on‑ > fail=restart > primitive B‑float IPaddr2 params ip=192.168.42.250 cidr_netmask=24 meta > migration‑threshold=3 failure‑timeout=60 op monitor interval=5 timeout=20 on‑ > fail=restart > primitive Asterisk asterisk meta migration‑threshold=3 failure‑timeout=60 op > monitor interval=5 timeout=20 on‑fail=restart > > group GroupA A‑float4 resource‑stickiness=100 > group GroupB B‑float4 resource‑stickiness=100 > group Anywhere Asterisk resource‑stickiness=100 > > location pref_A GroupA rule ‑inf: site ne cityA > location pref_B GroupB rule ‑inf: site ne cityB > location no_pref Anywhere rule ‑inf: site ne cityA and site ne cityB > > colocation Ast 100: Anywhere [ cityA cityB ] > > property cib‑bootstrap‑options: stonith‑enabled=no no‑quorum‑policy=stop > start‑failure‑is‑fatal=false cluster‑recheck‑interval=60s > > > Of course, the group definitions are not needed for single resources, but I > shall in practice be using multiple resources which do need groups, so I > wanted to ensure I was creating something which would work with that. > > I have tested it by: > > ‑ bringing up one node at a time: as soon as any 4 nodes are running, all > possible resources are running > > ‑ bringing up 5 or more nodes: all resources run > > ‑ taking down one node at a time to a maximum of three nodes offline: if at > least one node in a given city is running, the resources at that city are > running > > ‑ turning off (using "halt", so that corosync dies nicely) all three nodes > in > a city simultaneously: that city's resources stop running, the other city > continues working, as well as the "anywhere" resource > > ‑ causing a network failure at one city (so it simply disappears without > stopping corosync neatly): the other city continues its resources (plus the > "anywhere" resource), the isolated city stops > > For me, this is the solution I wanted, and in fact it's even slightly better > > than the previous two isolated 3‑node clusters I had, because I can now have > resources running on a single active node in cityA (provided it can see at > least 3 other nodes in cityB or cityC), which wasn't possible before. > > > Once again, thanks to everyone who has helped me to achieve this result :) > > > Antony. > > ‑‑ > "The future is already here. It's just not evenly distributed yet." > > ‑ William Gibson > >Please reply to the list; > please *don't* CC > me. > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
On 05/08/2021 00:11, Frank D. Engel, Jr. wrote: In theory if you could have an independent voting infrastructure among the three clusters which serves to effectively create a second cluster infrastructure interconnecting them to support resource D, you could Yes. It's called booth. have D running on one of the clusters so long as at least two of them can communicate with each other. In other words, give each cluster one vote, then as long as two of them can communicate there are two votes which makes quorum, thus resource D can run on one of those two clusters. If all three clusters lose contact with each other, then D still cannot safely run. To keep the remaining resources working when contact is lost between the clusters, the vote for this would need to be independent of the vote within each individual cluster, effectively meaning that each node would belong to two clusters at once: its own local cluster (A/B/C) plus a "global" cluster spread across the three locations. I don't know offhand if that is readily possible to support with the current software. On 8/4/21 5:01 PM, Antony Stone wrote: On Wednesday 04 August 2021 at 22:06:39, Frank D. Engel, Jr. wrote: There is no safe way to do what you are trying to do. If the resource is on cluster A and contact is lost between clusters A and B due to a network failure, how does cluster B know if the resource is still running on cluster A or not? It has no way of knowing if cluster A is even up and running. In that situation it cannot safely start the resource. I am perfectly happy to have an additional machine at a third location in order to avoid this split-brain between two clusters. However, what I cannot have is for the resources which should be running on cluster A to get started on cluster B. If cluster A is down, then its resources should simply not run - as happens right now with two independent clusters. Suppose for a moment I had three clusters at three locations: A, B and C. Is there a method by which I can have: 1. Cluster A resources running on cluster A if cluster A is functional and not running anywhere if cluster A is non-functional. 2. Cluster B resources running on cluster B if cluster B is functional and not running anywhere if cluster B is non-functional. 3. Cluster C resources running on cluster C if cluster C is functional and not running anywhere if cluster C is non-functional. 4. Resource D running _somewhere_ on clusters A, B or C, but only a single instance of D at a single location at any time. Requirements 1, 2 and 3 are easy to achieve - don't connect the clusters. Requirement 4 is the one I'm stuck with how to implement. If the three nodes comprising cluster A can manage resources such that they run on only one of the three nodes at any time, surely there must be a way of doing the same thing with a resource running on one of three clusters? Antony. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
On Thursday 05 August 2021 at 07:43:30, Andrei Borzenkov wrote: > On 05.08.2021 00:01, Antony Stone wrote: > > > > Requirements 1, 2 and 3 are easy to achieve - don't connect the clusters. > > > > Requirement 4 is the one I'm stuck with how to implement. > > You either have single cluster and define appropriate location > constraints or you have multiple clusters and configure geo-cluster on > top of them. But you already have been told it multiple times. > > > If the three nodes comprising cluster A can manage resources such that > > they run on only one of the three nodes at any time, surely there must > > be a way of doing the same thing with a resource running on one of three > > clusters? > > You need something that coordinates resources between three clusters and > that is booth. Indeed: On Wednesday 04 August 2021 at 12:48:37, Antony Stone wrote: > I'm going to look into booth as suggested by others. Thanks, Antony. -- +++ Divide By Cucumber Error. Please Reinstall Universe And Reboot +++ Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
On 05.08.2021 00:01, Antony Stone wrote: > On Wednesday 04 August 2021 at 22:06:39, Frank D. Engel, Jr. wrote: > >> There is no safe way to do what you are trying to do. >> >> If the resource is on cluster A and contact is lost between clusters A >> and B due to a network failure, how does cluster B know if the resource >> is still running on cluster A or not? >> >> It has no way of knowing if cluster A is even up and running. >> >> In that situation it cannot safely start the resource. > > I am perfectly happy to have an additional machine at a third location in > order to avoid this split-brain between two clusters. > > However, what I cannot have is for the resources which should be running on > cluster A to get started on cluster B. > > If cluster A is down, then its resources should simply not run - as happens > right now with two independent clusters. > > Suppose for a moment I had three clusters at three locations: A, B and C. > > Is there a method by which I can have: > > 1. Cluster A resources running on cluster A if cluster A is functional and > not > running anywhere if cluster A is non-functional. > > 2. Cluster B resources running on cluster B if cluster B is functional and > not > running anywhere if cluster B is non-functional. > > 3. Cluster C resources running on cluster C if cluster C is functional and > not > running anywhere if cluster C is non-functional. > > 4. Resource D running _somewhere_ on clusters A, B or C, but only a single > instance of D at a single location at any time. > > Requirements 1, 2 and 3 are easy to achieve - don't connect the clusters. > > Requirement 4 is the one I'm stuck with how to implement. > You either have single cluster and define appropriate location constraints or you have multiple clusters and configure geo-cluster on top of them. But you already have been told it multiple times. > If the three nodes comprising cluster A can manage resources such that they > run on only one of the three nodes at any time, surely there must be a way of > doing the same thing with a resource running on one of three clusters? > > You need something that coordinates resources between three clusters and that is booth. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
I still can't understand why the whole cluster will fail when only 3 nodes are down and a qdisk is used. CityA -> 3 nodes to run packageA -> 3 votesCityB -> 3 nodes to run packageB -> 3 votesCityC -> 1 node which cannot run any package (qdisk) -> 1 vote Max votes:7Quorum: 4 As long as one city is up + qdisk -> your cluster will be working. Then you just configure that packageA cannot run in CityB, packageB cannot run in CityA.If all nodes in a city die, the relevant package will be down. Last, you configure your last resource without any location constraint. PS: by package consider either a resource group or a single resource. Best Regards,Strahil Nikolov___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
In theory if you could have an independent voting infrastructure among the three clusters which serves to effectively create a second cluster infrastructure interconnecting them to support resource D, you could have D running on one of the clusters so long as at least two of them can communicate with each other. In other words, give each cluster one vote, then as long as two of them can communicate there are two votes which makes quorum, thus resource D can run on one of those two clusters. If all three clusters lose contact with each other, then D still cannot safely run. To keep the remaining resources working when contact is lost between the clusters, the vote for this would need to be independent of the vote within each individual cluster, effectively meaning that each node would belong to two clusters at once: its own local cluster (A/B/C) plus a "global" cluster spread across the three locations. I don't know offhand if that is readily possible to support with the current software. On 8/4/21 5:01 PM, Antony Stone wrote: On Wednesday 04 August 2021 at 22:06:39, Frank D. Engel, Jr. wrote: There is no safe way to do what you are trying to do. If the resource is on cluster A and contact is lost between clusters A and B due to a network failure, how does cluster B know if the resource is still running on cluster A or not? It has no way of knowing if cluster A is even up and running. In that situation it cannot safely start the resource. I am perfectly happy to have an additional machine at a third location in order to avoid this split-brain between two clusters. However, what I cannot have is for the resources which should be running on cluster A to get started on cluster B. If cluster A is down, then its resources should simply not run - as happens right now with two independent clusters. Suppose for a moment I had three clusters at three locations: A, B and C. Is there a method by which I can have: 1. Cluster A resources running on cluster A if cluster A is functional and not running anywhere if cluster A is non-functional. 2. Cluster B resources running on cluster B if cluster B is functional and not running anywhere if cluster B is non-functional. 3. Cluster C resources running on cluster C if cluster C is functional and not running anywhere if cluster C is non-functional. 4. Resource D running _somewhere_ on clusters A, B or C, but only a single instance of D at a single location at any time. Requirements 1, 2 and 3 are easy to achieve - don't connect the clusters. Requirement 4 is the one I'm stuck with how to implement. If the three nodes comprising cluster A can manage resources such that they run on only one of the three nodes at any time, surely there must be a way of doing the same thing with a resource running on one of three clusters? Antony. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
On Wednesday 04 August 2021 at 22:06:39, Frank D. Engel, Jr. wrote: > There is no safe way to do what you are trying to do. > > If the resource is on cluster A and contact is lost between clusters A > and B due to a network failure, how does cluster B know if the resource > is still running on cluster A or not? > > It has no way of knowing if cluster A is even up and running. > > In that situation it cannot safely start the resource. I am perfectly happy to have an additional machine at a third location in order to avoid this split-brain between two clusters. However, what I cannot have is for the resources which should be running on cluster A to get started on cluster B. If cluster A is down, then its resources should simply not run - as happens right now with two independent clusters. Suppose for a moment I had three clusters at three locations: A, B and C. Is there a method by which I can have: 1. Cluster A resources running on cluster A if cluster A is functional and not running anywhere if cluster A is non-functional. 2. Cluster B resources running on cluster B if cluster B is functional and not running anywhere if cluster B is non-functional. 3. Cluster C resources running on cluster C if cluster C is functional and not running anywhere if cluster C is non-functional. 4. Resource D running _somewhere_ on clusters A, B or C, but only a single instance of D at a single location at any time. Requirements 1, 2 and 3 are easy to achieve - don't connect the clusters. Requirement 4 is the one I'm stuck with how to implement. If the three nodes comprising cluster A can manage resources such that they run on only one of the three nodes at any time, surely there must be a way of doing the same thing with a resource running on one of three clusters? Antony. -- I don't know, maybe if we all waited then cosmic rays would write all our software for us. Of course it might take a while. - Ron Minnich, Los Alamos National Laboratory Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
There is no safe way to do what you are trying to do. If the resource is on cluster A and contact is lost between clusters A and B due to a network failure, how does cluster B know if the resource is still running on cluster A or not? It has no way of knowing if cluster A is even up and running. In that situation it cannot safely start the resource. If the network is down and both clusters come up at the same time, without being able to contact each other, neither knows if the other is running the resource, so neither can safely start it. On 8/4/21 3:27 PM, Antony Stone wrote: On Wednesday 04 August 2021 at 20:57:49, Strahil Nikolov wrote: That's why you need a qdisk at a 3-rd location, so you will have 7 votes in total.When 3 nodes in cityA die, all resources will be started on the remaining 3 nodes. I think I have not explained this properly. I have three nodes in city A which run resources which have to run in city A. They are based on IP addresses which are only valid on the network in city A. I have three nodes in city B which run resources which have to run in city B. They are based on IP addresses which are only valid on the network in city B. I have redundant routing between my upstream provider, and cities A and B, so that I only _need_ resources to be running in one of the two cities for everything to work as required. City A can go completely offline and not run its resources, and everything I need continues to work via city B. I now have an additional requirement to run a single resource at either city A or city B but not both. As soon as I connect the clusters at city A and city B, and apply the location contraints and weighting rules you have suggested: 1. everything works, including the single resource at either city A or city B, so long as both clusters are operational. 2. as soon as one cluster fails (all three of its nodes nodes become unavailable), then the other cluster stops running all its resources as well. This is even with quorum=2. This means I have lost the redundancy between my two clusters, which is based on the expectation that only one cluster will fail at a time. If the failure of one automatically _causes_ the failure of the other, I have no high availability any more. What I require is for cluster A to continue running its own resources, plus the single resource which can run anywhere, in the event that cluster B fails. In other words, I need the exact same outcome as I have at present if cluster B fails (its resources stop, cluster A is unaffected), except that cluster A continues to run the single resource which I need just a single instance of. It is impossible for the nodes at city A to run the resources which should be running at city B, partly because some of them are identical ("Asterisk" as a resource, for example, is already running at city A), and partly because some of them are bound to the networking arrangements (I cannot set a floating IP address which belongs in city A on a machine which exists in city B - it just doesn't work). Therefore if adding a seventh node at a third location would try to start _all_ resources in city A if city B goes down, it is not a working solution. If city B goes down then I simply do not want its resources to be running anywhere, just the same as I have now with the two independent clusters. Thanks, Antony. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
On Wednesday 04 August 2021 at 20:57:49, Strahil Nikolov wrote: > That's why you need a qdisk at a 3-rd location, so you will have 7 votes in > total.When 3 nodes in cityA die, all resources will be started on the > remaining 3 nodes. I think I have not explained this properly. I have three nodes in city A which run resources which have to run in city A. They are based on IP addresses which are only valid on the network in city A. I have three nodes in city B which run resources which have to run in city B. They are based on IP addresses which are only valid on the network in city B. I have redundant routing between my upstream provider, and cities A and B, so that I only _need_ resources to be running in one of the two cities for everything to work as required. City A can go completely offline and not run its resources, and everything I need continues to work via city B. I now have an additional requirement to run a single resource at either city A or city B but not both. As soon as I connect the clusters at city A and city B, and apply the location contraints and weighting rules you have suggested: 1. everything works, including the single resource at either city A or city B, so long as both clusters are operational. 2. as soon as one cluster fails (all three of its nodes nodes become unavailable), then the other cluster stops running all its resources as well. This is even with quorum=2. This means I have lost the redundancy between my two clusters, which is based on the expectation that only one cluster will fail at a time. If the failure of one automatically _causes_ the failure of the other, I have no high availability any more. What I require is for cluster A to continue running its own resources, plus the single resource which can run anywhere, in the event that cluster B fails. In other words, I need the exact same outcome as I have at present if cluster B fails (its resources stop, cluster A is unaffected), except that cluster A continues to run the single resource which I need just a single instance of. It is impossible for the nodes at city A to run the resources which should be running at city B, partly because some of them are identical ("Asterisk" as a resource, for example, is already running at city A), and partly because some of them are bound to the networking arrangements (I cannot set a floating IP address which belongs in city A on a machine which exists in city B - it just doesn't work). Therefore if adding a seventh node at a third location would try to start _all_ resources in city A if city B goes down, it is not a working solution. If city B goes down then I simply do not want its resources to be running anywhere, just the same as I have now with the two independent clusters. Thanks, Antony. -- "In fact I wanted to be John Cleese and it took me some time to realise that the job was already taken." - Douglas Adams Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
That's why you need a qdisk at a 3-rd location, so you will have 7 votes in total.When 3 nodes in cityA die, all resources will be started on the remaining 3 nodes. Best Regards,Strahil Nikolov On Wed, Aug 4, 2021 at 17:23, Antony Stone wrote: On Wednesday 04 August 2021 at 16:07:39, Andrei Borzenkov wrote: > On Wed, Aug 4, 2021 at 5:03 PM Antony Stone wrote: > > On Wednesday 04 August 2021 at 13:31:12, Andrei Borzenkov wrote: > > > On Wed, Aug 4, 2021 at 1:48 PM Antony Stone wrote: > > > > On Tuesday 03 August 2021 at 12:12:03, Strahil Nikolov via Users > > > > wrote: > > > > > Won't something like this work ? Each node in LA will have same > > > > > score of 5000, while other cities will be -5000. > > > > > > > > > > pcs constraint location DummyRes1 rule score=5000 city eq LA > > > > > pcs constraint location DummyRes1 rule score=-5000 city ne LA > > > > > stickiness -> 1 > > > > > > > > Thanks for the idea, but no difference. > > > > > > > > Basically, as soon as zero nodes in one city are available, all > > > > resources, including those running perfectly at the other city, stop. > > > > > > That is not what you originally said. > > > > > > You said you have 6 node cluster (3 + 3) and 2 nodes are not available. > > > > No, I don't think I said that? > > "With the new setup, if two machines in city A fail, then _both_ > clusters stop working" Ah, apologies - that was a typo. "With the new setup, if the machines in city A fail, then _both_ clusters stop working". So, basically what I'm saying is that with two separate clusters, if one fails, the other keeps going (as one would expect). Joining the two clusters together so that I can have a single floating resource which can run anywhere (as well as the exact same location-specific resources as before) results in one cluster failure taking the other cluster down too. I need one fully-working 3-node cluster to keep going, no matter what the other cluster does. Antony. -- It is also possible that putting the birds in a laboratory setting inadvertently renders them relatively incompetent. - Daniel C Dennett Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
On Wednesday 04 August 2021 at 16:07:39, Andrei Borzenkov wrote: > On Wed, Aug 4, 2021 at 5:03 PM Antony Stone wrote: > > On Wednesday 04 August 2021 at 13:31:12, Andrei Borzenkov wrote: > > > On Wed, Aug 4, 2021 at 1:48 PM Antony Stone wrote: > > > > On Tuesday 03 August 2021 at 12:12:03, Strahil Nikolov via Users > > > > wrote: > > > > > Won't something like this work ? Each node in LA will have same > > > > > score of 5000, while other cities will be -5000. > > > > > > > > > > pcs constraint location DummyRes1 rule score=5000 city eq LA > > > > > pcs constraint location DummyRes1 rule score=-5000 city ne LA > > > > > stickiness -> 1 > > > > > > > > Thanks for the idea, but no difference. > > > > > > > > Basically, as soon as zero nodes in one city are available, all > > > > resources, including those running perfectly at the other city, stop. > > > > > > That is not what you originally said. > > > > > > You said you have 6 node cluster (3 + 3) and 2 nodes are not available. > > > > No, I don't think I said that? > > "With the new setup, if two machines in city A fail, then _both_ > clusters stop working" Ah, apologies - that was a typo. "With the new setup, if the machines in city A fail, then _both_ clusters stop working". So, basically what I'm saying is that with two separate clusters, if one fails, the other keeps going (as one would expect). Joining the two clusters together so that I can have a single floating resource which can run anywhere (as well as the exact same location-specific resources as before) results in one cluster failure taking the other cluster down too. I need one fully-working 3-node cluster to keep going, no matter what the other cluster does. Antony. -- It is also possible that putting the birds in a laboratory setting inadvertently renders them relatively incompetent. - Daniel C Dennett Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
On Wed, Aug 4, 2021 at 5:03 PM Antony Stone wrote: > > On Wednesday 04 August 2021 at 13:31:12, Andrei Borzenkov wrote: > > > On Wed, Aug 4, 2021 at 1:48 PM Antony Stone wrote: > > > On Tuesday 03 August 2021 at 12:12:03, Strahil Nikolov via Users wrote: > > > > Won't something like this work ? Each node in LA will have same score > > > > of 5000, while other cities will be -5000. > > > > > > > > pcs constraint location DummyRes1 rule score=5000 city eq LA > > > > pcs constraint location DummyRes1 rule score=-5000 city ne LA > > > > stickiness -> 1 > > > > > > Thanks for the idea, but no difference. > > > > > > Basically, as soon as zero nodes in one city are available, all > > > resources, including those running perfectly at the other city, stop. > > > > That is not what you originally said. > > > > You said you have 6 node cluster (3 + 3) and 2 nodes are not available. > > No, I don't think I said that? > "With the new setup, if two machines in city A fail, then _both_ clusters stop working" ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
On Wednesday 04 August 2021 at 13:31:12, Andrei Borzenkov wrote: > On Wed, Aug 4, 2021 at 1:48 PM Antony Stone wrote: > > On Tuesday 03 August 2021 at 12:12:03, Strahil Nikolov via Users wrote: > > > Won't something like this work ? Each node in LA will have same score > > > of 5000, while other cities will be -5000. > > > > > > pcs constraint location DummyRes1 rule score=5000 city eq LA > > > pcs constraint location DummyRes1 rule score=-5000 city ne LA > > > stickiness -> 1 > > > > Thanks for the idea, but no difference. > > > > Basically, as soon as zero nodes in one city are available, all > > resources, including those running perfectly at the other city, stop. > > That is not what you originally said. > > You said you have 6 node cluster (3 + 3) and 2 nodes are not available. No, I don't think I said that? With the new setup, if 2 nodes are not available, everything carries on working; it doesn't matter whether the two nodes are in the same or different locations. That's fine. My problem is that with the new setup, if three nodes at one location go down, then *everything* stops, including the resources I want to carry on running at the other location. Under my previous, working arrangement with two separate clusters, one data centre going down does not affect the other, therefore I have a fully working system (since the two data centres provide identical services with redundant routing). A failure of one data centre taking down working services in the other data centre is not the high availability solution I'm looking for - it's more like high unavailability :) Antony. -- BASIC is to computer languages what Roman numerals are to arithmetic. Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
On Wed, Aug 4, 2021 at 1:48 PM Antony Stone wrote: > > On Tuesday 03 August 2021 at 12:12:03, Strahil Nikolov via Users wrote: > > > Won't something like this work ? Each node in LA will have same score of > > 5000, while other cities will be -5000. > > > > pcs constraint location DummyRes1 rule score=5000 city eq LA > > pcs constraint location DummyRes1 rule score=-5000 city ne LA > > stickiness -> 1 > > Thanks for the idea, but no difference. > > Basically, as soon as zero nodes in one city are available, all resources, > including those running perfectly at the other city, stop. > That is not what you originally said. You said you have 6 node cluster (3 + 3) and 2 nodes are not available. If you lose half of nodes and do not have working fencing then this is expected behavior (in default configuration). You may configure cluster to keep running resources, but you cannot configure cluster to take over resources without fencing (well, you can, but ...) > I'm going to look into booth as suggested by others. > > Thanks, > > > Antony. > > -- > Atheism is a non-prophet-making organisation. > >Please reply to the list; > please *don't* CC me. > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
On Tuesday 03 August 2021 at 12:12:03, Strahil Nikolov via Users wrote: > Won't something like this work ? Each node in LA will have same score of > 5000, while other cities will be -5000. > > pcs constraint location DummyRes1 rule score=5000 city eq LA > pcs constraint location DummyRes1 rule score=-5000 city ne LA > stickiness -> 1 Thanks for the idea, but no difference. Basically, as soon as zero nodes in one city are available, all resources, including those running perfectly at the other city, stop. I'm going to look into booth as suggested by others. Thanks, Antony. -- Atheism is a non-prophet-making organisation. Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
Won't something like this work ? Each node in LA will have same score of 5000, while other cities will be -5000. pcs constraint location DummyRes1 rule score=5000 city eq LA pcs constraint location DummyRes1 rule score=-5000 city ne LA stickiness -> 1 Best Regards,Strahil Nikolov Out of curiosity: Could one write a rule that demands that a resource migration should (not) happen within the same city? ("should" means "perferably when there are alternatives") ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
>>> Antony Stone schrieb am 03.08.2021 um 10:40 in Nachricht <202108031040.28312.antony.st...@ha.open.source.it>: > On Tuesday 11 May 2021 at 12:56:01, Strahil Nikolov wrote: > >> Here is the example I had promised: >> >> pcs node attribute server1 city=LA >> pcs node attribute server2 city=NY >> >> # Don't run on any node that is not in LA >> pcs constraint location DummyRes1 rule score=‑INFINITY city ne LA >> >> #Don't run on any node that is not in NY >> pcs constraint location DummyRes2 rule score=‑INFINITY city ne NY Hi! Out of curiosity: Could one write a rule that demands that a resource migration should (not) happen within the same city? ("should" means "perferably when there are alternatives") Regards, Ulrich >> >> The idea is that if you add a node and you forget to specify the attribute >> with the name 'city' , DummyRes1 & DummyRes2 won't be started on it. >> >> For resources that do not have a constraint based on the city ‑> they will >> run everywhere unless you specify a colocation constraint between the >> resources. > > Excellent ‑ thanks. I happen to use crmsh rather than pcs, but I've adapted > the above and got it working. > > Unfortunately, there is a problem. > > My current setup is: > > One 3‑machine cluster in city A running a bunch of resources between them, > the > most important of which for this discussion is Asterisk telephony. > > One 3‑machine cluster in city B doing exactly the same thing. > > The two clusters have no knowledge of each other. > > I have high‑availability routing between my clusters and my upstream > telephony > provider, such that a call can be handled by Cluster A or Cluster B, and if > one is unavailable, the call gets routed to the other. > > Thus, a total failure of Cluster A means I still get phone calls, via > Cluster > B. > > > To implement the above "one resource which can run anywhere, but only a > single > instance", I joined together clusters A and B, and placed the corresponding > location constraints on the resources I want only at A and the ones I want > only at B. I then added the resource with no location constraint, and it > runs > anywhere, just once. > > So far, so good. > > > The problem is: > > With the two independent clusters, if two machines in city A fail, then > Cluster A fails completely (no quorum), and Cluster B continues working. > That > means I still get phone calls. > > With the new setup, if two machines in city A fail, then _both_ clusters > stop > working and I have no functional resources anywhere. > > > So, my question now is: > > How can I have a 3‑machine Cluster A running local resources, and a 3‑machine > Cluster B running local resources, plus one resource running on either > Cluster > A or Cluster B, but without a failure of one cluster causing _everything_ to > > stop? Kind of stupid idea: Set up an IP address on the cluster that runs your special resource (colocation) and make the other cluster monitor that IP address: If the IP is down for some time, launch your resource locally. Probably you want two different IP adresses in each cluster. The other question is: How harmful is it if both clusters run the resource for a short time? Basically you need a kind of "cross-cluster semaphore" that keeps the state whether your resource is running somewhere. Regards, Ulrich > > > Thanks, > > > Antony. > > ‑‑ > One tequila, two tequila, three tequila, floor. > >Please reply to the list; > please *don't* CC > me. > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/