Re: [ClusterLabs] Antw: Re: Antw: [EXT] Re: Sub‑clusters / super‑clusters - working :)

2021-08-06 Thread kgaillot
On Fri, 2021-08-06 at 15:48 +0200, Ulrich Windl wrote:
> > > > Andrei Borzenkov  schrieb am 06.08.2021 um
> > > > 15:14 in
> Nachricht
> :
> > On Fri, Aug 6, 2021 at 3:47 PM Ulrich Windl
> >  wrote:
> > > > > > Antony Stone  schrieb am
> > > > > > 06.08.2021 um
> > > 14:41 in
> > > Nachricht <202108061441.59936.antony.st...@ha.open.source.it>:
> > > ...
> > > >   location pref_A GroupA rule ‑inf: site ne cityA
> > > >   location pref_B GroupB rule ‑inf: site ne cityB
> > > 
> > > I'm wondering whether the first is equivalentto
> > > location pref_A GroupA rule inf: site eq cityA
> > > 
> > 
> > No, it is not. The original constraint prohibits running resources
> > anywhere except cityA even if cityA is not available; your version
> > allows it if cityA is not available.
> 
> ?? If a resource must run on "cityA" and cityA is unavailable, then
> will it
> run elsewhere?

-inf = must not
+inf != must

-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Antw: Re: Antw: [EXT] Re: Sub‑clusters / super‑clusters - working :)

2021-08-06 Thread Ulrich Windl
>>> Andrei Borzenkov  schrieb am 06.08.2021 um 15:14 in
Nachricht
:
> On Fri, Aug 6, 2021 at 3:47 PM Ulrich Windl
>  wrote:
>>
>> >>> Antony Stone  schrieb am 06.08.2021 um
>> 14:41 in
>> Nachricht <202108061441.59936.antony.st...@ha.open.source.it>:
>> ...
>> >   location pref_A GroupA rule ‑inf: site ne cityA
>> >   location pref_B GroupB rule ‑inf: site ne cityB
>>
>> I'm wondering whether the first is equivalentto
>> location pref_A GroupA rule inf: site eq cityA
>>
> 
> No, it is not. The original constraint prohibits running resources
> anywhere except cityA even if cityA is not available; your version
> allows it if cityA is not available.

?? If a resource must run on "cityA" and cityA is unavailable, then will it
run elsewhere?

> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: Re: Antw: [EXT] Re: Sub‑clusters / super‑clusters?

2021-08-05 Thread Antony Stone
On Thursday 05 August 2021 at 07:48:37, Ulrich Windl wrote:

> Antony Stone schrieb am 04.08.2021 um 21:27:
> > 
> > As soon as I connect the clusters at city A and city B, and apply the
> > location contraints and weighting rules you have suggested:
> > 
> > 1. everything works, including the single resource at either city A or
> > city B, so long as both clusters are operational.
> > 
> > 2. as soon as one cluster fails (all three of its nodes nodes become
> > unavailable), then the other cluster stops running all its resources as
> > well. This is even with quorum=2.
> 
> Have you ever tried to find out why this happens? (Talking about logs)

Not in detail, no, but just in case there's a chance of getting this working 
as suggested simply using location constraints, I shall look further.

Thanks,


Antony.

-- 
This sentence contains exacly three erors.

   Please reply to the list;
 please *don't* CC me.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Antw: Re: Antw: [EXT] Re: Sub‑clusters / super‑clusters?

2021-08-04 Thread Ulrich Windl
>>> "Frank D. Engel, Jr."  schrieb am 05.08.2021 um 00:11 in
Nachricht :
> In theory if you could have an independent voting infrastructure among 
> the three clusters which serves to effectively create a second cluster 
> infrastructure interconnecting them to support resource D, you could 
> have D running on one of the clusters so long as at least two of them 
> can communicate with each other.

Hi!

That's what I thought too, BUT:
If yoiu have some common (NAS) storage where each cluster sends a "proof of 
life" periodically, what will happen if the network is down?
Each cluster will thing the other is dead, so no help.
Maybe it's time for an independent communication channel.
Maybe quorum-voting via SMS or packet radio? ;-)

Regards,
Ulrich
> 
> 
> In other words, give each cluster one vote, then as long as two of them 
> can communicate there are two votes which makes quorum, thus resource D 
> can run on one of those two clusters.
> 
> If all three clusters lose contact with each other, then D still cannot 
> safely run.
> 
> 
> To keep the remaining resources working when contact is lost between the 
> clusters, the vote for this would need to be independent of the vote 
> within each individual cluster, effectively meaning that each node would 
> belong to two clusters at once: its own local cluster (A/B/C) plus a 
> "global" cluster spread across the three locations.  I don't know 
> offhand if that is readily possible to support with the current software.
> 
> 
> On 8/4/21 5:01 PM, Antony Stone wrote:
>> On Wednesday 04 August 2021 at 22:06:39, Frank D. Engel, Jr. wrote:
>>
>>> There is no safe way to do what you are trying to do.
>>>
>>> If the resource is on cluster A and contact is lost between clusters A
>>> and B due to a network failure, how does cluster B know if the resource
>>> is still running on cluster A or not?
>>>
>>> It has no way of knowing if cluster A is even up and running.
>>>
>>> In that situation it cannot safely start the resource.
>> I am perfectly happy to have an additional machine at a third location in
>> order to avoid this split-brain between two clusters.
>>
>> However, what I cannot have is for the resources which should be running on
>> cluster A to get started on cluster B.
>>
>> If cluster A is down, then its resources should simply not run - as happens
>> right now with two independent clusters.
>>
>> Suppose for a moment I had three clusters at three locations: A, B and C.
>>
>> Is there a method by which I can have:
>>
>> 1. Cluster A resources running on cluster A if cluster A is functional and 
> not
>> running anywhere if cluster A is non-functional.
>>
>> 2. Cluster B resources running on cluster B if cluster B is functional and 
> not
>> running anywhere if cluster B is non-functional.
>>
>> 3. Cluster C resources running on cluster C if cluster C is functional and 
> not
>> running anywhere if cluster C is non-functional.
>>
>> 4. Resource D running _somewhere_ on clusters A, B or C, but only a single
>> instance of D at a single location at any time.
>>
>> Requirements 1, 2 and 3 are easy to achieve - don't connect the clusters.
>>
>> Requirement 4 is the one I'm stuck with how to implement.
>>
>> If the three nodes comprising cluster A can manage resources such that they
>> run on only one of the three nodes at any time, surely there must be a way 
> of
>> doing the same thing with a resource running on one of three clusters?
>>
>>
>> Antony.
>>
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Antw: Re: Antw: [EXT] Re: Sub‑clusters / super‑clusters?

2021-08-04 Thread Ulrich Windl
>>> Antony Stone  schrieb am 04.08.2021 um
23:01 in
Nachricht <202108042301.19895.antony.st...@ha.open.source.it>:
> On Wednesday 04 August 2021 at 22:06:39, Frank D. Engel, Jr. wrote:
> 
>> There is no safe way to do what you are trying to do.
>> 
>> If the resource is on cluster A and contact is lost between clusters A
>> and B due to a network failure, how does cluster B know if the resource
>> is still running on cluster A or not?
>>
>> It has no way of knowing if cluster A is even up and running.
>> 
>> In that situation it cannot safely start the resource.
> 
> I am perfectly happy to have an additional machine at a third location in 
> order to avoid this split‑brain between two clusters.
> 
> However, what I cannot have is for the resources which should be running on

> cluster A to get started on cluster B.
> 
> If cluster A is down, then its resources should simply not run ‑ as happens

> right now with two independent clusters.
> 
> Suppose for a moment I had three clusters at three locations: A, B and C.
> 
> Is there a method by which I can have:
> 
> 1. Cluster A resources running on cluster A if cluster A is functional and 
> not 
> running anywhere if cluster A is non‑functional.

If cluster A is non-functional, no resoiurces of cluster A will run.

> 
> 2. Cluster B resources running on cluster B if cluster B is functional and 
> not 
> running anywhere if cluster B is non‑functional.

Likewise for cluster B.

> 
> 3. Cluster C resources running on cluster C if cluster C is functional and 
> not 
> running anywhere if cluster C is non‑functional.

Same here.

> 
> 4. Resource D running _somewhere_ on clusters A, B or C, but only a single 
> instance of D at a single location at any time.

Part of the problem is your description: Actually you do not have a resource
D, but you have three resources like D_A, D_B, and D_C

Maybe things were easier if it it would all be one big cluster with location
constraints.


> 
> Requirements 1, 2 and 3 are easy to achieve ‑ don't connect the clusters.
> 
> Requirement 4 is the one I'm stuck with how to implement.
> 
> If the three nodes comprising cluster A can manage resources such that they

> run on only one of the three nodes at any time, surely there must be a way 
> of 
> doing the same thing with a resource running on one of three clusters?
> 
> 
> Antony.
> 
> ‑‑ 
> I don't know, maybe if we all waited then cosmic rays would write all our 
> software for us. Of course it might take a while.
> 
>  ‑ Ron Minnich, Los Alamos National Laboratory
> 
>Please reply to the
list;
>  please *don't* CC 
> me.
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Antw: Re: Antw: [EXT] Re: Sub‑clusters / super‑clusters?

2021-08-04 Thread Ulrich Windl
>>> Antony Stone  schrieb am 04.08.2021 um
21:27 in
Nachricht <202108042127.43916.antony.st...@ha.open.source.it>:
> On Wednesday 04 August 2021 at 20:57:49, Strahil Nikolov wrote:
> 
>> That's why you need a qdisk at a 3‑rd location, so you will have 7 votes
in
>> total.When 3 nodes in cityA die, all resources will be started on the
>> remaining 3 nodes.
> 
> I think I have not explained this properly.
> 
> I have three nodes in city A which run resources which have to run in city 
> A.  
> They are based on IP addresses which are only valid on the network in city 
> A.
> 
> I have three nodes in city B which run resources which have to run in city 
> B.  
> They are based on IP addresses which are only valid on the network in city 
> B.
> 
> I have redundant routing between my upstream provider, and cities A and B, 
> so 
> that I only _need_ resources to be running in one of the two cities for 
> everything to work as required.  City A can go completely offline and not 
> run 
> its resources, and everything I need continues to work via city B.
> 
> I now have an additional requirement to run a single resource at either city

> A 
> or city B but not both.
> 
> As soon as I connect the clusters at city A and city B, and apply the 
> location 
> contraints and weighting rules you have suggested:
> 
> 1. everything works, including the single resource at either city A or city

> B, 
> so long as both clusters are operational.
> 
> 2. as soon as one cluster fails (all three of its nodes nodes become 
> unavailable), then the other cluster stops running all its resources as 
> well.  
> This is even with quorum=2.

Have you ever tried to find out why this happens? (Talking about logs)

> 
> This means I have lost the redundancy between my two clusters, which is 
> based 
> on the expectation that only one cluster will fail at a time.  If the 
> failure 
> of one automatically _causes_ the failure of the other, I have no high 
> availability any more.
> 
> What I require is for cluster A to continue running its own resources, plus

> the single resource which can run anywhere, in the event that cluster B 
> fails.
> 
> In other words, I need the exact same outcome as I have at present if 
> cluster 
> B fails (its resources stop, cluster A is unaffected), except that cluster A

> 
> continues to run the single resource which I need just a single instance
of.
> 
> It is impossible for the nodes at city A to run the resources which should 
> be 
> running at city B, partly because some of them are identical ("Asterisk" as

> a 
> resource, for example, is already running at city A), and partly because 
> some 
> of them are bound to the networking arrangements (I cannot set a floating IP

> 
> address which belongs in city A on a machine which exists in city B ‑ it
just 
> 
> doesn't work).
> 
> Therefore if adding a seventh node at a third location would try to start 
> _all_ resources in city A if city B goes down, it is not a working solution.

>  
> If city B goes down then I simply do not want its resources to be running 
> anywhere, just the same as I have now with the two independent clusters.
> 
> 
> Thanks,
> 
> 
> Antony.
> 
> ‑‑ 
> "In fact I wanted to be John Cleese and it took me some time to realise that

> 
> the job was already taken."
> 
>  ‑ Douglas Adams
> 
>Please reply to the
list;
>  please *don't* CC 
> me.
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/