Re: [ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-18 Thread Ken Gaillot
On Wed, 2017-10-18 at 16:58 +0200, Gerard Garcia wrote: > I'm using version 1.1.15-11.el7_3.2-e174ec8. As far as I know the > latest stable version in Centos 7.3 > > Gerard Interesting ... this was an undetected bug that was coincidentally fixed by the recent fail-count work released in 1.1.17.

Re: [ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-18 Thread Gerard Garcia
I'm using version 1.1.15-11.el7_3.2-e174ec8. As far as I know the latest stable version in Centos 7.3 Gerard On Wed, Oct 18, 2017 at 4:42 PM, Ken Gaillot wrote: > On Wed, 2017-10-18 at 14:25 +0200, Gerard Garcia wrote: > > So I think I found the problem. The two resources

Re: [ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-18 Thread Ken Gaillot
On Wed, 2017-10-18 at 14:25 +0200, Gerard Garcia wrote: > So I think I found the problem. The two resources are named forwarder > and bgpforwarder. It doesn't matter if bgpforwarder exists. It is > just that when I set the failcount to INFINITY to a resource named > bgpforwarder (crm_failcount -r

Re: [ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-18 Thread Gerard Garcia
So I think I found the problem. The two resources are named forwarder and bgpforwarder. It doesn't matter if bgpforwarder exists. It is just that when I set the failcount to INFINITY to a resource named bgpforwarder (crm_failcount -r bgpforwarder -v INFINITY) it directly affects the forwarder

Re: [ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-17 Thread Gerard Garcia
That makes sense. I've tried copying the anything resource and changed its name and id (which I guess should be enough to make pacemaker think they are different) but I still have the same problem. After more debugging I have reduced the problem to this: * First cloned resource running fine *

Re: [ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-17 Thread Ken Gaillot
On Tue, 2017-10-17 at 11:47 +0200, Gerard Garcia wrote: > Thanks Ken. Yes, inspecting the logs seems that the failcount of the > correctly running resource reaches the maximum number of allowed > failures and gets banned in all nodes. > > What is weird is that I just see how the failcount for the

Re: [ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-17 Thread Gerard Garcia
Thanks Ken. Yes, inspecting the logs seems that the failcount of the correctly running resource reaches the maximum number of allowed failures and gets banned in all nodes. What is weird is that I just see how the failcount for the first resource gets updated, is like the failcount are being

Re: [ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-16 Thread Ken Gaillot
On Mon, 2017-10-16 at 18:30 +0200, Gerard Garcia wrote: > Hi, > > I have a cluster with two ocf:heartbeat:anything resources each one > running as a clone in all nodes of the cluster. For some reason when > one of them fails to start the other one stops. There is not any > constrain configured or

[ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-16 Thread Gerard Garcia
Hi, I have a cluster with two ocf:heartbeat:anything resources each one running as a clone in all nodes of the cluster. For some reason when one of them fails to start the other one stops. There is not any constrain configured or any kind of relation between them. Is it possible that there is