You are absolutely right about RA return code. It is LVM-activate plugin (as supplied with resource-agents-4.1.1 in Centos 8) with a slight modification to allow sanlock instead of DLM, which is missing from Centos 8.


So, this plugin erroneously return OCF_ERR_CONFIGURED in many cases, when there is a problem with local configuration on the node. In my case, it should return OCF_ERR_INSTALLED instead. Many thanks for the analysis!


--

Pavel


10.12.2020 12:21, Reid Wahl:
On Thu, Dec 10, 2020 at 1:13 AM Reid Wahl <nw...@redhat.com> wrote:
On Thu, Dec 10, 2020 at 1:08 AM Reid Wahl <nw...@redhat.com> wrote:
Thanks. I see it's only reproducible with stonith-enabled=false.
That's the step I was skipping previously, as I always have stonith
enabled in my clusters.

I'm not sure whether that's expected behavior for some reason when
stonith is disabled. Maybe someone else (e.g., Ken) can weigh in.
Never mind. This was a mistake on my part: I didn't re-add the stonith
**device** configuration when I re-enabled stonith.

So the behavior is the same regardless of whether stonith is enabled
or not. I attribute it to the OCF_ERR_CONFIGURED error.

Why exactly is this behavior unexpected, from your point of view?

Ref:
   - 
https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Administration/#_how_are_ocf_return_codes_interpreted
Going back to your original email, I think I understand. What type of
resource is vg.sanlock in your main cluster? I presume that it isn't
an ocf:pacemaker:Dummy resource like it is in the state4.xml file.

It seems that your real concern is with the behavior of one or more
resource agents. When a resource agent returns OCF_ERR_CONFIGURED,
Pacemaker stops all instances of that resource and prevents it from
starting again. However, the place to address it is in the resource
agent. Pacemaker is doing exactly what the resource agent is telling
it to do.

I also noticed that the state4.xml file has a return code of 6 for the
resource's start operation. That's an OCF_ERR_CONFIGURED, which is a
fatal error. At least for primitive resources, this type of error
prevents the resource from starting anywhere. So I'm somewhat
surprised that the clone instances don't stop on all nodes even when
fencing **is** enabled.


Without stonith:

Allocation scores:
pcmk__clone_allocate: vg.bv_sanlock-clone allocation score on node1: -INFINITY
pcmk__clone_allocate: vg.bv_sanlock-clone allocation score on node2: -INFINITY
pcmk__clone_allocate: vg.bv_sanlock:0 allocation score on node1: -INFINITY
pcmk__clone_allocate: vg.bv_sanlock:0 allocation score on node2: -INFINITY
pcmk__clone_allocate: vg.bv_sanlock:1 allocation score on node1: -INFINITY
pcmk__clone_allocate: vg.bv_sanlock:1 allocation score on node2: -INFINITY
pcmk__native_allocate: vg.bv_sanlock:0 allocation score on node1: -INFINITY
pcmk__native_allocate: vg.bv_sanlock:0 allocation score on node2: -INFINITY
pcmk__native_allocate: vg.bv_sanlock:1 allocation score on node1: -INFINITY
pcmk__native_allocate: vg.bv_sanlock:1 allocation score on node2: -INFINITY

Transition Summary:
  * Stop       vg.bv_sanlock:0     ( node2 )   due to node availability
  * Stop       vg.bv_sanlock:1     ( node1 )   due to node availability

Executing cluster transition:
  * Pseudo action:   vg.bv_sanlock-clone_stop_0
  * Resource action: vg.bv_sanlock   stop on node2
  * Resource action: vg.bv_sanlock   stop on node1
  * Pseudo action:   vg.bv_sanlock-clone_stopped_0



With stonith:

Allocation scores:
pcmk__clone_allocate: vg.bv_sanlock-clone allocation score on node1: -INFINITY
pcmk__clone_allocate: vg.bv_sanlock-clone allocation score on node2: -INFINITY
pcmk__clone_allocate: vg.bv_sanlock:0 allocation score on node1: -INFINITY
pcmk__clone_allocate: vg.bv_sanlock:0 allocation score on node2: -INFINITY
pcmk__clone_allocate: vg.bv_sanlock:1 allocation score on node1: -INFINITY
pcmk__clone_allocate: vg.bv_sanlock:1 allocation score on node2: -INFINITY
pcmk__native_allocate: vg.bv_sanlock:0 allocation score on node1: -INFINITY
pcmk__native_allocate: vg.bv_sanlock:0 allocation score on node2: -INFINITY
pcmk__native_allocate: vg.bv_sanlock:1 allocation score on node1: -INFINITY
pcmk__native_allocate: vg.bv_sanlock:1 allocation score on node2: -INFINITY

Transition Summary:

Executing cluster transition:

On Wed, Dec 9, 2020 at 10:33 PM Pavel Levshin <l...@581.spb.su> wrote:

See the file attached. This one has been produced and tested with
pacemaker 1.1.16 (RHEL 7).


--

Pavel


08.12.2020 10:14, Reid Wahl :
Can you provide the state4.xml file that you're using? I'm unable to
reproduce this issue by the clone instance to fail on one node.

Might need some logs as well.

On Mon, Dec 7, 2020 at 10:40 PM Pavel Levshin <l...@581.spb.su> wrote:
Hello.


Despite many years of Pacemaker use, it never stops fooling me...


This time, I have faced a trivial problem. In my new setup, the cluster 
consists of several identical nodes. A clone resource (vg.sanlock) is started 
on every node, ensuring it has access to SAN storage. Almost all other 
resources are colocated and ordered after vg.sanlock.


This day, I've started a node, and vg.sanlock has failed to start. Then the cluster has 
desided to stop all the clone instances "due to node availability", taking down 
all other resources by dependencies. This seemes illogical to me. In the case of a 
failing clone, I would prefer to see it stopping on one node only. How do I do it 
properly?


I've tried this config with Pacemaker 2.0.3 and 1.1.16, the behaviour stays the 
same.


Reduced test config here:


pcs cluster auth test-pcmk0 test-pcmk1 <>/dev/tty

pcs cluster setup --name test-pcmk test-pcmk0 test-pcmk1 --transport udpu \

    --auto_tie_breaker 1

pcs cluster start --all --wait=60

pcs cluster cib tmp-cib.xml

cp tmp-cib.xml tmp-cib.xml.deltasrc

pcs -f tmp-cib.xml property set stonith-enabled=false

pcs -f tmp-cib.xml resource defaults resource-stickiness=100

pcs -f tmp-cib.xml resource create vg.sanlock ocf:pacemaker:Dummy \

    op monitor interval=10 timeout=20 start interval=0s stop interval=0s \

    timeout=20

pcs -f tmp-cib.xml resource clone vg.sanlock interleave=true

pcs cluster cib-push tmp-cib.xml diff-against=tmp-cib.xml.deltasrc



And here goes cluster reaction to the failure:


# crm_simulate -x state4.xml -S



Current cluster status:

Online: [ test-pcmk0 test-pcmk1 ]



Clone Set: vg.sanlock-clone [vg.sanlock]

       vg.sanlock      (ocf::pacemaker:Dummy): FAILED test-pcmk0

       Started: [ test-pcmk1 ]



Transition Summary:

* Stop       vg.sanlock:0     ( test-pcmk1 )   due to node availability

* Stop       vg.sanlock:1     ( test-pcmk0 )   due to node availability



Executing cluster transition:

* Pseudo action:   vg.sanlock-clone_stop_0

* Resource action: vg.sanlock   stop on test-pcmk1

* Resource action: vg.sanlock   stop on test-pcmk0

* Pseudo action:   vg.sanlock-clone_stopped_0

* Pseudo action:   all_stopped



Revised cluster status:

Online: [ test-pcmk0 test-pcmk1 ]



Clone Set: vg.sanlock-clone [vg.sanlock]

       Stopped: [ test-pcmk0 test-pcmk1 ]


As a sidenote, if I make those clones globally-unique, they seem to behave 
properly. But nowhere I found a reference to this solution. In general, 
globally-unique clones are referred to only where resource agents make 
distinction between clone instances. This is not the case.


--

Thanks,

Pavel



_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


--
Regards,

Reid Wahl, RHCA
Senior Software Maintenance Engineer, Red Hat
CEE - Platform Support Delivery - ClusterHA


--
Regards,

Reid Wahl, RHCA
Senior Software Maintenance Engineer, Red Hat
CEE - Platform Support Delivery - ClusterHA


_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to