Re: [ClusterLabs] cluster doesn't do HA as expected, pingd doesn't help

2023-12-19 Thread Artem
Andrei and Klaus thanks for prompt reply and clarification!
As I understand, design and behavior of Pacemaker is tightly coupled with
the stonith concept. But isn't it too rigid?

Is there a way to leverage self-monitoring or pingd rules to trigger
isolated node to umount its FS? Like vSphere High Availability host
isolation response.
Can resource-stickiness=off (auto-failback) decrease risk of corruption by
unresponsive node coming back online?
Is there a quorum feature not for cluster but for resource start/stop? Got
lock - is welcome to mount, unable to refresh lease - force unmount.
Can on-fail=ignore break manual failover logic (stopped will be considered
as failed and thus ignored)?

best regards,
Artem

On Tue, 19 Dec 2023 at 17:03, Klaus Wenninger  wrote:

>
>
> On Tue, Dec 19, 2023 at 10:00 AM Andrei Borzenkov 
> wrote:
>
>> On Tue, Dec 19, 2023 at 10:41 AM Artem  wrote:
>> ...
>> > Dec 19 09:48:13 lustre-mds2.ntslab.ru pacemaker-schedulerd[785107]
>> (update_resource_action_runnable)warning: OST4_stop_0 on lustre4 is
>> unrunnable (node is offline)
>> > Dec 19 09:48:13 lustre-mds2.ntslab.ru pacemaker-schedulerd[785107]
>> (recurring_op_for_active)info: Start 20s-interval monitor for OST4 on
>> lustre3
>> > Dec 19 09:48:13 lustre-mds2.ntslab.ru pacemaker-schedulerd[785107]
>> (log_list_item)  notice: Actions: Stop   OST4( lustre4
>> )  blocked
>>
>> This is the default for the failed stop operation. The only way
>> pacemaker can resolve failure to stop a resource is to fence the node
>> where this resource was active. If it is not possible (and IIRC you
>> refuse to use stonith), pacemaker has no other choice as to block it.
>> If you insist, you can of course sert on-fail=ignore, but this means
>> unreachable node will continue to run resources. Whether it can lead
>> to some corruption in your case I cannot guess.
>>
>
> Don't know if I'm reading that correctly but I understand what you had
> written
> above that you try to trigger the failover by stopping the VM (lustre4)
> without
> ordered shutdown.
> With fencing disabled what we are seeing is exactly what we would expect:
> The state of the resource is unknown - pacemaker tries to stop it -
> doesn't work
> as the node is offline - no fencing configured - so everything it can do
> is wait
> till there is info if the resource is up or not.
> I guess the strange output below is because of fencing disabled - quite an
> unusual - also not recommended - configuration and so this might not have
> shown up too often in that way.
>
> Klaus
>
>>
>> > Dec 19 09:48:13 lustre-mds2.ntslab.ru pacemaker-schedulerd[785107]
>> (pcmk__create_graph) crit: Cannot fence lustre4 because of OST4:
>> blocked (OST4_stop_0)
>>
>> That is a rather strange phrase. The resource is blocked because the
>> pacemaker could not fence the node, not the other way round.
>> ___
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> ClusterLabs home: https://www.clusterlabs.org/
>>
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] cluster doesn't do HA as expected, pingd doesn't help

2023-12-18 Thread Artem
.ntslab.ru pacemaker-schedulerd[785107]
(unpack_rsc_op_failure)  warning: Unexpected result (error: Action was
pending when executor connection was dropped) was recorded for monitor of
OST4 on lustre4 at Dec 19 09:55:27 2023 | exit-status=1
id=OST4_last_failure_0
Dec 19 10:28:31 lustre-mds2.ntslab.ru pacemaker-schedulerd[785107]
(pcmk__unassign_resource)info: Unassigning OST4



Sorry for so many log lines, but I don't understand what`s going on


best regards,
Artem

On Tue, 19 Dec 2023 at 00:13, Ken Gaillot  wrote:

> On Mon, 2023-12-18 at 23:39 +0300, Artem wrote:
> > Hello experts.
> >
> > I previously played with a dummy resource and it worked as expected.
> > Now I'm switching to a Lustre OST resource and cannot make it.
> > Neither can I understand.
> >
> >
> > ### Initial setup:
> > # pcs resource defaults update resource-stickness=110
> > # for i in {1..4}; do pcs cluster node add-remote lustre$i
> > reconnect_interval=60; done
> > # for i in {1..4}; do pcs constraint location lustre$i prefers
> > lustre-mgs lustre-mds1 lustre-mds2; done
> > # pcs resource create OST3 ocf:lustre:Lustre target=/dev/disk/by-
> > id/wwn-0x6000c291b7f7147f826bb95153e2eaca mountpoint=/lustre/oss3
> > # pcs resource create OST4 ocf:lustre:Lustre target=/dev/disk/by-
> > id/wwn-0x6000c292c41eaae60bccdd3a752913b3 mountpoint=/lustre/oss4
> > (I also tried ocf:heartbeat:Filesystem device=... directory=...
> > fstype=lustre force_unmount=safe --> same behavior)
> >
> > # pcs constraint location OST3 prefers lustre3=100
> > # pcs constraint location OST3 prefers lustre4=100
> > # pcs constraint location OST4 prefers lustre3=100
> > # pcs constraint location OST4 prefers lustre4=100
> > # for i in lustre-mgs lustre-mds1 lustre-mds2 lustre{1..2}; do pcs
> > constraint location OST3 avoids $i; done
> > # for i in lustre-mgs lustre-mds1 lustre-mds2 lustre{1..2}; do pcs
> > constraint location OST4 avoids $i; done
> >
> > ### Checking all is good
> > # crm_simulate --simulate --live-check --show-scores
> > pcmk__primitive_assign: OST4 allocation score on lustre3: 100
> > pcmk__primitive_assign: OST4 allocation score on lustre4: 210
> > # pcs status
> >   * OST3(ocf::lustre:Lustre):Started lustre3
> >   * OST4(ocf::lustre:Lustre):Started lustre4
> >
> > ### VM with lustre4 (OST4) is OFF
> >
> > # crm_simulate --simulate --live-check --show-scores
> > pcmk__primitive_assign: OST4 allocation score on lustre3: 100
> > pcmk__primitive_assign: OST4 allocation score on lustre4: 100
> > Start  OST4( lustre3 )
> > Resource action: OST4start on lustre3
> > Resource action: OST4monitor=2 on lustre3
> > # pcs status
> >   * OST3(ocf::lustre:Lustre):Started lustre3
> >   * OST4(ocf::lustre:Lustre):Stopped
> >
> > 1) I see crm_simulate guesed that it has to restart failed OST4 on
> > lustre3. After making such decision I suspect it evaluates 100:100
> > scores of both lustre3 and lustre4, but lustre3 is already running a
> > service. So it decides to run OST4 again on lustre4, which is failed.
> > Thus it cannot restart on surviving nodes. Right?
>
> No. I'd start with figuring out this case. There's no reason, given the
> configuration above, why OST4 would be stopped. In fact the simulation
> shows it should be started, so that suggests that maybe the actual
> start failed.
>
> Do the logs show any errors around this time?
>
> > 2) Ok, let's try not to give specific score - nothing changed, see
> > below:
> > ### did remove old constraints; clear all resources; cleanup all
> > resources; cluster stop; cluster start
> >
> > # pcs constraint location OST3 prefers lustre3 lustre4
> > # pcs constraint location OST4 prefers lustre3 lustre4
> > # for i in lustre-mgs lustre-mds1 lustre-mds2 lustre{1..2}; do pcs
> > constraint location OST3 avoids $i; done
> > # for i in lustre-mgs lustre-mds1 lustre-mds2 lustre{1..2}; do pcs
> > constraint location OST4 avoids $i; done
> > # crm_simulate --simulate --live-check --show-scores
> > pcmk__primitive_assign: OST4 allocation score on lustre3: INFINITY
> > pcmk__primitive_assign: OST4 allocation score on lustre4: INFINITY
> > # pcs status
> >   * OST3(ocf::lustre:Lustre):Started lustre3
> >   * OST4(ocf::lustre:Lustre):Started lustre4
> >
> > ### VM with lustre4 (OST4) is OFF
> >
> > # crm_simulate --simulate --live-check --show-scores
> > pcmk__primitive_assign: OST4 allocation score on lustre3: INFINITY
> > pcmk

[ClusterLabs] cluster doesn't do HA as expected, pingd doesn't help

2023-12-18 Thread Artem
_primitive_assign: OST4 allocation score on lustre4: 210
# pcs status
  * OST3(ocf::lustre:Lustre):Started lustre3
  * OST4(ocf::lustre:Lustre):Started lustre4

### VM with lustre4 (OST4) is OFF.

# crm_simulate --simulate --live-check --show-scores
pcmk__primitive_assign: OST4 allocation score on lustre3: 90
pcmk__primitive_assign: OST4 allocation score on lustre4: 100
Start  OST4( lustre3 )
Resource action: OST4start on lustre3
Resource action: OST4monitor=2 on lustre3
# pcs status
  * OST3(ocf::lustre:Lustre):Started lustre3
  * OST4(ocf::lustre:Lustre):Stopped

Again lustre3 seems unable to overrule due to lower score and pingd DOESN'T
help at all!


4) Can I make a reliable HA failover without pingd to keep things as simple
as possible?
5) Pings might help to affect cluster decisions in case GW is lost, but its
not working as all the guides say. Why?


Thanks in advance,
Artem
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] ocf:pacemaker:ping works strange

2023-12-12 Thread Artem
Hi Andrei. pingd==0 won't satisfy both statements. It would if I used GTE,
but I used GT.
pingd lt 1 --> [0]
pingd gt 0 --> [1,2,3,...]

On Tue, 12 Dec 2023 at 17:21, Andrei Borzenkov  wrote:

> On Tue, Dec 12, 2023 at 4:47 PM Artem  wrote:
> >> > pcs constraint location FAKE3 rule score=0 pingd lt 1 or not_defined
> pingd
> >> > pcs constraint location FAKE3 rule score=125 pingd gt 0 or defined
> pingd
> > Are they really contradicting?
>
> Yes. pingd == 0 will satisfy both rules. My use of "always" was
> incorrect, it does not happen for all possible values of pingd, but it
> does happen for some.
>

May be defined/not_defined should be put in front of lt/gt ? It is possible
that VM goes down, pingd to not_defined, then the rule evaluates "lt 1"
first, catches an error and doesn't evaluate next part (after OR)?
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] resource fails manual failover

2023-12-12 Thread Artem
Is there a detailed explanation for resource monitor and start timeouts and
intervals with examples, for dummies?

my resource configured s follows:
[root@lustre-mds1 ~]# pcs resource show MDT00
Warning: This command is deprecated and will be removed. Please use 'pcs
resource config' instead.
Resource: MDT00 (class=ocf provider=heartbeat type=Filesystem)
  Attributes: MDT00-instance_attributes
device=/dev/mapper/mds00
directory=/lustre/mds00
force_unmount=safe
fstype=lustre
  Operations:
monitor: MDT00-monitor-interval-20s
  interval=20s
  timeout=40s
start: MDT00-start-interval-0s
  interval=0s
  timeout=60s
stop: MDT00-stop-interval-0s
  interval=0s
  timeout=60s

I issued manual failover with the following commands:
crm_resource --move -r MDT00 -H lustre-mds1

resource tried but returned back with the entries in pacemaker.log like
these:
Dec 12 15:53:23  Filesystem(MDT00)[1886100]:INFO: Running start for
/dev/mapper/mds00 on /lustre/mds00
Dec 12 15:53:45  Filesystem(MDT00)[1886100]:ERROR: Couldn't mount
device [/dev/mapper/mds00] as /lustre/mds00

tried again with the same result:
Dec 12 16:11:04  Filesystem(MDT00)[1891333]:INFO: Running start for
/dev/mapper/mds00 on /lustre/mds00
Dec 12 16:11:26  Filesystem(MDT00)[1891333]:ERROR: Couldn't mount
device [/dev/mapper/mds00] as /lustre/mds00

Why it cannot move?

Does this 20 sec interval (between start and error) have anything to do
with monitor interval settings?

[root@lustre-mgs ~]# pcs constraint show --full
Location Constraints:
  Resource: MDT00
Enabled on:
  Node: lustre-mds1 (score:100) (id:location-MDT00-lustre-mds1-100)
  Node: lustre-mds2 (score:100) (id:location-MDT00-lustre-mds2-100)
Disabled on:
  Node: lustre-mgs (score:-INFINITY)
(id:location-MDT00-lustre-mgs--INFINITY)
  Node: lustre1 (score:-INFINITY) (id:location-MDT00-lustre1--INFINITY)
  Node: lustre2 (score:-INFINITY) (id:location-MDT00-lustre2--INFINITY)
  Node: lustre3 (score:-INFINITY) (id:location-MDT00-lustre3--INFINITY)
  Node: lustre4 (score:-INFINITY) (id:location-MDT00-lustre4--INFINITY)
Ordering Constraints:
  start MGT then start MDT00 (kind:Optional) (id:order-MGT-MDT00-Optional)
  start MDT00 then start OST1 (kind:Optional) (id:order-MDT00-OST1-Optional)
  start MDT00 then start OST2 (kind:Optional) (id:order-MDT00-OST2-Optional)

with regards to ordering constraint: OST1 and OST2 are started now, while
I'm exercising MDT00 failover.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] ocf:pacemaker:ping works strange

2023-12-12 Thread Artem
On Tue, 12 Dec 2023 at 16:17, Andrei Borzenkov  wrote:

> On Fri, Dec 8, 2023 at 5:44 PM Artem  wrote:
> > pcs constraint location FAKE3 rule score=0 pingd lt 1 or not_defined
> pingd
> > pcs constraint location FAKE4 rule score=0 pingd lt 1 or not_defined
> pingd
> > pcs constraint location FAKE3 rule score=125 pingd gt 0 or defined pingd
> > pcs constraint location FAKE4 rule score=125 pingd gt 0 or defined pingd
> >
>
> These rules are contradicting. You set the score to 125 if pingd is
> defined and at the same time set it to 0 if the score is less than 1.
> To be "less than 1" it must be defined to start with so both rules
> will always apply. I do not know how the rules are ordered. Either you
> get random behavior, or one pair of these rules is effectively
> ignored.
>

"pingd lt 1 or not_defined pingd" means to me ==0 or not_defined, that is
ping fails to ping GW or fails to report to corosync/pacemaker. Am I wrong?
"pingd gt 0 or defined pingd" means to me that ping gets reply from GW and
reports it to cluster.
Are they really contradicting?
I read this article and tried to do in a similar way:
https://habr.com/ru/articles/118925/


>
> > Question #1) Why I cannot see accumulated score from pingd in
> crm_simulate output? Only location score and stickiness.
> > pcmk__primitive_assign: FAKE3 allocation score on lustre3: 210
> > pcmk__primitive_assign: FAKE3 allocation score on lustre4: 90
> > pcmk__primitive_assign: FAKE4 allocation score on lustre3: 90
> > pcmk__primitive_assign: FAKE4 allocation score on lustre4: 210
> > Either when all is OK or when VM is down - score from pingd not added to
> total score of RA
> >
> >
> > Question #2) I shut lustre3 VM down and leave it like that. pcs status:
> >   * FAKE3   (ocf::pacemaker:Dummy):  Stopped
> >   * FAKE4   (ocf::pacemaker:Dummy):  Started lustre4
> >   * Clone Set: ping-clone [ping]:
> > * Started: [ lustre-mds1 lustre-mds2 lustre-mgs lustre1 lustre2
> lustre4 ] << lustre3 missing
> > OK for now
> > VM boots up. pcs status:
> >   * FAKE3   (ocf::pacemaker:Dummy):  FAILED (blocked) [ lustre3
> lustre4 ]  << what is it?
> >   * Clone Set: ping-clone [ping]:
> > * ping  (ocf::pacemaker:ping):   FAILED lustre3 (blocked)<<
> why not started?
> > * Started: [ lustre-mds1 lustre-mds2 lustre-mgs lustre1 lustre2
> lustre4 ]
>
> If this is full pcs status output, I miss stonith resource.
>
> I have "pcs property set stonith-enabled=false" and don't plan to use it.
I want simple active-passive cluster, like Veritas or ServiceGuard with
most duties automated. And our production servers have their iBMC in a
locked network segment
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] ocf:pacemaker:ping works strange

2023-12-12 Thread Artem
Dear Ken and other experts.

How can I leverage pingd to speedup failover? Or may be it is useless and
we should leverage monitor/start timeouts and
migration-threshold/failure-timeout ?

I have preference like this for normal operations:
> pcs constraint location FAKE3 prefers lustre3=100
> pcs constraint location FAKE3 prefers lustre4=90
> pcs constraint location FAKE4 prefers lustre3=90
> pcs constraint location FAKE4 prefers lustre4=100
> pcs resource defaults update resource-stickiness=110
And I need a rule to decrease preference score on a node where pingd fails.
I'm checking it by VM being powered off with pacemaker unaware of it (no
agents on ESXi/vCenter).

On Mon, 11 Dec 2023 at 19:00, Ken Gaillot  wrote:

> > Question #1) Why I cannot see accumulated score from pingd in
> > crm_simulate output? Only location score and stickiness.
>
> ping scores aren't added to resource scores, they're just set as node
> attribute values. Location constraint rules map those values to
> resource scores (in this case any defined ping score gets mapped to
> 125).
> --
> Ken Gaillot 
>
>
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] ocf:pacemaker:ping works strange

2023-12-11 Thread Artem
Hi Ken,

On Mon, 11 Dec 2023 at 19:00, Ken Gaillot  wrote:

> > Question #2) I shut lustre3 VM down and leave it like that


> How did you shut it down? Outside cluster control, or with something
> like pcs resource disable?
>
> I did it outside of the cluster to simulate a failure. I turned off this
VM from vCenter. Cluster is unaware of anything behind OS.


> >   * FAKE3   (ocf::pacemaker:Dummy):  Stopped
> >   * FAKE4   (ocf::pacemaker:Dummy):  Started lustre4
> >   * Clone Set: ping-clone [ping]:
> > * Started: [ lustre-mds1 lustre-mds2 lustre-mgs lustre1 lustre2
> > lustre4 ] << lustre3 missing
> > OK for now
> > VM boots up. pcs status:
> >   * FAKE3   (ocf::pacemaker:Dummy):  FAILED (blocked) [ lustre3
> > lustre4 ]  << what is it?
> >   * Clone Set: ping-clone [ping]:
> > * ping  (ocf::pacemaker:ping):   FAILED lustre3 (blocked)
> > << why not started?
> > * Started: [ lustre-mds1 lustre-mds2 lustre-mgs lustre1 lustre2
> > lustre4 ]
> > I checked server processes manually and found that lustre4 runs
> > "/usr/lib/ocf/resource.d/pacemaker/ping monitor" while lustre3
> > doesn't
> > All is according to documentation but results are strange.
> > Then I tried to add meta target-role="started" to pcs resource create
> > ping and this time ping started after node rebooted. Can I expect
> > that it was just missing from official setup documentation, and now
> > everything will work fine?
>
>
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] ocf:pacemaker:ping works strange

2023-12-08 Thread Artem
Hello experts.

I use pacemaker for a Lustre cluster. But for simplicity and exploration I
use a Dummy resource. I didn't like how resource performed failover and
failback. When I shut down VM with remote agent, pacemaker tries to restart
it. According to pcs status it marks the resource (not RA) Online for some
time while VM stays down.

OK, I wanted to improve its behavior and set up a ping monitor. I tuned the
scores like this:
pcs resource create FAKE3 ocf:pacemaker:Dummy
pcs resource create FAKE4 ocf:pacemaker:Dummy
pcs constraint location FAKE3 prefers lustre3=100
pcs constraint location FAKE3 prefers lustre4=90
pcs constraint location FAKE4 prefers lustre3=90
pcs constraint location FAKE4 prefers lustre4=100
pcs resource defaults update resource-stickiness=110
pcs resource create ping ocf:pacemaker:ping dampen=5s host_list=local op
monitor interval=3s timeout=7s clone meta target-role="started"
for i in lustre{1..4}; do pcs constraint location ping-clone prefers $i;
done
pcs constraint location FAKE3 rule score=0 pingd lt 1 or not_defined pingd
pcs constraint location FAKE4 rule score=0 pingd lt 1 or not_defined pingd
pcs constraint location FAKE3 rule score=125 pingd gt 0 or defined pingd
pcs constraint location FAKE4 rule score=125 pingd gt 0 or defined pingd


Question #1) Why I cannot see accumulated score from pingd in crm_simulate
output? Only location score and stickiness.
pcmk__primitive_assign: FAKE3 allocation score on lustre3: 210
pcmk__primitive_assign: FAKE3 allocation score on lustre4: 90
pcmk__primitive_assign: FAKE4 allocation score on lustre3: 90
pcmk__primitive_assign: FAKE4 allocation score on lustre4: 210
Either when all is OK or when VM is down - score from pingd not added to
total score of RA


Question #2) I shut lustre3 VM down and leave it like that. pcs status:
  * FAKE3   (ocf::pacemaker:Dummy):  Stopped
  * FAKE4   (ocf::pacemaker:Dummy):  Started lustre4
  * Clone Set: ping-clone [ping]:
* Started: [ lustre-mds1 lustre-mds2 lustre-mgs lustre1 lustre2 lustre4
] << lustre3 missing
OK for now
VM boots up. pcs status:
  * FAKE3   (ocf::pacemaker:Dummy):  FAILED (blocked) [ lustre3 lustre4
]  << what is it?
  * Clone Set: ping-clone [ping]:
* ping  (ocf::pacemaker:ping):   FAILED lustre3 (blocked)<< why
not started?
* Started: [ lustre-mds1 lustre-mds2 lustre-mgs lustre1 lustre2 lustre4
]
I checked server processes manually and found that lustre4 runs
"/usr/lib/ocf/resource.d/pacemaker/ping monitor" while lustre3 doesn't
All is according to documentation but results are strange.
Then I tried to add meta target-role="started" to pcs resource create ping
and this time ping started after node rebooted. Can I expect that it was
just missing from official setup documentation, and now everything will
work fine?
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] RemoteOFFLINE status, permanently

2023-12-04 Thread Artem
Thank you very much Ken! I missed this step. Now I clearly see it in
Morrone_LUG2017.pdf
I added the constraint and RA became online.
What bugs me is the following. I destroyed and recreated the cluster with
the same settings on designated hosts and nothing worked - always
RemoteOFFLINE. But when I repeated these steps for a fresh install of 3 VMs
on my laptop it worked out of the box (RA was Online).


On Mon, 4 Dec 2023 at 23:21, Ken Gaillot  wrote:

> Hi,
>

> An asymmetric cluster requires that all resources be enabled on
> particular nodes with location constraints. Since you don't have any
> for your remote connections, they can't start anywhere.
>
>
>
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] RemoteOFFLINE status, permanently

2023-11-29 Thread Artem
Hello,

I deployed a Lustre cluster with 3 nodes (metadata) as pacemaker/corosync
and 4 nodes as Remote Agents (for data). Initially all went well, I've set
up MGS and MDS resources, checked failover and failback, remote agents were
online.

Then I tried to create a resource for OST on two nodes which are remote
agents. I also set location constraint preference for them, collocation
(OST1 and OST2 score=-50) and ordering constraint (MDS then OST[12]). Then
I read that colocation and ordering constraints should not be used for RA.
I deleted these constraints. At some stage I used reconnect_interval=5s,
but then found a bug report advising to set it higher, so reverted to
defaults.

Only then I checked pcs status, and noticed then RA were Offline.
I tried to remove RA, add again, restart cluster, destroy it and recreate,
reboot nodes - nothing helped: at the very beginning of cluster setup
agents were persistently RemoteOFFLINE even before creation of OST resource
and locating it preferably on RA (lustre1 and lustre2). I found nothing
helpful in /var/log/pacemaker/pacemaker.log. Please help me investigate and
fix it.


[root@lustre-mgs ~]# rpm -qa | grep -E "corosync|pacemaker|pcs"
pacemaker-cli-2.1.6-8.el8.x86_64
pacemaker-schemas-2.1.6-8.el8.noarch
pcs-0.10.17-2.el8.x86_64
pacemaker-libs-2.1.6-8.el8.x86_64
corosync-3.1.7-1.el8.x86_64
pacemaker-cluster-libs-2.1.6-8.el8.x86_64
pacemaker-2.1.6-8.el8.x86_64
corosynclib-3.1.7-1.el8.x86_64

[root@lustre-mgs ~]# ssh lustre1 "rpm -qa | grep resource-agents"
resource-agents-4.9.0-49.el8.x86_64

[root@lustre-mgs ~]# pcs status
Cluster name: cl-lustre
Cluster Summary:
  * Stack: corosync (Pacemaker is running)
  * Current DC: lustre-mds1 (version 2.1.6-8.el8-6fdc9deea29) - partition
with quorum
  * Last updated: Wed Nov 29 12:40:37 2023 on lustre-mgs
  * Last change:  Wed Nov 29 12:11:21 2023 by root via cibadmin on
lustre-mgs
  * 7 nodes configured
  * 6 resource instances configured
Node List:
  * Online: [ lustre-mds1 lustre-mds2 lustre-mgs ]
  * RemoteOFFLINE: [ lustre1 lustre2 lustre3 lustre4 ]
Full List of Resources:
  * lustre2 (ocf::pacemaker:remote): Stopped
  * lustre3 (ocf::pacemaker:remote): Stopped
  * lustre4 (ocf::pacemaker:remote): Stopped
  * lustre1 (ocf::pacemaker:remote): Stopped
  * MGT (ocf::heartbeat:Filesystem): Started lustre-mgs
  * MDT00   (ocf::heartbeat:Filesystem): Started lustre-mds1
Daemon Status:
  corosync: active/disabled
  pacemaker: active/enabled
  pcsd: active/enabled

[root@lustre-mgs ~]# pcs cluster verify --full
[root@lustre-mgs ~]#

[root@lustre-mgs ~]# pcs constraint show --full
Warning: This command is deprecated and will be removed. Please use 'pcs
constraint config' instead.
Location Constraints:
  Resource: MDT00
Enabled on:
  Node: lustre-mds1 (score:100) (id:location-MDT00-lustre-mds1-100)
  Node: lustre-mds2 (score:100) (id:location-MDT00-lustre-mds2-100)
  Resource: MGT
Enabled on:
  Node: lustre-mgs (score:100) (id:location-MGT-lustre-mgs-100)
  Node: lustre-mds2 (score:50) (id:location-MGT-lustre-mds2-50)
Ordering Constraints:
  start MGT then start MDT00 (kind:Optional) (id:order-MGT-MDT00-Optional)
Colocation Constraints:
Ticket Constraints:

[root@lustre-mgs ~]# pcs resource show lustre1
Warning: This command is deprecated and will be removed. Please use 'pcs
resource config' instead.
Resource: lustre1 (class=ocf provider=pacemaker type=remote)
  Attributes: lustre1-instance_attributes
server=lustre1
  Operations:
migrate_from: lustre1-migrate_from-interval-0s
  interval=0s
  timeout=60s
migrate_to: lustre1-migrate_to-interval-0s
  interval=0s
  timeout=60s
monitor: lustre1-monitor-interval-60s
  interval=60s
  timeout=30s
reload: lustre1-reload-interval-0s
  interval=0s
  timeout=60s
reload-agent: lustre1-reload-agent-interval-0s
  interval=0s
  timeout=60s
start: lustre1-start-interval-0s
  interval=0s
  timeout=60s
stop: lustre1-stop-interval-0s
  interval=0s
  timeout=60s

I also changed some properties:
pcs property set stonith-enabled=false
pcs property set symmetric-cluster=false
pcs property set batch-limit=100
pcs resource defaults update resource-stickness=1000
pcs cluster config update

[root@lustre-mgs ~]# ssh lustre1 "systemctl status pcsd pacemaker-remote
resource-agents-deps.target"
● pcsd.service - PCS GUI and remote configuration interface
   Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor
preset: disabled)
   Active: active (running) since Tue 2023-11-28 19:01:49 MSK; 17h ago
 Docs: man:pcsd(8)
   man:pcs(8)
 Main PID: 1752 (pcsd)
Tasks: 1 (limit: 408641)
   Memory: 28.0M
   CGroup: /system.slice/pcsd.service
   └─1752 /usr/libexec/platform-python -Es /usr/sbin/pcsd
Nov 28 19:01:49 lustre1.ntslab.ru systemd[1]: Starting PCS GUI and remote
configuration interface...
Nov 28