[ClusterLabs] kind=Optional order constraint not working at startup
Hi, in my cluster setup I have a couple of resources from which I need to start some in specific order. Basically I have two cloned resources that should start after mounting a DRBD filesystem on all nodes plus one resource that start after the clone sets. It is important that this only impacts the startup procedure. Once the system is running stopping or starting one of the clone resources should not impact the other resource's state. From reading the manual, this should be what a local constraint with kind=Optional implements. However, when I start the cluster the filesystem is started after the otehr resources ignoring the ordering constraint. My cluster configuration: pcs cluster setup --name MDA1PFP MDA1PFP-PCS01,MDA1PFP-S01 MDA1PFP-PCS02,MDA1PFP-S02 pcs cluster start --all sleep 5 crm_attribute --type nodes --node MDA1PFP-PCS01 --name ServerRole --update PRIME crm_attribute --type nodes --node MDA1PFP-PCS02 --name ServerRole --update BACKUP pcs property set stonith-enabled=false pcs resource defaults resource-stickiness=100 rm -f mda; pcs cluster cib mda pcs -f mda property set no-quorum-policy=ignore pcs -f mda resource create mda-ip ocf:heartbeat:IPaddr2 ip=192.168.120.20 cidr_netmask=24 nic=bond0 op monitor interval=1s pcs -f mda constraint location mda-ip prefers MDA1PFP-PCS01=50 pcs -f mda resource create ping ocf:pacemaker:ping dampen=5s multiplier=1000 host_list=pf-pep-dev-1 params timeout=1 attempts=3 op monitor interval=1 --clone pcs -f mda constraint location mda-ip rule score=-INFINITY pingd lt 1 or not_defined pingd pcs -f mda resource create ACTIVE ocf:heartbeat:dummy pcs -f mda constraint colocation add ACTIVE with mda-ip score=INFINITY pcs -f mda resource create drbd1 ocf:linbit:drbd drbd_resource=shared_fs op monitor interval=60s pcs -f mda resource master drbd1_sync drbd1 master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true pcs -f mda constraint colocation add master drbd1_sync with mda-ip score=INFINITY pcs -f mda resource create shared_fs Filesystem device="/dev/drbd1" directory=/shared_fs fstype="xfs" pcs -f mda constraint order promote drbd1_sync then start shared_fs pcs -f mda constraint colocation add shared_fs with master drbd1_sync score=INFINITY pcs -f mda resource create supervisor ocf:pfpep:supervisor params config="/shared_fs/pfpep.ini" --clone pcs -f mda resource create snmpAgent ocf:pfpep:snmpAgent params config="/shared_fs/pfpep.ini" --clone pcs -f mda resource create clusterSwitchNotification ocf:pfpep:clusterSwitch params config="/shared_fs/pfpep.ini" pcs -f mda constraint order start shared_fs then snmpAgent-clone kind=Optional pcs -f mda constraint order start shared_fs then supervisor-clone kind=Optional pcs -f mda constraint order start snmpAgent-clone then supervisor-clone kind=Optional pcs -f mda constraint order start supervisor-clone then clusterSwitchNotification kind=Optional pcs -f mda constraint colocation add clusterSwitchNotification with shared_fs score=INFINITY pcs cluster cib-push mda The order of resource startup in the log file is: Sep 21 13:01:21 MDA1PFP-S01 crmd[2760]: notice: Operation snmpAgent_start_0: ok (node=MDA1PFP-PCS01, call=40, rc=0, cib-update=82, confirmed=true) Sep 21 13:01:21 MDA1PFP-S01 crmd[2760]: notice: Operation drbd1_start_0: ok (node=MDA1PFP-PCS01, call=39, rc=0, cib-update=83, confirmed=true) Sep 21 13:01:23 MDA1PFP-S01 crmd[2760]: notice: Operation ping_start_0: ok (node=MDA1PFP-PCS01, call=38, rc=0, cib-update=85, confirmed=true) Sep 21 13:01:23 MDA1PFP-S01 crmd[2760]: notice: Operation supervisor_start_0: ok (node=MDA1PFP-PCS01, call=45, rc=0, cib-update=88, confirmed=true) Sep 21 13:01:28 MDA1PFP-S01 crmd[2760]: notice: Operation ACTIVE_start_0: ok (node=MDA1PFP-PCS01, call=48, rc=0, cib-update=94, confirmed=true) Sep 21 13:01:28 MDA1PFP-S01 crmd[2760]: notice: Operation mda-ip_start_0: ok (node=MDA1PFP-PCS01, call=47, rc=0, cib-update=96, confirmed=true) Sep 21 13:01:28 MDA1PFP-S01 crmd[2760]: notice: Operation clusterSwitchNotification_start_0: ok (node=MDA1PFP-PCS01, call=50, rc=0, cib-update=98, confirmed=true) Sep 21 13:01:28 MDA1PFP-S01 crmd[2760]: notice: Operation shared_fs_start_0: ok (node=MDA1PFP-PCS01, call=57, rc=0, cib-update=101, confirmed=true) Why is the shared file system started after the other resources? Best wishes, Jens ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] kind=Optional order constraint not working at startup
Hi, could this be issue 5039 (http://bugs.clusterlabs.org/show_bug.cgi?id=5039)? It sounds similar. Cheers, Jens -- Jens Auer | CGI | Software-Engineer CGI (Germany) GmbH & Co. KG Rheinstraße 95 | 64295 Darmstadt | Germany T: +49 6151 36860 154 jens.a...@cgi.com Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter de.cgi.com/pflichtangaben. Von: Auer, Jens [jens.a...@cgi.com] Gesendet: Mittwoch, 21. September 2016 15:10 An: users@clusterlabs.org Betreff: [ClusterLabs] kind=Optional order constraint not working at startup Hi, in my cluster setup I have a couple of resources from which I need to start some in specific order. Basically I have two cloned resources that should start after mounting a DRBD filesystem on all nodes plus one resource that start after the clone sets. It is important that this only impacts the startup procedure. Once the system is running stopping or starting one of the clone resources should not impact the other resource's state. From reading the manual, this should be what a local constraint with kind=Optional implements. However, when I start the cluster the filesystem is started after the otehr resources ignoring the ordering constraint. My cluster configuration: pcs cluster setup --name MDA1PFP MDA1PFP-PCS01,MDA1PFP-S01 MDA1PFP-PCS02,MDA1PFP-S02 pcs cluster start --all sleep 5 crm_attribute --type nodes --node MDA1PFP-PCS01 --name ServerRole --update PRIME crm_attribute --type nodes --node MDA1PFP-PCS02 --name ServerRole --update BACKUP pcs property set stonith-enabled=false pcs resource defaults resource-stickiness=100 rm -f mda; pcs cluster cib mda pcs -f mda property set no-quorum-policy=ignore pcs -f mda resource create mda-ip ocf:heartbeat:IPaddr2 ip=192.168.120.20 cidr_netmask=24 nic=bond0 op monitor interval=1s pcs -f mda constraint location mda-ip prefers MDA1PFP-PCS01=50 pcs -f mda resource create ping ocf:pacemaker:ping dampen=5s multiplier=1000 host_list=pf-pep-dev-1 params timeout=1 attempts=3 op monitor interval=1 --clone pcs -f mda constraint location mda-ip rule score=-INFINITY pingd lt 1 or not_defined pingd pcs -f mda resource create ACTIVE ocf:heartbeat:dummy pcs -f mda constraint colocation add ACTIVE with mda-ip score=INFINITY pcs -f mda resource create drbd1 ocf:linbit:drbd drbd_resource=shared_fs op monitor interval=60s pcs -f mda resource master drbd1_sync drbd1 master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true pcs -f mda constraint colocation add master drbd1_sync with mda-ip score=INFINITY pcs -f mda resource create shared_fs Filesystem device="/dev/drbd1" directory=/shared_fs fstype="xfs" pcs -f mda constraint order promote drbd1_sync then start shared_fs pcs -f mda constraint colocation add shared_fs with master drbd1_sync score=INFINITY pcs -f mda resource create supervisor ocf:pfpep:supervisor params config="/shared_fs/pfpep.ini" --clone pcs -f mda resource create snmpAgent ocf:pfpep:snmpAgent params config="/shared_fs/pfpep.ini" --clone pcs -f mda resource create clusterSwitchNotification ocf:pfpep:clusterSwitch params config="/shared_fs/pfpep.ini" pcs -f mda constraint order start shared_fs then snmpAgent-clone kind=Optional pcs -f mda constraint order start shared_fs then supervisor-clone kind=Optional pcs -f mda constraint order start snmpAgent-clone then supervisor-clone kind=Optional pcs -f mda constraint order start supervisor-clone then clusterSwitchNotification kind=Optional pcs -f mda constraint colocation add clusterSwitchNotification with shared_fs score=INFINITY pcs cluster cib-push mda The order of resource startup in the log file is: Sep 21 13:01:21 MDA1PFP-S01 crmd[2760]: notice: Operation snmpAgent_start_0: ok (node=MDA1PFP-PCS01, call=40, rc=0, cib-update=82, confirmed=true) Sep 21 13:01:21 MDA1PFP-S01 crmd[2760]: notice: Operation drbd1_start_0: ok (node=MDA1PFP-PCS01, call=39, rc=0, cib-update=83, confirmed=true) Sep 21 13:01:23 MDA1PFP-S01 crmd[2760]: notice: Operation ping_start_0: ok (node=MDA1PFP-PCS01, call=38, rc=0, cib-update=85, confirmed=true) Sep 21 13:01:23 MDA1PFP-S01 crmd[2760]: notice: Operation supervisor_start_0: ok (node=MDA1PFP-PCS01, call=45, rc=0, cib-update=88, confirmed=true) Sep 21 13:01:28 MDA1PFP-S01 crmd[2760]: notice: Operation ACTIVE_start_0: ok (node=MDA1PFP-PCS01, call=48, rc=0, cib-update=94, confirmed=true) Sep 21 13:01:28 MDA1PFP-S01 crmd[2760]: notice: Operation mda-ip_start_0: ok (node=MDA1PFP-PCS01, call=47, rc=0, cib-update=96, confirmed=true) Sep 21 13:01:28 MDA1PFP-S01 crmd[2760]: notice: Operation clusterSwitchNotification_start_0: ok (node=MDA1PFP-PCS01, call=50, rc=0, cib-update=98, confirmed=true) Sep 21 13:01:28 MDA1PFP-S01 crmd[2760]: notice: Operation shared_fs_start_0: ok (node=MDA1PFP-PCS01, call=57, rc=0, cib-update=101, confirmed=true) Why is the shared file sys
Re: [ClusterLabs] kind=Optional order constraint not working at startup
On 09/21/2016 09:00 AM, Auer, Jens wrote: > Hi, > > could this be issue 5039 (http://bugs.clusterlabs.org/show_bug.cgi?id=5039)? > It sounds similar. Correct -- "Optional" means honor the constraint only if both resources are starting *in the same transition*. shared_fs has to wait for the DRBD promotion, but the other resources have no such limitation, so they are free to start before shared_fs. The problem is "... only impacts the startup procedure". Pacemaker doesn't distinguish start-up from any other state of the cluster. Nodes (and entire partitions of nodes) can come and go at any time, and any or all resources can be stopped and started again at any time, so "start-up" is not really as meaningful as it sounds. Maybe try an optional constraint of the other resources on the DRBD promotion. That would make it more likely that all the resources end up starting in the same transition. > Cheers, > Jens > > -- > Jens Auer | CGI | Software-Engineer > CGI (Germany) GmbH & Co. KG > Rheinstraße 95 | 64295 Darmstadt | Germany > T: +49 6151 36860 154 > jens.a...@cgi.com > Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter > de.cgi.com/pflichtangaben. > > > > Von: Auer, Jens [jens.a...@cgi.com] > Gesendet: Mittwoch, 21. September 2016 15:10 > An: users@clusterlabs.org > Betreff: [ClusterLabs] kind=Optional order constraint not working at startup > > Hi, > > in my cluster setup I have a couple of resources from which I need to start > some in specific order. Basically I have two cloned resources that should > start after mounting a DRBD filesystem on all nodes plus one resource that > start after the clone sets. It is important that this only impacts the > startup procedure. Once the system is running stopping or starting one of the > clone resources should not impact the other resource's state. From reading > the manual, this should be what a local constraint with kind=Optional > implements. However, when I start the cluster the filesystem is started after > the otehr resources ignoring the ordering constraint. > > My cluster configuration: > pcs cluster setup --name MDA1PFP MDA1PFP-PCS01,MDA1PFP-S01 > MDA1PFP-PCS02,MDA1PFP-S02 > pcs cluster start --all > sleep 5 > crm_attribute --type nodes --node MDA1PFP-PCS01 --name ServerRole --update > PRIME > crm_attribute --type nodes --node MDA1PFP-PCS02 --name ServerRole --update > BACKUP > pcs property set stonith-enabled=false > pcs resource defaults resource-stickiness=100 > > rm -f mda; pcs cluster cib mda > pcs -f mda property set no-quorum-policy=ignore > > pcs -f mda resource create mda-ip ocf:heartbeat:IPaddr2 ip=192.168.120.20 > cidr_netmask=24 nic=bond0 op monitor interval=1s > pcs -f mda constraint location mda-ip prefers MDA1PFP-PCS01=50 > pcs -f mda resource create ping ocf:pacemaker:ping dampen=5s multiplier=1000 > host_list=pf-pep-dev-1 params timeout=1 attempts=3 op monitor interval=1 > --clone > pcs -f mda constraint location mda-ip rule score=-INFINITY pingd lt 1 or > not_defined pingd > > pcs -f mda resource create ACTIVE ocf:heartbeat:dummy > pcs -f mda constraint colocation add ACTIVE with mda-ip score=INFINITY > > pcs -f mda resource create drbd1 ocf:linbit:drbd drbd_resource=shared_fs op > monitor interval=60s > pcs -f mda resource master drbd1_sync drbd1 master-max=1 master-node-max=1 > clone-max=2 clone-node-max=1 notify=true > pcs -f mda constraint colocation add master drbd1_sync with mda-ip > score=INFINITY > > pcs -f mda resource create shared_fs Filesystem device="/dev/drbd1" > directory=/shared_fs fstype="xfs" > pcs -f mda constraint order promote drbd1_sync then start shared_fs > pcs -f mda constraint colocation add shared_fs with master drbd1_sync > score=INFINITY > > pcs -f mda resource create supervisor ocf:pfpep:supervisor params > config="/shared_fs/pfpep.ini" --clone > pcs -f mda resource create snmpAgent ocf:pfpep:snmpAgent params > config="/shared_fs/pfpep.ini" --clone > pcs -f mda resource create clusterSwitchNotification ocf:pfpep:clusterSwitch > params config="/shared_fs/pfpep.ini" > > pcs -f mda constraint order start shared_fs then snmpAgent-clone > kind=Optional > pcs -f mda constraint order start shared_fs then supervisor-clone > kind=Optional > pcs -f mda constraint order start snmpAgent-clone then supervisor-clone > kind=Optional > pcs -f mda constraint order start supervisor-clone then > clusterSwitchNotification kind=Optional > pcs -f mda constraint colocation add clusterSwitchNotification with shared_fs > score=INFINITY > > pcs
Re: [ClusterLabs] kind=Optional order constraint not working at startup
r you think for any reason that this message may have been addressed to you in error, you may not use or copy or deliver this message to anyone else. In such case, you should destroy this message and are asked to notify the sender by reply e-mail. Von: Ken Gaillot [kgail...@redhat.com] Gesendet: Mittwoch, 21. September 2016 16:30 An: users@clusterlabs.org Betreff: Re: [ClusterLabs] kind=Optional order constraint not working at startup On 09/21/2016 09:00 AM, Auer, Jens wrote: > Hi, > > could this be issue 5039 (http://bugs.clusterlabs.org/show_bug.cgi?id=5039)? > It sounds similar. Correct -- "Optional" means honor the constraint only if both resources are starting *in the same transition*. shared_fs has to wait for the DRBD promotion, but the other resources have no such limitation, so they are free to start before shared_fs. The problem is "... only impacts the startup procedure". Pacemaker doesn't distinguish start-up from any other state of the cluster. Nodes (and entire partitions of nodes) can come and go at any time, and any or all resources can be stopped and started again at any time, so "start-up" is not really as meaningful as it sounds. Maybe try an optional constraint of the other resources on the DRBD promotion. That would make it more likely that all the resources end up starting in the same transition. > Cheers, > Jens > > -- > Jens Auer | CGI | Software-Engineer > CGI (Germany) GmbH & Co. KG > Rheinstraße 95 | 64295 Darmstadt | Germany > T: +49 6151 36860 154 > jens.a...@cgi.com > Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter > de.cgi.com/pflichtangaben. > > > ________________ > Von: Auer, Jens [jens.a...@cgi.com] > Gesendet: Mittwoch, 21. September 2016 15:10 > An: users@clusterlabs.org > Betreff: [ClusterLabs] kind=Optional order constraint not working at startup > > Hi, > > in my cluster setup I have a couple of resources from which I need to start > some in specific order. Basically I have two cloned resources that should > start after mounting a DRBD filesystem on all nodes plus one resource that > start after the clone sets. It is important that this only impacts the > startup procedure. Once the system is running stopping or starting one of the > clone resources should not impact the other resource's state. From reading > the manual, this should be what a local constraint with kind=Optional > implements. However, when I start the cluster the filesystem is started after > the otehr resources ignoring the ordering constraint. > > My cluster configuration: > pcs cluster setup --name MDA1PFP MDA1PFP-PCS01,MDA1PFP-S01 > MDA1PFP-PCS02,MDA1PFP-S02 > pcs cluster start --all > sleep 5 > crm_attribute --type nodes --node MDA1PFP-PCS01 --name ServerRole --update > PRIME > crm_attribute --type nodes --node MDA1PFP-PCS02 --name ServerRole --update > BACKUP > pcs property set stonith-enabled=false > pcs resource defaults resource-stickiness=100 > > rm -f mda; pcs cluster cib mda > pcs -f mda property set no-quorum-policy=ignore > > pcs -f mda resource create mda-ip ocf:heartbeat:IPaddr2 ip=192.168.120.20 > cidr_netmask=24 nic=bond0 op monitor interval=1s > pcs -f mda constraint location mda-ip prefers MDA1PFP-PCS01=50 > pcs -f mda resource create ping ocf:pacemaker:ping dampen=5s multiplier=1000 > host_list=pf-pep-dev-1 params timeout=1 attempts=3 op monitor interval=1 > --clone > pcs -f mda constraint location mda-ip rule score=-INFINITY pingd lt 1 or > not_defined pingd > > pcs -f mda resource create ACTIVE ocf:heartbeat:dummy > pcs -f mda constraint colocation add ACTIVE with mda-ip score=INFINITY > > pcs -f mda resource create drbd1 ocf:linbit:drbd drbd_resource=shared_fs op > monitor interval=60s > pcs -f mda resource master drbd1_sync drbd1 master-max=1 master-node-max=1 > clone-max=2 clone-node-max=1 notify=true > pcs -f mda constraint colocation add master drbd1_sync with mda-ip > score=INFINITY > > pcs -f mda resource create shared_fs Filesystem device="/dev/drbd1" > directory=/shared_fs fstype="xfs" > pcs -f mda constraint order promote drbd1_sync then start shared_fs > pcs -f mda constraint colocation add shared_fs with master drbd1_sync > score=INFINITY > > pcs -f mda resource create supervisor ocf:pfpep:supervisor params > config="/shared_fs/pfpep.ini" --clone > pcs -f mda resource create snmpAgent ocf:pfpep:snmpAgent params > config="/shared_fs/pfpep.ini" --clone > pcs -f mda resource create clusterSwitchNotification ocf:pfpep:clusterSwitch > params config="/shared_fs/pfpep.ini" > > pcs -f mda constraint order star
Re: [ClusterLabs] kind=Optional order constraint not working at startup
ry) > start shared_fs then start snmpAgent-clone (kind:Optional) > (id:order-shared_fs-snmpAgent-clone-Optional) > start shared_fs then start supervisor-clone (kind:Optional) > (id:order-shared_fs-supervisor-clone-Optional) > start shared_fs then start clusterSwitchNotification (kind:Mandatory) > (id:order-shared_fs-clusterSwitchNotification-mandatory) > start snmpAgent-clone then start supervisor-clone (kind:Optional) > (id:order-snmpAgent-clone-supervisor-clone-Optional) > start supervisor-clone then start clusterSwitchNotification (kind:Optional) > (id:order-supervisor-clone-clusterSwitchNotification-Optional) > promote drbd1_sync then start supervisor-clone (kind:Optional) > (id:order-drbd1_sync-supervisor-clone-Optional) > promote drbd1_sync then start clusterSwitchNotification (kind:Optional) > (id:order-drbd1_sync-clusterSwitchNotification-Optional) > promote drbd1_sync then start snmpAgent-clone (kind:Optional) > (id:order-drbd1_sync-snmpAgent-clone-Optional) > Colocation Constraints: > ACTIVE with mda-ip (score:INFINITY) (id:colocation-ACTIVE-mda-ip-INFINITY) > drbd1_sync with mda-ip (score:INFINITY) (rsc-role:Master) > (with-rsc-role:Started) (id:colocation-drbd1_sync-mda-ip-INFINITY) > shared_fs with drbd1_sync (score:INFINITY) (rsc-role:Started) > (with-rsc-role:Master) (id:colocation-shared_fs-drbd1_sync-INFINITY) > clusterSwitchNotification with shared_fs (score:INFINITY) > (id:colocation-clusterSwitchNotification-shared_fs-INFINITY) > > but it still starts in the wrong order: > Sep 21 14:45:59 MDA1PFP-S01 crmd[3635]: notice: Operation snmpAgent_start_0: > ok (node=MDA1PFP-PCS01, call=39, rc=0, cib-update=45, confirmed=true) > Sep 21 14:45:59 MDA1PFP-S01 crmd[3635]: notice: Operation drbd1_start_0: ok > (node=MDA1PFP-PCS01, call=40, rc=0, cib-update=46, confirmed=true) > Sep 21 14:46:01 MDA1PFP-S01 crmd[3635]: notice: Operation ping_start_0: ok > (node=MDA1PFP-PCS01, call=38, rc=0, cib-update=48, confirmed=true) > Sep 21 14:46:01 MDA1PFP-S01 crmd[3635]: notice: Operation > supervisor_start_0: ok (node=MDA1PFP-PCS01, call=45, rc=0, cib-update=51, > confirmed=true) > Sep 21 14:46:06 MDA1PFP-S01 crmd[3635]: notice: Operation ACTIVE_start_0: ok > (node=MDA1PFP-PCS01, call=48, rc=0, cib-update=57, confirmed=true) > Sep 21 14:46:06 MDA1PFP-S01 crmd[3635]: notice: Operation mda-ip_start_0: ok > (node=MDA1PFP-PCS01, call=47, rc=0, cib-update=59, confirmed=true) > Sep 21 14:46:06 MDA1PFP-S01 crmd[3635]: notice: Operation shared_fs_start_0: > ok (node=MDA1PFP-PCS01, call=55, rc=0, cib-update=62, confirmed=true) > Sep 21 14:46:06 MDA1PFP-S01 crmd[3635]: notice: Operation > clusterSwitchNotification_start_0: ok (node=MDA1PFP-PCS01, call=57, rc=0, > cib-update=64, confirmed=true) > > Best wishes, > Jens > > -- > Jens Auer | CGI | Software-Engineer > CGI (Germany) GmbH & Co. KG > Rheinstraße 95 | 64295 Darmstadt | Germany > T: +49 6151 36860 154 > jens.a...@cgi.com > Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter > de.cgi.com/pflichtangaben. > > CONFIDENTIALITY NOTICE: Proprietary/Confidential information belonging to CGI > Group Inc. and its affiliates may be contained in this message. If you are > not a recipient indicated or intended in this message (or responsible for > delivery of this message to such person), or you think for any reason that > this message may have been addressed to you in error, you may not use or copy > or deliver this message to anyone else. In such case, you should destroy this > message and are asked to notify the sender by reply e-mail. > > > Von: Ken Gaillot [kgail...@redhat.com] > Gesendet: Mittwoch, 21. September 2016 16:30 > An: users@clusterlabs.org > Betreff: Re: [ClusterLabs] kind=Optional order constraint not working at > startup > > On 09/21/2016 09:00 AM, Auer, Jens wrote: >> Hi, >> >> could this be issue 5039 (http://bugs.clusterlabs.org/show_bug.cgi?id=5039)? >> It sounds similar. > > Correct -- "Optional" means honor the constraint only if both resources > are starting *in the same transition*. > > shared_fs has to wait for the DRBD promotion, but the other resources > have no such limitation, so they are free to start before shared_fs. > > The problem is "... only impacts the startup procedure". Pacemaker > doesn't distinguish start-up from any other state of the cluster. Nodes > (and entire partitions of nodes) can come and go at any time, and any or > all resources can be stopped and started again at any time, so > "start-up" is not really as meaningful as it sounds. > > Maybe try an optional constraint of t
Re: [ClusterLabs] kind=Optional order constraint not working at startup
Hi, > >> shared_fs has to wait for the DRBD promotion, but the other resources > >> have no such limitation, so they are free to start before shared_fs. > > Isn't there an implicit limitation by the ordering constraint? I have > > drbd_promote < shared_fs < snmpAgent-clone, and I would expect this to be a > transitive relationship. > > Yes, but shared fs < snmpAgent-Clone is optional, so snmpAgent-Clone is free > to > start without it. I was probably confused by the description in the manual. It says that "* Optional - Only applies if both resources are starting and/or stopping." (from RedHat HA documentation). I assumed that this means e.g. that when all resources are started when I start the cluster the constraint holds. > > What is the meaning of "transition"? Is there any way I can force resource > > actions > into transitions? > > A transition is simply the cluster's response to the current cluster state, > as directed > by the configuration. The easiest way to think of it is as the "steps" as > described > above. > > If the configuration says a service should be running, but the service is not > currently > running, then the cluster will schedule a start action (if possible > considering > constraints, etc.). All such actions that may be scheduled together at one > time is a > "transition". > > You can't really control transitions; you can only control the configuration, > and > transitions result from configuration+state. > > The only way to force actions to take place in a certain order is to use > mandatory > constraints. > > The problem here is that you want the constraint to be mandatory only at > "start- > up". But there really is no such thing. Consider the case where the cluster > stays up, > and for whatever maintenance purpose, you stop all the resources, then start > them > again later. Is that the same as start-up or not? What if you restart all but > one > resource? I think start-up is just a special case of what I think is a dependency for starting a resource. My current understanding is that a mandatory constraint means "If you start/stop resource A then you have to start/stop resource B". An optional constraint says that the constraint only holds when you start/stop two resources together in a single transition. What I want to express is more like a dependency "don't start resource A before resource B has been started at all. State changes of resource B should not impact resource A". I realize this is kind of odd, but if A can tolerate outages of its dependency B, e.g. reconnect, this makes sense. In principle this is what an optional constraint does, but not restricted to a single transition. > I can imagine one possible (but convoluted) way to do something like this, > using > node attributes and rules: > > http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html- > single/Pacemaker_Explained/index.html#idm140521751827232 > > With a rule, you can specify a location constraint that applies, not to a > particular > node, but to any node with a particular value of a particular node attribute. > > You would need a custom resource agent that sets a node attribute. Let's say > it > takes three parameters, the node attribute name, the value to set when > starting (or > do nothing), and the value to set when stopping (or do nothing). (That might > actually be a good idea for a new ocf:pacemaker: > agent.) > > You'd have an instance of this resource grouped with shared-fs, that would > set the > attribute to some magic value when started (say, "1"). > You'd have another instance grouped with snmpAgent-clone that would set it > differently when stopped ("0"). Then, you'd have a location constraint for > snmpAgent-clone with a rule that says it is only allowed on nodes with the > attribute > set to "1". > > With that, snmpAgent-clone would be unable to start until shared-fs had > started at > least once. shared-fs could stop without affecting snmpAgent-clone. If > snmpAgent- > clone stopped, it would reset, so it would require shared-fs again. > > I haven't thought through all possible scenarios, but I think it would give > the > behavior you want. That sounds interesting... I think we explore a solution which could accept restarting our resources. We only used the cloned resource set because we want our processes up and running to minimize outage when doing a failover. Currently, the second server is a passive backup which has everything up and running ready to take over. After the fs switches, it resynchs and then is ready to go. We probably can accept the additional timeout for starting the resources completely, but we have to explore this. Thanks, Jens ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs
Re: [ClusterLabs] kind=Optional order constraint not working at startup
22.09.2016 11:19, Auer, Jens пишет: ... > I think start-up is just a special case of what I think is a dependency for > starting a resource. > My current understanding is that a mandatory constraint means "If you > start/stop resource A then you > have to start/stop resource B". An optional constraint says that the > constraint only holds when > you start/stop two resources together in a single transition. What I want to > express is more like > a dependency "don't start resource A before resource B has been started at > all. State changes of resource B > should not impact resource A". I realize this is kind of odd, but if A can > tolerate outages of its dependency B, > e.g. reconnect, this makes sense. In principle this is what an optional > constraint does, but not restricted > to a single transition. > But if A can tolerate outage of B, why does it matter whether A is started before or after B? By the same logic it should be able to reconnect once B is up? At least that is what I'd expect. ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] kind=Optional order constraint not working at startup
Hi, > But if A can tolerate outage of B, why does it matter whether A is started > before or > after B? By the same logic it should be able to reconnect once B is up? At > least that > is what I'd expect. In our case B is the file system resource that stores the configuration file for resource A. Resource A is a cloned resource that is started on both servers in our cluster. On the active node, A should read the config file from the shared file system. On the passive node it reads a default file. After that the config file is not read anymore and thus the shared filesystem can go down and up again without disturbing the other resource. After moving the filesystem to the passive node for failover, the process updates itself by reading the configuration from the now new ini file. This requires that the shared filesystem is started on the node, but I don't want to restart the process for internal reasons. I could start the processes before the shared filesystem is started and then always re-sync. However this will confuse the users because they don't expect this to happen. In the end we probably will not go with cloned resources and just start them cleanly after the shared filesystem is started on a node. This is much simpler and will solve the ordering problems here. It should also be possible to put everything in a group as they are additionally co-located. Cheers, Jens ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org