Hi Ulrich, thanks for the answer, as Ken explained me, there isnt any way to prevent earlier members from running if a later member has no available node, if no node is available for the failed member, then it will just remain stopped,and the earlier members will stay active where they are. i really hope was a solution or workaorund for this, but as ken clarify, pacemaker cant hadle this exceptions.
Many thanks for your quick and effective support. Have a good evening! Damiano Il giorno gio 28 gen 2021 alle ore 11:15 Ulrich Windl < ulrich.wi...@rz.uni-regensburg.de> ha scritto: > >>> damiano giuliani <damianogiulian...@gmail.com> schrieb am 27.01.2021 > um > 19:25 > in Nachricht > <CAG=zYNOx-R=wKbhtm=4N7qaoYKE=oforvq7ja0jr17oyjgq...@mail.gmail.com>: > > Hi Andrei, Thanks for ur help. > > if one of my resource in the group fails or the primary node went down ( > > in my case acspcmk-02 ), the probe notices it and pacemaker tries to > > restart the whole resource group on the second node. > > if the second node cant run one of my grouped resources, it tries to stop > > them. > > And what exactly is what you want? The behavior described it how the > cluster > handles it normally. > > > > > > > i attached my cluster status; my primary node ( acspcmk-02 ) fails and > the > > resource group tries to restart on the acspcmk-01, i keep broken the > > resource "lta-subscription-backend-ope-s3" on purpose and as you can see > > some grouped resources are still started.. > > i would like to know how achive a condition that the resource group must > > start properly for each resources, if not stop all the group without some > > services still up and running. > > > > > > 2 nodes configured > > 28 resources configured > > > > Online: [ acspcmk-01 ] > > OFFLINE: [ acspcmk-02 ] > > > > Full list of resources: > > > > Clone Set: lta-odata-frontend-ope-s1-clone [lta-odata-frontend-ope-s1] > > Started: [ acspcmk-01 ] > > Stopped: [ acspcmk-02 ] > > Clone Set: lta-odata-frontend-ope-s2-clone [lta-odata-frontend-ope-s2] > > Started: [ acspcmk-01 ] > > Stopped: [ acspcmk-02 ] > > Clone Set: lta-odata-frontend-ope-s3-clone [lta-odata-frontend-ope-s3] > > Started: [ acspcmk-01 ] > > Stopped: [ acspcmk-02 ] > > Clone Set: s1ltaestimationtime-clone [s1ltaestimationtime] > > Started: [ acspcmk-01 ] > > Stopped: [ acspcmk-02 ] > > Clone Set: s2ltaestimationtime-clone [s2ltaestimationtime] > > Started: [ acspcmk-01 ] > > Stopped: [ acspcmk-02 ] > > Clone Set: s3ltaestimationtime-clone [s3ltaestimationtime] > > Started: [ acspcmk-01 ] > > Stopped: [ acspcmk-02 ] > > Clone Set: openresty-clone [openresty] > > Started: [ acspcmk-01 ] > > Stopped: [ acspcmk-02 ] > > Resource Group: LTA_SINGLE_RESOURCES > > VIP (ocf::heartbeat:IPaddr2): Started acspcmk-01 > > lta-subscription-backend-ope-s1 > > (systemd:lta-subscription-backend-ope-s1): Started acspcmk-01 > > lta-subscription-backend-ope-s2 > > (systemd:lta-subscription-backend-ope-s2): Started acspcmk-01 > > lta-subscription-backend-ope-s3 > > (systemd:lta-subscription-backend-ope-s3): Stopped > > s1ltaquotaservice (systemd:s1ltaquotaservice): Stopped > > s2ltaquotaservice (systemd:s2ltaquotaservice): Stopped > > s3ltaquotaservice (systemd:s3ltaquotaservice): Stopped > > s1ltarolling (systemd:s1ltarolling): Stopped > > s2ltarolling (systemd:s2ltarolling): Stopped > > s3ltarolling (systemd:s3ltarolling): Stopped > > s1srvnotificationdispatcher > > (systemd:s1srvnotificationdispatcher): Stopped > > s2srvnotificationdispatcher > > (systemd:s2srvnotificationdispatcher): Stopped > > s3srvnotificationdispatcher > > (systemd:s3srvnotificationdispatcher): Stopped > > > > Failed Resource Actions: > > * lta-subscription-backend-ope-s3_start_0 on acspcmk-01 'unknown error' > > (1): call=466, status=complete, exitreason='', > > last-rc-change='Wed Jan 27 13:00:21 2021', queued=0ms, exec=2128ms > > > > Daemon Status: > > corosync: active/disabled > > pacemaker: active/disabled > > pcsd: active/enabled > > sbd: active/enabled > > > > > > I hope i explained my problem at my best, > > > > Thanks for your time and help. > > > > Good Evening > > > > Damiano > > > > Il giorno mer 27 gen 2021 alle ore 19:03 Andrei Borzenkov < > > arvidj...@gmail.com> ha scritto: > > > >> 27.01.2021 19:06, damiano giuliani пишет: > >> > Hi all im pretty new to the clusters, im struggling trying to > configure > a > >> > bounch of resources and test how they failover.my need is to start and > >> > manage a group of resources as one (in order to archive this a > resource > >> > group has been created), and if one of them cant run and still fails, > the > >> > cluster will try to restart the resource group in the secondary node, > if > >> it > >> > cant run the all the resource toghter disable all the resource group. > >> > i would like to know if there is a way to set the cluster to disable > all > >> > the resources of the group (or the group itself) if it cant be run all > >> the > >> > resoruces somewhere. > >> > > >> > >> That's what pacemaker group does. I am not sure what you mean with > >> "disable all resources". If resource fail count on a node exceeds > >> threshold, this node is banned from running resource. If resource failed > >> on every node, no node can run it until you clear fail count. > >> > >> "Disable resource" in pacemaker would mean setting its target-role to > >> stopped. That does not happen automatically (at least I am not aware of > >> it). > >> _______________________________________________ > >> Manage your subscription: > >> https://lists.clusterlabs.org/mailman/listinfo/users > >> > >> ClusterLabs home: https://www.clusterlabs.org/ > >> > > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ >
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/