On Fri, Apr 2, 2021 at 2:04 PM Strahil Nikolov <hunter86...@yahoo.com> wrote:
> Hi Reid, > > I will check it out in Monday, but I'm pretty sure I created an order set > that first stops the topology and only then it stops the nfs-active. > > Yet, I made the stupid decision to prevent ocf:heartbeat:Filesystem (and > setting a huge timeout for the stop operation) from killing those 2 SAP > processes which led to 'I can't umount, giving up'-like notification and of > course fenced the entire cluster :D . > > Note taken, stonith has now different delays , and Filesystem can kill the > processes. > > As per the SAP note from Andrei, these could really be 'fast restart' > mechanisms in HANA 2.0 and it looks safe to be killed (will check with SAP > about that). > > > P.S: Is there a way to remove a whole set in pcs , cause it's really > irritating when the stupid command wipes the resource from multiple order > constraints? > If you mean a whole constraint set, then yes -- run `pcs constraint --full` to get a list of all constraints with their constraint IDs. Then run `pcs constraint remove <constraint_id>` to remove a particular constraint. This can include set constraints. > > Best Regards, > Strahil Nikolov > > > > On Fri, Apr 2, 2021 at 23:44, Reid Wahl > <nw...@redhat.com> wrote: > Hi, Strahil. > > Based on the constraints documented in the article you're following (RH KB > solution 5423971), I think I see what's happening. > > The SAPHanaTopology resource requires the appropriate nfs-active attribute > in order to run. That means that if the nfs-active attribute is set to > false, the SAPHanaTopology resource must stop. > > However, there's no rule saying SAPHanaTopology must finish stopping > before the nfs-active attribute resource stops. In fact, it's quite the > opposite: the SAPHanaTopology resource stops only after the nfs-active > resource stops. > > At the same time, the NFS resources are allowed to stop after the > nfs-active attribute resource has stopped. So the NFS resources are > stopping while the SAPHana* resources are likely still active. > > Try something like this: > # pcs constraint order hana_nfs1_active-clone then > SAPHanaTopology_<SID>_<instance_num>-clone kind=Optional > # pcs constraint order hana_nfs2_active-clone then > SAPHanaTopology_<SID>_<instance_num>-clone kind=Optional > > This says "if both hana_nfs1_active and SAPHanaTopology are scheduled to > start, then make hana_nfs1_active start first. If both are scheduled to > stop, then make SAPHanaTopology stop first." > > "kind=Optional" means there's no order dependency unless both resources > are already going to be scheduled for the action. I'm using kind=Optional > here even though kind=Mandatory (the default) would make sense, because > IIRC there were some unexpected interactions with ordering constraints for > clones, where events on one node had unwanted effects on other nodes. > > I'm not able to test right now since setting up an environment for this > even with dummy resources is non-trivial -- but you're welcome to try this > both with and without kind=Optional if you'd like. > > Please let us know how this goes. > > On Fri, Apr 2, 2021 at 2:20 AM Strahil Nikolov <hunter86...@yahoo.com> > wrote: > > Hello All, > > I am testing the newly built HANA (Scale-out) cluster and it seems that: > Neither SAPHanaController, nor SAPHanaTopology are stopping the HANA when > I put the nodes (same DC = same HANA) in standby. This of course leads to a > situation where the NFS cannot be umounted and despite the stop timeout - > leads to fencing(on-fail=fence). > > I thought that the Controller resource agent is stopping the HANA and the > slave role should not be 'stopped' before that . > > Maybe my expectations are wrong ? > > Best Regards, > Strahil Nikolov > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > > > > -- > Regards, > > > Reid Wahl, RHCA > Senior Software Maintenance Engineer, Red Hat > CEE - Platform Support Delivery - ClusterHA > > -- Regards, Reid Wahl, RHCA Senior Software Maintenance Engineer, Red Hat CEE - Platform Support Delivery - ClusterHA
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/