Dejan Muhamedagic <deja...@fastmail.fm> wrote: >Hi, > >On Tue, Feb 09, 2016 at 05:15:15PM +0300, Vladislav Bogdanov wrote: >> 09.02.2016 16:31, Kristoffer Grönlund wrote: >> >Vladislav Bogdanov <bub...@hoster-ok.com> writes: >> > >> >>Hi, >> >> >> >>when performing a delete operation, crmsh (2.2.0) having -F tries >> >>to stop passed op arguments and then waits for DC to become idle. >> >> >> > >> >Hi again, >> > >> >I have pushed a fix that only waits for DC if any resources were >> >actually stopped: >https://github.com/ClusterLabs/crmsh/commit/164aa48 >> >> Great! >> >> > >> >> >> >>More, it may be worth checking stop-orphan-resources property and >pass stop >> >>work to pacemaker if it is set to true. >> > >> >I am a bit concerned that this might not be 100% reliable. I found >an >> >older discussion regarding this and the recommendation from David >Vossel >> >then was to always make sure resources were stopped before removing >> >them, and not relying on stop-orphan-resources to clean things up >> >correctly. His example of when this might not work well is when >removing >> >a group, as the group members might get stopped out-of-order. >> >> OK, I agree. That was just an idea. >> >> > >> >At the same time, I have thought before that the current >functionality >> >is not great. Having to stop resources before removing them is if >> >nothing else annoying! I have a tentative change proposal to this >where >> >crmsh would stop the resources even if --force is not set, and there >> >would be a flag to pass to stop to get it to ignore whether >resources >> >are running, since that may be useful if the resource is >misconfigured >> >and the stop action doesn't work. >> >> That should result in fencing, no? I think that is RA issue if that >> happens. > >Right. Unfortunately, this case often gets too little attention; >people typically test with good and working configurations only. >The first time we hear about it is from some annoyed user who's >node got fenced for no good reason. Even worse, with some bad >configurations, it can happen that the nodes get fenced in a >round-robin fashion, which certainly won't make your time very >productive. > >> Particularly, imho RAs should not run validate_all on stop >> action. > >I'd disagree here. If the environment is no good (bad >installation, missing configuration and similar), then the stop >operation probably won't do much good. Ultimately, it may depend >on how the resource is managed. In ocf-rarun, validate_all is >run, but then the operation is not carried out if the environment >is invalid. In particular, the resource is considered to be >stopped, and the stop operation exits with success. One of the >most common cases is when the software resides on shared >non-parallel storage.
Well, I'd reword. Generally, RA should not exit with error if validation fails on stop. Is that better? > >BTW, handling the stop and monitor/probe operations was the >primary motivation to develop ocf-rarun. It's often quite >difficult to get these things right. > >Cheers, > >Dejan > > >> Best, >> Vladislav >> >> >> _______________________________________________ >> Users mailing list: Users@clusterlabs.org >> http://clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: >http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > >_______________________________________________ >Users mailing list: Users@clusterlabs.org >http://clusterlabs.org/mailman/listinfo/users > >Project Home: http://www.clusterlabs.org >Getting started: >http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org