On Tue, 2020-08-18 at 12:30 -0500, Ken Gaillot wrote: > On Tue, 2020-08-18 at 16:47 +0200, Lentes, Bernd wrote: > > > > ----- On Aug 17, 2020, at 5:09 PM, kgaillot kgail...@redhat.com > > wrote: > > > > > > > > I checked all relevant pe-files in this time period. > > > > This is what i found out (i just write the important entries): > > > > > > > > Executing cluster transition: > > > > * Resource action: vm_nextcloud stop on ha-idg-2 > > > > Revised cluster status: > > > > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > > > > > > > > ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe- > > > > input- > > > > 3118 -G transition-4516.xml -D transition-4516.dot > > > > Current cluster status: > > > > Node ha-idg-1 (1084777482): standby > > > > Online: [ ha-idg-2 ] > > > > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > > > > <============== vm_nextcloud is stopped > > > > Transition Summary: > > > > * Shutdown ha-idg-1 > > > > Executing cluster transition: > > > > * Resource action: vm_nextcloud stop on ha-idg-1 <==== why > > > > stop ? > > > > It is already stopped > > > > > > I'm not sure, I'd have to see the pe input. > > > > You find it here: > > https://hmgubox2.helmholtz-muenchen.de/index.php/s/WJGtodMZ9k7rN29 > > This appears to be a scheduler bug.
Fix is in master branch and will land in 2.0.5 expected at end of the year https://github.com/ClusterLabs/pacemaker/pull/2146 > The scheduler considers a migration to be "dangling" if it has a > record > of a failed migrate_to on the source node, but no migrate_from on the > target node (and no migrate_from or start on the source node, which > would indicate a later full restart or reverse migration). > > In this case, any migrate_from on the target has since been > superseded > by a failed start and a successful stop, so there is no longer a > record > of it. Therefore the migration is considered dangling, which requires > a > full stop on the source node. > > However in this case we already have a successful stop on the source > node after the failed migrate_to, and I believe that should be > sufficient to consider it no longer dangling. > > > > > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > > > > <======= > > > > vm_nextcloud is stopped > > > > Transition Summary: > > > > * Fence (Off) ha-idg-1 'resource actions are unrunnable' > > > > Executing cluster transition: > > > > * Fencing ha-idg-1 (Off) > > > > * Pseudo action: vm_nextcloud_stop_0 <======= why stop ? It > > > > is > > > > already stopped ? > > > > Revised cluster status: > > > > Node ha-idg-1 (1084777482): OFFLINE (standby) > > > > Online: [ ha-idg-2 ] > > > > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > > > > > > > > I don't understand why the cluster tries to stop a resource > > > > which > > > > is > > > > already stopped. > > > > Bernd > > Helmholtz Zentrum München > > > > Helmholtz Zentrum Muenchen > > Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) > > Ingolstaedter Landstr. 1 > > 85764 Neuherberg > > www.helmholtz-muenchen.de > > Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling > > Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, > > Kerstin > > Guenther > > Registergericht: Amtsgericht Muenchen HRB 6466 > > USt-IdNr: DE 129521671 -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/