Hi, The current behavior of cancelled migration with Pacemaker 1.1.16 with a resource implementing push migration:
# /usr/sbin/crm_resource --ban -r vm-conv-4 vhbl03 crmd[10017]: notice: State transition S_IDLE -> S_POLICY_ENGINE vhbl03 pengine[10016]: notice: Migrate vm-conv-4#011(Started vhbl07 -> vhbl04) vhbl03 crmd[10017]: notice: Initiating migrate_to operation vm-conv-4_migrate_to_0 on vhbl07 vhbl03 pengine[10016]: notice: Calculated transition 4633, saving inputs in /var/lib/pacemaker/pengine/pe-input-1069.bz2 [...] At this point, with the migration still ongoing, I wanted to get rid of the constraint: # /usr/sbin/crm_resource --clear -r vm-conv-4 vhbl03 crmd[10017]: notice: Transition aborted by deletion of rsc_location[@id='cli-ban-vm-conv-4-on-vhbl07']: Configuration change vhbl07 crmd[10233]: notice: Result of migrate_to operation for vm-conv-4 on vhbl07: 0 (ok) vhbl03 crmd[10017]: notice: Transition 4633 (Complete=6, Pending=0, Fired=0, Skipped=1, Incomplete=6, Source=/var/lib/pacemaker/pengine/pe-input-1069.bz2): Stopped vhbl03 pengine[10016]: notice: Resource vm-conv-4 can no longer migrate to vhbl04. Stopping on vhbl07 too vhbl03 pengine[10016]: notice: Reload vm-conv-4#011(Started vhbl07) vhbl03 pengine[10016]: notice: Calculated transition 4634, saving inputs in /var/lib/pacemaker/pengine/pe-input-1070.bz2 vhbl03 crmd[10017]: notice: Initiating stop operation vm-conv-4_stop_0 on vhbl07 vhbl03 crmd[10017]: notice: Initiating stop operation vm-conv-4_stop_0 on vhbl04 vhbl03 crmd[10017]: notice: Initiating reload operation vm-conv-4_reload_0 on vhbl04 This recovery was entirely unnecessary, as the resource successfully migrated to vhbl04 (the migrate_from operation does nothing). Pacemaker does not know this, but is there a way to educate it? I think in this special case it is possible to redesign the agent making migrate_to a no-op and doing everything in migrate_from, which would significantly reduce the window between the start points of the two "halfs", but I'm not sure that would help in the end: Pacemaker could still decide to do an unnecessary stop+start recovery. Would it? I failed to find any documentation on recovery from aborted migration transitions. I don't expect on-fail (for migrate_* ops, not me) to apply here, does it? Side question: why initiate a reload in any case, like above? Even more side question: could you please consider using space instead of TAB in syslog messages? (Actually, I wouldn't mind getting rid of them altogether in any output.) -- Thanks, Feri _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org