On 13 Feb 2015, at 8:38 pm, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de
wrote:
Hello!
I have some questions on pacemakers's resource migration. We have a Xen host
that has some problems (still to be investigated) that causes some VM disk
not be be ready for use.
When tyring to migrate a VM frem the bad host to a good host through
pacemaker, migration seemed to hang. At some state the source VM was no
longer present on the bad host (Unable to find domain 'v09'), but pacemaker
still tried a migration:
crmd[6779]: notice: te_rsc_command: Initiating action 100: migrate_from
prm_xen_v09_migrate_from_0 on h05
Only after the timeout CRM realized that there is a problem:
crmd[6779]: warning: status_from_rc: Action 100 (prm_xen_v09_migrate_from_0)
on h05 failed (target: 0 vs. rc: 1): Error
After that CRM still stried a stop on the source host (h10) (and on the
destination host):
crmd[6779]: notice: te_rsc_command: Initiating action 98: stop
prm_xen_v09_stop_0 on h10
crmd[6779]: notice: te_rsc_command: Initiating action 26: stop
prm_xen_v09_stop_0 on h05
Q1: Is this the way it should work?
Mostly, but the agent should have detected the condition earlier and returned
an error (instead of timing out).
Before that we had the same situation (thae bad host had been set to
standby) when someone tired of waiting so long destroyed the affected Xen
VMS on the source host while the cluster was migrating. Eventually the VMs
came up (restarted instead of being live migrated) on the good hosts.
Then we shutdown OpenAIS on the bad host, installed updates and rebooted the
bad host (during reboot OpenAIS was started (still standby)).
To my surprise pacemaker thought the VMS were still running on the bad host
and initiated a migration.
That would be coming from the resource agent.
As there were no source VMs on the bad host, but alle the affected VMs were
running on some good host, CRM stutdown the VMs on the good hostss, just to
restart them.
Q2: Ist this expected behavior? I can hardly believe!
Nope, fix the agent :)
Software is SLES11 SP3 with pacemaker-1.1.11-0.7.53 (and related) on all
hosts.
Regards,
Ulrich
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems