[ClusterLabs] SOLVED: Antw: Re: Antw: can't live migrate VirtualDomain which is part of a group

Lentes, Bernd Mon, 08 May 2017 08:01:16 -0700


----- On Apr 25, 2017, at 1:37 PM, Ulrich Windl 
ulrich.wi...@rz.uni-regensburg.de wrote:


>>>> "Lentes, Bernd" <bernd.len...@helmholtz-muenchen.de> schrieb am 25.04.2017 
>>>> um
> 11:02 in Nachricht
> <406563603.26964612.1493110931994.javamail.zim...@helmholtz-muenchen.de>:
> 
>> 
>> ----- On Apr 25, 2017, at 8:08 AM, Ulrich Windl
>> ulrich.wi...@rz.uni-regensburg.de
>> wrote:
>> 
>>> Berdn
>>> 
>>> you are long enough on this list to know that the reason for your failure is
>>> most likely to be found in the logs which you did not provide. Couldn't you
>>> find out yourself from the logs?
>>> 
>>> Regards,
>>> Ulrich
>>> 
>>> 
>> 
>> Hi Ulrich,
>> 
>> 
>> if i had found something in the log i would not have asked.
>> From what i understand from Ken is that the error is the resource IPaddr
>> which is by default not able to live-migrate.
>> 
>> Just a few minutes ago i tried again to live migrate the VirtualDomain
>> resource, and again it shutted down on one node
>> and rebooted on the other.
>> 
>> Here is the respective excerpt from the log. Maybe you can point out to me
>> where i can find the reason for the problem:
>> 
> 
> Usually there is a kind of action summary that is logged before the first 
> action
> is executed. If any of these actions fail, the outcome could be different from
> what was intended. In your case there does not seem to be an error in any
> action, so the outcome is what was planned (by crm). So (as we learned) the
> plans have to be changed.
> I see a migration of prim_vnc_ip_mausdb via restart and some operation with
> prim_vnc_ip_mousdb is already in progress...
> 
>> Apr 25 10:54:18 ha-idg-2 crmd[8587]:   notice: te_rsc_command: Initiating
>> action 52: stop prim_vnc_ip_mausdb_stop_0 on ha-idg-1
>> Apr 25 10:54:18 ha-idg-2 crmd[8587]:   notice: te_rsc_command: Initiating
>> action 53: start prim_vnc_ip_mausdb_start_0 on ha-idg-2 (local)
>> Apr 25 10:54:18 ha-idg-2 IPaddr(prim_vnc_ip_mausdb)[25724]: INFO: Using
>> calculated netmask for 146.107.235.161: 255.255.255.0
>> Apr 25 10:54:18 ha-idg-2 IPaddr(prim_vnc_ip_mausdb)[25724]: INFO: eval
>> ifconfig br0:0 146.107.235.161 netmask 255.255.255.0 broadcast
>> 146.107.235.255
>> Apr 25 10:54:18 ha-idg-2 crmd[8587]:   notice: process_lrm_event: Operation
>> prim_vnc_ip_mausdb_start_0: ok (node=ha-idg-2, call=283, rc=0, 
>> cib-update=1567,
>> confirmed=true)
>> Apr 25 10:54:18 ha-idg-2 crmd[8587]:   notice: te_rsc_command: Initiating
>> action 55: start prim_vm_mausdb_start_0 on ha-idg-2 (local)
>> Apr 25 10:54:19 ha-idg-2 kernel: [583994.652325] device vnet0 entered
>> promiscuous mode
>> Apr 25 10:54:19 ha-idg-2 kernel: [583994.718044] br0: port 2(vnet0) entering
>> forwarding state
>> Apr 25 10:54:19 ha-idg-2 kernel: [583994.718049] br0: port 2(vnet0) entering
>> forwarding state
>> Apr 25 10:54:20 ha-idg-2 crmd[8587]:   notice: handle_request: Current ping
>> state: S_TRANSITION_ENGINE
>> Apr 25 10:54:21 ha-idg-2 crmd[8587]:   notice: handle_request: Current ping
>> state: S_TRANSITION_ENGINE
>> Apr 25 10:54:22 ha-idg-2 crmd[8587]:   notice: process_lrm_event: Operation
>> prim_vm_mausdb_start_0: ok (node=ha-idg-2, call=284, rc=0, cib-update=1568,
>> confirmed=true)
>> Apr 25 10:54:22 ha-idg-2 crmd[8587]:   notice: te_rsc_command: Initiating
>> action 56: monitor prim_vm_mausdb_monitor_30000 on ha-idg-2 (local)
>> Apr 25 10:54:22 ha-idg-2 crmd[8587]:   notice: process_lrm_event: Operation
>> prim_vm_mausdb_monitor_30000: ok (node=ha-idg-2, call=285, rc=0,
>> cib-update=1569, confirmed=false)
>> Apr 25 10:54:22 ha-idg-2 crmd[8587]:   notice: run_graph: Transition 817
>> (Complete=10, Pending=0, Fired=0, Skipped=0, Incomplete=0,
>> Source=/var/lib/pacemaker/pengine/pe-input-1601.bz2): Complete
>> Apr 25 10:54:22 ha-idg-2 crmd[8587]:   notice: do_state_transition: State
>> transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
>> cause=C_FSA_INTERNAL origin=notify_crmd ]
>> Apr 25 10:54:24 ha-idg-2 crmd[8587]:   notice: handle_request: Current ping
>> state: S_IDLE
>> Apr 25 10:54:25 ha-idg-2 crmd[8587]:   notice: handle_request: Current ping
>> state: S_IDLE
>> 
>> 
>> 


for the sake of completeness:

i changed the RA, but still couldn't live migrate the complete group.
What i found out then is that i related the start/stop of the primitive IPaddr 
wrong to the actions migrate_to and migrate_from.
First i related migrate_to with ip_start and migrate_from with ip_stop.
But then i could just live migrate in one direction, not vice versa.
When i related migrate_to with ip_stop and migrate_from with ip_start 
everything went fine.
And i forgot to set the monitor operations in the definition of the resource.
I thought they are added by default. They aren't ! And they are very important 
:-)


This is now my resource:

primitive prim_vnc_ip_mausdb ocf:lentes:IPaddr \
        params ip=146.107.235.161 nic=br0 cidr_netmask=24 \
        op migrate_from interval=0 timeout=30 \
        op migrate_to interval=0 timeout=30 \
        op monitor interval=10 timeout=20 \
        meta allow-migrate=true is-managed=true

And here are my changes to the RA:

ha-idg-1:~ # diff /usr/lib/ocf/resource.d/lentes/IPaddr 
/usr/lib/ocf/resource.d/heartbeat/IPaddr

5d4
< # modified by Bernd Lentes, 25042017, Livemigration added (migrate_to, 
migrate_from)
41c40
< USAGE="usage: $0 
{start|stop|status|monitor|migrate_to|migrate_from|validate-all|meta-data}";
---
> USAGE="usage: $0 {start|stop|status|monitor|validate-all|meta-data}";
70d68
< Live-Migration added Bernd Lentes 25042017
202,203d199
< <action name="migrate_to"    timeout="20s" />
< <action name="migrate_from"    timeout="20s" />
889,890d884
<     migrate_to)               ip_stop;;
<     migrate_from)     ip_validate_all && ip_start;;



I can now live migrate the group of a VirtualDomain and an IPaddr resource in 
both directions (two-node cluster).

Thanks for any help.

Bernd
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe
Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671


_______________________________________________
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] SOLVED: Antw: Re: Antw: can't live migrate VirtualDomain which is part of a group

Reply via email to