subject:"\[ClusterLabs\] Resources suddenly get target\-role=\"stopped\""

Re: [ClusterLabs] Resources suddenly get target-role="stopped"

2015-12-08 Thread Boyan Ikonomov

Mistery solved:

Never put:
[ "${2}" = release ] && crm resource stop VMA_${1}
inside 
/etc/libvirt/hooks/qemu


Very wrong decision.

On Monday 07 December 2015 16:49:01 emmanuel segura wrote:
> the next time show your full config unless your config have something
> special that you can't show.
> 
> 2015-12-07 9:08 GMT+01:00 Klechomir :
> > Hi,
> > Sorry didn't get your point.
> > 
> > The xml of the VM is on a active-active drbd drive with ocfs2 fs on it and
> > is visible from both nodes.
> > The live migration is always successful.
> > 
> > On 4.12.2015 19:30, emmanuel segura wrote:
> >> I think the xml of your vm need to available on both nodes, but your
> >> using a failover resource Filesystem_CDrive1, because pacemaker
> >> monitor resource on both nodes to check if they are running in
> >> multiple nodes.
> >> 
> >> 2015-12-04 18:06 GMT+01:00 Ken Gaillot :
> >>> On 12/04/2015 10:22 AM, Klechomir wrote:
>  Hi list,
>  My issue is the following:
>  
>  I have very stable cluster, using Corosync 2.1.0.26 and Pacemaker 1.1.8
>  (observed the same problem with Corosync 2.3.5  & Pacemaker 1.1.13-rc3)
>  
>  Bumped on this issue when started playing with VirtualDomain resources,
>  but this seems to be unrelated to the RA.
>  
>  The problem is that without apparent reason a resource gets
>  target-role="Stopped". This happens after (successful) migration, or
>  after failover., or after VM restart .
>  
>  My tests showed that changing the resource name fixes this problem, but
>  this seems to be a temporary workaround.
>  
>  The resource configuration is:
>  primitive VMA_VM1 ocf:heartbeat:VirtualDomain \
>  
>   params config="/NFSvolumes/CDrive1/VM1/VM1.xml"
>  
>  hypervisor="qemu:///system" migration_transport="tcp" \
>  
>   meta allow-migrate="true" target-role="Started" \
>   op start interval="0" timeout="120s" \
>   op stop interval="0" timeout="120s" \
>   op monitor interval="10" timeout="30" depth="0" \
>   utilization cpu="1" hv_memory="925"
>  
>  order VM_VM1_after_Filesystem_CDrive1 inf: Filesystem_CDrive1 VMA_VM1
>  
>  Here is the log from one such stop, after successful migration with
>  "crm
>  migrate resource VMA_VM1":
>  
>  Dec 04 15:18:22 [3818929] CLUSTER-1   crmd:debug: cancel_op:
>  Cancelling op 5564 for VMA_VM1 (VMA_VM1:5564)
>  Dec 04 15:18:22 [4434] CLUSTER-1   lrmd: info:
>  cancel_recurring_action: Cancelling operation
>  VMA_VM1_monitor_1
>  Dec 04 15:18:23 [3818929] CLUSTER-1   crmd:debug: cancel_op:
>  Op 5564 for VMA_VM1 (VMA_VM1:5564): cancelled
>  Dec 04 15:18:23 [3818929] CLUSTER-1   crmd:debug:
>  do_lrm_rsc_op:Performing
>  key=351:199:0:fb6e486a-023a-4b44-83cf-4c0c208a0f56
>  op=VMA_VM1_migrate_to_0
>  VirtualDomain(VMA_VM1)[1797698]:2015/12/04_15:18:23 DEBUG:
>  Virtual domain VM1 is currently running.
>  VirtualDomain(VMA_VM1)[1797698]:2015/12/04_15:18:23 INFO: VM1:
>  Starting live migration to CLUSTER-2 (using virsh
>  --connect=qemu:///system --quiet migrate --live  VM1
>  qemu+tcp://CLUSTER-2/system ).
>  Dec 04 15:18:24 [3818929] CLUSTER-1   crmd: info:
>  process_lrm_event:LRM operation VMA_VM1_monitor_1 (call=5564,
>  status=1, cib-update=0, confirmed=false) Cancelled
>  Dec 04 15:18:24 [3818929] CLUSTER-1   crmd:debug:
>  update_history_cache: Updating history for 'VMA_VM1' with
>  monitor op
>  VirtualDomain(VMA_VM1)[1797698]:2015/12/04_15:18:26 INFO: VM1:
>  live migration to CLUSTER-2 succeeded.
>  Dec 04 15:18:26 [4434] CLUSTER-1   lrmd:debug:
>  operation_finished:  VMA_VM1_migrate_to_0:1797698 - exited with
>  rc=0
>  Dec 04 15:18:26 [4434] CLUSTER-1   lrmd:   notice:
>  operation_finished:  VMA_VM1_migrate_to_0:1797698 [
>  2015/12/04_15:18:23 INFO: VM1: Starting live migration to CLUSTER-2
>  (using virsh --connect=qemu:///system --quiet migrate --live  VM1
>  qemu+tcp://CLUSTER-2/system ). ]
>  Dec 04 15:18:26 [4434] CLUSTER-1   lrmd:   notice:
>  operation_finished:  VMA_VM1_migrate_to_0:1797698 [
>  2015/12/04_15:18:26 INFO: VM1: live migration to CLUSTER-2 succeeded. ]
>  Dec 04 15:18:27 [3818929] CLUSTER-1   crmd:debug:
>  create_operation_update:  do_update_resource: Updating resouce
>  VMA_VM1 after complete migrate_to op (interval=0)
>  Dec 04 15:18:27 [3818929] CLUSTER-1   crmd:   notice:
>  process_lrm_event:LRM operation VMA_VM1_migrate_to_0 (call=5697,
>  rc=0, cib-update=89, confirmed=true) ok
>  Dec 04 15:18:27 [3818929] CLUSTER-1   crmd:debug:
>  update_history_cache:

Re: [ClusterLabs] Resources suddenly get target-role="stopped"

2015-12-07 Thread Klechomir


Hi Ken,
The comments are in the text.

On 4.12.2015 19:06, Ken Gaillot wrote:

On 12/04/2015 10:22 AM, Klechomir wrote:

Hi list,
My issue is the following:

I have very stable cluster, using Corosync 2.1.0.26 and Pacemaker 1.1.8
(observed the same problem with Corosync 2.3.5  & Pacemaker 1.1.13-rc3)

Bumped on this issue when started playing with VirtualDomain resources,
but this seems to be unrelated to the RA.

The problem is that without apparent reason a resource gets
target-role="Stopped". This happens after (successful) migration, or
after failover., or after VM restart .

My tests showed that changing the resource name fixes this problem, but
this seems to be a temporary workaround.

The resource configuration is:
primitive VMA_VM1 ocf:heartbeat:VirtualDomain \
 params config="/NFSvolumes/CDrive1/VM1/VM1.xml"
hypervisor="qemu:///system" migration_transport="tcp" \
 meta allow-migrate="true" target-role="Started" \
 op start interval="0" timeout="120s" \
 op stop interval="0" timeout="120s" \
 op monitor interval="10" timeout="30" depth="0" \
 utilization cpu="1" hv_memory="925"
order VM_VM1_after_Filesystem_CDrive1 inf: Filesystem_CDrive1 VMA_VM1

Here is the log from one such stop, after successful migration with "crm
migrate resource VMA_VM1":

Dec 04 15:18:22 [3818929] CLUSTER-1   crmd:debug: cancel_op:
Cancelling op 5564 for VMA_VM1 (VMA_VM1:5564)
Dec 04 15:18:22 [4434] CLUSTER-1   lrmd: info:
cancel_recurring_action: Cancelling operation VMA_VM1_monitor_1
Dec 04 15:18:23 [3818929] CLUSTER-1   crmd:debug: cancel_op:
Op 5564 for VMA_VM1 (VMA_VM1:5564): cancelled
Dec 04 15:18:23 [3818929] CLUSTER-1   crmd:debug:
do_lrm_rsc_op:Performing
key=351:199:0:fb6e486a-023a-4b44-83cf-4c0c208a0f56 op=VMA_VM1_migrate_to_0
VirtualDomain(VMA_VM1)[1797698]:2015/12/04_15:18:23 DEBUG:
Virtual domain VM1 is currently running.
VirtualDomain(VMA_VM1)[1797698]:2015/12/04_15:18:23 INFO: VM1:
Starting live migration to CLUSTER-2 (using virsh
--connect=qemu:///system --quiet migrate --live  VM1
qemu+tcp://CLUSTER-2/system ).
Dec 04 15:18:24 [3818929] CLUSTER-1   crmd: info:
process_lrm_event:LRM operation VMA_VM1_monitor_1 (call=5564,
status=1, cib-update=0, confirmed=false) Cancelled
Dec 04 15:18:24 [3818929] CLUSTER-1   crmd:debug:
update_history_cache: Updating history for 'VMA_VM1' with
monitor op
VirtualDomain(VMA_VM1)[1797698]:2015/12/04_15:18:26 INFO: VM1:
live migration to CLUSTER-2 succeeded.
Dec 04 15:18:26 [4434] CLUSTER-1   lrmd:debug:
operation_finished:  VMA_VM1_migrate_to_0:1797698 - exited with rc=0
Dec 04 15:18:26 [4434] CLUSTER-1   lrmd:   notice:
operation_finished:  VMA_VM1_migrate_to_0:1797698 [
2015/12/04_15:18:23 INFO: VM1: Starting live migration to CLUSTER-2
(using virsh --connect=qemu:///system --quiet migrate --live  VM1
qemu+tcp://CLUSTER-2/system ). ]
Dec 04 15:18:26 [4434] CLUSTER-1   lrmd:   notice:
operation_finished:  VMA_VM1_migrate_to_0:1797698 [
2015/12/04_15:18:26 INFO: VM1: live migration to CLUSTER-2 succeeded. ]
Dec 04 15:18:27 [3818929] CLUSTER-1   crmd:debug:
create_operation_update:  do_update_resource: Updating resouce
VMA_VM1 after complete migrate_to op (interval=0)
Dec 04 15:18:27 [3818929] CLUSTER-1   crmd:   notice:
process_lrm_event:LRM operation VMA_VM1_migrate_to_0 (call=5697,
rc=0, cib-update=89, confirmed=true) ok
Dec 04 15:18:27 [3818929] CLUSTER-1   crmd:debug:
update_history_cache: Updating history for 'VMA_VM1' with
migrate_to op
Dec 04 15:18:31 [3818929] CLUSTER-1   crmd:debug: cancel_op:
Operation VMA_VM1:5564 already cancelled
Dec 04 15:18:31 [3818929] CLUSTER-1   crmd:debug:
do_lrm_rsc_op:Performing
key=225:200:0:fb6e486a-023a-4b44-83cf-4c0c208a0f56 op=VMA_VM1_stop_0
VirtualDomain(VMA_VM1)[1798719]:2015/12/04_15:18:31 DEBUG:
Virtual domain VM1 is not running:  failed to get domain 'vm1' domain
not found: no domain with matching name 'vm1'

This looks like the problem. Configuration error?


As far as I checked this is a harmless bug in VirtualDomain RA. It 
downcases the output from "virsh domain info" command, to be able to 
parse the status easily, which prevents matching the domain name.
In any case this error doesn't affect the RA functionality, in this case 
it just finds out that the resource is already stopped, while my big 
concern is why is it stopped.



VirtualDomain(VMA_VM1)[1798719]:2015/12/04_15:18:31 INFO: Domain
VM1 already stopped.
Dec 04 15:18:31 [4434] CLUSTER-1   lrmd:debug:
operation_finished:  VMA_VM1_stop_0:1798719 - exited with rc=0
Dec 04 15:18:31 [4434] CLUSTER-1   lrmd:   notice:
operation_finished:  VMA_VM1_stop_0:1798719 [ 2015/12/04_15:18:31
INFO: Domain VM1 already stopped. ]
Dec 04 15:18:32 [3818929] CLUSTER-1   crmd:debug:

Re: [ClusterLabs] Resources suddenly get target-role="stopped"

2015-12-07 Thread Klechomir


Hi,
Sorry didn't get your point.

The xml of the VM is on a active-active drbd drive with ocfs2 fs on it 
and is visible from both nodes.

The live migration is always successful.


On 4.12.2015 19:30, emmanuel segura wrote:

I think the xml of your vm need to available on both nodes, but your
using a failover resource Filesystem_CDrive1, because pacemaker
monitor resource on both nodes to check if they are running in
multiple nodes.

2015-12-04 18:06 GMT+01:00 Ken Gaillot :

On 12/04/2015 10:22 AM, Klechomir wrote:

Hi list,
My issue is the following:

I have very stable cluster, using Corosync 2.1.0.26 and Pacemaker 1.1.8
(observed the same problem with Corosync 2.3.5  & Pacemaker 1.1.13-rc3)

Bumped on this issue when started playing with VirtualDomain resources,
but this seems to be unrelated to the RA.

The problem is that without apparent reason a resource gets
target-role="Stopped". This happens after (successful) migration, or
after failover., or after VM restart .

My tests showed that changing the resource name fixes this problem, but
this seems to be a temporary workaround.

The resource configuration is:
primitive VMA_VM1 ocf:heartbeat:VirtualDomain \
 params config="/NFSvolumes/CDrive1/VM1/VM1.xml"
hypervisor="qemu:///system" migration_transport="tcp" \
 meta allow-migrate="true" target-role="Started" \
 op start interval="0" timeout="120s" \
 op stop interval="0" timeout="120s" \
 op monitor interval="10" timeout="30" depth="0" \
 utilization cpu="1" hv_memory="925"
order VM_VM1_after_Filesystem_CDrive1 inf: Filesystem_CDrive1 VMA_VM1

Here is the log from one such stop, after successful migration with "crm
migrate resource VMA_VM1":

Dec 04 15:18:22 [3818929] CLUSTER-1   crmd:debug: cancel_op:
Cancelling op 5564 for VMA_VM1 (VMA_VM1:5564)
Dec 04 15:18:22 [4434] CLUSTER-1   lrmd: info:
cancel_recurring_action: Cancelling operation VMA_VM1_monitor_1
Dec 04 15:18:23 [3818929] CLUSTER-1   crmd:debug: cancel_op:
Op 5564 for VMA_VM1 (VMA_VM1:5564): cancelled
Dec 04 15:18:23 [3818929] CLUSTER-1   crmd:debug:
do_lrm_rsc_op:Performing
key=351:199:0:fb6e486a-023a-4b44-83cf-4c0c208a0f56 op=VMA_VM1_migrate_to_0
VirtualDomain(VMA_VM1)[1797698]:2015/12/04_15:18:23 DEBUG:
Virtual domain VM1 is currently running.
VirtualDomain(VMA_VM1)[1797698]:2015/12/04_15:18:23 INFO: VM1:
Starting live migration to CLUSTER-2 (using virsh
--connect=qemu:///system --quiet migrate --live  VM1
qemu+tcp://CLUSTER-2/system ).
Dec 04 15:18:24 [3818929] CLUSTER-1   crmd: info:
process_lrm_event:LRM operation VMA_VM1_monitor_1 (call=5564,
status=1, cib-update=0, confirmed=false) Cancelled
Dec 04 15:18:24 [3818929] CLUSTER-1   crmd:debug:
update_history_cache: Updating history for 'VMA_VM1' with
monitor op
VirtualDomain(VMA_VM1)[1797698]:2015/12/04_15:18:26 INFO: VM1:
live migration to CLUSTER-2 succeeded.
Dec 04 15:18:26 [4434] CLUSTER-1   lrmd:debug:
operation_finished:  VMA_VM1_migrate_to_0:1797698 - exited with rc=0
Dec 04 15:18:26 [4434] CLUSTER-1   lrmd:   notice:
operation_finished:  VMA_VM1_migrate_to_0:1797698 [
2015/12/04_15:18:23 INFO: VM1: Starting live migration to CLUSTER-2
(using virsh --connect=qemu:///system --quiet migrate --live  VM1
qemu+tcp://CLUSTER-2/system ). ]
Dec 04 15:18:26 [4434] CLUSTER-1   lrmd:   notice:
operation_finished:  VMA_VM1_migrate_to_0:1797698 [
2015/12/04_15:18:26 INFO: VM1: live migration to CLUSTER-2 succeeded. ]
Dec 04 15:18:27 [3818929] CLUSTER-1   crmd:debug:
create_operation_update:  do_update_resource: Updating resouce
VMA_VM1 after complete migrate_to op (interval=0)
Dec 04 15:18:27 [3818929] CLUSTER-1   crmd:   notice:
process_lrm_event:LRM operation VMA_VM1_migrate_to_0 (call=5697,
rc=0, cib-update=89, confirmed=true) ok
Dec 04 15:18:27 [3818929] CLUSTER-1   crmd:debug:
update_history_cache: Updating history for 'VMA_VM1' with
migrate_to op
Dec 04 15:18:31 [3818929] CLUSTER-1   crmd:debug: cancel_op:
Operation VMA_VM1:5564 already cancelled
Dec 04 15:18:31 [3818929] CLUSTER-1   crmd:debug:
do_lrm_rsc_op:Performing
key=225:200:0:fb6e486a-023a-4b44-83cf-4c0c208a0f56 op=VMA_VM1_stop_0
VirtualDomain(VMA_VM1)[1798719]:2015/12/04_15:18:31 DEBUG:
Virtual domain VM1 is not running:  failed to get domain 'vm1' domain
not found: no domain with matching name 'vm1'

This looks like the problem. Configuration error?


VirtualDomain(VMA_VM1)[1798719]:2015/12/04_15:18:31 INFO: Domain
VM1 already stopped.
Dec 04 15:18:31 [4434] CLUSTER-1   lrmd:debug:
operation_finished:  VMA_VM1_stop_0:1798719 - exited with rc=0
Dec 04 15:18:31 [4434] CLUSTER-1   lrmd:   notice:
operation_finished:  VMA_VM1_stop_0:1798719 [ 2015/12/04_15:18:31
INFO: Domain VM1 already stopped. ]
Dec 04 15:18:32 [3818929] CLUSTER-1

Re: [ClusterLabs] Resources suddenly get target-role="stopped"

2015-12-04 Thread Ken Gaillot

On 12/04/2015 10:22 AM, Klechomir wrote:
> Hi list,
> My issue is the following:
> 
> I have very stable cluster, using Corosync 2.1.0.26 and Pacemaker 1.1.8
> (observed the same problem with Corosync 2.3.5  & Pacemaker 1.1.13-rc3)
> 
> Bumped on this issue when started playing with VirtualDomain resources,
> but this seems to be unrelated to the RA.
> 
> The problem is that without apparent reason a resource gets
> target-role="Stopped". This happens after (successful) migration, or
> after failover., or after VM restart .
> 
> My tests showed that changing the resource name fixes this problem, but
> this seems to be a temporary workaround.
> 
> The resource configuration is:
> primitive VMA_VM1 ocf:heartbeat:VirtualDomain \
> params config="/NFSvolumes/CDrive1/VM1/VM1.xml"
> hypervisor="qemu:///system" migration_transport="tcp" \
> meta allow-migrate="true" target-role="Started" \
> op start interval="0" timeout="120s" \
> op stop interval="0" timeout="120s" \
> op monitor interval="10" timeout="30" depth="0" \
> utilization cpu="1" hv_memory="925"
> order VM_VM1_after_Filesystem_CDrive1 inf: Filesystem_CDrive1 VMA_VM1
> 
> Here is the log from one such stop, after successful migration with "crm
> migrate resource VMA_VM1":
> 
> Dec 04 15:18:22 [3818929] CLUSTER-1   crmd:debug: cancel_op:   
> Cancelling op 5564 for VMA_VM1 (VMA_VM1:5564)
> Dec 04 15:18:22 [4434] CLUSTER-1   lrmd: info:
> cancel_recurring_action: Cancelling operation VMA_VM1_monitor_1
> Dec 04 15:18:23 [3818929] CLUSTER-1   crmd:debug: cancel_op:   
> Op 5564 for VMA_VM1 (VMA_VM1:5564): cancelled
> Dec 04 15:18:23 [3818929] CLUSTER-1   crmd:debug:
> do_lrm_rsc_op:Performing
> key=351:199:0:fb6e486a-023a-4b44-83cf-4c0c208a0f56 op=VMA_VM1_migrate_to_0
> VirtualDomain(VMA_VM1)[1797698]:2015/12/04_15:18:23 DEBUG:
> Virtual domain VM1 is currently running.
> VirtualDomain(VMA_VM1)[1797698]:2015/12/04_15:18:23 INFO: VM1:
> Starting live migration to CLUSTER-2 (using virsh
> --connect=qemu:///system --quiet migrate --live  VM1
> qemu+tcp://CLUSTER-2/system ).
> Dec 04 15:18:24 [3818929] CLUSTER-1   crmd: info:
> process_lrm_event:LRM operation VMA_VM1_monitor_1 (call=5564,
> status=1, cib-update=0, confirmed=false) Cancelled
> Dec 04 15:18:24 [3818929] CLUSTER-1   crmd:debug:
> update_history_cache: Updating history for 'VMA_VM1' with
> monitor op
> VirtualDomain(VMA_VM1)[1797698]:2015/12/04_15:18:26 INFO: VM1:
> live migration to CLUSTER-2 succeeded.
> Dec 04 15:18:26 [4434] CLUSTER-1   lrmd:debug:
> operation_finished:  VMA_VM1_migrate_to_0:1797698 - exited with rc=0
> Dec 04 15:18:26 [4434] CLUSTER-1   lrmd:   notice:
> operation_finished:  VMA_VM1_migrate_to_0:1797698 [
> 2015/12/04_15:18:23 INFO: VM1: Starting live migration to CLUSTER-2
> (using virsh --connect=qemu:///system --quiet migrate --live  VM1
> qemu+tcp://CLUSTER-2/system ). ]
> Dec 04 15:18:26 [4434] CLUSTER-1   lrmd:   notice:
> operation_finished:  VMA_VM1_migrate_to_0:1797698 [
> 2015/12/04_15:18:26 INFO: VM1: live migration to CLUSTER-2 succeeded. ]
> Dec 04 15:18:27 [3818929] CLUSTER-1   crmd:debug:
> create_operation_update:  do_update_resource: Updating resouce
> VMA_VM1 after complete migrate_to op (interval=0)
> Dec 04 15:18:27 [3818929] CLUSTER-1   crmd:   notice:
> process_lrm_event:LRM operation VMA_VM1_migrate_to_0 (call=5697,
> rc=0, cib-update=89, confirmed=true) ok
> Dec 04 15:18:27 [3818929] CLUSTER-1   crmd:debug:
> update_history_cache: Updating history for 'VMA_VM1' with
> migrate_to op
> Dec 04 15:18:31 [3818929] CLUSTER-1   crmd:debug: cancel_op:   
> Operation VMA_VM1:5564 already cancelled
> Dec 04 15:18:31 [3818929] CLUSTER-1   crmd:debug:
> do_lrm_rsc_op:Performing
> key=225:200:0:fb6e486a-023a-4b44-83cf-4c0c208a0f56 op=VMA_VM1_stop_0
> VirtualDomain(VMA_VM1)[1798719]:2015/12/04_15:18:31 DEBUG:
> Virtual domain VM1 is not running:  failed to get domain 'vm1' domain
> not found: no domain with matching name 'vm1'

This looks like the problem. Configuration error?

> VirtualDomain(VMA_VM1)[1798719]:2015/12/04_15:18:31 INFO: Domain
> VM1 already stopped.
> Dec 04 15:18:31 [4434] CLUSTER-1   lrmd:debug:
> operation_finished:  VMA_VM1_stop_0:1798719 - exited with rc=0
> Dec 04 15:18:31 [4434] CLUSTER-1   lrmd:   notice:
> operation_finished:  VMA_VM1_stop_0:1798719 [ 2015/12/04_15:18:31
> INFO: Domain VM1 already stopped. ]
> Dec 04 15:18:32 [3818929] CLUSTER-1   crmd:debug:
> create_operation_update:  do_update_resource: Updating resouce
> VMA_VM1 after complete stop op (interval=0)
> Dec 04 15:18:32 [3818929] CLUSTER-1   crmd:   notice:
> process_lrm_event:LRM operation VMA_VM1_stop_0 (call=5701, rc=0,
> cib-update=90, confirmed=true) ok
> Dec 04 15:18:32

[ClusterLabs] Resources suddenly get target-role="stopped"

2015-12-04 Thread Klechomir


Hi list,
My issue is the following:

I have very stable cluster, using Corosync 2.1.0.26 and Pacemaker 1.1.8
(observed the same problem with Corosync 2.3.5  & Pacemaker 1.1.13-rc3)

Bumped on this issue when started playing with VirtualDomain resources, 
but this seems to be unrelated to the RA.


The problem is that without apparent reason a resource gets 
target-role="Stopped". This happens after (successful) migration, or 
after failover., or after VM restart .


My tests showed that changing the resource name fixes this problem, but 
this seems to be a temporary workaround.


The resource configuration is:
primitive VMA_VM1 ocf:heartbeat:VirtualDomain \
params config="/NFSvolumes/CDrive1/VM1/VM1.xml" 
hypervisor="qemu:///system" migration_transport="tcp" \

meta allow-migrate="true" target-role="Started" \
op start interval="0" timeout="120s" \
op stop interval="0" timeout="120s" \
op monitor interval="10" timeout="30" depth="0" \
utilization cpu="1" hv_memory="925"
order VM_VM1_after_Filesystem_CDrive1 inf: Filesystem_CDrive1 VMA_VM1

Here is the log from one such stop, after successful migration with "crm 
migrate resource VMA_VM1":


Dec 04 15:18:22 [3818929] CLUSTER-1   crmd:debug: cancel_op:
Cancelling op 5564 for VMA_VM1 (VMA_VM1:5564)
Dec 04 15:18:22 [4434] CLUSTER-1   lrmd: info: 
cancel_recurring_action: Cancelling operation VMA_VM1_monitor_1
Dec 04 15:18:23 [3818929] CLUSTER-1   crmd:debug: cancel_op:
Op 5564 for VMA_VM1 (VMA_VM1:5564): cancelled
Dec 04 15:18:23 [3818929] CLUSTER-1   crmd:debug: 
do_lrm_rsc_op:Performing 
key=351:199:0:fb6e486a-023a-4b44-83cf-4c0c208a0f56 op=VMA_VM1_migrate_to_0
VirtualDomain(VMA_VM1)[1797698]:2015/12/04_15:18:23 DEBUG: 
Virtual domain VM1 is currently running.
VirtualDomain(VMA_VM1)[1797698]:2015/12/04_15:18:23 INFO: VM1: 
Starting live migration to CLUSTER-2 (using virsh 
--connect=qemu:///system --quiet migrate --live  VM1 
qemu+tcp://CLUSTER-2/system ).
Dec 04 15:18:24 [3818929] CLUSTER-1   crmd: info: 
process_lrm_event:LRM operation VMA_VM1_monitor_1 (call=5564, 
status=1, cib-update=0, confirmed=false) Cancelled
Dec 04 15:18:24 [3818929] CLUSTER-1   crmd:debug: 
update_history_cache: Updating history for 'VMA_VM1' with monitor op
VirtualDomain(VMA_VM1)[1797698]:2015/12/04_15:18:26 INFO: VM1: 
live migration to CLUSTER-2 succeeded.
Dec 04 15:18:26 [4434] CLUSTER-1   lrmd:debug: 
operation_finished:  VMA_VM1_migrate_to_0:1797698 - exited with rc=0
Dec 04 15:18:26 [4434] CLUSTER-1   lrmd:   notice: 
operation_finished:  VMA_VM1_migrate_to_0:1797698 [ 
2015/12/04_15:18:23 INFO: VM1: Starting live migration to CLUSTER-2 
(using virsh --connect=qemu:///system --quiet migrate --live  VM1 
qemu+tcp://CLUSTER-2/system ). ]
Dec 04 15:18:26 [4434] CLUSTER-1   lrmd:   notice: 
operation_finished:  VMA_VM1_migrate_to_0:1797698 [ 
2015/12/04_15:18:26 INFO: VM1: live migration to CLUSTER-2 succeeded. ]
Dec 04 15:18:27 [3818929] CLUSTER-1   crmd:debug: 
create_operation_update:  do_update_resource: Updating resouce 
VMA_VM1 after complete migrate_to op (interval=0)
Dec 04 15:18:27 [3818929] CLUSTER-1   crmd:   notice: 
process_lrm_event:LRM operation VMA_VM1_migrate_to_0 (call=5697, 
rc=0, cib-update=89, confirmed=true) ok
Dec 04 15:18:27 [3818929] CLUSTER-1   crmd:debug: 
update_history_cache: Updating history for 'VMA_VM1' with 
migrate_to op
Dec 04 15:18:31 [3818929] CLUSTER-1   crmd:debug: cancel_op:
Operation VMA_VM1:5564 already cancelled
Dec 04 15:18:31 [3818929] CLUSTER-1   crmd:debug: 
do_lrm_rsc_op:Performing 
key=225:200:0:fb6e486a-023a-4b44-83cf-4c0c208a0f56 op=VMA_VM1_stop_0
VirtualDomain(VMA_VM1)[1798719]:2015/12/04_15:18:31 DEBUG: 
Virtual domain VM1 is not running:  failed to get domain 'vm1' domain 
not found: no domain with matching name 'vm1'
VirtualDomain(VMA_VM1)[1798719]:2015/12/04_15:18:31 INFO: Domain 
VM1 already stopped.
Dec 04 15:18:31 [4434] CLUSTER-1   lrmd:debug: 
operation_finished:  VMA_VM1_stop_0:1798719 - exited with rc=0
Dec 04 15:18:31 [4434] CLUSTER-1   lrmd:   notice: 
operation_finished:  VMA_VM1_stop_0:1798719 [ 2015/12/04_15:18:31 
INFO: Domain VM1 already stopped. ]
Dec 04 15:18:32 [3818929] CLUSTER-1   crmd:debug: 
create_operation_update:  do_update_resource: Updating resouce 
VMA_VM1 after complete stop op (interval=0)
Dec 04 15:18:32 [3818929] CLUSTER-1   crmd:   notice: 
process_lrm_event:LRM operation VMA_VM1_stop_0 (call=5701, rc=0, 
cib-update=90, confirmed=true) ok
Dec 04 15:18:32 [3818929] CLUSTER-1   crmd:debug: 
update_history_cache: Updating history for 'VMA_VM1' with stop op
Dec 04 15:20:58 [3818929] CLUSTER-1   crmd:debug: 
create_operation_update:  build_active_RAs: Updating

Re: [ClusterLabs] Resources suddenly get target-role="stopped"

Re: [ClusterLabs] Resources suddenly get target-role="stopped"

Re: [ClusterLabs] Resources suddenly get target-role="stopped"

Re: [ClusterLabs] Resources suddenly get target-role="stopped"

[ClusterLabs] Resources suddenly get target-role="stopped"

5 matches

Site Navigation

Mail list logo

Footer information