Re: [ClusterLabs] Early VM resource migration
Hi Ken, I've tried with and without colocation. The rule was: colocation bla2 inf: VM_VM1 AA_Filesystem_CDrive1 In both cases the VM_VM1 tries to live migrate back the coming after standby node while cloned AA_Filesystem_CDrive1 isn't up on it yet. Same result with pacemaker 1.14-rc2 Regards, On 16.12.2015 11:08:35 Ken Gaillot wrote: > On 12/16/2015 10:30 AM, Klechomir wrote: > > On 16.12.2015 17:52, Ken Gaillot wrote: > >> On 12/16/2015 02:09 AM, Klechomir wrote: > >>> Hi list, > >>> I have a cluster with VM resources on a cloned active-active storage. > >>> > >>> VirtualDomain resource migrates properly during failover (node standby), > >>> but tries to migrate back too early, during failback, ignoring the > >>> "order" constraint, telling it to start when the cloned storage is up. > >>> This causes unnecessary VM restart. > >>> > >>> Is there any way to make it wait, until its storage resource is up? > >> > >> Hi Klecho, > >> > >> If you have an order constraint, the cluster will not try to start the > >> VM until the storage resource agent returns success for its start. If > >> the storage isn't fully up at that point, then the agent is faulty, and > >> should be modified to wait until the storage is truly available before > >> returning success. > >> > >> If you post all your constraints, I can look for anything that might > >> affect the behavior. > > > > Thanks for the reply, Ken > > > > Seems to me that that the constraints for a cloned resources act a a bit > > different. > > > > Here is my config: > > > > primitive p_AA_Filesystem_CDrive1 ocf:heartbeat:Filesystem \ > > > > params device="/dev/CSD_CDrive1/AA_CDrive1" > > > > directory="/volumes/AA_CDrive1" fstype="ocfs2" options="rw,noatime" > > primitive VM_VM1 ocf:heartbeat:VirtualDomain \ > > > > params config="/volumes/AA_CDrive1/VM_VM1/VM1.xml" > > > > hypervisor="qemu:///system" migration_transport="tcp" \ > > > > meta allow-migrate="true" target-role="Started" > > > > clone AA_Filesystem_CDrive1 p_AA_Filesystem_CDrive1 \ > > > > meta interleave="true" resource-stickiness="0" > > > > target-role="Started" > > order VM_VM1_after_AA_Filesystem_CDrive1 inf: AA_Filesystem_CDrive1 VM_VM1 > > > > Every time when a node comes back from standby, the VM tries to live > > migrate to it long before the filesystem is up. > > In most cases (including this one), when you have an order constraint, > you also need a colocation constraint. > > colocation = two resources must be run on the same node > > order = one resource must be started/stopped/whatever before another > > Or you could use a group, which is essentially a shortcut for specifying > colocation and order constraints for any sequence of resources. > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Antw: Re: Early VM resource migration
Hi Ulrich, This is only a part of the config, which concerns the problem. Even with dummy resources, the behaviour will be identical, so don't think that dlm/clvmd res. config will help solving the problem. Regards, KIecho On 17.12.2015 08:19:43 Ulrich Windl wrote: > >>> Klechomir schrieb am 16.12.2015 um 17:30 in Nachricht > > <5671918e.40...@gmail.com>: > > On 16.12.2015 17:52, Ken Gaillot wrote: > >> On 12/16/2015 02:09 AM, Klechomir wrote: > >>> Hi list, > >>> I have a cluster with VM resources on a cloned active-active storage. > >>> > >>> VirtualDomain resource migrates properly during failover (node standby), > >>> but tries to migrate back too early, during failback, ignoring the > >>> "order" constraint, telling it to start when the cloned storage is up. > >>> This causes unnecessary VM restart. > >>> > >>> Is there any way to make it wait, until its storage resource is up? > >> > >> Hi Klecho, > >> > >> If you have an order constraint, the cluster will not try to start the > >> VM until the storage resource agent returns success for its start. If > >> the storage isn't fully up at that point, then the agent is faulty, and > >> should be modified to wait until the storage is truly available before > >> returning success. > >> > >> If you post all your constraints, I can look for anything that might > >> affect the behavior. > > > > Thanks for the reply, Ken > > > > Seems to me that that the constraints for a cloned resources act a a bit > > different. > > > > Here is my config: > > > > primitive p_AA_Filesystem_CDrive1 ocf:heartbeat:Filesystem \ > > > > params device="/dev/CSD_CDrive1/AA_CDrive1" > > > > directory="/volumes/AA_CDrive1" fstype="ocfs2" options="rw,noatime" > > primitive VM_VM1 ocf:heartbeat:VirtualDomain \ > > > > params config="/volumes/AA_CDrive1/VM_VM1/VM1.xml" > > > > hypervisor="qemu:///system" migration_transport="tcp" \ > > > > meta allow-migrate="true" target-role="Started" > > > > clone AA_Filesystem_CDrive1 p_AA_Filesystem_CDrive1 \ > > > > meta interleave="true" resource-stickiness="0" > > > > target-role="Started" > > order VM_VM1_after_AA_Filesystem_CDrive1 inf: AA_Filesystem_CDrive1 VM_VM1 > > > > Every time when a node comes back from standby, the VM tries to live > > migrate to it long before the filesystem is up. > > Hi! > > To me your config looks rather incomplete: What about DLM, O2CB, cLVM, etc.? > >> ___ > >> Users mailing list: Users@clusterlabs.org > >> http://clusterlabs.org/mailman/listinfo/users > >> > >> Project Home: http://www.clusterlabs.org > >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >> Bugs: http://bugs.clusterlabs.org > > > > ___ > > Users mailing list: Users@clusterlabs.org > > http://clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Antw: Re: Antw: Re: Early VM resource migration
>>> Klechomir schrieb am 17.12.2015 um 14:16 in Nachricht <2102747.TPh6pTdk8c@bobo>: > Hi Ulrich, > This is only a part of the config, which concerns the problem. > Even with dummy resources, the behaviour will be identical, so don't think > that dlm/clvmd res. config will help solving the problem. You could send logs with the actual startup sequence then. > > Regards, > KIecho > > On 17.12.2015 08:19:43 Ulrich Windl wrote: >> >>> Klechomir schrieb am 16.12.2015 um 17:30 in Nachricht >> >> <5671918e.40...@gmail.com>: >> > On 16.12.2015 17:52, Ken Gaillot wrote: >> >> On 12/16/2015 02:09 AM, Klechomir wrote: >> >>> Hi list, >> >>> I have a cluster with VM resources on a cloned active-active storage. >> >>> >> >>> VirtualDomain resource migrates properly during failover (node standby), >> >>> but tries to migrate back too early, during failback, ignoring the >> >>> "order" constraint, telling it to start when the cloned storage is up. >> >>> This causes unnecessary VM restart. >> >>> >> >>> Is there any way to make it wait, until its storage resource is up? >> >> >> >> Hi Klecho, >> >> >> >> If you have an order constraint, the cluster will not try to start the >> >> VM until the storage resource agent returns success for its start. If >> >> the storage isn't fully up at that point, then the agent is faulty, and >> >> should be modified to wait until the storage is truly available before >> >> returning success. >> >> >> >> If you post all your constraints, I can look for anything that might >> >> affect the behavior. >> > >> > Thanks for the reply, Ken >> > >> > Seems to me that that the constraints for a cloned resources act a a bit >> > different. >> > >> > Here is my config: >> > >> > primitive p_AA_Filesystem_CDrive1 ocf:heartbeat:Filesystem \ >> > >> > params device="/dev/CSD_CDrive1/AA_CDrive1" >> > >> > directory="/volumes/AA_CDrive1" fstype="ocfs2" options="rw,noatime" >> > primitive VM_VM1 ocf:heartbeat:VirtualDomain \ >> > >> > params config="/volumes/AA_CDrive1/VM_VM1/VM1.xml" >> > >> > hypervisor="qemu:///system" migration_transport="tcp" \ >> > >> > meta allow-migrate="true" target-role="Started" >> > >> > clone AA_Filesystem_CDrive1 p_AA_Filesystem_CDrive1 \ >> > >> > meta interleave="true" resource-stickiness="0" >> > >> > target-role="Started" >> > order VM_VM1_after_AA_Filesystem_CDrive1 inf: AA_Filesystem_CDrive1 VM_VM1 >> > >> > Every time when a node comes back from standby, the VM tries to live >> > migrate to it long before the filesystem is up. >> >> Hi! >> >> To me your config looks rather incomplete: What about DLM, O2CB, cLVM, etc.? >> >> ___ >> >> Users mailing list: Users@clusterlabs.org >> >> http://clusterlabs.org/mailman/listinfo/users >> >> >> >> Project Home: http://www.clusterlabs.org >> >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> >> Bugs: http://bugs.clusterlabs.org >> > >> > ___ >> > Users mailing list: Users@clusterlabs.org >> > http://clusterlabs.org/mailman/listinfo/users >> > >> > Project Home: http://www.clusterlabs.org >> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> > Bugs: http://bugs.clusterlabs.org >> >> ___ >> Users mailing list: Users@clusterlabs.org >> http://clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Antw: Re: Antw: Re: Early VM resource migration
Here is what pacemaker says right after node1 comes back after standby: Dec 16 16:11:41 [4512] CLUSTER-2pengine:debug: native_assign_node: All nodes for resource VM_VM1 are unavailable, unclean or shutting down (CLUSTER-1: 1, -100) Dec 16 16:11:41 [4512] CLUSTER-2pengine:debug: native_assign_node: Could not allocate a node for VM_VM1 Dec 16 16:11:41 [4512] CLUSTER-2pengine:debug: native_assign_node: Processing VM_VM1_monitor_1 Dec 16 16:11:41 [4512] CLUSTER-2pengine: info: native_color: Resource VM_VM1 cannot run anywhere VM_VM1 gets immediately stopped as soon as node1 re-appears and stays down until its "order/colocation AA resource" comes up on node1. The curious part is that in the opposite case (node2 comes from standby), the failback is ok. Regards, On 17.12.2015 14:51:21 Ulrich Windl wrote: > >>> Klechomir schrieb am 17.12.2015 um 14:16 in Nachricht > > <2102747.TPh6pTdk8c@bobo>: > > Hi Ulrich, > > This is only a part of the config, which concerns the problem. > > Even with dummy resources, the behaviour will be identical, so don't think > > that dlm/clvmd res. config will help solving the problem. > > You could send logs with the actual startup sequence then. > > > Regards, > > KIecho > > > > On 17.12.2015 08:19:43 Ulrich Windl wrote: > >> >>> Klechomir schrieb am 16.12.2015 um 17:30 in > >> >>> Nachricht > >> > >> <5671918e.40...@gmail.com>: > >> > On 16.12.2015 17:52, Ken Gaillot wrote: > >> >> On 12/16/2015 02:09 AM, Klechomir wrote: > >> >>> Hi list, > >> >>> I have a cluster with VM resources on a cloned active-active storage. > >> >>> > >> >>> VirtualDomain resource migrates properly during failover (node > >> >>> standby), > >> >>> but tries to migrate back too early, during failback, ignoring the > >> >>> "order" constraint, telling it to start when the cloned storage is > >> >>> up. > >> >>> This causes unnecessary VM restart. > >> >>> > >> >>> Is there any way to make it wait, until its storage resource is up? > >> >> > >> >> Hi Klecho, > >> >> > >> >> If you have an order constraint, the cluster will not try to start the > >> >> VM until the storage resource agent returns success for its start. If > >> >> the storage isn't fully up at that point, then the agent is faulty, > >> >> and > >> >> should be modified to wait until the storage is truly available before > >> >> returning success. > >> >> > >> >> If you post all your constraints, I can look for anything that might > >> >> affect the behavior. > >> > > >> > Thanks for the reply, Ken > >> > > >> > Seems to me that that the constraints for a cloned resources act a a > >> > bit > >> > different. > >> > > >> > Here is my config: > >> > > >> > primitive p_AA_Filesystem_CDrive1 ocf:heartbeat:Filesystem \ > >> > > >> > params device="/dev/CSD_CDrive1/AA_CDrive1" > >> > > >> > directory="/volumes/AA_CDrive1" fstype="ocfs2" options="rw,noatime" > >> > primitive VM_VM1 ocf:heartbeat:VirtualDomain \ > >> > > >> > params config="/volumes/AA_CDrive1/VM_VM1/VM1.xml" > >> > > >> > hypervisor="qemu:///system" migration_transport="tcp" \ > >> > > >> > meta allow-migrate="true" target-role="Started" > >> > > >> > clone AA_Filesystem_CDrive1 p_AA_Filesystem_CDrive1 \ > >> > > >> > meta interleave="true" resource-stickiness="0" > >> > > >> > target-role="Started" > >> > order VM_VM1_after_AA_Filesystem_CDrive1 inf: AA_Filesystem_CDrive1 > >> > VM_VM1 > >> > > >> > Every time when a node comes back from standby, the VM tries to live > >> > migrate to it long before the filesystem is up. > >> > >> Hi! > >> > >> To me your config looks rather incomplete: What about DLM, O2CB, cLVM, > >> etc.?>> > >> >> ___ > >> >> Users mailing list: Users@clusterlabs.org > >> >> http://clusterlabs.org/mailman/listinfo/users > >> >> > >> >> Project Home: http://www.clusterlabs.org > >> >> Getting started: > >> >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >> >> Bugs: http://bugs.clusterlabs.org > >> > > >> > ___ > >> > Users mailing list: Users@clusterlabs.org > >> > http://clusterlabs.org/mailman/listinfo/users > >> > > >> > Project Home: http://www.clusterlabs.org > >> > Getting started: > >> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >> > Bugs: http://bugs.clusterlabs.org > >> > >> ___ > >> Users mailing list: Users@clusterlabs.org > >> http://clusterlabs.org/mailman/listinfo/users > >> > >> Project Home: http://www.clusterlabs.org > >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >> Bugs: http://bugs.clusterlabs.org > > > > ___ > > Users mailing list: Users@clusterlabs.org > > http://clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org > > Getting started:
[ClusterLabs] successful ipmi stonith still times out
I have a customer (running SLE 11 SP4 HAE) who is seeing the following stonith behavior running the ipmi stonith plugin. Dec 15 14:21:43 test4 pengine[24002]: warning: pe_fence_node: Node test3 will be fenced because termination was requested Dec 15 14:21:43 test4 pengine[24002]: warning: determine_online_status: Node test3 is unclean Dec 15 14:21:43 test4 pengine[24002]: warning: stage6: Scheduling Node test3 for STONITH ... it issues the reset and it is noted ... Dec 15 14:21:45 test4 external/ipmi(STONITH-test3)[177184]: [177197]: debug: ipmitool output: Chassis Power Control: Reset Dec 15 14:21:46 test4 stonith-ng[23999]: notice: log_operation: Operation 'reboot' [177179] (call 2 from crmd.24003) for host 'test3' with device 'STONITH-test3' returned: 0 (OK) ... test3 does go down ... Dec 15 14:22:21 test4 kernel: [90153.906461] Cell 2 (test3) left the membership ... but the stonith operation times out (it said OK earlier) ... Dec 15 14:22:56 test4 stonith-ng[23999]: notice: remote_op_timeout: Action reboot (a399a8cb-541a-455e-8d7c-9072d48667d1) for test3 (crmd.24003) timed out Dec 15 14:23:05 test4 external/ipmi(STONITH-test3)[177667]: [177678]: debug: ipmitool output: Chassis Power is on Dec 15 14:23:56 test4 crmd[24003]:error: stonith_async_timeout_handler: Async call 2 timed out after 132000ms Dec 15 14:23:56 test4 crmd[24003]: notice: tengine_stonith_callback: Stonith operation 2/51:100:0:f43dc87c-faf0-4034-8b51-be0c13c95656: Timer expired (-62) Dec 15 14:23:56 test4 crmd[24003]: notice: tengine_stonith_callback: Stonith operation 2 for test3 failed (Timer expired): aborting transition. Dec 15 14:23:56 test4 crmd[24003]: notice: abort_transition_graph: Transition aborted: Stonith failed (source=tengine_stonith_callback:697, 0) This looks like a bug but a quick search did not turn up anything. Does anyone recognize this problem? -- Ron Kerry ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] successful ipmi stonith still times out
On 12/17/2015 10:32 AM, Ron Kerry wrote: > I have a customer (running SLE 11 SP4 HAE) who is seeing the following > stonith behavior running the ipmi stonith plugin. > > Dec 15 14:21:43 test4 pengine[24002]: warning: pe_fence_node: Node > test3 will be fenced because termination was requested > Dec 15 14:21:43 test4 pengine[24002]: warning: determine_online_status: > Node test3 is unclean > Dec 15 14:21:43 test4 pengine[24002]: warning: stage6: Scheduling Node > test3 for STONITH > > ... it issues the reset and it is noted ... > Dec 15 14:21:45 test4 external/ipmi(STONITH-test3)[177184]: [177197]: > debug: ipmitool output: Chassis Power Control: Reset > Dec 15 14:21:46 test4 stonith-ng[23999]: notice: log_operation: > Operation 'reboot' [177179] (call 2 from crmd.24003) for host 'test3' > with device 'STONITH-test3' returned: 0 (OK) > > ... test3 does go down ... > Dec 15 14:22:21 test4 kernel: [90153.906461] Cell 2 (test3) left the > membership > > ... but the stonith operation times out (it said OK earlier) ... > Dec 15 14:22:56 test4 stonith-ng[23999]: notice: remote_op_timeout: > Action reboot (a399a8cb-541a-455e-8d7c-9072d48667d1) for test3 > (crmd.24003) timed out > Dec 15 14:23:05 test4 external/ipmi(STONITH-test3)[177667]: [177678]: > debug: ipmitool output: Chassis Power is on > > Dec 15 14:23:56 test4 crmd[24003]:error: > stonith_async_timeout_handler: Async call 2 timed out after 132000ms > Dec 15 14:23:56 test4 crmd[24003]: notice: tengine_stonith_callback: > Stonith operation 2/51:100:0:f43dc87c-faf0-4034-8b51-be0c13c95656: Timer > expired (-62) > Dec 15 14:23:56 test4 crmd[24003]: notice: tengine_stonith_callback: > Stonith operation 2 for test3 failed (Timer expired): aborting transition. > Dec 15 14:23:56 test4 crmd[24003]: notice: abort_transition_graph: > Transition aborted: Stonith failed (source=tengine_stonith_callback:697, 0) > > This looks like a bug but a quick search did not turn up anything. Does > anyone recognize this problem? Fence timeouts can be tricky to troubleshoot because there are multiple timeouts involved. The process goes like this: 1. crmd asks the local stonithd to do the fence. 2. The local stonithd queries all stonithd's to ensure it has the latest status of all fence devices. 3. The local stonithd chooses a fence device (or possibly devices, if topology is involved) and picks the best stonithd (or stonithd's) to actually execute the fencing. 4. The chosen stonithd (or stonithd's) runs the fence agent to do the actual fencing, then replies to the original stonithd, which replies to the original requester. So the crmd can timeout waiting for a reply from stonithd, the local stonithd can timeout waiting for query replies from all stonithd's, the local stonithd can timeout waiting for a reply from one or more executing stonithd's, or an executing stonithd can timeout waiting for a reply from the fence device. Another factor is that some reboots can be remapped to off then on. This will happen, for example, if the fence device doesn't have a reboot command, or if it's in a fence topology level with other devices. So in that case, there's the possibility of a timeout for the off command, and the on command. In this case, one thing that's odd is that the "Async call 2 timed out" message is the timeout for the crmd waiting for a reply from stonithd. The crmd timeout is always a minute longer than stonithd's timeout, which should be more than enough time for stonithd to reply. I'm not sure what's going on there. I'd look closely at the entire fence configuration (is topology involved? what are the configured timeouts? are the configuration options correct?), and trace through the logs to see what step or steps are actually timing out. I do see here that the reboot times out before the "Chassis Power is on" message, so it's possible the reboot timeout is too short to account for a full cycle. But I'm not sure why it would report OK before that, unless maybe that was for one step of the larger process. ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org