[ClusterLabs] VirtualDomain restart caused fencing.

Matthew Schumacher Wed, 30 Jun 2021 08:41:04 -0700

Hello,

I'm not sure how to fix this, but calling 'crm resource restart vm-name' this 
morning caused an entire node to get fenced, kicking the stool out from under a 
number of VMs.


Looking at VirtualDomain it looks like the system defaults to a 90s timeout, 
and if it can't gracefully shutdown the VM with 'virsh shutdown' in 85s, then 
it calls 'virsh destroy'.  For whatever reason, that's not what happened.

I created a mockup where I moved a test vm to it's own node (in case it gets 
fenced), then loaded something that would ignore acpi shutdown, then called 
restart.  This time it worked.  The logs show:

Jun 30 15:32:11  VirtualDomain(vm-testvm)[13047]:    INFO: Issuing graceful 
shutdown request for domain testvm.
Jun 30 15:32:26  VirtualDomain(vm-testvm)[13047]:    INFO: Issuing forced 
shutdown (destroy) request for domain testvm.

I don't have the logs from the original failure due to my node not being 
persistent, but I wonder if anyone else has run into this.

Here is my resource configuration if that reveals the issue:

crm configure primitive vm-testvm2 VirtualDomain params 
config="/datastore/vm/testvm/testvm.xml" migration_transport=ssh meta 
allow-migrate=true target-role=Started op monitor timeout=30 interval=30

Oh, one last question:  Can I disable fencing for a specific resource for 
testing reasons?  I'd love to watch this break without fear of fencing.

Matt

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] VirtualDomain restart caused fencing.

Reply via email to