Hi!

I just had an "interesting error" with VirtualDomain in SLES15 SP2:

On node h18 the configuration file for the VM was missing, but still the VM 
could be live-migrated to h18 a few days ago.
When today the VM was to be restarted, shutdown worked correctly also, but the 
start attempt failed with this interesting error message:

Mar 10 10:49:26 h18 pacemaker-execd[1741]:  notice: executing - rsc:prm_xen_v07 
action:start call_id:240
Mar 10 10:49:27 h18 VirtualDomain(prm_xen_rksapv07)[18248]: ERROR: Failed to 
start virtual domain v07.
Mar 10 10:49:27 h18 pacemaker-execd[1741]:  notice: prm_xen_v07_start_0[18015] 
error output [ error: Failed to create domain from /etc/xen/vm/v07.xml ]
Mar 10 10:49:27 h18 pacemaker-execd[1741]:  notice: prm_xen_v07_start_0[18015] 
error output [ error: Requested operation is not valid: domain 'v07' is already 
active ]
Mar 10 10:49:27 h18 pacemaker-execd[1741]:  notice: prm_xen_v07_start_0[18015] 
error output [ ocf-exit-reason:Failed to start virtual domain v07. ]
Mar 10 10:49:27 h18 pacemaker-execd[1741]:  notice: prm_xen_v07 start (call 
240, PID 18015) exited with status 1 (execution time 1846ms, queue time 0ms)
Mar 10 10:49:27 h18 pacemaker-controld[1744]:  notice: Result of start 
operation for prm_xen_v07 on h18: error
Mar 10 10:49:27 h18 pacemaker-controld[1744]:  notice: 
h18-prm_xen_v07_start_0:240 [ error: Failed to create domain from 
/etc/xen/vm/v07.xml\nerror: Requested operation is not valid: domain 'v07' is 
already active\nocf-exit-reason:Failed to start virtual domain v07.\n ]

Of course v07 was not active on h18 (as per "virsh list").

Maybe this error was triggered by the fact that the restart was due to a change 
in location of the configuration file. Is the RA or CRM using the wrong (old 
instead of new) parameter for start?

The "safety-stop" operation of the cluster logged:
Mar 10 10:49:28 h18 VirtualDomain(prm_xen_v07)[18284]: INFO: Configuration file 
/etc/libvirt/libxl/v07.xml not readable, resource considered stopped.
Mar 10 10:49:28 h18 VirtualDomain(prm_xen_v07)[18290]: INFO: environment is 
invalid, resource considered stopped
Mar 10 10:49:28 h18 pacemaker-execd[1741]:  notice: prm_xen_v07 stop (call 241, 
PID 18254) exited with status 0 (execution time 286ms, queue time 0ms)
Mar 10 10:49:28 h18 pacemaker-controld[1744]:  notice: Result of stop operation 
for prm_xen_v07 on h18: ok

Regards,
Ulrich


_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to