Hi, We had a failure for test 006_migrations.migrate_vm on master.
There was a libvirt disruption in the migration src and after that vdsm reported the migration as failed because the vm already exists which makes me suspect a split bran case. The patch reported has nothing to do with this issue and I think we simply stumbled on a race condition which can cause a split brain. Please note that there are several metrics related issues reported in vdsm logs as well. ** *Link and headline of suspected patches: * * *Not related * Link to Job: * **http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4278/** * * *Link to all logs:* * * **http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4278/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-006_migrations.py/** * * *(Relevant) error snippet from the log: * * *<error>* ** **Engine: * * *2017-12-05 06:26:48,546-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-385) [] EVENT_ID: VM_MIGRATION_TO_SERVER_FAILED(120), Migration failed (VM: vm0, Source: lago- basic-suite-master-host-1, Destination: lago-basic-suite-master-host-0). * *dst:** ** *2017-12-05 06:26:46,615-0500 WARN (jsonrpc/6) [vds] vm b3962e5c-08e3-444e-910e-ea675fa1a5c7 already exists (API:179) 2017-12-05 06:26:46,615-0500 ERROR (jsonrpc/6) [api] FINISH create error=Virtual machine already exists (api:124) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 117, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 180, in create raise exception.VMExists() VMExists: Virtual machine already exists 2017-12-05 06:26:46,620-0500 INFO (jsonrpc/6) [api.virt] FINISH create return={'status': {'message': 'Virtual machine already exists', 'code': 4}} from=::ffff:192.168.201.3,50394 (api:52) 2017-12-05 06:26:46,620-0500 INFO (jsonrpc/6) [api.virt] FINISH migrationCreate return={'status': {'message': 'Virtual machine already exists', 'code': 4}} from=::ffff:192.168.201.3,50394 (api:52) 2017-12-05 06:26:46,620-0500 INFO (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC call VM.migrationCreate failed (error 4) in 0.03 seconds (__init__:573) 2017-12-05 06:26:46,624-0500 INFO (jsonrpc/2) [api.virt] START destroy(gracefulAttempts=1) from=::ffff:192.168.201.3,50394 (api:46) 2017-12-05 06:26:46,624-0500 INFO (jsonrpc/2) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') Release VM resources (vm:4967) 2017-12-05 06:26:46,625-0500 WARN (jsonrpc/2) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') trying to set state to Powering down when already Down (vm:575) 2017-12-05 06:26:46,625-0500 INFO (jsonrpc/2) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') Stopping connection (guestagent:435) 2017-12-05 06:26:46,625-0500 INFO (jsonrpc/2) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') _destroyVmGraceful attempt #0 (vm:5004) 2017-12-05 06:26:46,626-0500 WARN (jsonrpc/2) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') VM 'b3962e5c-08e3-444e-910e-ea675fa1a5c7' couldn't be destroyed in libvirt: Requested operation is not valid: domain is not running (vm :5025) 2017-12-05 06:26:46,627-0500 INFO (jsonrpc/2) [vdsm.api] START teardownImage(sdUUID='952bb427-b88c-4fbe-99ef-49970d3aaf70', spUUID='9dcfeaaf-96b7-4e26-a327-5570e0e39261', imgUUID='e6eadbae-ec7a-48f4-8832-64a622a12bef', volUUID=None) from =::ffff:192.168.201.3,50394, task_id=2da93725-5533-4354-a369-751eb44f9ef2 (api:46) * **scr * * *2017-12-05 06:26:46,623-0500 ERROR (migsrc/b3962e5c) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') migration destination error: Virtual machine already exists (migration:290) * **disruption on **dst**: * * **2017-12-05 06:20:04,662-0500 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call VM.shutdown succeeded in 0.00 seconds (__init__:573) 2017-12-05 06:20:04,676-0500 ERROR (Thread-1) [root] Shutdown by QEMU Guest Agent failed (vm:5097) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 5088, in qemuGuestAgentShutdown self._dom.shutdownFlags(libvirt.VIR_DOMAIN_SHUTDOWN_GUEST_AGENT) File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 98, in f ret = attr(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 126, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 512, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2403, in shutdownFlags if ret == -1: raise libvirtError ('virDomainShutdownFlags() failed', dom=self) libvirtError: Guest agent is not responding: QEMU guest agent is not connected 2017-12-05 06:20:04,697-0500 INFO (libvirt/events) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') block threshold 1 exceeded on 'vda' (/rhev/data-center/mnt/blockSD/952bb427-b88c-4fbe-99ef-49970d3aaf70/images/e6eadbae-ec7a-48f4- 8832-64a622a12bef/1f063c73-f5cd-44e8-a84f-7810857f82df) (drivemonitor:162) 2017-12-05 06:20:04,698-0500 INFO (libvirt/events) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') drive 'vda' threshold exceeded (storage:872) 2017-12-05 06:20:05,889-0500 INFO (jsonrpc/7) [api.host] START getAllVmStats() from=::1,41118 (api:46) 2017-12-05 06:20:05,891-0500 INFO (jsonrpc/7) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::1,41118 (api:52) 2017-12-05 06:20:05,892-0500 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.00 seconds (__init__:573) 2017-12-05 06:20:07,466-0500 INFO (libvirt/events) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') underlying process disconnected (vm:1024) 2017-12-05 06:20:07,469-0500 INFO (libvirt/events) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') Release VM resources (vm:4967) 2017-12-05 06:20:07,469-0500 INFO (libvirt/events) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') Stopping connection (guestagent:435) 2017-12-05 06:20:07,469-0500 INFO (libvirt/events) [vdsm.api] START teardownImage(sdUUID='952bb427-b88c-4fbe-99ef-49970d3aaf70', spUUID='9dcfeaaf-96b7-4e26-a327-5570e0e39261', imgUUID='e6eadbae-ec7a-48f4-8832-64a622a12bef', volUUID=None) from=internal, task_id=9efedf46-d3be-4e41-b7f9-a074ed6344f6 (api:46) * * *</error>* **
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel