On Mon, Aug 9, 2021 at 2:42 PM Yedidyah Bar David <d...@redhat.com> wrote: > > On Mon, Aug 9, 2021 at 1:43 PM Nir Soffer <nsof...@redhat.com> wrote: > > > > On Mon, Aug 9, 2021 at 10:35 AM Yedidyah Bar David <d...@redhat.com> wrote: > > > > > > On Sun, Aug 8, 2021 at 5:42 PM Code Review <ger...@ovirt.org> wrote: > > > > > > > > From Jenkins CI <jenk...@ovirt.org>: > > > > > > > > Jenkins CI has posted comments on this change. ( > > > > https://gerrit.ovirt.org/c/ovirt-system-tests/+/115392 ) > > > > > > > > Change subject: HE: Use node image > > > > ...................................................................... > > > > > > > > > > > > Patch Set 13: Continuous-Integration-1 > > > > > > > > Build Failed > > > > > > While trying to deactivate a host, the engine wanted to migrate a VM > > > (vm0) from host-0 to host-1. vdsm log of host-0 says: > > > > > > 2021-08-08 14:31:10,076+0000 ERROR (migsrc/cde311f9) [virt.vm] > > > (vmId='cde311f9-9a33-4eb9-8338-fa22ff49edc2') Failed to migrate > > > (migration:503) > > > Traceback (most recent call last): > > > File "/usr/lib/python3.6/site-packages/vdsm/virt/migration.py", line > > > 477, in _regular_run > > > time.time(), machineParams > > > File "/usr/lib/python3.6/site-packages/vdsm/virt/migration.py", line > > > 578, in _startUnderlyingMigration > > > self._perform_with_conv_schedule(duri, muri) > > > File "/usr/lib/python3.6/site-packages/vdsm/virt/migration.py", line > > > 667, in _perform_with_conv_schedule > > > self._perform_migration(duri, muri) > > > File "/usr/lib/python3.6/site-packages/vdsm/virt/migration.py", line > > > 596, in _perform_migration > > > self._migration_flags) > > > File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line > > > 159, in call > > > return getattr(self._vm._dom, name)(*a, **kw) > > > File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line > > > 101, in f > > > ret = attr(*args, **kwargs) > > > File > > > "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", > > > line 131, in wrapper > > > ret = f(*args, **kwargs) > > > File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", > > > line 94, in wrapper > > > return func(inst, *args, **kwargs) > > > File "/usr/lib64/python3.6/site-packages/libvirt.py", line 2126, in > > > migrateToURI3 > > > raise libvirtError('virDomainMigrateToURI3() failed') > > > libvirt.libvirtError: Unsafe migration: Migration without shared > > > storage is unsafe > > > > Please share the vm xml: > > > > sudo virsh -r dumpxl vm-name > > I think you should be able to find a dump of it in vdsm.log: > > https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/18650/artifact/check-patch.he-basic_suite_master.el8.x86_64/test_logs/ost-he-basic-suite-master-host-0/var/log/vdsm/vdsm.log > > I think the first line of starting a migration is: > > 2021-08-08 14:31:08,350+0000 DEBUG (jsonrpc/4) [jsonrpc.JsonRpcServer] > Calling 'VM.migrate' in bridge with {'vmID': > 'cde311f9-9a33-4eb9-8338-fa22ff49edc2', 'params': > > A few lines later: > > 2021-08-08 14:31:08,387+0000 DEBUG (migsrc/cde311f9) > [virt.metadata.Descriptor] dumped metadata for > cde311f9-9a33-4eb9-8338-fa22ff49edc2: <?xml version='1.0' > encoding='utf-8'?> > <vm> > <balloonTarget type="int">98304</balloonTarget>
This is not the vm xml but the metadata xml. Looking at the logs on both hosts: [nsoffer@sparse ost]$ head -1 *vdsm.log ==> host0-vdsm.log <== 2021-08-08 13:16:04,676+0000 INFO (MainThread) [vds] (PID: 65169) I am the actual vdsm 4.40.80.3.12.git6d67b935b ost-he-basic-suite-master-host-0 (4.18.0-326.el8.x86_64) (vdsmd:162) ==> host1-vdsm.log <== 2021-08-08 15:40:54,367+0200 INFO (MainThread) [vds] (PID: 23005) I am the actual vdsm 4.40.80.4.5.git4309a3949 ost-he-basic-suite-master-host-1 (4.18.0-326.el8.x86_64) (vdsmd:162) - The hosts clocks are using different time zones (+0000 vs +0200) is this intended? - You are testing different versions of vdsm - is this intended? We have about 60 errors: $ grep 'Migration without shared storage is unsafe' host0-vdsm.log | wc -l 60 Looking at the first migration that failed, the vm xml is here: 2021-08-08 14:20:34,127+0000 INFO (vm/cde311f9) [virt.vm] (vmId='cde311f9-9a33-4eb9-8338-fa22ff49edc2') <?xml version='1.0' e ncoding='utf-8'?> <domain xmlns:ns0="http://libvirt.org/schemas/domain/qemu/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0" type="kvm"> <name>vm0</name> <uuid>cde311f9-9a33-4eb9-8338-fa22ff49edc2</uuid> ... The relevant parts for storage are: <disk device="disk" snapshot="no" type="file"> <driver cache="none" error_policy="stop" io="threads" iothread="1" name="qemu" type="qcow2" /> <source file="/rhev/data-center/7d97ea80-f849-11eb-ac79-5452d501341a/46fa5761-bb9e-46be-8f1c-35f4b03d0203/images/20002ad2-4a97-4d2f-b3fc-c103477b5b91/614abd56-4d4f-4412-aa2a-3f7bad2f3a87"> <seclabel model="dac" relabel="no" /> </source> <target bus="virtio" dev="vda" /> <serial>20002ad2-4a97-4d2f-b3fc-c103477b5b91</serial> <boot order="1" /> <alias name="ua-20002ad2-4a97-4d2f-b3fc-c103477b5b91" /> <address bus="0x05" domain="0x0000" function="0x0" slot="0x00" type="pci" /> </disk> <disk device="disk" snapshot="no" type="block"> <driver cache="none" error_policy="stop" io="native" name="qemu" type="raw" /> <source dev="/dev/mapper/36001405bc9d94e4419b4b80a2f702e2f"> <seclabel model="dac" relabel="no" /> </source> <target bus="scsi" dev="sda" /> <serial>738c8486-8929-44ec-9083-69327bde9c65</serial> <alias name="ua-738c8486-8929-44ec-9083-69327bde9c65" /> <address bus="0" controller="0" target="0" type="drive" unit="0" /> </disk> So we have one qcow2 disk on file storage, and one direct lun. On the destination, the first migration attempt is here: 2021-08-08 16:31:08,437+0200 DEBUG (jsonrpc/2) [jsonrpc.JsonRpcServer] Calling 'VM.migrationCreate' in bridge with {'vmID': 'c de311f9-9a33-4eb9-8338-fa22ff49edc2', 'params': {'_srcDomXML': '<domain type=\'kvm\' id=\'6\' xmlns:qemu=\'http://libvirt.org/ ... We prepare the qcow2 disk: 2021-08-08 16:31:09,313+0200 INFO (vm/cde311f9) [vdsm.api] FINISH prepareImage return={'path': '/rhev/data-center/mnt/192.168.200.2:_exports_nfs_share1/46fa5761-bb9e-46be-8f1c-35f4b03d0203/images/20002ad2-4a97-4d2f-b3fc-c103477b5b91/614abd56-4d4f-4412-aa2a-3f7bad2f3a87', 'info': {'type': 'file', 'path': '/rhev/data-center/mnt/192.168.200.2:_exports_nfs_share1/46fa5761-bb9e-46be-8f1c-35f4b03d0203/images/20002ad2-4a97-4d2f-b3fc-c103477b5b91/614abd56-4d4f-4412-aa2a-3f7bad2f3a87'}, 'imgVolumesInfo': [{'domainID': '46fa5761-bb9e-46be-8f1c-35f4b03d0203', 'imageID': '20002ad2-4a97-4d2f-b3fc-c103477b5b91', 'volumeID': '1d3f07dc-b481-492f-a2a6-7c46689d82ba', 'path': '/rhev/data-center/mnt/192.168.200.2:_exports_nfs_share1/46fa5761-bb9e-46be-8f1c-35f4b03d0203/images/20002ad2-4a97-4d2f-b3fc-c103477b5b91/1d3f07dc-b481-492f-a2a6-7c46689d82ba', 'leasePath': '/rhev/data-center/mnt/192.168.200.2:_exports_nfs_share1/46fa5761-bb9e-46be-8f1c-35f4b03d0203/images/20002ad2-4a97-4d2f-b3fc-c103477b5b91/1d3f07dc-b481-492f-a2a6-7c46689d82ba.lease', 'leaseOffset': 0}, {'domainID': '46fa5761-bb9e-46be-8f1c-35f4b03d0203', 'imageID': '20002ad2-4a97-4d2f-b3fc-c103477b5b91', 'volumeID': '614abd56-4d4f-4412-aa2a-3f7bad2f3a87', 'path': '/rhev/data-center/mnt/192.168.200.2:_exports_nfs_share1/46fa5761-bb9e-46be-8f1c-35f4b03d0203/images/20002ad2-4a97-4d2f-b3fc-c103477b5b91/614abd56-4d4f-4412-aa2a-3f7bad2f3a87', 'leasePath': '/rhev/data-center/mnt/192.168.200.2:_exports_nfs_share1/46fa5761-bb9e-46be-8f1c-35f4b03d0203/images/20002ad2-4a97-4d2f-b3fc-c103477b5b91/614abd56-4d4f-4412-aa2a-3f7bad2f3a87.lease', 'leaseOffset': 0}, {'domainID': '46fa5761-bb9e-46be-8f1c-35f4b03d0203', 'imageID': '20002ad2-4a97-4d2f-b3fc-c103477b5b91', 'volumeID': 'a4309ef3-01bb-45db-8bf7-0f9498a7feeb', 'path': '/rhev/data-center/mnt/192.168.200.2:_exports_nfs_share1/46fa5761-bb9e-46be-8f1c-35f4b03d0203/images/20002ad2-4a97-4d2f-b3fc-c103477b5b91/a4309ef3-01bb-45db-8bf7-0f9498a7feeb', 'leasePath': '/rhev/data-center/mnt/192.168.200.2:_exports_nfs_share1/46fa5761-bb9e-46be-8f1c-35f4b03d0203/images/20002ad2-4a97-4d2f-b3fc-c103477b5b91/a4309ef3-01bb-45db-8bf7-0f9498a7feeb.lease', 'leaseOffset': 0}]} from=internal, task_id=f2dfbacb-154c-4f5b-b57d-affcbf419691 (api:54) 2021-08-08 16:31:09,314+0200 INFO (vm/cde311f9) [vds] prepared volume path: /rhev/data-center/mnt/192.168.200.2:_exports_nfs_share1/46fa5761-bb9e-46be-8f1c-35f4b03d0203/images/20002ad2-4a97-4d2f-b3fc-c103477b5b91/614abd56-4d4f-4412-aa2a-3f7bad2f3a87 (clientIF:518) And the direct lun: 2021-08-08 16:31:09,315+0200 INFO (vm/cde311f9) [vdsm.api] START appropriateDevice(guid='36001405bc9d94e4419b4b80a2f702e2f', thiefId='cde311f9-9a33-4eb9-8338-fa22ff49edc2', deviceType='mpath') from=internal, task_id=220f1c1f-caec-4327-a157-7a4fab3b54a5 (api:48) 2021-08-08 16:31:09,550+0200 INFO (vm/cde311f9) [vdsm.api] FINISH appropriateDevice return={'truesize': '21474836480', 'apparentsize': '21474836480', 'path': '/dev/mapper/36001405bc9d94e4419b4b80a2f702e2f'} from=internal, task_id=220f1c1f-caec-4327-a157-7a4fab3b54a5 (api:54) 2021-08-08 16:31:09,550+0200 INFO (vm/cde311f9) [vds] prepared volume path: /dev/mapper/36001405bc9d94e4419b4b80a2f702e2f (clientIF:518) The interesting thing is that the qcow2 disk is using different path on the source and destination vms: source: /rhev/data-center/7d97ea80-f849-11eb-ac79-5452d501341a/46fa5761-bb9e-46be-8f1c-35f4b03d0203/images/20002ad2-4a97-4d2f-b3fc-c103477b5b91/614abd56-4d4f-4412-aa2a-3f7bad2f3a87 destination: /rhev/data-center/mnt/192.168.200.2:_exports_nfs_share1/46fa5761-bb9e-46be-8f1c-35f4b03d0203/images/20002ad2-4a97-4d2f-b3fc-c103477b5b91/614abd56-4d4f-4412-aa2a-3f7bad2f3a87 On the source we have: /rhev/data-center/pool-id/domain-id/images/image-id/volume-id On the destination we have: /rhev/data-center/mnt/mountpoint/domain-id/images/image-id/volume-id Both lead to the same disk, but libvirt probably compared the strings and decided that we don't have shared storage. It may be new validation in libvirt, or maybe engine changed the way the disk path is added to the xml recently? In vdsm we use os.path.realpath() when we compare disk paths since we know that we can have different paths to the same volume. This kind of issue is likely to be reproducible without ost. Nir _______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/MQ5IWI2KXYRE3XRT4F4GGQJSFVNTZYX3/