On Wed, Aug 12, 2020 at 2:25 AM <tho...@hoberg.net> wrote: > While trying to diagnose an issue with a set of VMs that get stopped for > I/O problems at startup, I try to deal with the fact that their boot disks > cause this issue, no matter where I connect them. They might have been the > first disks I ever tried to sparsify and I was afraid that might have > messed them up. The images are for a nested oVirt deployment and they > worked just fine, before I shut down those VMs... > > So I first tried to hook them as secondary disks to another VM to have a > look, but that just cause the other VM to stop at boot. > > Also tried downloading, exporting, and plain copying the disks to no > avail, OVA exports on the entire VM fail again (fix is in!). > > So to make sure copying disks between volumes *generally* work, I tried > copying a disk from a working (but stopped) VM from 'vmstore' to 'data' on > my 3nHCI farm, but that failed, too! > > Plenty of space all around, but all disks are using thin/sparse/VDO on SSD > underneath. > > Before I open a bug, I'd like to have some feedback if this is a standard > QA test, this is happening to you etc. > > Still on oVirt 4.3.11 with pack_ova.py patched to wait for the udev > settle, > > This is from the engine.log on the hosted-engine: > > 2020-08-12 00:04:15,870+02 ERROR > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > (EE-ManagedThreadFactory-engineScheduled-Thread-67) [] EVENT_ID: > VDS_BROKER_COMMAND_FAILURE(10,802), VDSM gem2 command > HSMGetAllTasksStatusesVDS failed: low level Image copy failed: ("Command > ['/usr/bin/qemu-img', 'convert', '-p', '-t', 'none', '-T', 'none', '-f', > 'raw', > u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_vmstore/9d1b8774-c5dc-46a8-bfa2-6a6db5851195/images/aca27b96-7215-476f-b793-fb0396543a2e/311f853c-e9cc-4b9e-8a00-5885ec7adf14', > '-O', 'raw', > u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_data/32129b5f-d47c-495b-a282-7eae1079257e/images/f6a08d2a-4ddb-42da-88e6-4f92a38b9c95/e0d00d46-61a1-4d8c-8cb4-2e5f1683d7f5'] > failed with rc=1 out='' err=bytearray(b'qemu-img: error while reading > sector 131072: Transport endpoint is not connected\\nqemu-img: error while > reading sector 135168: Transport endpoint is not connected\\nqemu-img: > error while reading sector 139264: Transport > endpoint is not connected\\nqemu-img: error while reading sector 143360: > Transport endpoint is not connected\\nqemu-img: error while reading sector > 147456: Transport endpoint is not connected\\nqemu-img: error while reading > sector 151552: Transport endpoint is not connected\\n')",) > > and this is from the vdsm.log on the gem2 node: > Error: Command ['/usr/bin/qemu-img', 'convert', '-p', '-t', 'none', '-T', > 'none', '-f', 'raw', > u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_vmstore/9d1b8774-c5dc-46a8-bfa2-6a6db5851195/images/aca27b96-7215-476f-b793-fb0396543a2e/311f853c-e9cc-4b9e-8a00-5885ec7adf14', > '-O', 'raw', > u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_data/32129b5f-d47c-495b-a282-7eae1079257e/images/f6a08d2a-4ddb-42da-88e6-4f92a38b9c95/e0d00d46-61a1-4d8c-8cb4-2e5f1683d7f5'] > failed with rc=1 out='' err=bytearray(b'qemu-img: error while reading > sector 131072: Transport endpoint is not connected\nqemu-img: error while > reading sector 135168: Transport endpoint is not connected\nqemu-img: error > while reading sector 139264: Transport endpoint is not connected\nqemu-img: > error while reading sector 143360: Transport endpoint is not > connected\nqemu-img: error while reading sector 147456: Transport endpoint > is not connected\nqemu-img: error while reading sector 151552: Transport > endpoint is not connected\n') > 2020-08-12 00:03:15,428+0200 ERROR (tasks/7) [storage.Image] Unexpected > error (image:849) > Traceback (most recent call last): > File "/usr/lib/python2.7/site-packages/vdsm/storage/image.py", line 837, > in copyCollapsed > raise se.CopyImageError(str(e)) > CopyImageError: low level Image copy failed: ("Command > ['/usr/bin/qemu-img', 'convert', '-p', '-t', 'none', '-T', 'none', '-f', > 'raw', > u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_vmstore/9d1b8774-c5dc-46a8-bfa2-6a6db5851195/images/aca27b96-7215-476f-b793-fb0396543a2e/311f853c-e9cc-4b9e-8a00-5885ec7adf14', > '-O', 'raw', > u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_data/32129b5f-d47c-495b-a282-7eae1079257e/images/f6a08d2a-4ddb-42da-88e6-4f92a38b9c95/e0d00d46-61a1-4d8c-8cb4-2e5f1683d7f5'] > failed with rc=1 out='' err=bytearray(b'qemu-img: error while reading > sector 131072: Transport endpoint is not connected\\nqemu-img: error while > reading sector 135168: Transport endpoint is not connected\\nqemu-img: > error while reading sector 139264: Transport endpoint is not > connected\\nqemu-img: error while reading sector 143360: Transport endpoint > is not connected\\nqemu-img: error while reading sector 147456: Transport > endpoint is not connected\\nqemu-img: error while reading sector 151552: T > ransport endpoint is not connected\\n')",) >
Please file a gluster bug for this. You should be able to reproduce by running qemu-img manually: qemu-img convert -p -t none -T none-f raw -O raw \ /rhev/data-center/mnt/glusterSD/192.168.0.91:_vmstore/9d1b8774-c5dc-46a8-bfa2-6a6db5851195/images/aca27b96-7215-476f-b793-fb0396543a2e/311f853c-e9cc-4b9e-8a00-5885ec7adf14 \ /rhev/data-center/mnt/glusterSD/192.168.0.91:_data/test.raw > 2020-08-12 00:03:15,429+0200 ERROR (tasks/7) [storage.TaskManager.Task] > (Task='6399d533-e96a-412d-b0c3-0548e24d658d') Unexpected error (task:875) > Traceback (most recent call last): > File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, > in _run > return fn(*args, **kargs) > File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 336, > in run > return self.cmd(*self.argslist, **self.argsdict) > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line > 79, in wrapper > return method(self, *args, **kwargs) > File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1633, > in copyImage > postZero, force, discard) > File "/usr/lib/python2.7/site-packages/vdsm/storage/image.py", line 837, > in copyCollapsed > raise se.CopyImageError(str(e)) > CopyImageError: low level Image copy failed: ("Command > ['/usr/bin/qemu-img', 'convert', '-p', '-t', 'none', '-T', 'none', '-f', > 'raw', > u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_vmstore/9d1b8774-c5dc-46a8-bfa2-6a6db5851195/images/aca27b96-7215-476f-b793-fb0396543a2e/311f853c-e9cc-4b9e-8a00-5885ec7adf14', > '-O', 'raw', > u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_data/32129b5f-d47c-495b-a282-7eae1079257e/images/f6a08d2a-4ddb-42da-88e6-4f92a38b9c95/e0d00d46-61a1-4d8c-8cb4-2e5f1683d7f5'] > failed with rc=1 out='' err=bytearray(b'qemu-img: error while reading > sector 131072: Transport endpoint is not connected\\nqemu-img: error while > reading sector 135168: Transport endpoint is not connected\\nqemu-img: > error while reading sector 139264: Transport endpoint is not > connected\\nqemu-img: error while reading sector 143360: Transport endpoint > is not connected\\nqemu-img: error while reading sector 147456: Transport > endpoint is not connected\\nqemu-img: error while reading sector 151552: T > ransport endpoint is not connected\\n')",) > _______________________________________________ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/PRK4LTN3VTOQTBXOHS5R5IOXSIPYR64I/ >
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/GCLCLUR4Y662V5EBIXITCDOYKDUBAHVM/