So,
I did more digging and now I know how to reproduce it.
I created a VM and added a disk on local ssd using scratchpad hook,
formatted and mounted this scratchdisk.
Now, when I try to do heavy IO on this scratchdisk on local ssd, like,
dd if=/dev/zero of=/mnt/scratchdisk/test bs=1M count=10000, qemu
pauses VM.
Debug logs in libvirt shows

2021-09-23 11:04:32.765+0000: 463319: debug : virThreadJobSet:94 :
Thread 463319 (rpc-worker) is now running job
remoteDispatchNodeGetFreePages
2021-09-23 11:04:32.765+0000: 463319: debug : virNodeGetFreePages:1614
: conn=0x7f8620018ba0, npages=3, pages=0x7f8670009960,
startCell=4294967295, cellCount=1, counts=0x7f8670007db0, flags=0x0
2021-09-23 11:04:32.765+0000: 463319: debug : virThreadJobClear:119 :
Thread 463319 (rpc-worker) finished job remoteDispatchNodeGetFreePages
with ret=0
2021-09-23 11:04:34.235+0000: 488774: debug :
qemuMonitorJSONIOProcessLine:220 : Line [{"timestamp": {"seconds":
1632395074, "microseconds": 235454}, "event": "BLOCK_IO_ERROR",
"data": {"device": "", "nospace": false, "node-name":
"libvirt-3-format", "reason": "Input/output error", "operation":
"write", "action": "stop"}}]
2021-09-23 11:04:34.235+0000: 488774: info :
qemuMonitorJSONIOProcessLine:235 : QEMU_MONITOR_RECV_EVENT:
mon=0x7f860c14b700 event={"timestamp": {"seconds": 1632395074,
"microseconds": 235454}, "event": "BLOCK_IO_ERROR", "data": {"device":
"", "nospace": false, "node-name": "libvirt-3-format", "reason":
"Input/output error", "operation": "write", "action": "stop"}}
2021-09-23 11:04:34.235+0000: 488774: debug :
qemuMonitorJSONIOProcessEvent:181 : mon=0x7f860c14b700
obj=0x7f860c0e7450
2021-09-23 11:04:34.235+0000: 488774: debug :
qemuMonitorEmitEvent:1166 : mon=0x7f860c14b700 event=BLOCK_IO_ERROR
2021-09-23 11:04:34.235+0000: 488774: debug :
qemuProcessHandleEvent:581 : vm=0x7f86201d6df0
2021-09-23 11:04:34.235+0000: 488774: debug : virObjectEventNew:624 :
obj=0x7f860c0d82f0
2021-09-23 11:04:34.235+0000: 488774: debug :
qemuMonitorJSONIOProcessEvent:206 : handle BLOCK_IO_ERROR
handler=0x7f8639c77a90 data=0x7f860c0661c0

To confirm the local ssd is fine, have enough space where scratch disk
is located and I could run dd in host without any issues.

This happens on other storages as well.
So this seems like an issue with qemu when heavy IO is happening on a disk.

On Thu, Sep 23, 2021 at 7:19 AM Tommy Sway <sz_cui...@163.com> wrote:
>
> Another option with (still tech preview) is Managed Block Storage (Cinder 
> based storage).
>
> It still tech preview in 4.4 ??
>
>
>
>
>
>
>
> -----Original Message-----
> From: users-boun...@ovirt.org <users-boun...@ovirt.org> On Behalf Of Nir 
> Soffer
> Sent: Wednesday, August 11, 2021 4:26 AM
> To: Shantur Rathore <shantur.rath...@gmail.com>
> Cc: users <users@ovirt.org>; Roman Bednar <rbed...@redhat.com>
> Subject: [ovirt-users] Re: Sparse VMs from Templates - Storage issues
>
> On Tue, Aug 10, 2021 at 4:24 PM Shantur Rathore <shantur.rath...@gmail.com> 
> wrote:
> >
> > Hi all,
> >
> > I have a setup as detailed below
> >
> > - iSCSI Storage Domain
> > - Template with Thin QCOW2 disk
> > - Multiple VMs from Template with Thin disk
>
> Note that a single template disk used by many vms can become a performance 
> bottleneck, and is a single point of failure. Cloning the template when 
> creating vms avoids such issues.
>
> > oVirt Node 4.4.4
>
> 4.4.4 is old, you should upgrade to 4.4.7.
>
> > When the VMs boots up it downloads some data to it and that leads to 
> > increase in volume size.
> > I see that every few seconds the VM gets paused with
> >
> > "VM X has been paused due to no Storage space error."
> >
> >  and then after few seconds
> >
> > "VM X has recovered from paused back to up"
>
> This is normal operation when a vm writes too quickly and oVirt cannot extend 
> the disk quick enough. To mitigate this, you can increase the volume chunk 
> size.
>
> Created this configuration drop in file:
>
> # cat /etc/vdsm/vdsm.conf.d/99-local.conf
> [irs]
> volume_utilization_percent = 25
> volume_utilization_chunk_mb = 2048
>
> And restart vdsm.
>
> With this setting, when free space in a disk is 1.5g, the disk will be 
> extended by 2g. With the default setting, when free space is 0.5g the disk 
> was extended by 1g.
>
> If this does not eliminate the pauses, try a larger chunk size like 4096.
>
> > Sometimes after a many pause and recovery the VM dies with
> >
> > "VM X is down with error. Exit message: Lost connection with qemu process."
>
> This means qemu has crashed. You can find more info in the vm log at:
> /var/log/libvirt/qemu/vm-name.log
>
> We know about bugs in qemu that cause such crashes when vm disk is extended. 
> I think the latest bug was fixed in 4.4.6, so upgrading to 4.4.7 will fix 
> this issue.
>
> Even with these settings, if you have a very bursty io in the vm, it may 
> become paused. The only way to completely avoid these pauses is to use a 
> preallocated disk, or use file storage (e.g. NFS). Preallocated disk can be 
> thin provisioned on the server side so it does not mean you need more 
> storage, but you will not be able to use shared templates in the way you use 
> them now. You can create vm from template, but the template is cloned to the 
> new vm.
>
> Another option with (still tech preview) is Managed Block Storage (Cinder 
> based storage). If your storage server is supported by Cinder, we can managed 
> it using cinderlib. In this setup every disk is a LUN, which may be thin 
> provisioned on the storage server. This can also offload storage operations 
> to the server, like cloning disks, which may be much faster and more 
> efficient.
>
> Nir
> _______________________________________________
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: 
> https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/W653KLDZMLUNMKLE242UFH5LY4KQ6LD5/
>
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/A3F7KD6CYKB6ZXIGQQPYNDZOBPTQKPLO/

Reply via email to