I tried to measure IOs using gluster volume top but its results seem very 
cryptic to me (I need a deep analyze and don't have the time now)

Thank you very much for your analysis, if I understood the problem is that the 
consumer SSD cache is too weak to help in times under a smoll number ~15 not 
particularly IO intensive VMs, so the IO hangs as the performance is poor and 
this hangs the VMs. The VMs kernel think that the CPU had hanged and so it 
crash.

This seem to be the case....

If it's possible would be very useful a sort of profiler in the gluster 
enviromnent that raise up the evidence of issue related to speed of the 
undelying storage infrastructure, it can be a problem related to disks or to 
network, in any case the errors reported to user are almost misleading as it 
seem there is a data integrity issue (cannot read... or something like this.
Only for reference these are the first lines of the "open" top command 
(currently I don't experience problems):
[root@ovirt-node2 ~]# gluster volume top gv1 open
Brick: ovirt-node2.ovirt:/brickgv1/gv1
Current open fds: 15, Max open fds: 38, Max openfd time: 2022-09-19 
07:27:20.033304 +0000
Count           filename
=======================
331763          /45b4f14c-8323-482f-90ab-99d8fd610018/dom_md/inbox
66284           /45b4f14c-8323-482f-90ab-99d8fd610018/dom_md/leases
53939           /45b4f14c-8323-482f-90ab-99d8fd610018/dom_md/metadata.new
169             
/45b4f14c-8323-482f-90ab-99d8fd610018/images/910fa026-d30b-4be2-9111-3c9f4f646fde/b7d6f39a-1481-4f5c-84fd-fc43f9e14d71
[...]
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/F7FKIJHYOANZM657KDZMIKC23CHXKRDS/

Reply via email to