Re: [PVE-User] High write iops rate from idle VMs running on a PVE cluster with ceph storage

Eneko Lacunza via pve-user Mon, 12 Jul 2021 00:43:46 -0700

--- Begin Message ---
Hi Rainer,

El 12/7/21 a las 8:53, Rainer Krienke escribió:
Hello,
I run a 5 node PVE cluster with pve-manager/6.4-8/185e14db (runningkernel: 5.4.119-1-pve). The storage backend is a HDD based "external"ceph cluster running Ceph 14.2.16 with 144 OSDs on 9 hosts. Currentlythere are about 70 VMs running on this PVE cluster, all Linux (Ubuntu,SLES).
The problem I have is that writing on VMS has become slower and slowerover time and eg running linux updates (eg apt upgrade) on the VMStakes longer and longer. The reason seams to be a steadily risingwrite IOPs rate on the storage side. Of course over time the number ofVNMs also increased up to the current number causing higher numbers.
Over the week day I can see rates on the ceph side of up to 1000IOPS/sec writing and about 300 IOPS/sec reading. The really stangething is however that even at weekends where the services the VMsoffer are hardly used at all, there is still a quite high write IOPSrate of about 400/sec whereas the read rate is only about 50 IOPS/secthen. The Bytes read/written are minimal at this time with only about100KBytes read/sec and about 5MBytes write/sec.
I don't think you should have I/O problems with a Ceph cluster with 144OSDs and 9 hosts if they are healthy, you should be able to perform morethan that. I'd suspect of some host or OSDs performing poorly that breakwhole cluster's performance...
So what I am looking for is by what the "always there" write IOPS-Rateof about 400 could be caused. My guess is that this could be causedby file time (mtime,ctime,atime) write updates to the VMsfilesystems. If this was true then using lazytime in /etc/fstab onall VMs could help to avoid this behaviour.
But on the other hand all VMs use the (safe) "Writeback"-cachesetting. So shouldn't this cache mode also cache writes caused byupdates for file times?
If yes, than I have to look for other reasons for my write IOPSproblem allthough I have no idea about this at the moment. Anysuggestions?
We have a cluster with 62 VMs running (mostly Linux but also someWindows), I'm seeing right now 5-15MB/s read and 5-35MB/s writes, withIOPS ~500 read and ~200 writes. This is with two pools, one 4 SSD OSDbased and the other 11 HDD OSDs. HDD pool has 45 VMs running on it andapt upgrade performance is good...
Cheers

Eneko Lacunza
Zuzendari teknikoa | Director técnico
Binovo IT Human Project

Tel. +34 943 569 206 | https://www.binovo.es
Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun

https://www.youtube.com/user/CANALBINOVO
https://www.linkedin.com/company/37269706/
--- End Message ---

_______________________________________________
pve-user mailing list
[email protected]
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Re: [PVE-User] High write iops rate from idle VMs running on a PVE cluster with ceph storage

Reply via email to