[PVE-User] Parallel VM creation/destruction issue

Eneko Lacunza via pve-user Mon, 05 Jul 2021 03:37:48 -0700

--- Begin Message ---
Hi all,
We have split the BIG 88 node cluster in 6 clusters of 15 nodes each(there where some spare servers); now things seem much better :)
Sadly, we are seeing some issues when VDI management system (USDEnterprise) is performing mass (in the order of 100s or even 1000s)destruction and creation of VMs. In a fraction of the clone operations,clone will fail with the following message:
"Error: clone failed. Failed to change directory to'/mnt/pve/vdi-prod1/images/103': No such file or directory at/usr/share/perl5/PVE/Storage/Plugin.pm line 708."
This happens when destroy for that VMID was some seconds before (5s-14sfor example). When another clone tries to use that VMID later (as soonas 54s after destruction), it works ok.
PVE version is 6.4 ISO (details below), and storage is NFS 4.2 with pNFSwith two pairs of NetApp servers in HA.
Seems like a "race condition" is happening, where the node that iscloning sees the storage directory removed by destruction late (?).
I have checked "qemu-server.git/PVE/QemuServer.pm:sub destroy_vm" and Isee first storage disk are freed and after that VM config is removed,which seems quite correct. Could it be the NFS servers that are a bit"late" propagating directory removal to the client nodes?
Any ideas?

Thanks

Eneko Lacunza
Zuzendari teknikoa | Director técnico
Binovo IT Human Project

Tel. +34 943 569 206 | https://www.binovo.es
Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun

https://www.youtube.com/user/CANALBINOVO
https://www.linkedin.com/company/37269706/
--- End Message ---

_______________________________________________
pve-user mailing list
[email protected]
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

[PVE-User] Parallel VM creation/destruction issue

Reply via email to