--- Begin Message ---
Am 27.12.2022 um 18:54 schrieb Óscar de Arriba:
Hello all,
From ~1 week ago, one of my Proxmox nodes' data LVM is doing strange things.
For storage, I'm using a commercial Crucial MX500 SATA SSD connected directly
to the motherboard controller (no PCIe HBA for the system+data disk) and it is
brand new - and S.M.A.R.T. checks are passing, only 4% of wearout. I have set
up proxmox inside a cluster with LVM and making backups to a NFS external
location.
Last week I tried to migrate an stopped VM of ~64 GiB from one server to
another, and found out *the SSD started to underperform (~5 MB/s) after roughly
55 GiB copied *(this pattern was repeated several times).
It was so bad that *even cancelling the migration, the SSD continued busy
writting at that speeed and I need to reboot the instance, as it was completely
unusable* (it is in my homelab, not running mission critical workloads, so it
was okay to do that). After the reboot, I could remove the half-copied VM disk.
After that, (and several retries, even making a backup to an external storage
and trying to restore the backup, just in case the bottleneck was on the
migration process) I ended up creating the instance from scratch and migrating
data from one VM to another - so the VM was crearted brand new and no
bottleneck was hit.
The problem is that *now the pve/data logical volume is showing 377 GiB used,
but the total size of stored VM disks (even if they are 100% approvisioned) is
168 GiB*. I checked and both VMs have no snapshots.
I don't know if the reboot while writting to the disk (always having cancelled
the migration first) damaged the LV in some way, but after thinking about it it
does not even make sense that an SSD of this type ends up writting at 5 MB/s,
even with the writting cache full. It should be writting far faster than that
even without cache.
Some information about the storage:
`root@venom:~# lvs -a
LV VG Attr LSize Pool Origin Data% Meta% Move Log
Cpy%Sync Convert
data pve twi-aotz-- 377.55g 96.13 1.54
[data_tdata] pve Twi-ao---- 377.55g
[data_tmeta] pve ewi-ao---- <3.86g
[lvol0_pmspare] pve ewi------- <3.86g
root pve -wi-ao---- 60.00g
swap pve -wi-ao---- 4.00g
vm-150-disk-0 pve Vwi-a-tz-- 4.00m data 14.06
vm-150-disk-1 pve Vwi-a-tz-- 128.00g data 100.00
vm-201-disk-0 pve Vwi-aotz-- 4.00m data 14.06
vm-201-disk-1 pve Vwi-aotz-- 40.00g data 71.51`
and can be also seen on this post on the forum I did a couple of days ago:
https://forum.proxmox.com/threads/thin-lvm-showing-more-used-space-than-expected.120051/
Any ideas aside from doing a backup and reinstall from scratch?
Thanks in advance!
_______________________________________________
pve-user mailing list
[email protected]
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
Hi,
Never used lvm-thin, so beware, this is just guessing, but to me this
looks like, for some reason, something filled up your pool once
(probably the migration?). Consumer SSDs don't perform well when
allocation all space (at least to my knowledge) and, even there is still
space in the pool, there are no free blocks (as for the SSDs
controller). Therefore the low speed may come from this situation, as
the controller needs to erase blocks, before writing them again, due to
the lack of (known) free space. Did you try to run a fstrim on the VMs
to regain the allocated space? At least on linux something like "fstrim
-av" should do the trick. Also the "discard" option needs to be enabled
for all volumes you want to trim, so check the VM config first.
hth
Martin
--- End Message ---
_______________________________________________
pve-user mailing list
[email protected]
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user