From: Pavel Tikhomirov <ptikhomi...@virtuozzo.com> Recently we noticed that fstrim does not work properly on ploop images. The problem is actually worse: current code preallocates excessive space on the image and loses it later.
Example: [root@vz9-demens-1 fstrim]# du -h . 4.0K . [root@vz9-demens-1 fstrim]# ploop init -s 10G image ... [root@vz9-demens-1 fstrim]# du -h . 513M . [root@vz9-demens-1 fstrim]# ploop mount -m /mnt/fstrim/ DiskDescriptor.xml ... [root@vz9-demens-1 fstrim]# dd if=/dev/urandom of=/mnt/fstrim/file bs=1M count=1024 oflag=direct 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 4.93694 s, 217 MB/s [root@vz9-demens-1 fstrim]# ploop umount DiskDescriptor.xml ... [root@vz9-demens-1 fstrim]# du -h . 8.7G . [root@vz9-demens-1 fstrim]# ploop mount -m /mnt/fstrim/ DiskDescriptor.xml ... [root@vz9-demens-1 fstrim]# rm -rf /mnt/fstrim/file [root@vz9-demens-1 fstrim]# fstrim /mnt/fstrim [root@vz9-demens-1 fstrim]# ploop umount DiskDescriptor.xml ... [root@vz9-demens-1 fstrim]# du -h . 7.7G . The image just after the creation was already 512M. After writing 1GB data it was almost 9GB in size. The fstrim discarded only known blocks, so final size is 7.7GB. Mainly two factors contribute to that: - file_preallocated_area_start is never updated - parallel preallocation requests scale preallocation size by 128M each time To fix this, the patch changes preallocation logic a bit: - Update file_preallocated_area_start after allocation starts using preallocated space. - Truncate preallocated space on ploop destruction. Following several patches also do the following: - Make preallocations absolute instead of relative in ploop_req_prealloc(). Thus only request what is really needed. - Rename s/ploop_pending_prealloc/ploop_no_pending_prealloc/ as it returns true in case of no pending preallocation. - Make ploop_preallocate_cluster() return error on unexpected file size change. Also simplify the logic a little bit (e.g. more is excess). - Preallocation in ploop_allocate_cluster() should not depend on (pos < prealloc_start), so always try to preallocate. Also simplify ploop_allocate_cluster() a little bit (e.g. old_size is excess). Same example after the patchset: [root@ptikh-hci fstrim]# du -h . 4.0K . [root@ptikh-hci fstrim]# ploop init -s 10G image [root@ptikh-hci fstrim]# du -h . 3.2M . [root@ptikh-hci fstrim]# ploop mount -m /mnt/fstrim/ DiskDescriptor.xml ... [root@ptikh-hci fstrim]# dd if=/dev/urandom of=/mnt/fstrim/file bs=1M count=1024 oflag=direct ... [root@ptikh-hci fstrim]# ploop umount DiskDescriptor.xml ... [root@ptikh-hci fstrim]# du -h . 1.1G . [root@ptikh-hci fstrim]# ploop mount -m /mnt/fstrim/ DiskDescriptor.xml ... [root@ptikh-hci fstrim]# rm -rf /mnt/fstrim/file [root@ptikh-hci fstrim]# fstrim /mnt/fstrim [root@ptikh-hci fstrim]# ploop umount DiskDescriptor.xml ... [root@ptikh-hci fstrim]# du -h . 4.2M . Now it is much better. Empty image size is quite small, write does not cause excessive preallocation and fstrim therefore reduces the size. https://virtuozzo.atlassian.net/browse/VSTOR-108868 Co-developed-by: Andrey Zhadchenko <andrey.zhadche...@virtuozzo.com> Signed-off-by: Pavel Tikhomirov <ptikhomi...@virtuozzo.com> Feature: dm-ploop: ploop target driver --- v3: This is rework of v2 which makes it possible to apply it in ready-kernel (no ploop structure changes). Also it covers some new cases, like resetting the state on unexpected file size change. In v2 preemptive_prealloc was not reset and can probably lead to problems. v4: fix compilation and incorporate v2 description v5: splitted the patch into several small ones --- drivers/md/dm-ploop-map.c | 2 ++ drivers/md/dm-ploop-target.c | 6 ++++++ 2 files changed, 8 insertions(+) diff --git a/drivers/md/dm-ploop-map.c b/drivers/md/dm-ploop-map.c index 11767dd919503..79c2f4c7db033 100644 --- a/drivers/md/dm-ploop-map.c +++ b/drivers/md/dm-ploop-map.c @@ -1339,6 +1339,8 @@ static int ploop_allocate_cluster(struct ploop *ploop, u32 *dst_clu, struct file } } + if (end > top->file_preallocated_area_start) + top->file_preallocated_area_start = end; spin_unlock_irqrestore(&ploop->bat_lock, flags); return 0; diff --git a/drivers/md/dm-ploop-target.c b/drivers/md/dm-ploop-target.c index 8fa0043a71420..ea5f3a8dab6b6 100644 --- a/drivers/md/dm-ploop-target.c +++ b/drivers/md/dm-ploop-target.c @@ -162,6 +162,7 @@ static bool ploop_empty_htable(struct hlist_head head[]) static void ploop_destroy(struct ploop *ploop) { + struct ploop_delta *top = ploop_top_delta(ploop); int i; if (ploop->kt_worker) { @@ -201,6 +202,11 @@ static void ploop_destroy(struct ploop *ploop) for (i = 0; i < 2; i++) percpu_ref_exit(&ploop->inflight_bios_ref[i]); + + if (top->file_preallocated_area_start < top->file_size && + vfs_truncate2(&top->file->f_path, top->file_preallocated_area_start, top->file)) + PL_ERR("Unable to truncate preallocated area on destroy"); + /* Nobody uses it after destroy_workqueue() */ while (ploop->nr_deltas-- > 0) { if (ploop->deltas[ploop->nr_deltas].file) { -- 2.43.0 _______________________________________________ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel