The commit is pushed to "branch-rh9-5.14.0-427.55.1.vz9.82.x-ovz" and will 
appear at g...@bitbucket.org:openvz/vzkernel.git
after rh9-5.14.0-427.55.1.vz9.82.3
------>
commit 34451d746b817b69b5e9df07bfef622bd4e1a2df
Author: Pavel Tikhomirov <ptikhomi...@virtuozzo.com>
Date:   Mon Jul 7 10:04:02 2025 +0200

    dm-ploop: truncate preallocated space on ploop destruction
    
    Recently we noticed that fstrim does not work properly on ploop images.
    The problem is actually worse: current code preallocates excessive space
    on the image and loses it later.
    
    Example:
      [root@vz9-demens-1 fstrim]# du -h .
      4.0K  .
      [root@vz9-demens-1 fstrim]# ploop init -s 10G image
      ...
      [root@vz9-demens-1 fstrim]# du -h .
      513M  .
      [root@vz9-demens-1 fstrim]# ploop mount -m /mnt/fstrim/ DiskDescriptor.xml
      ...
      [root@vz9-demens-1 fstrim]# dd if=/dev/urandom of=/mnt/fstrim/file bs=1M 
count=1024 oflag=direct
      1024+0 records in
      1024+0 records out
      1073741824 bytes (1.1 GB, 1.0 GiB) copied, 4.93694 s, 217 MB/s
      [root@vz9-demens-1 fstrim]# ploop umount DiskDescriptor.xml
      ...
      [root@vz9-demens-1 fstrim]# du -h .
      8.7G  .
      [root@vz9-demens-1 fstrim]# ploop mount -m /mnt/fstrim/ DiskDescriptor.xml
      ...
      [root@vz9-demens-1 fstrim]# rm -rf /mnt/fstrim/file
      [root@vz9-demens-1 fstrim]# fstrim /mnt/fstrim
      [root@vz9-demens-1 fstrim]# ploop umount DiskDescriptor.xml
      ...
      [root@vz9-demens-1 fstrim]# du -h .
      7.7G  .
    
    The image just after the creation was already 512M. After writing 1GB
    data it was almost 9GB in size. The fstrim discarded only known blocks, so
    final size is 7.7GB.
    
    Mainly two factors contribute to that:
     - file_preallocated_area_start is never updated
     - parallel preallocation requests scale preallocation size by 128M
       each time
    
    To fix this, the patch changes preallocation logic a bit:
    
     - Update file_preallocated_area_start after allocation starts using
       preallocated space.
     - Truncate preallocated space on ploop destruction.
    
    Following several patches also do the following:
    
     - Make preallocations absolute instead of relative in
       ploop_req_prealloc(). Thus only request what is really needed.
     - Rename s/ploop_pending_prealloc/ploop_no_pending_prealloc/ as it
       returns true in case of no pending preallocation.
     - Make ploop_preallocate_cluster() return error on unexpected file size
       change. Also simplify the logic a little bit (e.g. more is excess).
     - Preallocation in ploop_allocate_cluster() should not depend on (pos <
       prealloc_start), so always try to preallocate. Also simplify
       ploop_allocate_cluster() a little bit (e.g. old_size is excess).
    
    Same example after the patchset:
    
    [root@ptikh-hci fstrim]# du -h .
    4.0K    .
    [root@ptikh-hci fstrim]# ploop init -s 10G image
    [root@ptikh-hci fstrim]# du -h .
    3.2M    .
    [root@ptikh-hci fstrim]# ploop mount -m /mnt/fstrim/ DiskDescriptor.xml
    ...
    [root@ptikh-hci fstrim]# dd if=/dev/urandom of=/mnt/fstrim/file bs=1M 
count=1024 oflag=direct
    ...
    [root@ptikh-hci fstrim]# ploop umount DiskDescriptor.xml
    ...
    [root@ptikh-hci fstrim]# du -h .
    1.1G    .
    [root@ptikh-hci fstrim]# ploop mount -m /mnt/fstrim/ DiskDescriptor.xml
    ...
    [root@ptikh-hci fstrim]# rm -rf /mnt/fstrim/file
    [root@ptikh-hci fstrim]# fstrim /mnt/fstrim
    [root@ptikh-hci fstrim]# ploop umount DiskDescriptor.xml
    ...
    [root@ptikh-hci fstrim]# du -h .
    4.2M    .
    
    Now it is much better. Empty image size is quite small, write does
    not cause excessive preallocation and fstrim therefore reduces the
    size.
    
    https://virtuozzo.atlassian.net/browse/VSTOR-108868
    Co-developed-by: Andrey Zhadchenko <andrey.zhadche...@virtuozzo.com>
    Signed-off-by: Pavel Tikhomirov <ptikhomi...@virtuozzo.com>
    
    Feature: dm-ploop: ploop target driver
    
    ---
    v3: This is rework of v2 which makes it possible to apply it in
    ready-kernel (no ploop structure changes). Also it covers some new
    cases, like resetting the state on unexpected file size change. In v2
    preemptive_prealloc was not reset and can probably lead to problems.
    
    v4: fix compilation and incorporate v2 description
    
    v5: splitted the patch into several small ones
---
 drivers/md/dm-ploop-map.c    | 2 ++
 drivers/md/dm-ploop-target.c | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/drivers/md/dm-ploop-map.c b/drivers/md/dm-ploop-map.c
index 55b39874beee9..0ce482d3644a1 100644
--- a/drivers/md/dm-ploop-map.c
+++ b/drivers/md/dm-ploop-map.c
@@ -1339,6 +1339,8 @@ static int ploop_allocate_cluster(struct ploop *ploop, 
u32 *dst_clu, struct file
                }
        }
 
+       if (end > top->file_preallocated_area_start)
+               top->file_preallocated_area_start = end;
        spin_unlock_irqrestore(&ploop->bat_lock, flags);
 
        return 0;
diff --git a/drivers/md/dm-ploop-target.c b/drivers/md/dm-ploop-target.c
index 8fa0043a71420..ea5f3a8dab6b6 100644
--- a/drivers/md/dm-ploop-target.c
+++ b/drivers/md/dm-ploop-target.c
@@ -162,6 +162,7 @@ static bool ploop_empty_htable(struct hlist_head head[])
 
 static void ploop_destroy(struct ploop *ploop)
 {
+       struct ploop_delta *top = ploop_top_delta(ploop);
        int i;
 
        if (ploop->kt_worker) {
@@ -201,6 +202,11 @@ static void ploop_destroy(struct ploop *ploop)
 
        for (i = 0; i < 2; i++)
                percpu_ref_exit(&ploop->inflight_bios_ref[i]);
+
+       if (top->file_preallocated_area_start < top->file_size &&
+           vfs_truncate2(&top->file->f_path, 
top->file_preallocated_area_start, top->file))
+               PL_ERR("Unable to truncate preallocated area on destroy");
+
        /* Nobody uses it after destroy_workqueue() */
        while (ploop->nr_deltas-- > 0) {
                if (ploop->deltas[ploop->nr_deltas].file) {
_______________________________________________
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to