Re: [PATCH 7/7] iotests/271: check disk usage on subcluster-based discard/unmap

2023-11-09 Thread Andrey Drobyshev
On 11/3/23 17:51, Hanna Czenczek wrote:
> On 20.10.23 23:56, Andrey Drobyshev wrote:
>> Add _verify_du_delta() checker which is used to check that real disk
>> usage delta meets the expectations.  For now we use it for checking that
>> subcluster-based discard/unmap operations lead to actual disk usage
>> decrease (i.e. PUNCH_HOLE operation is performed).
> 
> I’m not too happy about checking the disk usage because that relies on
> the underlying filesystem actually accepting and executing the unmap. 
> Why is it not enough to check the L2 bitmap?
> 
> …Coming back later (I had to fix the missing `ret = ` I mentioned in
> patch 2, or this test would hang, so I couldn’t run it at first), I note
> that checking the disk usage in fact doesn’t work on tmpfs.  I usually
> run the iotests in tmpfs, so that’s not great.
> 

My original idea was to make sure that the PUNCH_HOLE operation did
indeed take place, i.e. there was an actual discard.  For instance,
currently the discard operation initiated by qemu-io is called with the
QCOW2_DISCARD_REQUEST discard type, but if some other type is passed by
mistake, qcow2_queue_discard() won't be called, and though the
subclusters will be marked unallocated in L2 the data will still be
there.  Not quite what we expect from discard operation.

BTW checking the disk usage on tmpfs works on my machine:

> # cd /tmp; df -Th /tmp
> Filesystem Type   Size  Used Avail Use% Mounted on
> tmpfs  tmpfs   32G  2.5M   32G   1% /tmp
> # BUILD=/root/src/qemu/master/build
> # $BUILD/qemu-img create -f qcow2 -o extended_l2=on img.qcow2 1M
> Formatting 'img.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=on 
> compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16
> # $BUILD/qemu-io -c 'write -q 0 128k' img.qcow2
> # du --block-size=1 img.qcow2
> 397312  img.qcow2
> # $BUILD/qemu-io -f qcow2 -c 'discard -q 0 8k' img.qcow2
> # du --block-size=1 img.qcow2
> 389120  img.qcow2
> # $BUILD/qemu-io -f qcow2 -c 'discard -q 8k 120k' img.qcow2
> # du --block-size=1 img.qcow2
> 266240  img.qcow2

I'm wondering what this might depend on and can't we overcome this?

>> Also add separate test case for discarding particular subcluster within
>> one cluster.
>>
>> Signed-off-by: Andrey Drobyshev 
>> ---
>>   tests/qemu-iotests/271 | 25 -
>>   tests/qemu-iotests/271.out |  2 ++
>>   2 files changed, 26 insertions(+), 1 deletion(-)
>>
>> diff --git a/tests/qemu-iotests/271 b/tests/qemu-iotests/271
>> index c7c2cadda0..5fcb209f5f 100755
>> --- a/tests/qemu-iotests/271
>> +++ b/tests/qemu-iotests/271
>> @@ -81,6 +81,15 @@ _verify_l2_bitmap()
>>   fi
>>   }
>>   +# Check disk usage delta after a discard/unmap operation
>> +# _verify_du_delta $before $after $expected_delta
>> +_verify_du_delta()
>> +{
>> +    if [ $(($1 - $2)) -ne $3 ]; then
>> +    printf "ERROR: unexpected delta: $1 - $2 = $(($1 - $2)) != $3\n"
>> +    fi
>> +}
>> +
>>   # This should be called as _run_test c=XXX sc=XXX off=XXX len=XXX
>> cmd=XXX
>>   # c:   cluster number (0 if unset)
>>   # sc:  subcluster number inside cluster @c (0 if unset)
>> @@ -198,9 +207,12 @@ for use_backing_file in yes no; do
>>   alloc="$(seq 0 31)"; zero=""
>>   _run_test sc=0 len=64k
>>   -    ### Zero and unmap half of cluster #0 (this won't unmap it)
>> +    ### Zero and unmap half of cluster #0 (this will unmap it)
> 
> I think “it” refers to the cluster, and it is not unmapped.  This test
> case does not use a discard, but write -z instead, so it worked before. 
> (The L2 bitmap shown in the output doesn’t change, so functionally, this
> patch series didn’t change this case.)
> 

>From the _run_test() implementation:

> # cmd: the command to pass to qemu-io, must be one of 
>   
> #  write-> write  
>   
> #  zero -> write -z   
>   
> #  unmap-> write -z -u   <-   
> 
> #  compress -> write -c   
>   
> #  discard  -> discard
>   
> _run_test()

So it actually uses 'write -z -u', and we end up with an actual unmap.
I agree that the l2 bitmap doesn't change, that's why I specifically
added disk usage check to catch the changed functionality.

>>   alloc="$(seq 16 31)"; zero="$(seq 0 15)"
>> +    before=$(disk_usage "$TEST_IMG")
>>   _run_test sc=0 len=32k cmd=unmap
>> +    after=$(disk_usage "$TEST_IMG")
>> +    _verify_du_delta $before $after 32768
>>     ### Zero and unmap cluster #0
>>   alloc=""; zero="$(seq 0 31)"
> 
> For this following case shown truncated here, why don’t we try
> “_run_test sc=16 len=32k cmd=unmap” instead of “sc=0 len=64k”?  I.e.
> unmap only the second half, which, thanks to patch 3, should still unmap
> the whole cluster, because the 

Re: [PATCH 7/7] iotests/271: check disk usage on subcluster-based discard/unmap

2023-11-09 Thread Andrey Drobyshev
On 11/3/23 17:59, Hanna Czenczek wrote:
> On 03.11.23 16:51, Hanna Czenczek wrote:
>> On 20.10.23 23:56, Andrey Drobyshev wrote: 
> 
> [...]
> 
>>> @@ -528,6 +543,14 @@ for use_backing_file in yes no; do
>>>   else
>>>   _make_test_img -o extended_l2=on 1M
>>>   fi
>>> +    # Write cluster #0 and discard its subclusters #0-#3
>>> +    $QEMU_IO -c 'write -q 0 64k' "$TEST_IMG"
>>> +    before=$(disk_usage "$TEST_IMG")
>>> +    $QEMU_IO -c 'discard -q 0 8k' "$TEST_IMG"
>>> +    after=$(disk_usage "$TEST_IMG")
>>> +    _verify_du_delta $before $after 8192
>>> +    alloc="$(seq 4 31)"; zero="$(seq 0 3)"
>>> +    _verify_l2_bitmap 0
>>>   # Write clusters #0-#2 and then discard them
>>>   $QEMU_IO -c 'write -q 0 128k' "$TEST_IMG"
>>>   $QEMU_IO -c 'discard -q 0 128k' "$TEST_IMG"
>>
>> Similarly to above, I think it would be good if we combined this
>> following case with the one you added, i.e. to write 128k from the
>> beginning, drop the write here, and change the discard to be “discard
>> -q 8k 120k”, i.e. skip the subclusters we have already discarded, to
>> see that this is still combined to discard the whole first cluster.
>>
>> ...Ah, see, and when I try this, the following assertion fails:
>>
>> qemu-io: ../block/qcow2-cache.c:156: qcow2_cache_destroy: Assertion
>> `c->entries[i].ref == 0' failed.
>> ./common.rc: line 220: 128894 Aborted (core dumped) (
>> VALGRIND_QEMU="${VALGRIND_QEMU_IO}" _qemu_proc_exec
>> "${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
>>
>> Looks like an L2 table is leaked somewhere.  That’s why SCRI should be
>> a g_auto()-able type.
> 
> Forgot to add: This single test case here is the only place where we
> test the added functionality.  I think there should be more cases. It
> doesn’t really make sense now that 271 has so many cases for writing
> zeroes, but so few for discarding, now that discarding works on
> subclusters.  Most of them should at least be considered whether we can
> run them for discard as well.
> 
> I didn’t want to push for such an extensive set of tests, but, well, now
> it turned out I overlooked a bug in patch 4, and only found it because I
> thought “this place might also make a nice test case for this series”.
> 
> Hanna
> 

I agree that more coverage should be added.  Based on the previous
email, I see the following cases:

1. Direct 'discard' on the subclusters range (discard_l2_subclusters()).
2. Direct 'discard' on the subclusters range, complementary to an
unallocated range (i.e. discard_l2_subclusters() -> discard_in_l2_slice()).
3. 'write -u -z' on the subclusters range (zero_l2_subclusters() ->
discard_l2_subclusters()).
4. 'write -u -z' on the subclusters range, complementary to an
unallocated range (zero_l2_subclusters() -> discard_l2_subclusters() ->
discard_in_l2_slice()).

Would also be nice to test the zero_l2_subclusters() ->
zero_in_l2_slice() path, but we'd have to somehow check the refcount
table for that since the L2 bitmap doesn't change.

Please let me know if you can think of anything else.



Re: [PATCH 7/7] iotests/271: check disk usage on subcluster-based discard/unmap

2023-11-03 Thread Hanna Czenczek

On 03.11.23 16:51, Hanna Czenczek wrote:
On 20.10.23 23:56, Andrey Drobyshev wrote: 


[...]


@@ -528,6 +543,14 @@ for use_backing_file in yes no; do
  else
  _make_test_img -o extended_l2=on 1M
  fi
+    # Write cluster #0 and discard its subclusters #0-#3
+    $QEMU_IO -c 'write -q 0 64k' "$TEST_IMG"
+    before=$(disk_usage "$TEST_IMG")
+    $QEMU_IO -c 'discard -q 0 8k' "$TEST_IMG"
+    after=$(disk_usage "$TEST_IMG")
+    _verify_du_delta $before $after 8192
+    alloc="$(seq 4 31)"; zero="$(seq 0 3)"
+    _verify_l2_bitmap 0
  # Write clusters #0-#2 and then discard them
  $QEMU_IO -c 'write -q 0 128k' "$TEST_IMG"
  $QEMU_IO -c 'discard -q 0 128k' "$TEST_IMG"


Similarly to above, I think it would be good if we combined this 
following case with the one you added, i.e. to write 128k from the 
beginning, drop the write here, and change the discard to be “discard 
-q 8k 120k”, i.e. skip the subclusters we have already discarded, to 
see that this is still combined to discard the whole first cluster.


...Ah, see, and when I try this, the following assertion fails:

qemu-io: ../block/qcow2-cache.c:156: qcow2_cache_destroy: Assertion 
`c->entries[i].ref == 0' failed.
./common.rc: line 220: 128894 Aborted (core dumped) ( 
VALGRIND_QEMU="${VALGRIND_QEMU_IO}" _qemu_proc_exec 
"${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )


Looks like an L2 table is leaked somewhere.  That’s why SCRI should be 
a g_auto()-able type.


Forgot to add: This single test case here is the only place where we 
test the added functionality.  I think there should be more cases. It 
doesn’t really make sense now that 271 has so many cases for writing 
zeroes, but so few for discarding, now that discarding works on 
subclusters.  Most of them should at least be considered whether we can 
run them for discard as well.


I didn’t want to push for such an extensive set of tests, but, well, now 
it turned out I overlooked a bug in patch 4, and only found it because I 
thought “this place might also make a nice test case for this series”.


Hanna




Re: [PATCH 7/7] iotests/271: check disk usage on subcluster-based discard/unmap

2023-11-03 Thread Hanna Czenczek

On 20.10.23 23:56, Andrey Drobyshev wrote:

Add _verify_du_delta() checker which is used to check that real disk
usage delta meets the expectations.  For now we use it for checking that
subcluster-based discard/unmap operations lead to actual disk usage
decrease (i.e. PUNCH_HOLE operation is performed).


I’m not too happy about checking the disk usage because that relies on 
the underlying filesystem actually accepting and executing the unmap.  
Why is it not enough to check the L2 bitmap?


…Coming back later (I had to fix the missing `ret = ` I mentioned in 
patch 2, or this test would hang, so I couldn’t run it at first), I note 
that checking the disk usage in fact doesn’t work on tmpfs.  I usually 
run the iotests in tmpfs, so that’s not great.



Also add separate test case for discarding particular subcluster within
one cluster.

Signed-off-by: Andrey Drobyshev 
---
  tests/qemu-iotests/271 | 25 -
  tests/qemu-iotests/271.out |  2 ++
  2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/271 b/tests/qemu-iotests/271
index c7c2cadda0..5fcb209f5f 100755
--- a/tests/qemu-iotests/271
+++ b/tests/qemu-iotests/271
@@ -81,6 +81,15 @@ _verify_l2_bitmap()
  fi
  }
  
+# Check disk usage delta after a discard/unmap operation

+# _verify_du_delta $before $after $expected_delta
+_verify_du_delta()
+{
+if [ $(($1 - $2)) -ne $3 ]; then
+printf "ERROR: unexpected delta: $1 - $2 = $(($1 - $2)) != $3\n"
+fi
+}
+
  # This should be called as _run_test c=XXX sc=XXX off=XXX len=XXX cmd=XXX
  # c:   cluster number (0 if unset)
  # sc:  subcluster number inside cluster @c (0 if unset)
@@ -198,9 +207,12 @@ for use_backing_file in yes no; do
  alloc="$(seq 0 31)"; zero=""
  _run_test sc=0 len=64k
  
-### Zero and unmap half of cluster #0 (this won't unmap it)

+### Zero and unmap half of cluster #0 (this will unmap it)


I think “it” refers to the cluster, and it is not unmapped.  This test 
case does not use a discard, but write -z instead, so it worked before.  
(The L2 bitmap shown in the output doesn’t change, so functionally, this 
patch series didn’t change this case.)



  alloc="$(seq 16 31)"; zero="$(seq 0 15)"
+before=$(disk_usage "$TEST_IMG")
  _run_test sc=0 len=32k cmd=unmap
+after=$(disk_usage "$TEST_IMG")
+_verify_du_delta $before $after 32768
  
  ### Zero and unmap cluster #0

  alloc=""; zero="$(seq 0 31)"


For this following case shown truncated here, why don’t we try 
“_run_test sc=16 len=32k cmd=unmap” instead of “sc=0 len=64k”?  I.e. 
unmap only the second half, which, thanks to patch 3, should still unmap 
the whole cluster, because the first half is already unmapped.



@@ -447,7 +459,10 @@ for use_backing_file in yes no; do
  
  # Subcluster-aligned request from clusters #12 to #14

  alloc="$(seq 0 15)"; zero="$(seq 16 31)"
+before=$(disk_usage "$TEST_IMG")
  _run_test c=12 sc=16 len=128k cmd=unmap
+after=$(disk_usage "$TEST_IMG")
+_verify_du_delta $before $after $((128 * 1024))
  alloc=""; zero="$(seq 0 31)"
  _verify_l2_bitmap 13
  alloc="$(seq 16 31)"; zero="$(seq 0 15)"
@@ -528,6 +543,14 @@ for use_backing_file in yes no; do
  else
  _make_test_img -o extended_l2=on 1M
  fi
+# Write cluster #0 and discard its subclusters #0-#3
+$QEMU_IO -c 'write -q 0 64k' "$TEST_IMG"
+before=$(disk_usage "$TEST_IMG")
+$QEMU_IO -c 'discard -q 0 8k' "$TEST_IMG"
+after=$(disk_usage "$TEST_IMG")
+_verify_du_delta $before $after 8192
+alloc="$(seq 4 31)"; zero="$(seq 0 3)"
+_verify_l2_bitmap 0
  # Write clusters #0-#2 and then discard them
  $QEMU_IO -c 'write -q 0 128k' "$TEST_IMG"
  $QEMU_IO -c 'discard -q 0 128k' "$TEST_IMG"


Similarly to above, I think it would be good if we combined this 
following case with the one you added, i.e. to write 128k from the 
beginning, drop the write here, and change the discard to be “discard -q 
8k 120k”, i.e. skip the subclusters we have already discarded, to see 
that this is still combined to discard the whole first cluster.


...Ah, see, and when I try this, the following assertion fails:

qemu-io: ../block/qcow2-cache.c:156: qcow2_cache_destroy: Assertion 
`c->entries[i].ref == 0' failed.
./common.rc: line 220: 128894 Aborted (core dumped) ( 
VALGRIND_QEMU="${VALGRIND_QEMU_IO}" _qemu_proc_exec 
"${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )


Looks like an L2 table is leaked somewhere.  That’s why SCRI should be a 
g_auto()-able type.


Hanna


diff --git a/tests/qemu-iotests/271.out b/tests/qemu-iotests/271.out
index 5be780de76..0da8d72cde 100644
--- a/tests/qemu-iotests/271.out
+++ b/tests/qemu-iotests/271.out
@@ -426,6 +426,7 @@ L2 entry #29: 0x 
  ### Discarding clusters with non-zero bitmaps (backing file: yes) ###
  
  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576