On Mon, 2020-11-23 at 18:38 +0100, Kevin Wolf wrote: > Am 23.11.2020 um 16:49 hat Maxim Levitsky geschrieben: > > Commit 205fa50750 ("qcow2: Add subcluster support to zero_in_l2_slice()") > > introduced a subtle change to code in zero_in_l2_slice: > > > > It swapped the order of > > > > 1. qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > > 2. set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO); > > 3. qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST); > > > > To > > > > 1. qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > > 2. qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST); > > 3. set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO); > > > > It seems harmless, however the call to qcow2_free_any_clusters > > can trigger a cache flush which can mark the L2 table as clean, > > and assuming that this was the last write to it, > > a stale version of it will remain on the disk. > > Do you have more details on this last paragraph? I'm trying to come up > with a reproducer, but I don't see how qcow2_free_any_clusters() could > flush the L2 table cache. (It's easy to get it to flush the refcount > block cache, but that's useless for a reproducer.) > > The only way I see to flush any cache with it is in update_refcount() > the qcow2_cache_set_dependency() call. This will always flush the cache > that the L2 cache depends on - which will never be the L2 cache itself, > but always either the refcount cache or nothing. > > There are more options in alloc_refcount_block() if we're allocating a > new refcount block, but in the context of freeing clusters we'll never > need to do that. > > Whatever I tried, at the end of zero_in_l2_slice(), I have a dirty L2 > table and a dirty refcount block in the cache, with a dependency that > makes sure that the L2 table will be written out first. > > If you don't have the information yet, can you try to debug your manual > reproducer a bit more to find out how this happens? I'll do this tomorrow. Best regards, Maxim Levitsky
> > Kevin > > > Now we have a valid L2 entry pointing to a freed cluster. Oops. > > > > Fixes: 205fa50750 ("qcow2: Add subcluster support to zero_in_l2_slice()") > > Signed-off-by: Maxim Levitsky <mlevi...@redhat.com> > > --- > > block/qcow2-cluster.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c > > index 485b4cb92e..267b46a4ca 100644 > > --- a/block/qcow2-cluster.c > > +++ b/block/qcow2-cluster.c > > @@ -2010,11 +2010,11 @@ static int zero_in_l2_slice(BlockDriverState *bs, > > uint64_t offset, > > continue; > > } > > > > - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > > if (unmap) { > > qcow2_free_any_cluster(bs, old_l2_entry, > > QCOW2_DISCARD_REQUEST); > > } > > set_l2_entry(s, l2_slice, l2_index + i, new_l2_entry); > > + qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > > if (has_subclusters(s)) { > > set_l2_bitmap(s, l2_slice, l2_index + i, new_l2_bitmap); > > } > > -- > > 2.26.2 > >