On 31.10.25 14:03, Kevin Wolf wrote:
Am 31.10.2025 um 10:29 hat Hanna Czenczek geschrieben:
On 29.10.25 21:23, Kevin Wolf wrote:
Am 28.10.2025 um 17:33 hat Hanna Czenczek geschrieben:
[...]

@@ -1239,12 +1310,6 @@ static void qcow2_update_options_commit(BlockDriverState 
*bs,
       s->discard_no_unref = r->discard_no_unref;
-    if (s->cache_clean_interval != r->cache_clean_interval) {
-        cache_clean_timer_del(bs);
-        s->cache_clean_interval = r->cache_clean_interval;
-        cache_clean_timer_init(bs, bdrv_get_aio_context(bs));
-    }
-
I think the del/init pair here won't be necessary any more after
switching to the background coroutine. It will just start using the new
value of s->cache_clean_interval the next time it sleeps.
One problem is that if we don’t lock s->lock, the coroutine can read
s->l2_table_cache and s->refcount_block_cache while they’re invalid, which
is why I moved the deletion above.
I see. This is a preexisting problem, right? The timer runs in the BDS
main context, while qcow2_update_options_commit() runs in the main loop
or... essentially anywhere else?

Sorry for the late reply, yes.  There may be the additional problem noted in the commit message, specifically that the timer may fire, run the CB in the BDS’s context, and is concurrently deleted from the main context.  It will then still run, so just moving the delete up is not a 100 % solution I think.  We also need to make sure the timer code really isn’t running.

Should it be a separate patch then? A comment in the code wouldn't hurt
either.

I’ll add a comment.

We also still need to delete if the interval is set to 0 (or
special-case that in the coroutine to wait forever).
If the coroutine terminates on 0 as you suggested above, that would
automatically be solved.

I don’t think so, because we still need to be able to restart it (when transitioning from a value of 0 to a different value).

We could run all of this in a coroutine so we can lock s->lock, or we
have to force-stop the timer/coroutine at the start.  Maybe running it
in a coroutine is better…
So qcow2_reopen_commit() would immediately enter a coroutine and
BDRV_POLL_WHILE() for its completion?

It's not exactly pretty (and I hope polling in reopen callbacks is even
allowed), but maybe more local ugliness than having additional state
(like s->cache_clean_running) to allow BDRV_POLL_WHILE() to wait for the
cache clean coroutine to stop.

I think I’ll keep it as-is.  We’ll probably actually want to wait in detach_aio_context, too.

Hanna


Reply via email to