Clear dirty bitmap operation needs to walk memory_listeners but the context may not hold BQL.
These callers hold BQL for it: cpu_physical_memory_sync_dirty_bitmap dirtyrate_manual_reset_protect These callers hold RCU for it: migration_clear_memory_region_dirty_bitmap [1] cpu_physical_memory_test_and_clear_dirty The above case [1] is extremely unobvious and probably still buggy, because: - Either the RCU read lock was taken very high from the stack in ram_save_iterate() or ram_save_queue_pages() where the RCU lock was probably taken for the sake of ramblock references (which is also protected by RCU), or, - I _think_ there's path that leaks taking any lock (e.g. the other path migration_clear_memory_region_dirty_bitmap_range which is used by virtio-mem or virtio-balloon that may or may not really take RCU at all, neither BQL). Add the RCU read lock in memory_region_clear_dirty_bitmap() to make sure it's not missed. The RCU is also needed for address_space_get_flatview(), so this will generally making the RCU section larger to cover the whole walking process when not taken, but wanted. This should be the only place that we referenced memory_listeners (or as->listeners) without guaranteed to hold BQL nor RCU. Signed-off-by: Peter Xu <pet...@redhat.com> --- softmmu/memory.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/softmmu/memory.c b/softmmu/memory.c index c48e9cc6ed..95cdcaeccf 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -2270,7 +2270,8 @@ void memory_region_clear_dirty_bitmap(MemoryRegion *mr, hwaddr start, FlatRange *fr; hwaddr sec_start, sec_end, sec_size; - QTAILQ_FOREACH(listener, &memory_listeners, link) { + RCU_READ_LOCK_GUARD(); + QTAILQ_FOREACH_RCU(listener, &memory_listeners, link) { if (!listener->log_clear) { continue; } -- 2.39.1