On Thu, Mar 31, 2022 at 07:14:12PM +0300, Andrey Ryabinin wrote: > > > On 3/30/22 22:25, Peter Xu wrote: > > On Fri, Mar 25, 2022 at 06:40:13PM +0300, Andrey Ryabinin wrote: > >> The sequence of ram_block_add()/qemu_ram_free()/ram_block_add() > >> function calls leads to leaking some memory. > >> > >> ram_block_add() calls dirty_memory_extend() to allocate bitmap blocks > >> for new memory. These blocks only grow but never shrink. So the > >> qemu_ram_free() restores RAM size back to it's original stat but > >> doesn't touch dirty memory bitmaps. > >> > >> After qemu_ram_free() there is no way of knowing that we have > >> allocated dirty memory bitmaps beyond current RAM size. > >> So the next ram_block_add() will call dirty_memory_extend() again to > >> to allocate new bitmaps and rewrite pointers to bitmaps left after > >> the first ram_block_add()/dirty_memory_extend() calls. > >> > >> Rework dirty_memory_extend() to be able to shrink dirty maps, > >> also rename it to dirty_memory_resize(). And fix the leak by > >> shrinking dirty memory maps on qemu_ram_free() if needed. > >> > >> Fixes: 5b82b703b69a ("memory: RCU ram_list.dirty_memory[] for safe RAM > >> hotplug") > >> Cc: qemu-sta...@nongnu.org > >> Signed-off-by: Andrey Ryabinin <a...@yandex-team.com> > >> --- > >> include/exec/ramlist.h | 2 ++ > >> softmmu/physmem.c | 38 ++++++++++++++++++++++++++++++++------ > >> 2 files changed, 34 insertions(+), 6 deletions(-) > >> > >> diff --git a/include/exec/ramlist.h b/include/exec/ramlist.h > >> index 2ad2a81acc..019e238e7c 100644 > >> --- a/include/exec/ramlist.h > >> +++ b/include/exec/ramlist.h > >> @@ -41,6 +41,8 @@ typedef struct RAMBlockNotifier RAMBlockNotifier; > >> #define DIRTY_MEMORY_BLOCK_SIZE ((ram_addr_t)256 * 1024 * 8) > >> typedef struct { > >> struct rcu_head rcu; > >> + unsigned int nr_blocks; > >> + unsigned int nr_blocks_inuse; > >> unsigned long *blocks[]; > >> } DirtyMemoryBlocks; > > > > The problem looks valid, but can we avoid introducing these variables at > > all? > > > > IIUC what we miss here is the proper releasing of dirty blocks when ram is > > released. IMHO as long as we properly release the extra dirty memory > > blocks in qemu_ram_free(), then last_ram_page() will reflect the existing > > dirty memory block size. That looks simpler if I'm correct.. > > > > ram_list.dirty_memory is RCU protected which means we can't free it > immediately. > Freeing must be delayed until the end of RCU grace period. > One way to do it is using rcu callback, like in this patch. That why we need > these > variables - to pass the information into rcu callback. > > Another way is using synchronize_rcu() before freeing the memory. In that case > the variables won't be needed. But it's expensive. > Also I'm not sure if using synchronize_rcu() under a mutex lock is a good > idea. > Perhaps it will needed somehow to rework dirty_memory_resize() to move > synchronize_rcu() and freeing > outside of mutex_lock()/unlock() section.
I suggested that just because from the 1st glance I failed to read clearly on the current patch due to some trivial details (I commented inline), which made me thought that merging extend/shrink is not necessary. But I agree it looks mostly correct, and call_rcu() should indeed be preferred. Thanks, -- Peter Xu