On 3/30/22 22:25, Peter Xu wrote:
> On Fri, Mar 25, 2022 at 06:40:13PM +0300, Andrey Ryabinin wrote:
>> The sequence of ram_block_add()/qemu_ram_free()/ram_block_add()
>> function calls leads to leaking some memory.
>>
>> ram_block_add() calls dirty_memory_extend() to allocate bitmap blocks
>> for new memory. These blocks only grow but never shrink. So the
>> qemu_ram_free() restores RAM size back to it's original stat but
>> doesn't touch dirty memory bitmaps.
>>
>> After qemu_ram_free() there is no way of knowing that we have
>> allocated dirty memory bitmaps beyond current RAM size.
>> So the next ram_block_add() will call dirty_memory_extend() again to
>> to allocate new bitmaps and rewrite pointers to bitmaps left after
>> the first ram_block_add()/dirty_memory_extend() calls.
>>
>> Rework dirty_memory_extend() to be able to shrink dirty maps,
>> also rename it to dirty_memory_resize(). And fix the leak by
>> shrinking dirty memory maps on qemu_ram_free() if needed.
>>
>> Fixes: 5b82b703b69a ("memory: RCU ram_list.dirty_memory[] for safe RAM 
>> hotplug")
>> Cc: qemu-sta...@nongnu.org
>> Signed-off-by: Andrey Ryabinin <a...@yandex-team.com>
>> ---
>>  include/exec/ramlist.h |  2 ++
>>  softmmu/physmem.c      | 38 ++++++++++++++++++++++++++++++++------
>>  2 files changed, 34 insertions(+), 6 deletions(-)
>>
>> diff --git a/include/exec/ramlist.h b/include/exec/ramlist.h
>> index 2ad2a81acc..019e238e7c 100644
>> --- a/include/exec/ramlist.h
>> +++ b/include/exec/ramlist.h
>> @@ -41,6 +41,8 @@ typedef struct RAMBlockNotifier RAMBlockNotifier;
>>  #define DIRTY_MEMORY_BLOCK_SIZE ((ram_addr_t)256 * 1024 * 8)
>>  typedef struct {
>>      struct rcu_head rcu;
>> +    unsigned int nr_blocks;
>> +    unsigned int nr_blocks_inuse;
>>      unsigned long *blocks[];
>>  } DirtyMemoryBlocks;
> 
> The problem looks valid, but can we avoid introducing these variables at
> all?
> 
> IIUC what we miss here is the proper releasing of dirty blocks when ram is
> released.  IMHO as long as we properly release the extra dirty memory
> blocks in qemu_ram_free(), then last_ram_page() will reflect the existing
> dirty memory block size.  That looks simpler if I'm correct..
> 

ram_list.dirty_memory is RCU protected which means we can't free it immediately.
Freeing must be delayed until the end of RCU grace period.
One way to do it is using rcu callback, like in this patch. That why we need 
these
variables - to pass the information into rcu callback.

Another way is using synchronize_rcu() before freeing the memory. In that case
the variables won't be needed. But it's expensive.
Also I'm not sure if using synchronize_rcu() under a mutex lock is a good idea.
Perhaps it will needed somehow to rework dirty_memory_resize() to move 
synchronize_rcu() and freeing
outside of mutex_lock()/unlock() section.

> Side question: both of the patches are not regression introduced in this
> release, right?  So they are targeting for the next release?
> 

Right, these bugs are old. I'm not familiar with qemu release 
schedule/procedure,
so whatever release is more appropriate.

Reply via email to