On Fri, Jun 13, 2025 at 8:01 AM Lorenzo Stoakes
<lorenzo.stoa...@oracle.com> wrote:
>
> Hi Suren,
>
> I promised I'd share VMA merging scenarios so we can be absolutely sure we 
> have
> all cases covered, I share that below. I also included information on split.

Thanks Lorenzo! This is great and very helpful.

>
> Hopefully this is useful! And maybe we can somehow put in a comment or commit
> msg or something somewhere? Not sure if a bit much for that though :)

I'll see if I can add a short version into my next cover letter.

>
> Note that in all of the below we hold exclusive mmap, vma + rmap write locks.
>
> ## Merge with change to EXISTING VMA
>
> ### Merge both
>
>                       start    end
>                          |<---->|
>                  |-------********-------|
>                    prev   middle   next
>                   extend  delete  delete
>
> 1. Set prev VMA range [prev->vm_start, next->vmend)
> 2. Overwrite prev, middle, next nodes in maple tree with prev
> 3. Detach middle VMA
> 4. Free middle VMA
> 5. Detach next VMA
> 6. Free next VMA

This case should be fine with per-vma locks while reading
/proc/pid/maps. In the worst case we will report some of the original
vmas before the merge and then the final merged vma, so prev might be
seen twice but no gaps should be observed.

>
> ### Merge left full
>
>                        start        end
>                          |<--------->|
>                  |-------*************
>                    prev     middle
>                   extend    delete
>
> 1. Set prev VMA range [prev->vm_start, end)
> 2. Overwrite prev, middle nodes in maple tree with prev
> 3. Detach middle VMA
> 4. Free middle VMA

Same as the previous case. Worst case we report prev twice - once
before the merge, once after the merge.

>
> ### Merge left partial
>
>                        start   end
>                          |<---->|
>                  |-------*************
>                    prev     middle
>                   extend  partial overwrite
>
> 1. Set prev VMA range [prev->vm_start, end)
> 2. Set middle range [end, middle->vm_end)
> 3. Overwrite prev, middle (partial) nodes in maple tree with prev

We might report prev twice here and this might cause us to retry if we
see a temporary gap between old prev and new middle vma. But retry
should handle this case, so I think we are good here.

>
> ### Merge right full
>
>                start        end
>                  |<--------->|
>                  *************-------|
>                     middle     next
>                     delete    extend
>
> 1. Set next range [start, next->vm_end)
> 2. Overwrite middle, next nodes in maple tree with next
> 3. Detach middle VMA
> 4. Free middle VMA

Worst case we report middle twice.

>
> ### Merge right partial
>
>                    start    end
>                      |<----->|
>                  *************-------|
>                     middle     next
>                     shrink    extend
>
> 1. Set middle range [middle->vm_start, start)
> 2. Set next range [start, next->vm_end)
> 3. Overwrite middle (partial), next nodes in maple tree with next

Worse case we retry and report middle twice.

>
> ## Merge due to introduction of proposed NEW VMA
>
> These cases are easier as there's no existing VMA to either remove or 
> partially
> adjust.
>
> ### Merge both
>
>                        start     end
>                          |<------>|
>                  |-------..........-------|
>                    prev  (proposed)  next
>                   extend            delete
>
> 1. Set prev VMA range [prev->vm_start, next->vm_end)
> 2. Overwrite prev, next nodes in maple tree with prev
> 3. Detach next VMA
> 4. Delete next VMA

Worst case we report prev twice after retry.

>
> ### Merge left
>
>                        start     end
>                          |<------>|
>                  |-------..........
>                    prev  (proposed)
>                   extend
>
> 1. Set prev VMA range [prev->vm_start, end)
> 2. Overwrite prev node in maple tree with newly extended prev

Worst case we report prev twice.

>
> (This is what's used for brk() and bprm_mm_init() stack relocation in
> relocate_vma_down() too)
>
> ### Merge right
>
>                        start     end
>                          |<------>|
>                          ..........-------|
>                          (proposed)  next
>                                     extend
>
> 1. Set next VMA range [start, next->vm_end)
> 2. Overwrite next node in maple tree with newly extended next

This will show either a legit gap + original next or the extended next
with no gap. Both ways we are fine.

>
> ## Split VMA
>
> If new below:
>
>                     addr
>                 |-----.-----|
>                 | new .     |
>                 |-----.-----|
>                      vma
> Otherwise:
>
>                     addr
>                 |-----.-----|
>                 |     . new |
>                 |-----.-----|
>                      vma
>
> 1. Duplicate vma
> 2. If new below, set new range to [vma-vm_start, addr)
> 3. Otherwise, set new range to [addr, vma->vm_end)
> 4. If new below, Set vma range to [addr, vma->vm_end)
> 5. Otherwise, set vma range to [vma->vm_start, addr)
> 6. Partially overwrite vma node in maple tree with new

These are fine too. We will either report before-split view or after-split view.
Thanks,
Suren.

>
> Cheers, Lorenzo

Reply via email to