On Fri, Apr 18, 2025 at 12:56 PM Lorenzo Stoakes <lorenzo.stoa...@oracle.com> wrote: > > On Fri, Apr 18, 2025 at 12:31:29PM -0700, Suren Baghdasaryan wrote: > > On Fri, Apr 18, 2025 at 11:30 AM Lorenzo Stoakes > > <lorenzo.stoa...@oracle.com> wrote: > > > > > > On Fri, Apr 18, 2025 at 10:49:54AM -0700, Suren Baghdasaryan wrote: > > > > Test that /proc/pid/maps does not report unexpected holes in the address > > > > space when we concurrently remap a part of a vma into the middle of > > > > another vma. This remapping results in the destination vma being split > > > > into three parts and the part in the middle being patched back from, > > > > all done concurrently from under the reader. We should always see either > > > > original vma or the split one with no holes. > > > > > > > > Signed-off-by: Suren Baghdasaryan <sur...@google.com> > > > > > > Umm, but we haven't fixed this in the mremap code right? :) So isn't this > > > test > > > failing right now? :P > > > > > > My understanding from meeting was you'd drop this commit/mark it skipped > > > for now or something like this? > > > > After you pointed out the issue in mremap_to() I spent some time > > investigating that code. IIUC the fact that mremap_to() does > > do_munmap() and move_vma() as two separate operations should not > > affect lockless reading because both those operations are done under > > mmap_write_lock() without dropping it in between. Since my lockless > > mechanism uses mmap_lock_speculate_xxx API to detect address space > > modifications and retry if concurrent update happen, it should be able > > to handle this case. The vma it reports should be either the version > > before mmap_write_lock was taken (before the mremap()) or after it got > > dropped (after mremap() is complete). Did I miss something more > > subtle? > > Speaking to Liam, seems perhaps the implementation has changed since we first > started looking at this? > > I guess it's this speculation stuff that keeps you safe then, yeah we hold the > write lock over both. > > So actually ideal if we can avoid having to fix up the mremap implementation! > > If you're sure that the speculation combined with mmap write lock keeps us > safe > then we should be ok I'd say.
Yeah, I tested that quite rigorously and confirmed (using the mm->mm_lock_seq) that mmap_write_lock is not dropped somewhere in the middle of those operations. I think we should be safe. > > > > > > > > > > --- > > > > tools/testing/selftests/proc/proc-pid-vm.c | 92 ++++++++++++++++++++++ > > > > 1 file changed, 92 insertions(+) > > > > > > > > diff --git a/tools/testing/selftests/proc/proc-pid-vm.c > > > > b/tools/testing/selftests/proc/proc-pid-vm.c > > > > index 39842e4ec45f..1aef2db7e893 100644 > > > > --- a/tools/testing/selftests/proc/proc-pid-vm.c > > > > +++ b/tools/testing/selftests/proc/proc-pid-vm.c > > > > @@ -663,6 +663,95 @@ static void test_maps_tearing_from_resize(int > > > > maps_fd, > > > > signal_state(mod_info, TEST_DONE); > > > > } > > > > > > > > +static inline void remap_vma(const struct vma_modifier_info *mod_info) > > > > +{ > > > > + /* > > > > + * Remap the last page of the next vma into the middle of the vma. > > > > + * This splits the current vma and the first and middle parts (the > > > > + * parts at lower addresses) become the last vma objserved in the > > > > + * first page and the first vma observed in the last page. > > > > + */ > > > > + assert(mremap(mod_info->next_addr + page_size * 2, page_size, > > > > + page_size, MREMAP_FIXED | MREMAP_MAYMOVE | > > > > MREMAP_DONTUNMAP, > > > > + mod_info->addr + page_size) != MAP_FAILED); > > > > +} > > > > + > > > > +static inline void patch_vma(const struct vma_modifier_info *mod_info) > > > > +{ > > > > + assert(!mprotect(mod_info->addr + page_size, page_size, > > > > + mod_info->prot)); > > > > +} > > > > + > > > > +static inline void check_remap_result(struct line_content > > > > *mod_last_line, > > > > + struct line_content *mod_first_line, > > > > + struct line_content > > > > *restored_last_line, > > > > + struct line_content > > > > *restored_first_line) > > > > +{ > > > > + /* Make sure vmas at the boundaries are changing */ > > > > + assert(strcmp(mod_last_line->text, restored_last_line->text) != > > > > 0); > > > > + assert(strcmp(mod_first_line->text, restored_first_line->text) != > > > > 0); > > > > +} > > > > + > > > > +static void test_maps_tearing_from_remap(int maps_fd, > > > > + struct vma_modifier_info *mod_info, > > > > + struct page_content *page1, > > > > + struct page_content *page2, > > > > + struct line_content *last_line, > > > > + struct line_content *first_line) > > > > +{ > > > > + struct line_content remapped_last_line; > > > > + struct line_content remapped_first_line; > > > > + struct line_content restored_last_line; > > > > + struct line_content restored_first_line; > > > > + > > > > + wait_for_state(mod_info, SETUP_READY); > > > > + > > > > + /* re-read the file to avoid using stale data from previous test > > > > */ > > > > + read_boundary_lines(maps_fd, page1, page2, last_line, first_line); > > > > + > > > > + mod_info->vma_modify = remap_vma; > > > > + mod_info->vma_restore = patch_vma; > > > > + mod_info->vma_mod_check = check_remap_result; > > > > + > > > > + capture_mod_pattern(maps_fd, mod_info, page1, page2, last_line, > > > > first_line, > > > > + &remapped_last_line, &remapped_first_line, > > > > + &restored_last_line, &restored_first_line); > > > > + > > > > + /* Now start concurrent modifications for test_duration_sec */ > > > > + signal_state(mod_info, TEST_READY); > > > > + > > > > + struct line_content new_last_line; > > > > + struct line_content new_first_line; > > > > + struct timespec start_ts, end_ts; > > > > + > > > > + clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); > > > > + do { > > > > + read_boundary_lines(maps_fd, page1, page2, > > > > &new_last_line, &new_first_line); > > > > + > > > > + /* Check if we read vmas after remapping it */ > > > > + if (!strcmp(new_last_line.text, remapped_last_line.text)) > > > > { > > > > + /* > > > > + * The vmas should be consistent with remap > > > > results, > > > > + * however if the vma was concurrently restored, > > > > it > > > > + * can be reported twice (first as split one, then > > > > + * as restored one) because we found it as the > > > > next vma > > > > + * again. In that case new first line will be the > > > > same > > > > + * as the last restored line. > > > > + */ > > > > + assert(!strcmp(new_first_line.text, > > > > remapped_first_line.text) || > > > > + !strcmp(new_first_line.text, > > > > restored_last_line.text)); > > > > + } else { > > > > + /* The vmas should be consistent with the > > > > original/resored state */ > > > > + assert(!strcmp(new_last_line.text, > > > > restored_last_line.text) && > > > > + !strcmp(new_first_line.text, > > > > restored_first_line.text)); > > > > + } > > > > + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); > > > > + } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); > > > > + > > > > + /* Signal the modifyer thread to stop and wait until it exits */ > > > > + signal_state(mod_info, TEST_DONE); > > > > +} > > > > + > > > > static int test_maps_tearing(void) > > > > { > > > > struct vma_modifier_info *mod_info; > > > > @@ -757,6 +846,9 @@ static int test_maps_tearing(void) > > > > test_maps_tearing_from_resize(maps_fd, mod_info, &page1, &page2, > > > > &last_line, &first_line); > > > > > > > > + test_maps_tearing_from_remap(maps_fd, mod_info, &page1, &page2, > > > > + &last_line, &first_line); > > > > + > > > > stop_vma_modifier(mod_info); > > > > > > > > free(page2.data); > > > > -- > > > > 2.49.0.805.g082f7c87e0-goog > > > >