Re: [Nouveau] [PATCH v9 03/10] mm/rmap: Split try_to_munlock from try_to_unmap
On Saturday, 5 June 2021 10:41:03 AM AEST Shakeel Butt wrote: > External email: Use caution opening links or attachments > > > On Fri, Jun 4, 2021 at 1:49 PM Liam Howlett wrote: > > > > * Shakeel Butt [210525 19:45]: > > > On Tue, May 25, 2021 at 11:40 AM Liam Howlett wrote: > > > > > > > [...] > > > > > > > > > > +/* > > > > > + * Walks the vma's mapping a page and mlocks the page if any locked vma's are > > > > > + * found. Once one is found the page is locked and the scan can be terminated. > > > > > + */ > > > > > > > > Can you please add that this requires the mmap_sem() lock to the > > > > comments? > > > > > > > > > > Why does this require mmap_sem() lock? Also mmap_sem() lock of which mm_struct? > > > > > > Doesn't the mlock_vma_page() require the mmap_sem() for reading? The > > mm_struct in vma->vm_mm; > > > > We are traversing all the vmas where this page is mapped of possibly > different mm_structs. I don't think we want to take mmap_sem() of all > those mm_structs. The commit b87537d9e2fe ("mm: rmap use pte lock not > mmap_sem to set PageMlocked") removed exactly that. > > > > > From what I can see, at least the following paths have mmap_lock held > > for writing: > > > > munlock_vma_pages_range() from __do_munmap() > > munlokc_vma_pages_range() from remap_file_pages() > > > > The following path does not hold mmap_sem: > > exit_mmap() -> munlock_vma_pages_all() -> munlock_vma_pages_range(). > > I would really suggest all to carefully read the commit message of > b87537d9e2fe ("mm: rmap use pte lock not mmap_sem to set > PageMlocked"). > > Particularly the following paragraph: > ... > Vlastimil Babka points out another race which this patch protects against. > try_to_unmap_one() might reach its mlock_vma_page() TestSetPageMlocked a > moment after munlock_vma_pages_all() did its Phase 1 TestClearPageMlocked: > leaving PageMlocked and unevictable when it should be evictable. mmap_sem > is ineffective because exit_mmap() does not hold it; page lock ineffective > because __munlock_pagevec() only takes it afterwards, in Phase 2; pte lock > is effective because __munlock_pagevec_fill() takes it to get the page, > after VM_LOCKED was cleared from vm_flags, so visible to try_to_unmap_one. > ... > > Alistair, please bring back the VM_LOCKED check with pte lock held and > the comment "Holding pte lock, we do *not* need mmap_lock here". Actually thanks for highlighting that paragraph. I have gone back through the code again in munlock_vma_pages_range() and think I have a better understanding of it now. So now I agree - the check of VM_LOCKED under the PTL is important to ensure mlock_vma_page() does not run after VM_LOCKED has been cleared and __munlock_pagevec_fill() has run. Will post v10 to fix this and the try_to_munlock reference pointed out by Liam which I missed for v9. Thanks Shakeel for taking the time to point this out. > One positive outcome of this cleanup patch is the removal of > unnecessary invalidation (unmapping for kvm case) of secondary mmus. ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH v9 03/10] mm/rmap: Split try_to_munlock from try_to_unmap
* Shakeel Butt [210604 20:41]: > On Fri, Jun 4, 2021 at 1:49 PM Liam Howlett wrote: > > > > * Shakeel Butt [210525 19:45]: > > > On Tue, May 25, 2021 at 11:40 AM Liam Howlett > > > wrote: > > > > > > > [...] > > > > > > > > > > +/* > > > > > + * Walks the vma's mapping a page and mlocks the page if any locked > > > > > vma's are > > > > > + * found. Once one is found the page is locked and the scan can be > > > > > terminated. > > > > > + */ > > > > > > > > Can you please add that this requires the mmap_sem() lock to the > > > > comments? > > > > > > > > > > Why does this require mmap_sem() lock? Also mmap_sem() lock of which > > > mm_struct? > > > > > > Doesn't the mlock_vma_page() require the mmap_sem() for reading? The > > mm_struct in vma->vm_mm; > > > > We are traversing all the vmas where this page is mapped of possibly > different mm_structs. I don't think we want to take mmap_sem() of all > those mm_structs. The commit b87537d9e2fe ("mm: rmap use pte lock not > mmap_sem to set PageMlocked") removed exactly that. > > > > > From what I can see, at least the following paths have mmap_lock held > > for writing: > > > > munlock_vma_pages_range() from __do_munmap() > > munlokc_vma_pages_range() from remap_file_pages() > > > > The following path does not hold mmap_sem: > > exit_mmap() -> munlock_vma_pages_all() -> munlock_vma_pages_range(). Isn't this the benign race referenced by Hugh in the commit you point to below? > > I would really suggest all to carefully read the commit message of > b87537d9e2fe ("mm: rmap use pte lock not mmap_sem to set > PageMlocked"). > > Particularly the following paragraph: > ... > Vlastimil Babka points out another race which this patch protects against. > try_to_unmap_one() might reach its mlock_vma_page() TestSetPageMlocked a > moment after munlock_vma_pages_all() did its Phase 1 TestClearPageMlocked: > leaving PageMlocked and unevictable when it should be evictable. mmap_sem > is ineffective because exit_mmap() does not hold it; page lock ineffective > because __munlock_pagevec() only takes it afterwards, in Phase 2; pte lock > is effective because __munlock_pagevec_fill() takes it to get the page, > after VM_LOCKED was cleared from vm_flags, so visible to try_to_unmap_one. > ... So this is saying the race with exit_mmap() isn't benign after all? > > Alistair, please bring back the VM_LOCKED check with pte lock held and > the comment "Holding pte lock, we do *not* need mmap_lock here". > > One positive outcome of this cleanup patch is the removal of > unnecessary invalidation (unmapping for kvm case) of secondary mmus. ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH v9 03/10] mm/rmap: Split try_to_munlock from try_to_unmap
* Shakeel Butt [210525 19:45]: > On Tue, May 25, 2021 at 11:40 AM Liam Howlett wrote: > > > [...] > > > > > > +/* > > > + * Walks the vma's mapping a page and mlocks the page if any locked > > > vma's are > > > + * found. Once one is found the page is locked and the scan can be > > > terminated. > > > + */ > > > > Can you please add that this requires the mmap_sem() lock to the > > comments? > > > > Why does this require mmap_sem() lock? Also mmap_sem() lock of which > mm_struct? Doesn't the mlock_vma_page() require the mmap_sem() for reading? The mm_struct in vma->vm_mm; >From what I can see, at least the following paths have mmap_lock held for writing: munlock_vma_pages_range() from __do_munmap() munlokc_vma_pages_range() from remap_file_pages() > > > > +static bool page_mlock_one(struct page *page, struct vm_area_struct *vma, > > > + unsigned long address, void *unused) > > > +{ > > > + struct page_vma_mapped_walk pvmw = { > > > + .page = page, > > > + .vma = vma, > > > + .address = address, > > > + }; > > > + > > > + /* An un-locked vma doesn't have any pages to lock, continue the > > > scan */ > > > + if (!(vma->vm_flags & VM_LOCKED)) > > > + return true; > > > + > > > + while (page_vma_mapped_walk()) { > > > + /* PTE-mapped THP are never mlocked */ > > > + if (!PageTransCompound(page)) > > > + mlock_vma_page(page); > > > + page_vma_mapped_walk_done(); > > > + > > > + /* > > > + * no need to continue scanning other vma's if the page has > > > + * been locked. > > > + */ > > > + return false; > > > + } > > > + > > > + return true; > > > +} munlock_vma_pages_range() comments still references try_to_{munlock|unmap} ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH v9 03/10] mm/rmap: Split try_to_munlock from try_to_unmap
On Tue, May 25, 2021 at 11:40 AM Liam Howlett wrote: > [...] > > > > +/* > > + * Walks the vma's mapping a page and mlocks the page if any locked vma's > > are > > + * found. Once one is found the page is locked and the scan can be > > terminated. > > + */ > > Can you please add that this requires the mmap_sem() lock to the > comments? > Why does this require mmap_sem() lock? Also mmap_sem() lock of which mm_struct? > > +static bool page_mlock_one(struct page *page, struct vm_area_struct *vma, > > + unsigned long address, void *unused) > > +{ > > + struct page_vma_mapped_walk pvmw = { > > + .page = page, > > + .vma = vma, > > + .address = address, > > + }; > > + > > + /* An un-locked vma doesn't have any pages to lock, continue the scan > > */ > > + if (!(vma->vm_flags & VM_LOCKED)) > > + return true; > > + > > + while (page_vma_mapped_walk()) { > > + /* PTE-mapped THP are never mlocked */ > > + if (!PageTransCompound(page)) > > + mlock_vma_page(page); > > + page_vma_mapped_walk_done(); > > + > > + /* > > + * no need to continue scanning other vma's if the page has > > + * been locked. > > + */ > > + return false; > > + } > > + > > + return true; > > +} ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH v9 03/10] mm/rmap: Split try_to_munlock from try_to_unmap
* Alistair Popple [210524 09:29]: > The behaviour of try_to_unmap_one() is difficult to follow because it > performs different operations based on a fairly large set of flags used > in different combinations. > > TTU_MUNLOCK is one such flag. However it is exclusively used by > try_to_munlock() which specifies no other flags. Therefore rather than > overload try_to_unmap_one() with unrelated behaviour split this out into > it's own function and remove the flag. > > Signed-off-by: Alistair Popple > Reviewed-by: Ralph Campbell > Reviewed-by: Christoph Hellwig > > --- > > v9: > * Improved comments > > v8: > * Renamed try_to_munlock to page_mlock to better reflect what the > function actually does. > * Removed the TODO from the documentation that this patch addresses. > > v7: > * Added Christoph's Reviewed-by > > v4: > * Removed redundant check for VM_LOCKED > --- > Documentation/vm/unevictable-lru.rst | 33 ++- > include/linux/rmap.h | 3 +- > mm/mlock.c | 10 ++--- > mm/rmap.c| 61 > 4 files changed, 63 insertions(+), 44 deletions(-) > > diff --git a/Documentation/vm/unevictable-lru.rst > b/Documentation/vm/unevictable-lru.rst > index 0e1490524f53..eae3af17f2d9 100644 > --- a/Documentation/vm/unevictable-lru.rst > +++ b/Documentation/vm/unevictable-lru.rst > @@ -389,14 +389,14 @@ mlocked, munlock_vma_page() updates that zone > statistics for the number of > mlocked pages. Note, however, that at this point we haven't checked whether > the page is mapped by other VM_LOCKED VMAs. > > -We can't call try_to_munlock(), the function that walks the reverse map to > +We can't call page_mlock(), the function that walks the reverse map to > check for other VM_LOCKED VMAs, without first isolating the page from the > LRU. > -try_to_munlock() is a variant of try_to_unmap() and thus requires that the > page > +page_mlock() is a variant of try_to_unmap() and thus requires that the page > not be on an LRU list [more on these below]. However, the call to > -isolate_lru_page() could fail, in which case we couldn't try_to_munlock(). > So, > +isolate_lru_page() could fail, in which case we can't call page_mlock(). So, > we go ahead and clear PG_mlocked up front, as this might be the only chance > we > -have. If we can successfully isolate the page, we go ahead and > -try_to_munlock(), which will restore the PG_mlocked flag and update the zone > +have. If we can successfully isolate the page, we go ahead and call > +page_mlock(), which will restore the PG_mlocked flag and update the zone > page statistics if it finds another VMA holding the page mlocked. If we fail > to isolate the page, we'll have left a potentially mlocked page on the LRU. > This is fine, because we'll catch it later if and if vmscan tries to reclaim > @@ -545,31 +545,24 @@ munlock or munmap system calls, mm teardown > (munlock_vma_pages_all), reclaim, > holepunching, and truncation of file pages and their anonymous COWed pages. > > > -try_to_munlock() Reverse Map Scan > +page_mlock() Reverse Map Scan > - > > -.. warning:: > - [!] TODO/FIXME: a better name might be page_mlocked() - analogous to the > - page_referenced() reverse map walker. > - > When munlock_vma_page() [see section :ref:`munlock()/munlockall() System Call > Handling ` above] tries to munlock a > page, it needs to determine whether or not the page is mapped by any > VM_LOCKED VMA without actually attempting to unmap all PTEs from the > page. For this purpose, the unevictable/mlock infrastructure > -introduced a variant of try_to_unmap() called try_to_munlock(). > +introduced a variant of try_to_unmap() called page_mlock(). > > -try_to_munlock() calls the same functions as try_to_unmap() for anonymous and > -mapped file and KSM pages with a flag argument specifying unlock versus unmap > -processing. Again, these functions walk the respective reverse maps looking > -for VM_LOCKED VMAs. When such a VMA is found, as in the try_to_unmap() case, > -the functions mlock the page via mlock_vma_page() and return SWAP_MLOCK. > This > -undoes the pre-clearing of the page's PG_mlocked done by munlock_vma_page. > +page_mlock() walks the respective reverse maps looking for VM_LOCKED VMAs. > When > +such a VMA is found the page is mlocked via mlock_vma_page(). This undoes the > +pre-clearing of the page's PG_mlocked done by munlock_vma_page. > > -Note that try_to_munlock()'s reverse map walk must visit every VMA in a > page's > +Note that page_mlock()'s reverse map walk must visit every VMA in a page's > reverse map to determine that a page is NOT mapped into any VM_LOCKED VMA. > However, the scan can terminate when it encounters a VM_LOCKED VMA. > -Although try_to_munlock() might be called a great many times when munlocking > a > +Although page_mlock() might be called a great many times when