"Sierra Guiza, Alejandro (Alex)" <alex.sie...@amd.com> writes:
> @apop...@nvidia.com Could you please check this patch? It's somehow related > to migrate_device_page() for long term device coherent pages. > > Regards, > Alex Sierra >> -----Original Message----- >> From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> On Behalf Of Alex >> Sierra >> Sent: Thursday, May 5, 2022 4:34 PM >> To: j...@nvidia.com >> Cc: rcampb...@nvidia.com; wi...@infradead.org; da...@redhat.com; >> Kuehling, Felix <felix.kuehl...@amd.com>; apop...@nvidia.com; amd- >> g...@lists.freedesktop.org; linux-...@vger.kernel.org; linux...@kvack.org; >> jgli...@redhat.com; dri-de...@lists.freedesktop.org; akpm@linux- >> foundation.org; linux-e...@vger.kernel.org; h...@lst.de >> Subject: [PATCH v1 04/15] mm: add device coherent checker to remove >> migration pte >> >> During remove_migration_pte(), entries for device coherent type pages that >> were not created through special migration ptes, ignore _PAGE_RW flag. This >> path can be found at migrate_device_page(), where valid vma is not >> required. In this case, migrate_vma_collect_pmd() is not called and special >> migration ptes are not set. It's true that we don't call migrate_vma_collect_pmd() for migrate_device_page(), but this doesn't imply migration entries are not created. We still call migrate_vma_unmap() which calls try_to_migrate() to install migration entries. When we have a vma migrate_vma_collect_pmd() is a fast path for the common case a page is only mapped once. So migrate_vma_collect_pmd() should fairly closely match try_to_migrate_one(). I did experiment locally with removing the fast path to simplify the code, but it does provide a meaningful performance improvement so I abandoned it. I think you're running into the problem addressed by https://lkml.kernel.org/r/20211018045247.3128058-1-apop...@nvidia.com but for DEVICE_COHERENT pages. Based on that I think the approach below is wrong. You should update try_to_migrate_one() to deal with DEVICE_COHERENT pages. It would make sense to do that as part of patch 1 in this series. The problem is that try_to_migrate_one() assumes folio_is_zone_device() implies it is a DEVICE_PRIVATE page due to the check in try_to_migrate(). >> Signed-off-by: Alex Sierra <alex.sie...@amd.com> >> --- >> mm/migrate.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/mm/migrate.c b/mm/migrate.c index >> 6c31ee1e1c9b..e18ddee56f37 100644 >> --- a/mm/migrate.c >> +++ b/mm/migrate.c >> @@ -206,7 +206,8 @@ static bool remove_migration_pte(struct folio *folio, >> * Recheck VMA as permissions can change since migration >> started >> */ >> entry = pte_to_swp_entry(*pvmw.pte); >> - if (is_writable_migration_entry(entry)) >> + if (is_writable_migration_entry(entry) || >> + is_device_coherent_page(pfn_to_page(pvmw.pfn))) >> pte = maybe_mkwrite(pte, vma); >> else if (pte_swp_uffd_wp(*pvmw.pte)) >> pte = pte_mkuffd_wp(pte); >> -- >> 2.32.0