"Zhou, Xianrong" writes:
> [AMD Official Use Only - General]
>
>> > The vmf_insert_pfn_prot could cause unnecessary double faults on a
>> > device pfn. Because currently the vmf_insert_pfn_prot does not
>> > make the pfn writable so the pte entry is normally read-only or
>> > di
Christian König writes:
> Am 01.12.23 um 06:48 schrieb Zeng, Oak:
>> [SNIP]
>> Besides memory eviction/oversubscription, there are a few other pain points
>> when I use hmm:
>>
>> 1) hmm doesn't support file-back memory, so it is hard to share
> memory b/t process in a gpu environment. You me
"Zeng, Oak" writes:
> See inline comments
>
>> -Original Message-
>> From: dri-devel On Behalf Of
>> zhuweixi
>> Sent: Thursday, November 30, 2023 5:48 AM
>> To: Christian König ; Zeng, Oak
>> ; Christian König ; linux-
>> m...@kvack.org; linux-ker...@vger.kernel.org; a...@linux-founda
zhuweixi writes:
> Glad to know that there is a common demand for a new syscall like
> hmadvise(). I expect it would also be useful for homogeneous NUMA
> cases. Credits to cudaMemAdvise() API which brought this idea to
> GMEM's design.
It's not clear to me that this would need to be a new sys
"Vlastimil Babka (SUSE)" writes:
> On 9/28/22 14:01, Alistair Popple wrote:
>> This series aims to fix a number of page reference counting issues in
>> drivers dealing with device private ZONE_DEVICE pages. These result in
>> use-after-free type bugs, either fro
Felix Kuehling writes:
> On 2022-09-28 08:01, Alistair Popple wrote:
>> When the CPU tries to access a device private page the migrate_to_ram()
>> callback associated with the pgmap for the page is called. However no
>> reference is taken on the faulting page. T
Dan Williams writes:
> Alistair Popple wrote:
>>
>> Jason Gunthorpe writes:
>>
>> > On Mon, Sep 26, 2022 at 04:03:06PM +1000, Alistair Popple wrote:
>> >> Since 27674ef6c73f ("mm: remove the extra ZONE_DEVICE struct page
>> >>
Andrew Morton writes:
> On Wed, 28 Sep 2022 22:01:22 +1000 Alistair Popple wrote:
>
>> @@ -1401,22 +1494,7 @@ static int dmirror_device_init(struct dmirror_device
>> *mdevice, int id)
>>
>> static void dmirror_device_remove(struct dmirror_device *mdevice
Michael Ellerman writes:
> Alistair Popple writes:
>> When the CPU tries to access a device private page the migrate_to_ram()
>> callback associated with the pgmap for the page is called. However no
>> reference is taken on the faulting page. Therefore a concurrent
>&
o see
if it's expected or not.
Signed-off-by: Alistair Popple
Cc: Jason Gunthorpe
Cc: John Hubbard
Cc: Ralph Campbell
Cc: Michael Ellerman
Cc: Felix Kuehling
Cc: Lyude Paul
---
arch/powerpc/kvm/book3s_hv_uvmem.c | 15 ++-
drivers/gpu/drm/amd/amdkfd/kfd_migr
n
free up device memory.
To allow that this patch introduces the migrate_device family of
functions which are functionally similar to migrate_vma but which skips
the initial lookup based on mapping.
Signed-off-by: Alistair Popple
Cc: "Huang, Ying"
Cc: Zi Yan
Cc: Matthew Wilcox
Cc:
this
isn't true for device private memory, and a future change requires
similar functionality for device private memory. So refactor the code
into something more sensible for migrating device memory without a vma.
Signed-off-by: Alistair Popple
Cc: "Huang, Ying"
Cc: Zi Yan
Cc: Mat
Signed-off-by: Alistair Popple
Cc: Jason Gunthorpe
Cc: Ralph Campbell
Cc: John Hubbard
Cc: Alex Sierra
Cc: Felix Kuehling
---
lib/test_hmm.c | 120 +-
lib/test_hmm_uapi.h| 1 +-
tools/testing/selftests/vm/hmm-tests.c
.
Refactor out the core functionality so that it is not specific to fault
handling.
Signed-off-by: Alistair Popple
Reviewed-by: Lyude Paul
Cc: Ben Skeggs
Cc: Ralph Campbell
Cc: John Hubbard
---
drivers/gpu/drm/nouveau/nouveau_dmem.c | 58 +--
1 file changed, 28 insertions
ough pages are still mapped by the kernel which can
lead to kernel crashes, particularly if a driver frees the pagemap.
To fix this drivers should take a pagemap reference when allocating the
page. This reference can then be returned when the page is freed.
Signed-off-by: Alistair Popple
Fixes: 27
device pages have been freed which may never happen.
Fix this by migrating device mappings back to normal CPU memory prior to
freeing the GPU memory chunks and associated device private pages.
Signed-off-by: Alistair Popple
Cc: Lyude Paul
Cc: Ben Skeggs
Cc: Ralph Campbell
Cc: John Hubbard
-gfx@lists.freedesktop.org
Cc: nouv...@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
Alistair Popple (8):
mm/memory.c: Fix race when faulting a device private page
mm: Free device private pages have zero refcount
mm/memremap.c: Take a pgmap reference on page allocation
mm
ns such as
get_page_unless_zero().
Signed-off-by: Alistair Popple
Cc: Jason Gunthorpe
Cc: Michael Ellerman
Cc: Felix Kuehling
Cc: Alex Deucher
Cc: Christian König
Cc: Ben Skeggs
Cc: Lyude Paul
Cc: Ralph Campbell
Cc: Alex Sierra
Cc: John Hubbard
Cc: Dan Williams
---
This will conflict with Dan'
Lyude Paul writes:
> On Mon, 2022-09-26 at 16:03 +1000, Alistair Popple wrote:
>> nouveau_dmem_fault_copy_one() is used during handling of CPU faults via
>> the migrate_to_ram() callback and is used to copy data from GPU to CPU
>> memory. It is currently specific to faul
Felix Kuehling writes:
> On 2022-09-26 17:35, Lyude Paul wrote:
>> On Mon, 2022-09-26 at 16:03 +1000, Alistair Popple wrote:
>>> When the module is unloaded or a GPU is unbound from the module it is
>>> possible for device private pages to be left mapped in curr
John Hubbard writes:
> On 9/26/22 14:35, Lyude Paul wrote:
>>> + for (i = 0; i < npages; i++) {
>>> + if (src_pfns[i] & MIGRATE_PFN_MIGRATE) {
>>> + struct page *dpage;
>>> +
>>> + /*
>>> +* _GFP_NOFAIL because the GPU is going
Jason Gunthorpe writes:
> On Mon, Sep 26, 2022 at 04:03:06PM +1000, Alistair Popple wrote:
>> Since 27674ef6c73f ("mm: remove the extra ZONE_DEVICE struct page
>> refcount") device private pages have no longer had an extra reference
>> count when the page is in u
n
free up device memory.
To allow that this patch introduces the migrate_device family of
functions which are functionally similar to migrate_vma but which skips
the initial lookup based on mapping.
Signed-off-by: Alistair Popple
---
include/linux/migrate.h | 7 +++-
mm/migrate_device.c
. Unfortunately I lack the
hardware to test on either of these so would appreciate it if someone with
access could test those.
Alistair Popple (7):
mm/memory.c: Fix race when faulting a device private page
mm: Free device private pages have zero refcount
mm/migrate_device.c: Refactor
Signed-off-by: Alistair Popple
---
lib/test_hmm.c | 119 +-
lib/test_hmm_uapi.h| 1 +-
tools/testing/selftests/vm/hmm-tests.c | 49 +++-
3 files changed, 148 insertions(+), 21 deletions(-)
diff --git a/lib/test_hmm.c
this
isn't true for device private memory, and a future change requires
similar functionality for device private memory. So refactor the code
into something more sensible for migrating device memory without a vma.
Signed-off-by: Alistair Popple
---
mm/migrate_device.c
o see
if it's expected or not.
Signed-off-by: Alistair Popple
---
arch/powerpc/kvm/book3s_hv_uvmem.c | 15 ++-
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 17 +++--
drivers/gpu/drm/amd/amdkfd/kfd_migrate.h | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 11 +--
callbacks have all been freed.
Fix this by migrating any mappings back to normal CPU memory prior to
freeing the GPU memory chunks and associated device private pages.
Signed-off-by: Alistair Popple
---
I assume the AMD driver might have a similar issue. However I can't see
where device privat
.
Refactor out the core functionality so that it is not specific to fault
handling.
Signed-off-by: Alistair Popple
---
drivers/gpu/drm/nouveau/nouveau_dmem.c | 59 +--
1 file changed, 29 insertions(+), 30 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c
b
ns such as
get_page_unless_zero().
Signed-off-by: Alistair Popple
---
arch/powerpc/kvm/book3s_hv_uvmem.c | 1 +
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 1 +
drivers/gpu/drm/nouveau/nouveau_dmem.c | 1 +
lib/test_hmm.c | 1 +
mm/memremap.c| 5
Commit b05a79d4377f ("mm/gup: migrate device coherent pages when pinning
instead of failing") added a badly formatted if statement. Fix it.
Signed-off-by: Alistair Popple
Reported-by: David Hildenbrand
---
Apologies Andrew for missing this. Hopefully this fixes things.
mm/gup.c |
Commit b05a79d4377f ("mm/gup: migrate device coherent pages when pinning
instead of failing") added a badly formatted if statement. Fix it.
Signed-off-by: Alistair Popple
Reported-by: David Hildenbrand
---
Apologies Andrew for missing this. Hopefully this fixes things.
mm/gup.c |
: Alistair Popple
Acked-by: Felix Kuehling
Signed-off-by: Christoph Hellwig
---
This patch hopefully addresses all of David's comments. It replaces both my "mm:
remove the vma check in migrate_vma_setup()" and "mm/gup: migrate device
coherent pages when pinning instead of failing&qu
David Hildenbrand writes:
> On 07.07.22 21:03, Alex Sierra wrote:
>> From: Alistair Popple
>>
>> migrate_vma_setup() checks that a valid vma is passed so that the page
>> tables can be walked to find the pfns associated with a given address
>> range. However i
David Hildenbrand writes:
> On 07.07.22 21:03, Alex Sierra wrote:
>> From: Alistair Popple
>>
>> Currently any attempts to pin a device coherent page will fail. This is
>> because device coherent pages need to be managed by a device driver, and
>> pinning
David Hildenbrand writes:
> On 29.06.22 05:54, Alex Sierra wrote:
>> This case is used to migrate pages from device memory, back to system
>> memory. Device coherent type memory is cache coherent from device and CPU
>> point of view.
>>
>> Signed-off-by: Alex Sierra
>> Acked-by: Felix Kuehling
David Hildenbrand writes:
> On 21.06.22 18:08, Sierra Guiza, Alejandro (Alex) wrote:
>>
>> On 6/21/2022 7:25 AM, David Hildenbrand wrote:
>>> On 21.06.22 13:55, Alistair Popple wrote:
>>>> David Hildenbrand writes:
>>>>
>>>>> On 2
ew.
>>>>>>>> This is used on platforms that have an advanced system bus (like CAPI
>>>>>>>> or CXL). Any page of a process can be migrated to such memory. However,
>>>>>>>> no one should be allowed to pin such memory so that it c
Oded Gabbay writes:
> On Mon, Jun 20, 2022 at 3:33 AM Alistair Popple wrote:
>>
>>
>> Oded Gabbay writes:
>>
>> > On Fri, Jun 17, 2022 at 8:20 PM Sierra Guiza, Alejandro (Alex)
>> > wrote:
>> >>
>> >>
>> >&g
> >> evicted.
>> >>
>> >> Signed-off-by: Alex Sierra
>> >> Acked-by: Felix Kuehling
>> >> Reviewed-by: Alistair Popple
>> >> [hch: rebased ontop of the refcount changes,
>> >>removed is_dev_private_or_coherent_page]
I can't see any issues with this now so:
Reviewed-by: Alistair Popple
Alex Sierra writes:
> With DEVICE_COHERENT, we'll soon have vm_normal_pages() return
> device-managed anonymous pages that are not LRU pages. Although they
> behave like normal pages for purposes of
Felix Kuehling writes:
> Am 2022-05-25 um 00:11 schrieb Alistair Popple:
>> Alex Sierra writes:
>>
>>> With DEVICE_COHERENT, we'll soon have vm_normal_pages() return
>>> device-managed anonymous pages that are not LRU pages. Although they
>>> b
"Sierra Guiza, Alejandro (Alex)" writes:
> On 5/24/2022 11:11 PM, Alistair Popple wrote:
>> Alex Sierra writes:
>>
>>> With DEVICE_COHERENT, we'll soon have vm_normal_pages() return
>>> device-managed anonymous pages that are not LRU pages
Alex Sierra writes:
> With DEVICE_COHERENT, we'll soon have vm_normal_pages() return
> device-managed anonymous pages that are not LRU pages. Although they
> behave like normal pages for purposes of mapping in CPU page, and for
> COW. They do not support LRU lists, NUMA migration or THP.
>
> We
Technically I think this patch should be earlier in the series. As I
understand it patch 1 allows DEVICE_COHERENT pages to be inserted in the
page tables and therefore makes it possible for page table walkers to
see non-LRU pages.
Some more comments below:
Alex Sierra writes:
> With DEVICE_CO
type(variant->device_number)) {
> + ASSERT_EQ(HMM_DMIRROR_PROT_DEV_COHERENT_LOCAL |
> HMM_DMIRROR_PROT_WRITE, m[0]);
> + ASSERT_EQ(HMM_DMIRROR_PROT_DEV_COHERENT_LOCAL |
> HMM_DMIRROR_PROT_WRITE, m[1]);
> + } else {
> +
Alex Sierra writes:
[...]
> diff --git a/mm/rmap.c b/mm/rmap.c
> index fedb82371efe..d57102cd4b43 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1995,7 +1995,8 @@ void try_to_migrate(struct folio *folio, enum ttu_flags
> flags)
> TTU_SYNC)))
>
"Sierra Guiza, Alejandro (Alex)" writes:
> @apop...@nvidia.com Could you please check this patch? It's somehow related
> to migrate_device_page() for long term device coherent pages.
>
> Regards,
> Alex Sierra
>> -Original Message-
>> From: amd-gfx On Behalf Of Alex
>> Sierra
>> Sent:
"Sierra Guiza, Alejandro (Alex)" writes:
> @apop...@nvidia.com Could you please check this patch? It's somehow related to
> migrate_device_page() for long term device coherent pages.
Sure thing. This whole series is in my queue of things to review once I make it
home from LSF/MM.
- Alistair
>
Felix Kuehling writes:
> On 2022-03-11 04:16, David Hildenbrand wrote:
>> On 10.03.22 18:26, Alex Sierra wrote:
>>> DEVICE_COHERENT pages introduce a subtle distinction in the way
>>> "normal" pages can be used by various callers throughout the kernel.
>>> They behave like normal pages for purpos
Felix Kuehling writes:
> Am 2022-03-10 um 14:25 schrieb Matthew Wilcox:
>> On Thu, Mar 10, 2022 at 11:26:31AM -0600, Alex Sierra wrote:
>>> @@ -606,7 +606,7 @@ static void print_bad_pte(struct vm_area_struct *vma,
>>> unsigned long addr,
>>>* PFNMAP mappings in order to support COWable mappi
Felix Kuehling writes:
> Am 2022-02-16 um 07:26 schrieb Jason Gunthorpe:
>> The other place that needs careful audit is all the callers using
>> vm_normal_page() - they must all be able to accept a ZONE_DEVICE page
>> if we don't set pte_devmap.
>
> How much code are we talking about here? A quic
Jason Gunthorpe writes:
> On Wed, Feb 16, 2022 at 09:31:03AM +0100, David Hildenbrand wrote:
>> On 16.02.22 03:36, Alistair Popple wrote:
>> > On Wednesday, 16 February 2022 1:03:57 PM AEDT Jason Gunthorpe wrote:
>> >> On Wed, Feb 16, 2022 at 12:23:44P
Jason Gunthorpe writes:
> On Tue, Feb 15, 2022 at 04:35:56PM -0500, Felix Kuehling wrote:
>>
>> On 2022-02-15 14:41, Jason Gunthorpe wrote:
>> > On Tue, Feb 15, 2022 at 07:32:09PM +0100, Christoph Hellwig wrote:
>> > > On Tue, Feb 15, 2022 at 10:45:24AM -0400, Jason Gunthorpe wrote:
>> > > > > Do
On Wednesday, 16 February 2022 1:03:57 PM AEDT Jason Gunthorpe wrote:
> On Wed, Feb 16, 2022 at 12:23:44PM +1100, Alistair Popple wrote:
>
> > Device private and device coherent pages are not marked with pte_devmap and
> > they
> > are backed by a struct page. The only
John Hubbard writes:
> On 2/11/22 18:51, Alistair Popple wrote:
[…]
>>> See below…
>>>
>>>> + }
>>>> +
>>>> + pages[i] = migrate_device_page(head, gup_flags);
>> migrate_device_page() will return
>>> or CXL). Any page of a process can be migrated to such memory. However,
>>> no one should be allowed to pin such memory so that it can always be
>>> evicted.
>>>
>>> Signed-off-by: Alex Sierra
>>> Acked-by: Felix Kuehling
>>> Reviewed
On Saturday, 12 February 2022 1:10:29 PM AEDT John Hubbard wrote:
> On 2/6/22 20:26, Alistair Popple wrote:
> > Currently any attempts to pin a device coherent page will fail. This is
> > because device coherent pages need to be managed by a device driver, and
> > pinning
On Thursday, 10 February 2022 10:47:35 PM AEDT David Hildenbrand wrote:
> On 10.02.22 12:39, Alistair Popple wrote:
> > On Thursday, 10 February 2022 9:53:38 PM AEDT David Hildenbrand wrote:
> >> On 07.02.22 05:26, Alistair Popple wrote:
> >>> Currently any attempts
On Thursday, 10 February 2022 9:53:38 PM AEDT David Hildenbrand wrote:
> On 07.02.22 05:26, Alistair Popple wrote:
> > Currently any attempts to pin a device coherent page will fail. This is
> > because device coherent pages need to be managed by a device driver, and
> > pinni
On Thursday, 10 February 2022 6:28:01 PM AEDT Christoph Hellwig wrote:
[...]
> Changes since v1:
> - add a missing memremap.h include in memcontrol.c
> - include rebased versions of the device coherent support and
>device coherent migration support series as well as additional
>cleanup
Reviewed-by: Alistair Popple
On Thursday, 10 February 2022 6:28:12 PM AEDT Christoph Hellwig wrote:
> Make the flow a little more clear and prepare for adding a new
> ZONE_DEVICE memory type.
>
> Signed-off-by: Christoph Hellwig
> ---
> mm/migrate.c | 31 +++---
Reviewed-by: Alistair Popple
On Thursday, 10 February 2022 6:28:13 PM AEDT Christoph Hellwig wrote:
> Make the flow a little more clear and prepare for adding a new
> ZONE_DEVICE memory type.
>
> Signed-off-by: Christoph Hellwig
> ---
> mm/migrate.c | 27 ---
Thanks, it's also better than more stubbed functions.
Reviewed-by: Alistair Popple
On Thursday, 10 February 2022 6:28:15 PM AEDT Christoph Hellwig wrote:
> This code will be used for device coherent memory as well in a bit,
> so relax the ifdef a bit.
>
> Signed-off-by:
I got the following build error:
/data/source/linux/mm/migrate_device.c: In function ‘migrate_vma_collect_pmd’:
/data/source/linux/mm/migrate_device.c:242:3: error: implicit declaration of
function ‘flush_tlb_range’; did you mean ‘flush_pmd_tlb_range’?
[-Werror=implicit-function-declaration]
2
On Thursday, 10 February 2022 4:48:36 AM AEDT Christoph Hellwig wrote:
> On Mon, Feb 07, 2022 at 04:19:29PM -0500, Felix Kuehling wrote:
> >
> > Am 2022-02-07 um 01:32 schrieb Christoph Hellwig:
> >> Move the check for the actual pgmap types that need the free at refcount
> >> one behavior into the
x27;t required.
Signed-off-by: Alistair Popple
Acked-by: Felix Kuehling
---
Changes for v2:
- Added Felix's Acked-by
mm/migrate.c | 34 +-
1 file changed, 17 insertions(+), 17 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index a9aed12..0d6570d 10064
accessible from the CPU so can be migrated just like
pinning ZONE_MOVABLE pages. So instead of failing all attempts to pin
them first try migrating them out of ZONE_DEVICE.
Signed-off-by: Alistair Popple
Acked-by: Felix Kuehling
---
Changes for v2:
- Added Felix's Acked-by
- Fixed mi
On Wednesday, 2 February 2022 2:03:01 AM AEDT Felix Kuehling wrote:
>
> Am 2022-02-01 um 02:05 schrieb Alistair Popple:
> > Currently any attempts to pin a device coherent page will fail. This is
> > because device coherent pages need to be managed by a device driver, and
>
- Rebased on to linux-next-20220204
Alex Sierra (1):
tools: add hmm gup test for long term pinned device pages
Alistair Popple (2):
migrate.c: Remove vma check in migrate_vma_setup()
mm/gup.c: Migrate device coherent pages when pinning instead of failing
mm/gup.c
From: Alex Sierra
The intention is to test device coherent type pages that have been
called through get user pages with PIN_LONGTERM flag set. These pages
should get migrated back to normal system memory.
Signed-off-by: Alex Sierra
Signed-off-by: Alistair Popple
Reviewed-by: Felix Kuehling
Oh sorry, I had looked at this but forgotten to add my reviewed by:
Reviewed-by: Alistair Popple
On Tuesday, 1 February 2022 10:27:25 AM AEDT Sierra Guiza, Alejandro (Alex)
wrote:
> Hi Alistair,
> This is the last patch to be reviewed from this series. It already has
> the changes fr
accessible from the CPU so can be migrated just like
pinning ZONE_MOVABLE pages. So instead of failing all attempts to pin
them first try migrating them out of ZONE_DEVICE.
Signed-off-by: Alistair Popple
---
mm/gup.c | 105 ++--
1 file changed
ong term pinned device pages
Alistair Popple (2):
migrate.c: Remove vma check in migrate_vma_setup()
mm/gup.c: Migrate device coherent pages when pinning instead of failing
mm/gup.c | 105 +++---
mm/migrate.c | 34
x27;t required.
Signed-off-by: Alistair Popple
---
mm/migrate.c | 34 +-
1 file changed, 17 insertions(+), 17 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index d3cc358..31ba8ca 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2581,24 +2581,24 @
From: Alex Sierra
The intention is to test device coherent type pages that have been
called through get user pages with PIN_LONGTERM flag set. These pages
should get migrated back to normal system memory.
Signed-off-by: Alex Sierra
Signed-off-by: Alistair Popple
---
tools/testing/selftests
Thanks for fixing. I'm guessing Andrew will want you to resend this as part of
a new v6 series, but please add:
Reviewed-by: Alistair Popple
On Tuesday, 1 February 2022 6:48:13 AM AEDT Alex Sierra wrote:
> This case is used to migrate pages from device memory, back to system
> mem
Looks good, feel free to add:
Reviewed-by: Alistair Popple
On Saturday, 29 January 2022 7:08:16 AM AEDT Alex Sierra wrote:
> Device memory that is cache coherent from device and CPU point of view.
> This is used on platforms that have an advanced system bus (like CAPI
> or CXL). Any
On Saturday, 29 January 2022 7:08:17 AM AEDT Alex Sierra wrote:
[...]
> struct migrate_vma {
> diff --git a/mm/migrate.c b/mm/migrate.c
> index cd137aedcfe5..d3cc3589e1e8 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -2264,7 +2264,8 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>
On Thursday, 27 January 2022 2:09:41 PM AEDT Alex Sierra wrote:
[...]
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 277562cd4cf5..2b3375e165b1 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -2340,8 +2340,6 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
> if (
Reviewed-by: Alistair Popple
On Thursday, 27 January 2022 2:09:45 PM AEDT Alex Sierra wrote:
> new ioctl cmd added to query zone device type. This will be
> used once the test_hmm adds zone device coherent type.
>
> Signed-off-by: Alex Sierra
> ---
> lib/te
Thanks for the updates, looks good now.
Reviewed-by: Alistair Popple
On Thursday, 27 January 2022 2:09:46 PM AEDT Alex Sierra wrote:
> In order to configure device coherent in test_hmm, two module parameters
> should be passed, which correspond to the SP start address of each
>
On Thursday, 27 January 2022 2:09:40 PM AEDT Alex Sierra wrote:
[...]
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 1852d787e6ab..277562cd4cf5 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -362,7 +362,7 @@ static int expected_page_refs(struct address_space
> *mapping, struct page *pa
On Thursday, 27 January 2022 2:09:43 PM AEDT Alex Sierra wrote:
[...]
> @@ -984,3 +990,4 @@ int svm_migrate_init(struct amdgpu_device *adev)
>
> return 0;
> }
> +
>
git-am complained about this when I applied the series. Given you have to
rebase anyway it would be worth fixing this.
On Thursday, 27 January 2022 2:09:42 PM AEDT Alex Sierra wrote:
> Avoid long term pinning for Coherent device type pages. This could
> interfere with their own device memory manager. For now, we are just
> returning error for PIN_LONGTERM Coherent device type pages. Eventually,
> these type of page
I haven't tested the change which checks that pages migrated back to sysmem,
but it looks ok so:
Reviewed-by: Alistair Popple
On Thursday, 27 January 2022 2:09:47 PM AEDT Alex Sierra wrote:
> Device Coherent type uses device memory that is coherently accesible by
> the CPU. This cou
On Thursday, 20 January 2022 11:36:21 PM AEDT Joao Martins wrote:
> On 1/10/22 22:31, Alex Sierra wrote:
> > Avoid long term pinning for Coherent device type pages. This could
> > interfere with their own device memory manager. For now, we are just
> > returning error for PIN_LONGTERM Coherent devi
On Wednesday, 12 January 2022 10:06:03 PM AEDT Alistair Popple wrote:
> I have been looking at this in relation to the migration code and noticed we
> have the following in try_to_migrate():
>
> if (is_zone_device_page(page) && !is_device_private_page(page))
>
On Tuesday, 11 January 2022 9:31:59 AM AEDT Alex Sierra wrote:
> Device Coherent type uses device memory that is coherently accesible by
> the CPU. This could be shown as SP (special purpose) memory range
> at the BIOS-e820 memory enumeration. If no SP memory is supported in
> system, this could be
Looks good,
Reviewed-by: Alistair Popple
On Tuesday, 11 January 2022 9:32:01 AM AEDT Alex Sierra wrote:
> Add two more parameters to set spm_addr_dev0 & spm_addr_dev1
> addresses. These two parameters configure the start SP
> addresses for each device in test_hmm driver.
> Co
Thanks for splitting the coherent devices into separate device nodes. Couple of
comments below.
On Tuesday, 11 January 2022 9:31:58 AM AEDT Alex Sierra wrote:
> In order to configure device coherent in test_hmm, two module parameters
> should be passed, which correspond to the SP start address of
On Tuesday, 11 January 2022 9:31:57 AM AEDT Alex Sierra wrote:
[...]
> +enum {
> + /* 0 is reserved to catch uninitialized type fields */
This seems unnecessary and can be dropped to start at zero.
Reviewed-by: Alistair Popple
> + HMM_DMIRROR_MEMORY_DEVICE_PR
On Tuesday, 11 January 2022 9:32:00 AM AEDT Alex Sierra wrote:
> Test cases such as migrate_fault and migrate_multiple, were modified to
> explicit migrate from device to sys memory without the need of page
> faults, when using device coherent type.
>
> Snapshot test case updated to read memory de
On Tuesday, 11 January 2022 9:31:52 AM AEDT Alex Sierra wrote:
> Device memory that is cache coherent from device and CPU point of view.
> This is used on platforms that have an advanced system bus (like CAPI
> or CXL). Any page of a process can be migrated to such memory. However,
> no one should
I have been looking at this in relation to the migration code and noticed we
have the following in try_to_migrate():
if (is_zone_device_page(page) && !is_device_private_page(page))
return;
Which if I'm understanding correctly means that migration of device coherent
pages w
On Friday, 10 December 2021 3:54:31 AM AEDT Sierra Guiza, Alejandro (Alex)
wrote:
>
> On 12/9/2021 10:29 AM, Felix Kuehling wrote:
> > Am 2021-12-09 um 5:53 a.m. schrieb Alistair Popple:
> >> On Thursday, 9 December 2021 5:55:26 AM AEDT Sierra Guiza, Alejandro
> >&g
On Thursday, 9 December 2021 12:53:45 AM AEDT Jason Gunthorpe wrote:
> > I think a similar problem exists for device private fault handling as well
> > and
> > it has been on my list of things to fix for a while. I think the solution
> > is to
> > call try_get_page(), except it doesn't work with
On Thursday, 9 December 2021 5:55:26 AM AEDT Sierra Guiza, Alejandro (Alex)
wrote:
>
> On 12/8/2021 11:30 AM, Felix Kuehling wrote:
> > Am 2021-12-08 um 11:58 a.m. schrieb Felix Kuehling:
> >> Am 2021-12-08 um 6:31 a.m. schrieb Alistair Popple:
> >>> On Tuesday
On Tuesday, 7 December 2021 5:52:43 AM AEDT Alex Sierra wrote:
> Avoid long term pinning for Coherent device type pages. This could
> interfere with their own device memory manager.
> If caller tries to get user device coherent pages with PIN_LONGTERM flag
> set, those pages will be migrated back t
On Tuesday, 23 November 2021 4:16:55 AM AEDT Felix Kuehling wrote:
[...]
> > Right, so long as my fix goes in I don't think there is anything wrong with
> > pinning device public pages. Agree that we should avoid FOLL_LONGTERM pins
> > for
> > device memory though. I think the way to do that is
1 - 100 of 112 matches
Mail list logo