Re: [PATCH 1/8] drm/amdgpu: fix ATC handling for Ryzen
On 2019-04-03 1:24 p.m., Koenig, Christian wrote: > Am 01.04.19 um 20:58 schrieb Kuehling, Felix: >> On 2019-04-01 2:03 p.m., Christian König wrote: >>> Am 01.04.19 um 19:59 schrieb Kuehling, Felix: On 2019-04-01 7:23 a.m., Christian König wrote: > Am 30.03.19 um 01:41 schrieb Kuehling, Felix: >> Patches 1-3 are Reviewed-by: Felix Kuehling > Thanks. > >> About the direct mode, that removes a bunch of synchronization, so it >> must make some assumptions about the state of the page tables. What >> makes that safe? > Direct mode is only supposed to be used during page fault handling. > > E.g. we know that the page tables are in the correct place in this > situation because the hardware is hammering on a PTE and waiting for > it to become valid. A fence could also indicate a concurrent modification of the page table. For example a PTB may be allocated and initialized concurrently, not in direct mode. Would direct mode need to wait for a fence that indicates completion of the PTB initialization? Or do we have some way to ensure such concurrent allocation and initialization of a PTB cannot happen? >>> Yeah, that is a very good question I haven't solved yet either. >>> >>> My currently best idea is to separate the address space, e.g. use the >>> lower address space for on demand paging and the higher with classic >>> pre-filled page tables for the MM and display engines. >> That may work for graphics, but doesn't work for KFD. I need the ability >> to mix pre-filled page tables with HMM in the same SVM address space. > Even after thinking for multiple days about it I can't of hand find a > way to make this work. > >> That's why I was thinking that all page table updates for a given VM >> would need to use the same method. > Well what exactly do you mean with that? Essentially there are two methods: > > 1. Pre-fill the page tables before accessing them with the hardware. > > 2. Fill on demand with page faults. > > I don't think we can mix those two methods together in the same address > range. That's what I was hoping to do. For example an application could use "old" BO-based memory management APIs that pre-fill page tables with "new" HMM-based memory management APIs that rely on page faults. Those may be different libraries written in different languages running in the same application. E.g. a GPU BLAS implementation that's optimized and uses old-style memory allocations linked to an OpenMP application that relies on HMM. If that's not possible, I'd need to emulate all the old memory APIs on top of HMM. I was hoping to avoid that. Even when page faults are enabled, we want to be able to pre-fault stuff to avoid the performance it on the first access. Are you saying that won't be possible? Regards, Felix > > E.g. we can say to use pre-fill for MM engines in the upper range and on > demand filling in the lower range, but we can't mix them. > > Regards, > Christian. > >> Regards, >> Felix >> >>> Christian. >>> Regards, Felix > Christian. > >> Is it safe to use direct-mode on a >> per-page-table-update basis? Or do all page table updates have to go >> through direct mode to avoid hazards? If yes, then maybe this >> should be >> a property of the VM rather than a parameter that gets passed to a >> bunch >> of function calls. >> >> Regards, >> Felix >> >> On 2019-03-29 6:45 a.m., Christian König wrote: >>> Otherwise we don't correctly use translate further. >>> >>> Signed-off-by: Christian König >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 - >>> 1 file changed, 8 insertions(+), 5 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>> index 3d221f044183..059d9802e713 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>> @@ -767,14 +767,17 @@ static int amdgpu_vm_clear_bo(struct >>> amdgpu_device *adev, >>> addr = 0; >>> if (ats_entries) { >>> - uint64_t ats_value; >>> + uint64_t value = 0, flags; >>> - ats_value = AMDGPU_PTE_DEFAULT_ATC; >>> - if (level != AMDGPU_VM_PTB) >>> - ats_value |= AMDGPU_PDE_PTE; >>> + flags = AMDGPU_PTE_DEFAULT_ATC; >>> + if (level != AMDGPU_VM_PTB) { >>> + /* Handle leaf PDEs as PTEs */ >>> + flags |= AMDGPU_PDE_PTE; >>> + amdgpu_gmc_get_vm_pde(adev, level, , ); >>> + } >>> r = vm->update_funcs->update(, bo, addr, 0, >>> ats_entries, >>> - 0, ats_value); >>> + value, flags); >>> if (r) >>> return r;
Re: [PATCH 1/8] drm/amdgpu: fix ATC handling for Ryzen
Am 01.04.19 um 20:58 schrieb Kuehling, Felix: > On 2019-04-01 2:03 p.m., Christian König wrote: >> Am 01.04.19 um 19:59 schrieb Kuehling, Felix: >>> On 2019-04-01 7:23 a.m., Christian König wrote: Am 30.03.19 um 01:41 schrieb Kuehling, Felix: > Patches 1-3 are Reviewed-by: Felix Kuehling Thanks. > About the direct mode, that removes a bunch of synchronization, so it > must make some assumptions about the state of the page tables. What > makes that safe? Direct mode is only supposed to be used during page fault handling. E.g. we know that the page tables are in the correct place in this situation because the hardware is hammering on a PTE and waiting for it to become valid. >>> A fence could also indicate a concurrent modification of the page table. >>> For example a PTB may be allocated and initialized concurrently, not in >>> direct mode. Would direct mode need to wait for a fence that indicates >>> completion of the PTB initialization? Or do we have some way to ensure >>> such concurrent allocation and initialization of a PTB cannot happen? >> Yeah, that is a very good question I haven't solved yet either. >> >> My currently best idea is to separate the address space, e.g. use the >> lower address space for on demand paging and the higher with classic >> pre-filled page tables for the MM and display engines. > That may work for graphics, but doesn't work for KFD. I need the ability > to mix pre-filled page tables with HMM in the same SVM address space. Even after thinking for multiple days about it I can't of hand find a way to make this work. > That's why I was thinking that all page table updates for a given VM > would need to use the same method. Well what exactly do you mean with that? Essentially there are two methods: 1. Pre-fill the page tables before accessing them with the hardware. 2. Fill on demand with page faults. I don't think we can mix those two methods together in the same address range. E.g. we can say to use pre-fill for MM engines in the upper range and on demand filling in the lower range, but we can't mix them. Regards, Christian. > > Regards, > Felix > >> Christian. >> >>> Regards, >>> Felix >>> >>> Christian. > Is it safe to use direct-mode on a > per-page-table-update basis? Or do all page table updates have to go > through direct mode to avoid hazards? If yes, then maybe this > should be > a property of the VM rather than a parameter that gets passed to a > bunch > of function calls. > > Regards, > Felix > > On 2019-03-29 6:45 a.m., Christian König wrote: >> Otherwise we don't correctly use translate further. >> >> Signed-off-by: Christian König >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 - >> 1 file changed, 8 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >> index 3d221f044183..059d9802e713 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >> @@ -767,14 +767,17 @@ static int amdgpu_vm_clear_bo(struct >> amdgpu_device *adev, >> addr = 0; >> if (ats_entries) { >> - uint64_t ats_value; >> + uint64_t value = 0, flags; >> - ats_value = AMDGPU_PTE_DEFAULT_ATC; >> - if (level != AMDGPU_VM_PTB) >> - ats_value |= AMDGPU_PDE_PTE; >> + flags = AMDGPU_PTE_DEFAULT_ATC; >> + if (level != AMDGPU_VM_PTB) { >> + /* Handle leaf PDEs as PTEs */ >> + flags |= AMDGPU_PDE_PTE; >> + amdgpu_gmc_get_vm_pde(adev, level, , ); >> + } >> r = vm->update_funcs->update(, bo, addr, 0, >> ats_entries, >> - 0, ats_value); >> + value, flags); >> if (r) >> return r; >>> ___ >>> amd-gfx mailing list >>> amd-gfx@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/8] drm/amdgpu: fix ATC handling for Ryzen
On 2019-04-01 2:03 p.m., Christian König wrote: > Am 01.04.19 um 19:59 schrieb Kuehling, Felix: >> On 2019-04-01 7:23 a.m., Christian König wrote: >>> Am 30.03.19 um 01:41 schrieb Kuehling, Felix: Patches 1-3 are Reviewed-by: Felix Kuehling >>> Thanks. >>> About the direct mode, that removes a bunch of synchronization, so it must make some assumptions about the state of the page tables. What makes that safe? >>> Direct mode is only supposed to be used during page fault handling. >>> >>> E.g. we know that the page tables are in the correct place in this >>> situation because the hardware is hammering on a PTE and waiting for >>> it to become valid. >> A fence could also indicate a concurrent modification of the page table. >> For example a PTB may be allocated and initialized concurrently, not in >> direct mode. Would direct mode need to wait for a fence that indicates >> completion of the PTB initialization? Or do we have some way to ensure >> such concurrent allocation and initialization of a PTB cannot happen? > > Yeah, that is a very good question I haven't solved yet either. > > My currently best idea is to separate the address space, e.g. use the > lower address space for on demand paging and the higher with classic > pre-filled page tables for the MM and display engines. That may work for graphics, but doesn't work for KFD. I need the ability to mix pre-filled page tables with HMM in the same SVM address space. That's why I was thinking that all page table updates for a given VM would need to use the same method. Regards, Felix > > Christian. > >> >> Regards, >> Felix >> >> >>> Christian. >>> Is it safe to use direct-mode on a per-page-table-update basis? Or do all page table updates have to go through direct mode to avoid hazards? If yes, then maybe this should be a property of the VM rather than a parameter that gets passed to a bunch of function calls. Regards, Felix On 2019-03-29 6:45 a.m., Christian König wrote: > Otherwise we don't correctly use translate further. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 - > 1 file changed, 8 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > index 3d221f044183..059d9802e713 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -767,14 +767,17 @@ static int amdgpu_vm_clear_bo(struct > amdgpu_device *adev, > addr = 0; > if (ats_entries) { > - uint64_t ats_value; > + uint64_t value = 0, flags; > - ats_value = AMDGPU_PTE_DEFAULT_ATC; > - if (level != AMDGPU_VM_PTB) > - ats_value |= AMDGPU_PDE_PTE; > + flags = AMDGPU_PTE_DEFAULT_ATC; > + if (level != AMDGPU_VM_PTB) { > + /* Handle leaf PDEs as PTEs */ > + flags |= AMDGPU_PDE_PTE; > + amdgpu_gmc_get_vm_pde(adev, level, , ); > + } > r = vm->update_funcs->update(, bo, addr, 0, > ats_entries, > - 0, ats_value); > + value, flags); > if (r) > return r; >> ___ >> amd-gfx mailing list >> amd-gfx@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/8] drm/amdgpu: fix ATC handling for Ryzen
Am 01.04.19 um 19:59 schrieb Kuehling, Felix: On 2019-04-01 7:23 a.m., Christian König wrote: Am 30.03.19 um 01:41 schrieb Kuehling, Felix: Patches 1-3 are Reviewed-by: Felix Kuehling Thanks. About the direct mode, that removes a bunch of synchronization, so it must make some assumptions about the state of the page tables. What makes that safe? Direct mode is only supposed to be used during page fault handling. E.g. we know that the page tables are in the correct place in this situation because the hardware is hammering on a PTE and waiting for it to become valid. A fence could also indicate a concurrent modification of the page table. For example a PTB may be allocated and initialized concurrently, not in direct mode. Would direct mode need to wait for a fence that indicates completion of the PTB initialization? Or do we have some way to ensure such concurrent allocation and initialization of a PTB cannot happen? Yeah, that is a very good question I haven't solved yet either. My currently best idea is to separate the address space, e.g. use the lower address space for on demand paging and the higher with classic pre-filled page tables for the MM and display engines. Christian. Regards, Felix Christian. Is it safe to use direct-mode on a per-page-table-update basis? Or do all page table updates have to go through direct mode to avoid hazards? If yes, then maybe this should be a property of the VM rather than a parameter that gets passed to a bunch of function calls. Regards, Felix On 2019-03-29 6:45 a.m., Christian König wrote: Otherwise we don't correctly use translate further. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 3d221f044183..059d9802e713 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -767,14 +767,17 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev, addr = 0; if (ats_entries) { - uint64_t ats_value; + uint64_t value = 0, flags; - ats_value = AMDGPU_PTE_DEFAULT_ATC; - if (level != AMDGPU_VM_PTB) - ats_value |= AMDGPU_PDE_PTE; + flags = AMDGPU_PTE_DEFAULT_ATC; + if (level != AMDGPU_VM_PTB) { + /* Handle leaf PDEs as PTEs */ + flags |= AMDGPU_PDE_PTE; + amdgpu_gmc_get_vm_pde(adev, level, , ); + } r = vm->update_funcs->update(, bo, addr, 0, ats_entries, - 0, ats_value); + value, flags); if (r) return r; ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/8] drm/amdgpu: fix ATC handling for Ryzen
On 2019-04-01 7:23 a.m., Christian König wrote: > Am 30.03.19 um 01:41 schrieb Kuehling, Felix: >> Patches 1-3 are Reviewed-by: Felix Kuehling > > Thanks. > >> >> About the direct mode, that removes a bunch of synchronization, so it >> must make some assumptions about the state of the page tables. What >> makes that safe? > > Direct mode is only supposed to be used during page fault handling. > > E.g. we know that the page tables are in the correct place in this > situation because the hardware is hammering on a PTE and waiting for > it to become valid. A fence could also indicate a concurrent modification of the page table. For example a PTB may be allocated and initialized concurrently, not in direct mode. Would direct mode need to wait for a fence that indicates completion of the PTB initialization? Or do we have some way to ensure such concurrent allocation and initialization of a PTB cannot happen? Regards, Felix > > Christian. > >> Is it safe to use direct-mode on a >> per-page-table-update basis? Or do all page table updates have to go >> through direct mode to avoid hazards? If yes, then maybe this should be >> a property of the VM rather than a parameter that gets passed to a bunch >> of function calls. >> >> Regards, >> Felix >> >> On 2019-03-29 6:45 a.m., Christian König wrote: >>> Otherwise we don't correctly use translate further. >>> >>> Signed-off-by: Christian König >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 - >>> 1 file changed, 8 insertions(+), 5 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>> index 3d221f044183..059d9802e713 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>> @@ -767,14 +767,17 @@ static int amdgpu_vm_clear_bo(struct >>> amdgpu_device *adev, >>> addr = 0; >>> if (ats_entries) { >>> - uint64_t ats_value; >>> + uint64_t value = 0, flags; >>> - ats_value = AMDGPU_PTE_DEFAULT_ATC; >>> - if (level != AMDGPU_VM_PTB) >>> - ats_value |= AMDGPU_PDE_PTE; >>> + flags = AMDGPU_PTE_DEFAULT_ATC; >>> + if (level != AMDGPU_VM_PTB) { >>> + /* Handle leaf PDEs as PTEs */ >>> + flags |= AMDGPU_PDE_PTE; >>> + amdgpu_gmc_get_vm_pde(adev, level, , ); >>> + } >>> r = vm->update_funcs->update(, bo, addr, 0, >>> ats_entries, >>> - 0, ats_value); >>> + value, flags); >>> if (r) >>> return r; > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/8] drm/amdgpu: fix ATC handling for Ryzen
Am 30.03.19 um 01:41 schrieb Kuehling, Felix: Patches 1-3 are Reviewed-by: Felix Kuehling Thanks. About the direct mode, that removes a bunch of synchronization, so it must make some assumptions about the state of the page tables. What makes that safe? Direct mode is only supposed to be used during page fault handling. E.g. we know that the page tables are in the correct place in this situation because the hardware is hammering on a PTE and waiting for it to become valid. Christian. Is it safe to use direct-mode on a per-page-table-update basis? Or do all page table updates have to go through direct mode to avoid hazards? If yes, then maybe this should be a property of the VM rather than a parameter that gets passed to a bunch of function calls. Regards, Felix On 2019-03-29 6:45 a.m., Christian König wrote: Otherwise we don't correctly use translate further. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 3d221f044183..059d9802e713 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -767,14 +767,17 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev, addr = 0; if (ats_entries) { - uint64_t ats_value; + uint64_t value = 0, flags; - ats_value = AMDGPU_PTE_DEFAULT_ATC; - if (level != AMDGPU_VM_PTB) - ats_value |= AMDGPU_PDE_PTE; + flags = AMDGPU_PTE_DEFAULT_ATC; + if (level != AMDGPU_VM_PTB) { + /* Handle leaf PDEs as PTEs */ + flags |= AMDGPU_PDE_PTE; + amdgpu_gmc_get_vm_pde(adev, level, , ); + } r = vm->update_funcs->update(, bo, addr, 0, ats_entries, -0, ats_value); +value, flags); if (r) return r; ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/8] drm/amdgpu: fix ATC handling for Ryzen
Patches 1-3 are Reviewed-by: Felix Kuehling About the direct mode, that removes a bunch of synchronization, so it must make some assumptions about the state of the page tables. What makes that safe? Is it safe to use direct-mode on a per-page-table-update basis? Or do all page table updates have to go through direct mode to avoid hazards? If yes, then maybe this should be a property of the VM rather than a parameter that gets passed to a bunch of function calls. Regards, Felix On 2019-03-29 6:45 a.m., Christian König wrote: > Otherwise we don't correctly use translate further. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 - > 1 file changed, 8 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > index 3d221f044183..059d9802e713 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -767,14 +767,17 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device > *adev, > > addr = 0; > if (ats_entries) { > - uint64_t ats_value; > + uint64_t value = 0, flags; > > - ats_value = AMDGPU_PTE_DEFAULT_ATC; > - if (level != AMDGPU_VM_PTB) > - ats_value |= AMDGPU_PDE_PTE; > + flags = AMDGPU_PTE_DEFAULT_ATC; > + if (level != AMDGPU_VM_PTB) { > + /* Handle leaf PDEs as PTEs */ > + flags |= AMDGPU_PDE_PTE; > + amdgpu_gmc_get_vm_pde(adev, level, , ); > + } > > r = vm->update_funcs->update(, bo, addr, 0, ats_entries, > - 0, ats_value); > + value, flags); > if (r) > return r; > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx