date:20210817

Re: [PATCH] drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)

2021-08-17 Thread Andrey Grodzovsky




On 2021-08-02 1:16 a.m., Guchun Chen wrote:

In amdgpu_fence_driver_hw_fini, no need to call drm_sched_fini to stop
scheduler in s3 test, otherwise, fence related failure will arrive
after resume. To fix this and for a better clean up, move drm_sched_fini
from fence_hw_fini to fence_sw_fini, as it's part of driver shutdown, and
should never be called in hw_fini.

v2: rename amdgpu_fence_driver_init to amdgpu_fence_driver_sw_init,
to keep sw_init and sw_fini paired.

Fixes: cd87a6dcf6af drm/amdgpu: adjust fence driver enable sequence
Suggested-by: Christian König 
Signed-off-by: Guchun Chen 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 ++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 12 +++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  4 ++--
  3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index b1d2dc39e8be..9e53ff851496 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3646,9 +3646,9 @@ int amdgpu_device_init(struct amdgpu_device *adev,
  
  fence_driver_init:

/* Fence driver */
-   r = amdgpu_fence_driver_init(adev);
+   r = amdgpu_fence_driver_sw_init(adev);
if (r) {
-   dev_err(adev->dev, "amdgpu_fence_driver_init failed\n");
+   dev_err(adev->dev, "amdgpu_fence_driver_sw_init failed\n");
amdgpu_vf_error_put(adev, AMDGIM_ERROR_VF_FENCE_INIT_FAIL, 0, 
0);
goto failed;
}
@@ -3988,7 +3988,6 @@ int amdgpu_device_resume(struct drm_device *dev, bool 
fbcon)
}
amdgpu_fence_driver_hw_init(adev);
  
-

r = amdgpu_device_ip_late_init(adev);
if (r)
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 49c5c7331c53..7495911516c2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -498,7 +498,7 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,
  }
  
  /**

- * amdgpu_fence_driver_init - init the fence driver
+ * amdgpu_fence_driver_sw_init - init the fence driver
   * for all possible rings.
   *
   * @adev: amdgpu device pointer
@@ -509,13 +509,13 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring 
*ring,
   * amdgpu_fence_driver_start_ring().
   * Returns 0 for success.
   */
-int amdgpu_fence_driver_init(struct amdgpu_device *adev)
+int amdgpu_fence_driver_sw_init(struct amdgpu_device *adev)
  {
return 0;
  }
  
  /**

- * amdgpu_fence_driver_fini - tear down the fence driver
+ * amdgpu_fence_driver_hw_fini - tear down the fence driver
   * for all possible rings.
   *
   * @adev: amdgpu device pointer
@@ -531,8 +531,7 @@ void amdgpu_fence_driver_hw_fini(struct amdgpu_device *adev)
  
  		if (!ring || !ring->fence_drv.initialized)

continue;
-   if (!ring->no_scheduler)
-   drm_sched_fini(>sched);
+
/* You can't wait for HW to signal if it's gone */
if (!drm_dev_is_unplugged(>ddev))
r = amdgpu_fence_wait_empty(ring);



Sorry for late notice, missed this patch. By moving drm_sched_fini
past amdgpu_fence_wait_empty a race is created as even after you waited
for all fences on the ring to signal the sw scheduler will keep submitting
new jobs on the ring and so the ring won't stay empty.

For hot device removal also we want to prevent any access to HW past PCI 
removal
in order to not do any MMIO accesses inside the physical MMIO range that 
no longer
belongs to this device after it's removal by the PCI core. Stopping all 
the schedulers prevents any MMIO
accesses done during job submissions and that why drm_sched_fini was 
done as part of amdgpu_fence_driver_hw_fini

and not amdgpu_fence_driver_sw_fini

Andrey


@@ -560,6 +559,9 @@ void amdgpu_fence_driver_sw_fini(struct amdgpu_device *adev)
if (!ring || !ring->fence_drv.initialized)
continue;
  
+		if (!ring->no_scheduler)

+   drm_sched_fini(>sched);
+
for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)
dma_fence_put(ring->fence_drv.fences[j]);
kfree(ring->fence_drv.fences);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 27adffa7658d..9c11ced4312c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -106,7 +106,6 @@ struct amdgpu_fence_driver {
struct dma_fence**fences;
  };
  
-int amdgpu_fence_driver_init(struct amdgpu_device *adev);

  void amdgpu_fence_driver_force_completion(struct amdgpu_ring *ring);
  
  int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,

@@ -115,9 +114,10 @@ int amdgpu_fence_driver_init_ring(struct

Re: [PATCH v3 2/2] drm/amdkfd: fix KFDSVMRangeTest.PartialUnmapSysMemTest fails

2021-08-17 Thread Felix Kuehling

Am 2021-08-17 um 11:12 p.m. schrieb Yifan Zhang:
> [ RUN  ] KFDSVMRangeTest.PartialUnmapSysMemTest
> /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDTestUtil.cpp:245: Failure
> Value of: (hsaKmtAllocMemory(m_Node, m_Size, m_Flags, _pBuf))
>   Actual: 1
> Expected: HSAKMT_STATUS_SUCCESS
> Which is: 0
> /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDTestUtil.cpp:248: Failure
> Value of: (hsaKmtMapMemoryToGPUNodes(m_pBuf, m_Size, __null, mapFlags, 1, 
> _Node))
>   Actual: 1
> Expected: HSAKMT_STATUS_SUCCESS
> Which is: 0
> /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDTestUtil.cpp:306: Failure
> Expected: ((void *)__null) != (ptr), actual: NULL vs NULL
> Segmentation fault (core dumped)
> [  ] Profile: Full Test
> [  ] HW capabilities: 0x9
>
> kernel log:
>
> [  102.029150]  ret_from_fork+0x22/0x30
> [  102.029158] ---[ end trace 15c34e782714f9a3 ]---
> [ 3613.603598] amdgpu: Address: 0x7f7149ccc000 already allocated by SVM
> [ 3613.610620] show_signal_msg: 27 callbacks suppressed
>
> These is race with deferred actions from previous memory map
> changes (e.g. munmap).Flush pending deffered work to avoid such case.
>
> Signed-off-by: Yifan Zhang 

The series is

Reviewed-by: Felix Kuehling 


> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 3177c4a0e753..4de907f3e66a 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -1261,7 +1261,12 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file 
> *filep,
>   return -EINVAL;
>  
>  #if IS_ENABLED(CONFIG_HSA_AMD_SVM)
> + /* Flush pending deferred work to avoid racing with deferred actions
> +  * from previous memory map changes (e.g. munmap).
> +  */
> + svm_range_list_lock_and_flush_work(svms, current->mm);
>   mutex_lock(>lock);
> + mmap_write_unlock(current->mm);
>   if (interval_tree_iter_first(>objects,
>args->va_addr >> PAGE_SHIFT,
>(args->va_addr + args->size - 1) >> 
> PAGE_SHIFT)) {

[PATCH v3 2/2] drm/amdkfd: fix KFDSVMRangeTest.PartialUnmapSysMemTest fails

2021-08-17 Thread Yifan Zhang

[ RUN  ] KFDSVMRangeTest.PartialUnmapSysMemTest
/home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDTestUtil.cpp:245: Failure
Value of: (hsaKmtAllocMemory(m_Node, m_Size, m_Flags, _pBuf))
  Actual: 1
Expected: HSAKMT_STATUS_SUCCESS
Which is: 0
/home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDTestUtil.cpp:248: Failure
Value of: (hsaKmtMapMemoryToGPUNodes(m_pBuf, m_Size, __null, mapFlags, 1, 
_Node))
  Actual: 1
Expected: HSAKMT_STATUS_SUCCESS
Which is: 0
/home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDTestUtil.cpp:306: Failure
Expected: ((void *)__null) != (ptr), actual: NULL vs NULL
Segmentation fault (core dumped)
[  ] Profile: Full Test
[  ] HW capabilities: 0x9

kernel log:

[  102.029150]  ret_from_fork+0x22/0x30
[  102.029158] ---[ end trace 15c34e782714f9a3 ]---
[ 3613.603598] amdgpu: Address: 0x7f7149ccc000 already allocated by SVM
[ 3613.610620] show_signal_msg: 27 callbacks suppressed

These is race with deferred actions from previous memory map
changes (e.g. munmap).Flush pending deffered work to avoid such case.

Signed-off-by: Yifan Zhang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 3177c4a0e753..4de907f3e66a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1261,7 +1261,12 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file 
*filep,
return -EINVAL;
 
 #if IS_ENABLED(CONFIG_HSA_AMD_SVM)
+   /* Flush pending deferred work to avoid racing with deferred actions
+* from previous memory map changes (e.g. munmap).
+*/
+   svm_range_list_lock_and_flush_work(svms, current->mm);
mutex_lock(>lock);
+   mmap_write_unlock(current->mm);
if (interval_tree_iter_first(>objects,
 args->va_addr >> PAGE_SHIFT,
 (args->va_addr + args->size - 1) >> 
PAGE_SHIFT)) {
-- 
2.25.1

[PATCH v3 1/2] drm/amdkfd: export svm_range_list_lock_and_flush_work

2021-08-17 Thread Yifan Zhang

export svm_range_list_lock_and_flush_work to make other kfd parts be
able to sync svm_range_list.

Signed-off-by: Yifan Zhang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index c1833acc54c7..d4a43c94bcf9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1500,7 +1500,7 @@ static int svm_range_validate_and_map(struct mm_struct 
*mm,
  * Context: Returns with mmap write lock held, pending deferred work flushed
  *
  */
-static void
+void
 svm_range_list_lock_and_flush_work(struct svm_range_list *svms,
   struct mm_struct *mm)
 {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
index 3fc1fd8b4fbc..e7fc5e8998aa 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
@@ -188,6 +188,7 @@ void svm_range_prefault(struct svm_range *prange, struct 
mm_struct *mm,
void *owner);
 struct kfd_process_device *
 svm_range_get_pdd_by_adev(struct svm_range *prange, struct amdgpu_device 
*adev);
+void svm_range_list_lock_and_flush_work(struct svm_range_list *svms, struct 
mm_struct *mm);
 
 /* SVM API and HMM page migration work together, device memory type
  * is initialized to not 0 when page migration register device memory.
-- 
2.25.1

Re: [PATCH] drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)

2021-08-17 Thread Mike Lothian

Hi

I've just noticed something similar when starting weston, I still see it
with this patch, but not on linus's tree

I'll confirm for sure tomorrow and send the stack trace if I can save it

Cheers

Mike

On Tue, 3 Aug 2021 at 02:56, Chen, Guchun  wrote:

> [Public]
>
> Hi Alex,
>
> I submitted the patch before your message, I will take care of this next
> time.
>
> Regards,
> Guchun
>
> -Original Message-
> From: Alex Deucher 
> Sent: Monday, August 2, 2021 9:35 PM
> To: Chen, Guchun 
> Cc: Christian König ;
> amd-gfx@lists.freedesktop.org; Gao, Likun ; Koenig,
> Christian ; Zhang, Hawking <
> hawking.zh...@amd.com>; Deucher, Alexander 
> Subject: Re: [PATCH] drm/amdgpu: avoid over-handle of fence driver fini in
> s3 test (v2)
>
> On Mon, Aug 2, 2021 at 4:23 AM Chen, Guchun  wrote:
> >
> > [Public]
> >
> > Thank you, Christian.
> >
> > Regarding fence_drv.initialized, it looks to a bit redundant, anyway let
> me look into this more.
>
> Does this patch fix this bug?
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1668data=04%7C01%7CGuchun.Chen%40amd.com%7C2bf8bebf5b424751572408d955ba66e8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637635081353279181%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=FuAo44Ws5SnuCxt45A%2Fqmu%2B3OfEkat1G%2BixO8G9uDVc%3Dreserved=0
>
> If so, please add:
> Bug:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1668data=04%7C01%7CGuchun.Chen%40amd.com%7C2bf8bebf5b424751572408d955ba66e8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637635081353279181%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=FuAo44Ws5SnuCxt45A%2Fqmu%2B3OfEkat1G%2BixO8G9uDVc%3Dreserved=0
> to the commit message.
>
> Alex
>
> >
> > Regards,
> > Guchun
> >
> > -Original Message-
> > From: Christian König 
> > Sent: Monday, August 2, 2021 2:56 PM
> > To: Chen, Guchun ; amd-gfx@lists.freedesktop.org;
> > Gao, Likun ; Koenig, Christian
> > ; Zhang, Hawking ;
> > Deucher, Alexander 
> > Subject: Re: [PATCH] drm/amdgpu: avoid over-handle of fence driver
> > fini in s3 test (v2)
> >
> > Am 02.08.21 um 07:16 schrieb Guchun Chen:
> > > In amdgpu_fence_driver_hw_fini, no need to call drm_sched_fini to
> > > stop scheduler in s3 test, otherwise, fence related failure will
> > > arrive after resume. To fix this and for a better clean up, move
> > > drm_sched_fini from fence_hw_fini to fence_sw_fini, as it's part of
> > > driver shutdown, and should never be called in hw_fini.
> > >
> > > v2: rename amdgpu_fence_driver_init to amdgpu_fence_driver_sw_init,
> > > to keep sw_init and sw_fini paired.
> > >
> > > Fixes: cd87a6dcf6af drm/amdgpu: adjust fence driver enable sequence
> > > Suggested-by: Christian König 
> > > Signed-off-by: Guchun Chen 
> >
> > It's a bit ambiguous now what fence_drv.initialized means, but I think
> we can live with that for now.
> >
> > Patch is Reviewed-by: Christian König .
> >
> > Regards,
> > Christian.
> >
> > > ---
> > >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 ++---
> > >   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 12 +++-
> > >   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  4 ++--
> > >   3 files changed, 11 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > index b1d2dc39e8be..9e53ff851496 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > @@ -3646,9 +3646,9 @@ int amdgpu_device_init(struct amdgpu_device
> > > *adev,
> > >
> > >   fence_driver_init:
> > >   /* Fence driver */
> > > - r = amdgpu_fence_driver_init(adev);
> > > + r = amdgpu_fence_driver_sw_init(adev);
> > >   if (r) {
> > > - dev_err(adev->dev, "amdgpu_fence_driver_init failed\n");
> > > + dev_err(adev->dev, "amdgpu_fence_driver_sw_init
> > > + failed\n");
> > >   amdgpu_vf_error_put(adev,
> AMDGIM_ERROR_VF_FENCE_INIT_FAIL, 0, 0);
> > >   goto failed;
> > >   }
> > > @@ -3988,7 +3988,6 @@ int amdgpu_device_resume(struct drm_device *dev,
> bool fbcon)
> > >   }
> > >   amdgpu_fence_driver_hw_init(adev);
> > >
> > > -
> > >   r = amdgpu_device_ip_late_init(adev);
> > >   if (r)
> > >   return r;
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > > index 49c5c7331c53..7495911516c2 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > > @@ -498,7 +498,7 @@ int amdgpu_fence_driver_init_ring(struct
> amdgpu_ring *ring,
> > >   }
> > >
> > >   /**
> > > - * amdgpu_fence_driver_init - init the fence driver
> > > + * amdgpu_fence_driver_sw_init - init the

Re: [PATCH v6 02/13] mm: remove extra ZONE_DEVICE struct page refcount

2021-08-17 Thread Felix Kuehling



Am 2021-08-17 um 8:01 p.m. schrieb Ralph Campbell:
> On 8/12/21 11:31 PM, Alex Sierra wrote:
>> From: Ralph Campbell 
>>
>> ZONE_DEVICE struct pages have an extra reference count that
>> complicates the
>> code for put_page() and several places in the kernel that need to
>> check the
>> reference count to see that a page is not being used (gup, compaction,
>> migration, etc.). Clean up the code so the reference count doesn't
>> need to
>> be treated specially for ZONE_DEVICE.
>>
>> v2:
>> AS: merged this patch in linux 5.11 version
>>
>> v5:
>> AS: add condition at try_grab_page to check for the zone device type,
>> while
>> page ref counter is checked less/equal to zero. In case of device
>> zone, pages
>> ref counter are initialized to zero.
>>
>> Signed-off-by: Ralph Campbell 
>> Signed-off-by: Alex Sierra 
>> ---
>>   arch/powerpc/kvm/book3s_hv_uvmem.c |  2 +-
>>   drivers/gpu/drm/nouveau/nouveau_dmem.c |  2 +-
>>   fs/dax.c   |  4 +-
>>   include/linux/dax.h    |  2 +-
>>   include/linux/memremap.h   |  7 +--
>>   include/linux/mm.h | 13 +
>>   lib/test_hmm.c |  2 +-
>>   mm/internal.h  |  8 +++
>>   mm/memremap.c  | 68 +++---
>>   mm/migrate.c   |  5 --
>>   mm/page_alloc.c    |  3 ++
>>   mm/swap.c  | 45 ++---
>>   12 files changed, 46 insertions(+), 115 deletions(-)
>>
> I haven't seen a response to the issues I raised back at v3 of this
> series.
> https://lore.kernel.org/linux-mm/4f6dd918-d79b-1aa7-3a4c-caa67ddc2...@nvidia.com/
>
>
> Did I miss something?

I think part of the response was that we did more testing. Alex added
support for DEVICE_GENERIC pages to test_hmm and he ran DAX tests
recommended by Theodore Tso. In that testing he ran into a WARN_ON_ONCE
about a zero page refcount in try_get_page. The fix is in the latest
version of patch 2. But it's already obsolete because John Hubbard is
about to remove that function altogether.

I think the issues you raised were more uncertainty than known bugs. It
seems the fact that you can have DAX pages with 0 refcount is a feature
more than a bug.

Regards,
  Felix

Re: [PATCH v6 02/13] mm: remove extra ZONE_DEVICE struct page refcount

2021-08-17 Thread Ralph Campbell


On 8/12/21 11:31 PM, Alex Sierra wrote:

From: Ralph Campbell 

ZONE_DEVICE struct pages have an extra reference count that complicates the
code for put_page() and several places in the kernel that need to check the
reference count to see that a page is not being used (gup, compaction,
migration, etc.). Clean up the code so the reference count doesn't need to
be treated specially for ZONE_DEVICE.

v2:
AS: merged this patch in linux 5.11 version

v5:
AS: add condition at try_grab_page to check for the zone device type, while
page ref counter is checked less/equal to zero. In case of device zone, pages
ref counter are initialized to zero.

Signed-off-by: Ralph Campbell 
Signed-off-by: Alex Sierra 
---
  arch/powerpc/kvm/book3s_hv_uvmem.c |  2 +-
  drivers/gpu/drm/nouveau/nouveau_dmem.c |  2 +-
  fs/dax.c   |  4 +-
  include/linux/dax.h|  2 +-
  include/linux/memremap.h   |  7 +--
  include/linux/mm.h | 13 +
  lib/test_hmm.c |  2 +-
  mm/internal.h  |  8 +++
  mm/memremap.c  | 68 +++---
  mm/migrate.c   |  5 --
  mm/page_alloc.c|  3 ++
  mm/swap.c  | 45 ++---
  12 files changed, 46 insertions(+), 115 deletions(-)


I haven't seen a response to the issues I raised back at v3 of this series.
https://lore.kernel.org/linux-mm/4f6dd918-d79b-1aa7-3a4c-caa67ddc2...@nvidia.com/

Did I miss something?

Re: [PATCH][next] drm/amd/pm: Fix spelling mistake "firwmare" -> "firmware"

2021-08-17 Thread Alex Deucher

Applied.  Thanks!

Alex

On Tue, Aug 17, 2021 at 10:35 AM Colin King  wrote:
>
> From: Colin Ian King 
>
> There is a spelling mistake in a dev_err error message. Fix it.
>
> Signed-off-by: Colin Ian King 
> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
> index 5d2605df32e8..a0e50f23b1dd 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
> @@ -311,7 +311,7 @@ static int smu_v13_0_get_pptable_from_firmware(struct 
> smu_context *smu, void **t
> version_major = le16_to_cpu(hdr->header.header_version_major);
> version_minor = le16_to_cpu(hdr->header.header_version_minor);
> if (version_major != 2) {
> -   dev_err(adev->dev, "Unsupported smu firwmare version %d.%d\n",
> +   dev_err(adev->dev, "Unsupported smu firmware version %d.%d\n",
> version_major, version_minor);
> return -EINVAL;
> }
> --
> 2.32.0
>

Re: [PATCH v2 06/12] x86/sev: Replace occurrences of sev_active() with prot_guest_has()

2021-08-17 Thread Borislav Petkov

On Tue, Aug 17, 2021 at 10:26:18AM -0500, Tom Lendacky wrote:
> >>/*
> >> -   * If SME is active we need to be sure that kexec pages are
> >> -   * not encrypted because when we boot to the new kernel the
> >> +   * If host memory encryption is active we need to be sure that kexec
> >> +   * pages are not encrypted because when we boot to the new kernel the
> >> * pages won't be accessed encrypted (initially).
> >> */
> > 
> > That hunk belongs logically into the previous patch which removes
> > sme_active().
> 
> I was trying to keep the sev_active() changes separate... so even though
> it's an SME thing, I kept it here. But I can move it to the previous
> patch, it just might look strange.

Oh I meant only the comment because it is a SME-related change. But not
too important so whatever.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH v2 05/12] x86/sme: Replace occurrences of sme_active() with prot_guest_has()

2021-08-17 Thread Borislav Petkov

On Tue, Aug 17, 2021 at 09:46:58AM -0500, Tom Lendacky wrote:
> I'm ok with letting the TDX folks make changes to these calls to be SME or
> SEV specific, if necessary, later.

Yap, exactly. Let's add the specific stuff only when really needed.

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH v2 03/12] x86/sev: Add an x86 version of prot_guest_has()

2021-08-17 Thread Borislav Petkov

On Tue, Aug 17, 2021 at 10:22:52AM -0500, Tom Lendacky wrote:
> I can change it to be an AMD/HYGON check...  although, I'll have to check
> to see if any (very) early use of the function will work with that.

We can always change it later if really needed. It is just that I'm not
a fan of such "preemptive" changes.

> At a minimum, the check in arch/x86/kernel/head64.c will have to be
> changed or removed. I'll take a closer look.

Yeah, sme_me_mask, already discussed on IRC.

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

[PATCH][next] drm/amd/pm: Fix spelling mistake "firwmare" -> "firmware"

2021-08-17 Thread Colin King

From: Colin Ian King 

There is a spelling mistake in a dev_err error message. Fix it.

Signed-off-by: Colin Ian King 
---
 drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
index 5d2605df32e8..a0e50f23b1dd 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
@@ -311,7 +311,7 @@ static int smu_v13_0_get_pptable_from_firmware(struct 
smu_context *smu, void **t
version_major = le16_to_cpu(hdr->header.header_version_major);
version_minor = le16_to_cpu(hdr->header.header_version_minor);
if (version_major != 2) {
-   dev_err(adev->dev, "Unsupported smu firwmare version %d.%d\n",
+   dev_err(adev->dev, "Unsupported smu firmware version %d.%d\n",
version_major, version_minor);
return -EINVAL;
}
-- 
2.32.0

[PATCH] drm/amdgpu: fix radv vulkan fps drop after s3 resume

2021-08-17 Thread Mohan Marimuthu, Yogesh

[Public]

[Why]
After s3, In radv there is huge fps drop in games. This is because
when memory is allocated using radv_amdgpu_winsys_bo_create()
with both AMDGPU_GEM_DOMAIN_VRAM and AMDGPU_GEM_DOMAIN_GTT domains
set, the kernel memory management after resume fails to move the data
back to VRAM. In kernel memory management, ttm_bo_mem_compat()
function returns true and hence data is not moved back to VRAM.

[How]
Implement the idea suggested by Christian Koenig. During suspend
move the data to system RAM instead of GTT. Due to this ttm_bo_mem_compat()
will return false and data will be moved back to VRAM.

Signed-off-by: Christian König 
christian.koe...@amd.com
Signed-off-by: Yogesh mohan marimuthu 
yogesh.mohanmarimu...@amd.com
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 446943e32..44ec59998 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -136,7 +136,13 @@ static void amdgpu_evict_flags(struct ttm_buffer_object 
*bo,
return;
   case TTM_PL_VRAM:
-if (!adev->mman.buffer_funcs_enabled) {
+   /* Move data to system memory for S3 so that while 
resume
+   * ttm_bo_mem_compat() will return false and data 
will be
+   * moved back to VRAM also in case of bo with both
+   * AMDGPU_GEM_DOMAIN_GTT and AMDGPU_GEM_DOMAIN_VRAM 
domain
+   * set in bo->preferred_domains.
+   */
+   if (!adev->mman.buffer_funcs_enabled || 
adev->in_s3) {
   /* Move to system memory */
   amdgpu_bo_placement_from_domain(abo, 
AMDGPU_GEM_DOMAIN_CPU);
} else if 
(!amdgpu_gmc_vram_full_visible(>gmc) &&
--
2.25.1

Re: [PATCH 2/2] drm/amdkfd: fix KFDSVMRangeTest.PartialUnmapSysMemTest fails

2021-08-17 Thread Felix Kuehling

Am 2021-08-17 um 5:17 a.m. schrieb Yifan Zhang:
> [ RUN  ] KFDSVMRangeTest.PartialUnmapSysMemTest
> /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDTestUtil.cpp:245: Failure
> Value of: (hsaKmtAllocMemory(m_Node, m_Size, m_Flags, _pBuf))
>   Actual: 1
> Expected: HSAKMT_STATUS_SUCCESS
> Which is: 0
> /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDTestUtil.cpp:248: Failure
> Value of: (hsaKmtMapMemoryToGPUNodes(m_pBuf, m_Size, __null, mapFlags, 1, 
> _Node))
>   Actual: 1
> Expected: HSAKMT_STATUS_SUCCESS
> Which is: 0
> /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDTestUtil.cpp:306: Failure
> Expected: ((void *)__null) != (ptr), actual: NULL vs NULL
> Segmentation fault (core dumped)
> [  ] Profile: Full Test
> [  ] HW capabilities: 0x9
>
> kernel log:
>
> [  102.029150]  ret_from_fork+0x22/0x30
> [  102.029158] ---[ end trace 15c34e782714f9a3 ]---
> [ 3613.603598] amdgpu: Address: 0x7f7149ccc000 already allocated by SVM
> [ 3613.610620] show_signal_msg: 27 callbacks suppressed
>
> These is race with deferred actions from previous memory map
> changes (e.g. munmap).Flush pending deffered work to avoid such case.
>
> Signed-off-by: Yifan Zhang 
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 3177c4a0e753..e1c4abb98b35 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -1261,6 +1261,10 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file 
> *filep,
>   return -EINVAL;
>  
>  #if IS_ENABLED(CONFIG_HSA_AMD_SVM)
> + /* Flush pending deferred work to avoid racing with deferred actions
> +  * from previous memory map changes (e.g. munmap).
> +  */
> + svm_range_list_lock_and_flush_work(svms, current->mm);
>   mutex_lock(>lock);
>   if (interval_tree_iter_first(>objects,
>args->va_addr >> PAGE_SHIFT,
> @@ -1271,6 +1275,7 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file 
> *filep,
>   return -EADDRINUSE;
>   }
>   mutex_unlock(>lock);
> + mmap_write_unlock(current->mm);

I think you can probably drop the mmap_write_unlock just after the
mutex_lock above. There is no need to hold that lock any longer. And I
believe the locking doesn't need to be strictly nested either.

With that fixed, the series is

Reviewed-by: Felix Kuehling 


>  #endif
>   dev = kfd_device_by_id(args->gpu_id);
>   if (!dev)

[PATCH v2 6/6] drm/amd/display: Add DP 2.0 SST DC Support

2021-08-17 Thread Fangzhi Zuo

1. Retrieve 128/132b link cap.
2. 128/132b link training and payload allocation.
3. UHBR10 link rate support.

Signed-off-by: Fangzhi Zuo 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c |6 +
 drivers/gpu/drm/amd/display/dc/core/dc.c  |   17 +
 drivers/gpu/drm/amd/display/dc/core/dc_link.c |  503 +++-
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 1071 +++--
 .../drm/amd/display/dc/core/dc_link_hwss.c|  291 -
 .../gpu/drm/amd/display/dc/core/dc_resource.c |  104 ++
 drivers/gpu/drm/amd/display/dc/dc.h   |   17 +-
 drivers/gpu/drm/amd/display/dc/dc_dp_types.h  |  199 ++-
 drivers/gpu/drm/amd/display/dc/dc_link.h  |1 +
 drivers/gpu/drm/amd/display/dc/dc_types.h |   15 +
 .../display/dc/dce110/dce110_hw_sequencer.c   |   86 +-
 .../amd/display/dc/dcn10/dcn10_link_encoder.c |9 +
 .../drm/amd/display/dc/dcn20/dcn20_hwseq.c|   26 +-
 .../drm/amd/display/dc/dcn20/dcn20_resource.c |4 +
 .../drm/amd/display/dc/dcn31/dcn31_resource.c |   18 +-
 drivers/gpu/drm/amd/display/dc/dm_cp_psp.h|1 +
 drivers/gpu/drm/amd/display/dc/dm_helpers.h   |2 +
 .../gpu/drm/amd/display/dc/inc/dc_link_dp.h   |   22 +
 .../amd/display/dc/inc/hw_sequencer_private.h |1 +
 drivers/gpu/drm/amd/display/dc/inc/resource.h |3 +
 .../gpu/drm/amd/display/include/dpcd_defs.h   |   14 +-
 .../amd/display/include/link_service_types.h  |   18 +-
 .../drm/amd/display/include/logger_types.h|2 +
 23 files changed, 2201 insertions(+), 229 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
index 6fee12c91ef5..ba142e8bbee2 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
@@ -751,3 +751,9 @@ void dm_helpers_mst_enable_stream_features(const struct 
dc_stream_state *stream)
 _downspread.raw,
 sizeof(new_downspread));
 }
+
+void dm_set_phyd32clk(struct dc_context *ctx, int freq_khz)
+{
+   // FPGA programming for this clock in diags framework that
+   // needs to go through dm layer, therefore leave dummy interace here
+}
\ No newline at end of file
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 7a442fcfa6ac..29f5ce2fed36 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -186,6 +186,7 @@ static bool create_links(
int i;
int connectors_num;
struct dc_bios *bios = dc->ctx->dc_bios;
+   int dp_hpo_link_count = 0;
 
dc->link_count = 0;
 
@@ -255,6 +256,22 @@ static bool create_links(
goto failed_alloc;
}
 
+   if (IS_FPGA_MAXIMUS_DC(dc->ctx->dce_environment) &&
+   dc->caps.dp_hpo &&
+   
link->dc->res_pool->res_cap->num_hpo_dp_link_encoder > 0) {
+   /* FPGA case - Allocate HPO DP link encoder */
+   if (i < 
link->dc->res_pool->res_cap->num_hpo_dp_link_encoder) {
+   link->hpo_dp_link_enc = 
link->dc->res_pool->hpo_dp_link_enc[i];
+
+   if (link->hpo_dp_link_enc == NULL) {
+   BREAK_TO_DEBUGGER();
+   goto failed_alloc;
+   }
+   link->hpo_dp_link_enc->hpd_source = 
link->link_enc->hpd_source;
+   link->hpo_dp_link_enc->transmitter = 
link->link_enc->transmitter;
+   }
+   }
+
link->link_status.dpcd_caps = >dpcd_caps;
 
enc_init.ctx = dc->ctx;
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 8bd7f42a8053..293b1c6aac42 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -64,6 +64,29 @@
 
/***
  * Private functions
  
**/
+static bool add_dp_hpo_link_encoder_to_link(struct dc_link *link)
+{
+   struct hpo_dp_link_encoder *enc = 
resource_get_unused_hpo_dp_link_encoder(
+   link->dc->res_pool);
+
+   if (!link->hpo_dp_link_enc && enc) {
+   link->hpo_dp_link_enc = enc;
+   link->hpo_dp_link_enc->transmitter = 
link->link_enc->transmitter;
+   link->hpo_dp_link_enc->hpd_source = link->link_enc->hpd_source;
+   }
+
+   return (link->hpo_dp_link_enc != NULL);
+}
+
+static void remove_dp_hpo_link_encoder_from_link(struct dc_link *link)
+{
+   if (link->hpo_dp_link_enc) {
+

[PATCH v2 5/6] drm/amd/display: Add DP 2.0 BIOS and DMUB Support

2021-08-17 Thread Fangzhi Zuo

Parse DP2 encoder caps and hpo instance from bios

Signed-off-by: Fangzhi Zuo 
---
 drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c| 8 
 drivers/gpu/drm/amd/display/dc/bios/command_table2.c  | 6 ++
 .../gpu/drm/amd/display/dc/dcn30/dcn30_dio_link_encoder.c | 4 
 drivers/gpu/drm/amd/display/dc/inc/hw/link_encoder.h  | 4 
 drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h   | 2 +-
 drivers/gpu/drm/amd/display/include/bios_parser_types.h   | 6 ++
 drivers/gpu/drm/amd/include/atomfirmware.h| 4 
 7 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c 
b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
index 6dbde74c1e06..3cabcd13f71c 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
@@ -1604,6 +1604,14 @@ static enum bp_result bios_parser_get_encoder_cap_info(
ATOM_ENCODER_CAP_RECORD_HBR3_EN) ? 1 : 0;
info->HDMI_6GB_EN = (record->encodercaps &
ATOM_ENCODER_CAP_RECORD_HDMI6Gbps_EN) ? 1 : 0;
+   info->IS_DP2_CAPABLE = (record->encodercaps &
+   ATOM_ENCODER_CAP_RECORD_DP2) ? 1 : 0;
+   info->DP_UHBR10_EN = (record->encodercaps &
+   ATOM_ENCODER_CAP_RECORD_UHBR10_EN) ? 1 : 0;
+   info->DP_UHBR13_5_EN = (record->encodercaps &
+   ATOM_ENCODER_CAP_RECORD_UHBR13_5_EN) ? 1 : 0;
+   info->DP_UHBR20_EN = (record->encodercaps &
+   ATOM_ENCODER_CAP_RECORD_UHBR20_EN) ? 1 : 0;
info->DP_IS_USB_C = (record->encodercaps &
ATOM_ENCODER_CAP_RECORD_USB_C_TYPE) ? 1 : 0;
 
diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table2.c 
b/drivers/gpu/drm/amd/display/dc/bios/command_table2.c
index f1f672a997d7..29197a943aeb 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/command_table2.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/command_table2.c
@@ -340,6 +340,11 @@ static enum bp_result transmitter_control_v1_7(
const struct command_table_helper *cmd = bp->cmd_helper;
struct dmub_dig_transmitter_control_data_v1_7 dig_v1_7 = {0};
 
+   uint8_t hpo_instance = (uint8_t)cntl->hpo_engine_id - ENGINE_ID_HPO_0;
+
+   if (dc_is_dp_signal(cntl->signal))
+   hpo_instance = (uint8_t)cntl->hpo_engine_id - 
ENGINE_ID_HPO_DP_0;
+
dig_v1_7.phyid = cmd->phy_id_to_atom(cntl->transmitter);
dig_v1_7.action = (uint8_t)cntl->action;
 
@@ -353,6 +358,7 @@ static enum bp_result transmitter_control_v1_7(
dig_v1_7.hpdsel = cmd->hpd_sel_to_atom(cntl->hpd_sel);
dig_v1_7.digfe_sel = cmd->dig_encoder_sel_to_atom(cntl->engine_id);
dig_v1_7.connobj_id = (uint8_t)cntl->connector_obj_id.id;
+   dig_v1_7.HPO_instance = hpo_instance;
dig_v1_7.symclk_units.symclk_10khz = cntl->pixel_clock/10;
 
if (cntl->action == TRANSMITTER_CONTROL_ENABLE ||
diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dio_link_encoder.c 
b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dio_link_encoder.c
index 46ea39f5ef8d..6f3c2fb60790 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dio_link_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dio_link_encoder.c
@@ -192,6 +192,10 @@ void dcn30_link_encoder_construct(
enc10->base.features.flags.bits.IS_HBR3_CAPABLE =
bp_cap_info.DP_HBR3_EN;
enc10->base.features.flags.bits.HDMI_6GB_EN = 
bp_cap_info.HDMI_6GB_EN;
+   enc10->base.features.flags.bits.IS_DP2_CAPABLE = 
bp_cap_info.IS_DP2_CAPABLE;
+   enc10->base.features.flags.bits.IS_UHBR10_CAPABLE = 
bp_cap_info.DP_UHBR10_EN;
+   enc10->base.features.flags.bits.IS_UHBR13_5_CAPABLE = 
bp_cap_info.DP_UHBR13_5_EN;
+   enc10->base.features.flags.bits.IS_UHBR20_CAPABLE = 
bp_cap_info.DP_UHBR20_EN;
enc10->base.features.flags.bits.DP_IS_USB_C =
bp_cap_info.DP_IS_USB_C;
} else {
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/link_encoder.h 
b/drivers/gpu/drm/amd/display/dc/inc/hw/link_encoder.h
index 58db885e4d12..67b911b6d273 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/link_encoder.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/link_encoder.h
@@ -59,6 +59,10 @@ struct encoder_feature_support {
uint32_t IS_TPS3_CAPABLE:1;
uint32_t IS_TPS4_CAPABLE:1;
uint32_t HDMI_6GB_EN:1;
+   uint32_t IS_DP2_CAPABLE:1;
+   uint32_t IS_UHBR10_CAPABLE:1;
+   uint32_t IS_UHBR13_5_CAPABLE:1;
+   uint32_t IS_UHBR20_CAPABLE:1;
uint32_t DP_IS_USB_C:1;
} bits;
uint32_t raw;
diff --git a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h

[PATCH v2 4/6] drm/amd/display: Add DP 2.0 DCCG

2021-08-17 Thread Fangzhi Zuo

HW Blocks:

++  +-+  +--+
|  OPTC  |  | HDA |  | HUBP |
++  +-+  +--+
|  ||
|  ||
HPO |==||
 |  |  v|
 |  |   +-+ |
 |  |   | APG | |
 |  |   +-+ |
 |  |  ||
 |  v  vv
 | +-+
 | |  HPO Stream Encoder |
 | +-+
 | |
 | v
 |  ++
 |  |  HPO Link Encoder  |
 |  ++
 | |
 v  ===|=
   v
  +--+
  |  DIO Output Mux  |
  +--+
   |
   v
+-+
| PHY |
+-+
   | PHYD32CLK[0]
   v
+--+
| DCCG |
+--+
   |
   v
   SYMCLK32

Signed-off-by: Fangzhi Zuo 
---
 .../gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c | 162 ++
 .../gpu/drm/amd/display/dc/dcn31/dcn31_dccg.h |  18 ++
 drivers/gpu/drm/amd/display/dc/inc/hw/dccg.h  |  21 +++
 .../amd/display/dc/inc/hw/timing_generator.h  |   1 +
 4 files changed, 202 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c 
b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c
index 696c9307715d..9896adf67425 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c
@@ -42,6 +42,155 @@
 #define DC_LOGGER \
dccg->ctx->logger
 
+void dccg31_set_dpstreamclk(
+   struct dccg *dccg,
+   enum hdmistreamclk_source src,
+   int otg_inst)
+{
+   struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
+
+   /* enabled to select one of the DTBCLKs for pipe */
+   switch (otg_inst) {
+   case 0:
+   REG_UPDATE(DPSTREAMCLK_CNTL,
+   DPSTREAMCLK_PIPE0_EN, (src == REFCLK) ? 0 : 1);
+   break;
+   case 1:
+   REG_UPDATE(DPSTREAMCLK_CNTL,
+   DPSTREAMCLK_PIPE1_EN, (src == REFCLK) ? 0 : 1);
+   break;
+   case 2:
+   REG_UPDATE(DPSTREAMCLK_CNTL,
+   DPSTREAMCLK_PIPE2_EN, (src == REFCLK) ? 0 : 1);
+   break;
+   case 3:
+   REG_UPDATE(DPSTREAMCLK_CNTL,
+   DPSTREAMCLK_PIPE3_EN, (src == REFCLK) ? 0 : 1);
+   break;
+   default:
+   BREAK_TO_DEBUGGER();
+   return;
+   }
+}
+
+void dccg31_enable_symclk32_se(
+   struct dccg *dccg,
+   int hpo_se_inst,
+   enum phyd32clk_clock_source phyd32clk)
+{
+   struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
+
+   /* select one of the PHYD32CLKs as the source for symclk32_se */
+   switch (hpo_se_inst) {
+   case 0:
+   REG_UPDATE_2(SYMCLK32_SE_CNTL,
+   SYMCLK32_SE0_SRC_SEL, phyd32clk,
+   SYMCLK32_SE0_EN, 1);
+   break;
+   case 1:
+   REG_UPDATE_2(SYMCLK32_SE_CNTL,
+   SYMCLK32_SE1_SRC_SEL, phyd32clk,
+   SYMCLK32_SE1_EN, 1);
+   break;
+   case 2:
+   REG_UPDATE_2(SYMCLK32_SE_CNTL,
+   SYMCLK32_SE2_SRC_SEL, phyd32clk,
+   SYMCLK32_SE2_EN, 1);
+   break;
+   case 3:
+   REG_UPDATE_2(SYMCLK32_SE_CNTL,
+   SYMCLK32_SE3_SRC_SEL, phyd32clk,
+   SYMCLK32_SE3_EN, 1);
+   break;
+   default:
+   BREAK_TO_DEBUGGER();
+   return;
+   }
+}
+
+void dccg31_disable_symclk32_se(
+   struct dccg *dccg,
+   int hpo_se_inst)
+{
+   struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
+
+   /* set refclk as the source for symclk32_se */
+   switch (hpo_se_inst) {
+   case 0:
+   REG_UPDATE_2(SYMCLK32_SE_CNTL,
+   SYMCLK32_SE0_SRC_SEL, 0,
+   SYMCLK32_SE0_EN, 0);
+   break;
+   case 1:
+   REG_UPDATE_2(SYMCLK32_SE_CNTL,
+   SYMCLK32_SE1_SRC_SEL, 0,
+   SYMCLK32_SE1_EN, 0);
+   break;
+   case 2:
+   REG_UPDATE_2(SYMCLK32_SE_CNTL,
+   SYMCLK32_SE2_SRC_SEL, 0,
+   SYMCLK32_SE2_EN, 0);
+

[PATCH v2 3/6] drm/amd/display: Add DP 2.0 HPO Link Encoder

2021-08-17 Thread Fangzhi Zuo

HW Blocks:

++  +-+  +--+
|  OPTC  |  | HDA |  | HUBP |
++  +-+  +--+
|  ||
|  ||
HPO |==||
 |  |  v|
 |  |   +-+ |
 |  |   | APG | |
 |  |   +-+ |
 |  |  ||
 |  v  vv
 | +-+
 | |  HPO Stream Encoder |
 | +-+
 | |
 | v
 |  ++
 |  |  HPO Link Encoder  |
 v  ++

Signed-off-by: Fangzhi Zuo 
---
 drivers/gpu/drm/amd/display/dc/dc_link.h  |   2 +
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c |  32 +
 drivers/gpu/drm/amd/display/dc/dcn31/Makefile |   2 +-
 .../display/dc/dcn31/dcn31_dio_link_encoder.c |   4 +
 .../dc/dcn31/dcn31_hpo_dp_link_encoder.c  | 620 ++
 .../dc/dcn31/dcn31_hpo_dp_link_encoder.h  | 222 +++
 .../drm/amd/display/dc/dcn31/dcn31_resource.c |  50 ++
 .../gpu/drm/amd/display/dc/inc/core_types.h   |   2 +
 .../gpu/drm/amd/display/dc/inc/hw/hw_shared.h |   1 +
 .../drm/amd/display/dc/inc/hw/link_encoder.h  |  85 +++
 drivers/gpu/drm/amd/display/dc/inc/resource.h |   5 +
 .../amd/display/include/link_service_types.h  |  13 +-
 12 files changed, 1036 insertions(+), 2 deletions(-)
 create mode 100644 
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_link_encoder.c
 create mode 100644 
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_link_encoder.h

diff --git a/drivers/gpu/drm/amd/display/dc/dc_link.h 
b/drivers/gpu/drm/amd/display/dc/dc_link.h
index 83845d006c54..0ed7085384f0 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_link.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_link.h
@@ -45,6 +45,8 @@ struct dc_link_status {
 struct link_mst_stream_allocation {
/* DIG front */
const struct stream_encoder *stream_enc;
+   /* HPO DP Stream Encoder */
+   const struct hpo_dp_stream_encoder *hpo_dp_stream_enc;
/* associate DRM payload table with DC stream encoder */
uint8_t vcp_id;
/* number of slots required for the DP stream in transport packet */
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index be98f5513fe5..70d47773c23c 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -497,6 +497,38 @@ void dcn10_log_hw_state(struct dc *dc,
 
DTN_INFO("\n");
}
+
+   /* log DP HPO L_ENC section if any hpo_dp_link_enc exists */
+   for (i = 0; i < dc->link_count; i++)
+   if (dc->links[i]->hpo_dp_link_enc)
+   hpo_dp_link_enc_count++;
+
+   if (hpo_dp_link_enc_count) {
+   DTN_INFO("DP HPO L_ENC:  Enabled  Mode   Lanes   Stream 
 Slots   VC Rate XVC Rate Y\n");
+
+   for (i = 0; i < dc->link_count; i++) {
+   struct hpo_dp_link_encoder *hpo_dp_link_enc = 
dc->links[i]->hpo_dp_link_enc;
+   struct hpo_dp_link_enc_state hpo_dp_le_state = 
{0};
+
+   if (hpo_dp_link_enc && 
hpo_dp_link_enc->funcs->read_state) {
+   
hpo_dp_link_enc->funcs->read_state(hpo_dp_link_enc, _dp_le_state);
+   DTN_INFO("[%d]: %d  %6s 
%d%d  %d %d %d\n",
+   hpo_dp_link_enc->inst,
+   
hpo_dp_le_state.link_enc_enabled,
+   
(hpo_dp_le_state.link_mode == 0) ? "TPS1" :
+   
(hpo_dp_le_state.link_mode == 1) ? "TPS2" :
+   
(hpo_dp_le_state.link_mode == 2) ? "ACTIVE" : "TEST",
+   
hpo_dp_le_state.lane_count,
+   
hpo_dp_le_state.stream_src[0],
+   
hpo_dp_le_state.slot_count[0],
+   
hpo_dp_le_state.vc_rate_x[0],
+   
hpo_dp_le_state.vc_rate_y[0]);
+   DTN_INFO("\n");
+   }
+   }
+
+   DTN_INFO("\n");
+   }
}
 
DTN_INFO_END();
diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
index

[PATCH v2 2/6] drm/amd/display: Add DP 2.0 HPO Stream Encoder

2021-08-17 Thread Fangzhi Zuo

HW Blocks:

++  +-+  +--+
|  OPTC  |  | HDA |  | HUBP |
++  +-+  +--+
|  ||
|  ||
HPO |==||
 |  |  v|
 |  |   +-+ |
 |  |   | APG | |
 |  |   +-+ |
 |  |  ||
 v  v  vv
   +--+
   |  HPO Stream Encoder  |
   +--+

Signed-off-by: Fangzhi Zuo 
---
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c |  33 +
 drivers/gpu/drm/amd/display/dc/dcn31/Makefile |   2 +-
 .../dc/dcn31/dcn31_hpo_dp_stream_encoder.c| 749 ++
 .../dc/dcn31/dcn31_hpo_dp_stream_encoder.h| 241 ++
 .../drm/amd/display/dc/dcn31/dcn31_resource.c |  85 ++
 .../gpu/drm/amd/display/dc/inc/core_types.h   |   4 +
 .../gpu/drm/amd/display/dc/inc/hw/hw_shared.h |   1 +
 .../amd/display/dc/inc/hw/stream_encoder.h|  79 ++
 drivers/gpu/drm/amd/display/dc/inc/resource.h |   4 +
 .../amd/display/include/grph_object_defs.h|  10 +
 .../drm/amd/display/include/grph_object_id.h  |   6 +
 11 files changed, 1213 insertions(+), 1 deletion(-)
 create mode 100644 
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c
 create mode 100644 
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.h

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index df8a7718a85f..be98f5513fe5 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -466,6 +466,39 @@ void dcn10_log_hw_state(struct dc *dc,
 
log_mpc_crc(dc, log_ctx);
 
+   {
+   int hpo_dp_link_enc_count = 0;
+
+   if (pool->hpo_dp_stream_enc_count > 0) {
+   DTN_INFO("DP HPO S_ENC:  Enabled  OTG   Format   Depth  
 Vid   SDP   Compressed  Link\n");
+   for (i = 0; i < pool->hpo_dp_stream_enc_count; i++) {
+   struct hpo_dp_stream_encoder_state 
hpo_dp_se_state = {0};
+   struct hpo_dp_stream_encoder *hpo_dp_stream_enc 
= pool->hpo_dp_stream_enc[i];
+
+   if (hpo_dp_stream_enc && 
hpo_dp_stream_enc->funcs->read_state) {
+   
hpo_dp_stream_enc->funcs->read_state(hpo_dp_stream_enc, _dp_se_state);
+
+   DTN_INFO("[%d]: %d
%d   %6s   %d %d %d%d %d\n",
+   hpo_dp_stream_enc->id - 
ENGINE_ID_HPO_DP_0,
+   
hpo_dp_se_state.stream_enc_enabled,
+   
hpo_dp_se_state.otg_inst,
+   
(hpo_dp_se_state.pixel_encoding == 0) ? "4:4:4" :
+   
((hpo_dp_se_state.pixel_encoding == 1) ? "4:2:2" :
+   
(hpo_dp_se_state.pixel_encoding == 2) ? "4:2:0" : "Y-Only"),
+   
(hpo_dp_se_state.component_depth == 0) ? 6 :
+   
((hpo_dp_se_state.component_depth == 1) ? 8 :
+   
(hpo_dp_se_state.component_depth == 2) ? 10 : 12),
+   
hpo_dp_se_state.vid_stream_enabled,
+   
hpo_dp_se_state.sdp_enabled,
+   
hpo_dp_se_state.compressed_format,
+   
hpo_dp_se_state.mapped_to_link_enc);
+   }
+   }
+
+   DTN_INFO("\n");
+   }
+   }
+
DTN_INFO_END();
 }
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
index bc2087f6dcb2..8b811f589524 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
@@ -12,7 +12,7 @@
 
 DCN31 = dcn31_resource.o dcn31_hubbub.o dcn31_hwseq.o dcn31_init.o 
dcn31_hubp.o \
dcn31_dccg.o dcn31_optc.o dcn31_dio_link_encoder.o dcn31_panel_cntl.o \
-   dcn31_apg.o
+   dcn31_apg.o dcn31_hpo_dp_stream_encoder.o
 
 ifdef CONFIG_X86
 CFLAGS_$(AMDDALPATH)/dc/dcn31/dcn31_resource.o := -msse
diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c 
b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c
new file mode 100644
index ..bc265ee06824

[PATCH v2 1/6] drm/amd/display: Add DP 2.0 Audio Package Generator

2021-08-17 Thread Fangzhi Zuo

HW Blocks:

+-+
| HDA |
+-+
   |
   |
HPO ===|=
 | v
 |  +-+
 |  | APG |
 v  +-+

Signed-off-by: Fangzhi Zuo 
---
 drivers/gpu/drm/amd/display/dc/dcn31/Makefile |   3 +-
 .../gpu/drm/amd/display/dc/dcn31/dcn31_apg.c  | 173 ++
 .../gpu/drm/amd/display/dc/dcn31/dcn31_apg.h  | 115 
 .../drm/amd/display/dc/dcn31/dcn31_resource.c |  38 
 4 files changed, 328 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/amd/display/dc/dcn31/dcn31_apg.c
 create mode 100644 drivers/gpu/drm/amd/display/dc/dcn31/dcn31_apg.h

diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
index 4bab97acb155..bc2087f6dcb2 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
@@ -11,7 +11,8 @@
 # Makefile for dcn31.
 
 DCN31 = dcn31_resource.o dcn31_hubbub.o dcn31_hwseq.o dcn31_init.o 
dcn31_hubp.o \
-   dcn31_dccg.o dcn31_optc.o dcn31_dio_link_encoder.o dcn31_panel_cntl.o
+   dcn31_dccg.o dcn31_optc.o dcn31_dio_link_encoder.o dcn31_panel_cntl.o \
+   dcn31_apg.o
 
 ifdef CONFIG_X86
 CFLAGS_$(AMDDALPATH)/dc/dcn31/dcn31_resource.o := -msse
diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_apg.c 
b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_apg.c
new file mode 100644
index ..6bd7a0626665
--- /dev/null
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_apg.c
@@ -0,0 +1,173 @@
+/*
+ * Copyright 2019 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ *  and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: AMD
+ *
+ */
+
+
+#include "dc_bios_types.h"
+#include "hw_shared.h"
+#include "dcn31_apg.h"
+#include "reg_helper.h"
+
+#define DC_LOGGER \
+   apg31->base.ctx->logger
+
+#define REG(reg)\
+   (apg31->regs->reg)
+
+#undef FN
+#define FN(reg_name, field_name) \
+   apg31->apg_shift->field_name, apg31->apg_mask->field_name
+
+
+#define CTX \
+   apg31->base.ctx
+
+
+static void apg31_enable(
+   struct apg *apg)
+{
+   struct dcn31_apg *apg31 = DCN31_APG_FROM_APG(apg);
+
+   /* Reset APG */
+   REG_UPDATE(APG_CONTROL, APG_RESET, 1);
+   REG_WAIT(APG_CONTROL,
+   APG_RESET_DONE, 1,
+   1, 10);
+   REG_UPDATE(APG_CONTROL, APG_RESET, 0);
+   REG_WAIT(APG_CONTROL,
+   APG_RESET_DONE, 0,
+   1, 10);
+
+   /* Enable APG */
+   REG_UPDATE(APG_CONTROL2, APG_ENABLE, 1);
+}
+
+static void apg31_disable(
+   struct apg *apg)
+{
+   struct dcn31_apg *apg31 = DCN31_APG_FROM_APG(apg);
+
+   /* Disable APG */
+   REG_UPDATE(APG_CONTROL2, APG_ENABLE, 0);
+}
+
+static union audio_cea_channels speakers_to_channels(
+   struct audio_speaker_flags speaker_flags)
+{
+   union audio_cea_channels cea_channels = {0};
+
+   /* these are one to one */
+   cea_channels.channels.FL = speaker_flags.FL_FR;
+   cea_channels.channels.FR = speaker_flags.FL_FR;
+   cea_channels.channels.LFE = speaker_flags.LFE;
+   cea_channels.channels.FC = speaker_flags.FC;
+
+   /* if Rear Left and Right exist move RC speaker to channel 7
+* otherwise to channel 5
+*/
+   if (speaker_flags.RL_RR) {
+   cea_channels.channels.RL_RC = speaker_flags.RL_RR;
+   cea_channels.channels.RR = speaker_flags.RL_RR;
+   cea_channels.channels.RC_RLC_FLC = speaker_flags.RC;
+   } else {
+   cea_channels.channels.RL_RC = speaker_flags.RC;
+   }
+
+   /* FRONT Left Right Center and REAR Left Right Center are exclusive */
+   if (speaker_flags.FLC_FRC) {
+   cea_channels.channels.RC_RLC_FLC = speaker_flags.FLC_FRC;
+

[PATCH v2 0/6] Add DP 2.0 SST Support

2021-08-17 Thread Fangzhi Zuo

The patch series adds SST UHBR10 support

Fangzhi Zuo (6):
  drm/amd/display: Add DP 2.0 Audio Package Generator
  drm/amd/display: Add DP 2.0 HPO Stream Encoder
  drm/amd/display: Add DP 2.0 HPO Link Encoder
  drm/amd/display: Add DP 2.0 DCCG
  drm/amd/display: Add DP 2.0 BIOS and DMUB Support
  drm/amd/display: Add DP 2.0 SST DC Support

 .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c |6 +
 .../drm/amd/display/dc/bios/bios_parser2.c|8 +
 .../drm/amd/display/dc/bios/command_table2.c  |6 +
 drivers/gpu/drm/amd/display/dc/core/dc.c  |   17 +
 drivers/gpu/drm/amd/display/dc/core/dc_link.c |  503 +++-
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 1071 +++--
 .../drm/amd/display/dc/core/dc_link_hwss.c|  291 -
 .../gpu/drm/amd/display/dc/core/dc_resource.c |  104 ++
 drivers/gpu/drm/amd/display/dc/dc.h   |   17 +-
 drivers/gpu/drm/amd/display/dc/dc_dp_types.h  |  199 ++-
 drivers/gpu/drm/amd/display/dc/dc_link.h  |3 +
 drivers/gpu/drm/amd/display/dc/dc_types.h |   15 +
 .../display/dc/dce110/dce110_hw_sequencer.c   |   86 +-
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c |   65 +
 .../amd/display/dc/dcn10/dcn10_link_encoder.c |9 +
 .../drm/amd/display/dc/dcn20/dcn20_hwseq.c|   26 +-
 .../drm/amd/display/dc/dcn20/dcn20_resource.c |4 +
 .../display/dc/dcn30/dcn30_dio_link_encoder.c |4 +
 drivers/gpu/drm/amd/display/dc/dcn31/Makefile |3 +-
 .../gpu/drm/amd/display/dc/dcn31/dcn31_apg.c  |  173 +++
 .../gpu/drm/amd/display/dc/dcn31/dcn31_apg.h  |  115 ++
 .../gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c |  162 +++
 .../gpu/drm/amd/display/dc/dcn31/dcn31_dccg.h |   18 +
 .../display/dc/dcn31/dcn31_dio_link_encoder.c |4 +
 .../dc/dcn31/dcn31_hpo_dp_link_encoder.c  |  620 ++
 .../dc/dcn31/dcn31_hpo_dp_link_encoder.h  |  222 
 .../dc/dcn31/dcn31_hpo_dp_stream_encoder.c|  749 
 .../dc/dcn31/dcn31_hpo_dp_stream_encoder.h|  241 
 .../drm/amd/display/dc/dcn31/dcn31_resource.c |  181 +++
 drivers/gpu/drm/amd/display/dc/dm_cp_psp.h|1 +
 drivers/gpu/drm/amd/display/dc/dm_helpers.h   |2 +
 .../gpu/drm/amd/display/dc/inc/core_types.h   |6 +
 .../gpu/drm/amd/display/dc/inc/dc_link_dp.h   |   22 +
 drivers/gpu/drm/amd/display/dc/inc/hw/dccg.h  |   21 +
 .../gpu/drm/amd/display/dc/inc/hw/hw_shared.h |2 +
 .../drm/amd/display/dc/inc/hw/link_encoder.h  |   89 ++
 .../amd/display/dc/inc/hw/stream_encoder.h|   79 ++
 .../amd/display/dc/inc/hw/timing_generator.h  |1 +
 .../amd/display/dc/inc/hw_sequencer_private.h |1 +
 drivers/gpu/drm/amd/display/dc/inc/resource.h |   12 +
 .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h   |2 +-
 .../amd/display/include/bios_parser_types.h   |6 +
 .../gpu/drm/amd/display/include/dpcd_defs.h   |   14 +-
 .../amd/display/include/grph_object_defs.h|   10 +
 .../drm/amd/display/include/grph_object_id.h  |6 +
 .../amd/display/include/link_service_types.h  |   31 +-
 .../drm/amd/display/include/logger_types.h|2 +
 drivers/gpu/drm/amd/include/atomfirmware.h|4 +
 48 files changed, 5006 insertions(+), 227 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/display/dc/dcn31/dcn31_apg.c
 create mode 100644 drivers/gpu/drm/amd/display/dc/dcn31/dcn31_apg.h
 create mode 100644 
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_link_encoder.c
 create mode 100644 
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_link_encoder.h
 create mode 100644 
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c
 create mode 100644 
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.h

-- 
2.25.1

Re: [PATCH] drm/amd/amdgpu:flush ttm delayed work before cancel_sync

2021-08-17 Thread Andrey Grodzovsky


Looks reasonable to me.

Reviewed-by: Andrey Grodzovsky 

Andrey

On 2021-08-17 5:50 a.m., YuBiao Wang wrote:

[Why]
In some cases when we unload driver, warning call trace
will show up in vram_mgr_fini which claims that LRU is not empty, caused
by the ttm bo inside delay deleted queue.

[How]
We should flush delayed work to make sure the delay deleting is done.

Signed-off-by: YuBiao Wang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 4d266c40382c..0b5764aa98a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3824,8 +3824,10 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
  {
dev_info(adev->dev, "amdgpu: finishing device.\n");
flush_delayed_work(>delayed_init_work);
-   if (adev->mman.initialized)
+   if (adev->mman.initialized) {
+   flush_delayed_work(>mman.bdev.wq);
ttm_bo_lock_delayed_workqueue(>mman.bdev);
+   }
adev->shutdown = true;
  
  	/* make sure IB test finished before entering exclusive mode

Re: [PATCH 2/6] drm/amd/display: Add DP 2.0 HPO Stream Encoder

2021-08-17 Thread Kazlauskas, Nicholas


On 2021-08-16 4:59 p.m., Fangzhi Zuo wrote:

HW Blocks:

 ++  +-+  +--+
 |  OPTC  |  | HDA |  | HUBP |
 ++  +-+  +--+
 |  ||
 |  ||
 HPO |==||
  |  |  v|
  |  |   +-+ |
  |  |   | APG | |
  |  |   +-+ |
  |  |  ||
  v  v  vv
+--+
|  HPO Stream Encoder  |
+--+

Signed-off-by: Fangzhi Zuo 
---
  .../amd/display/dc/dcn10/dcn10_hw_sequencer.c |  35 +
  drivers/gpu/drm/amd/display/dc/dcn31/Makefile |   2 +-
  .../dc/dcn31/dcn31_hpo_dp_stream_encoder.c| 761 ++
  .../dc/dcn31/dcn31_hpo_dp_stream_encoder.h| 241 ++
  .../drm/amd/display/dc/dcn31/dcn31_resource.c |  85 ++
  .../gpu/drm/amd/display/dc/inc/core_types.h   |   4 +
  .../gpu/drm/amd/display/dc/inc/hw/hw_shared.h |   1 +
  .../amd/display/dc/inc/hw/stream_encoder.h|  79 ++
  drivers/gpu/drm/amd/display/dc/inc/resource.h |   4 +
  .../amd/display/include/grph_object_defs.h|  10 +
  .../drm/amd/display/include/grph_object_id.h  |   6 +
  11 files changed, 1227 insertions(+), 1 deletion(-)
  create mode 100644 
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c
  create mode 100644 
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.h

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index df8a7718a85f..cffd9e6f44b2 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -466,6 +466,41 @@ void dcn10_log_hw_state(struct dc *dc,
  
  	log_mpc_crc(dc, log_ctx);
  
+#if defined(CONFIG_DRM_AMD_DC_DCN3_1)

+   {
+   int hpo_dp_link_enc_count = 0;
+
+   if (pool->hpo_dp_stream_enc_count > 0) {
+   DTN_INFO("DP HPO S_ENC:  Enabled  OTG   Format   Depth   Vid 
  SDP   Compressed  Link\n");
+   for (i = 0; i < pool->hpo_dp_stream_enc_count; i++) {
+   struct hpo_dp_stream_encoder_state 
hpo_dp_se_state = {0};
+   struct hpo_dp_stream_encoder *hpo_dp_stream_enc = 
pool->hpo_dp_stream_enc[i];
+
+   if (hpo_dp_stream_enc && 
hpo_dp_stream_enc->funcs->read_state) {
+   
hpo_dp_stream_enc->funcs->read_state(hpo_dp_stream_enc, _dp_se_state);
+
+   DTN_INFO("[%d]: %d%d   
%6s   %d %d %d%d %d\n",
+   hpo_dp_stream_enc->id - 
ENGINE_ID_HPO_DP_0,
+   
hpo_dp_se_state.stream_enc_enabled,
+   
hpo_dp_se_state.otg_inst,
+   (hpo_dp_se_state.pixel_encoding 
== 0) ? "4:4:4" :
+   
((hpo_dp_se_state.pixel_encoding == 1) ? "4:2:2" :
+   
(hpo_dp_se_state.pixel_encoding == 2) ? "4:2:0" : "Y-Only"),
+   
(hpo_dp_se_state.component_depth == 0) ? 6 :
+   
((hpo_dp_se_state.component_depth == 1) ? 8 :
+   
(hpo_dp_se_state.component_depth == 2) ? 10 : 12),
+   
hpo_dp_se_state.vid_stream_enabled,
+   
hpo_dp_se_state.sdp_enabled,
+   
hpo_dp_se_state.compressed_format,
+   
hpo_dp_se_state.mapped_to_link_enc);
+   }
+   }
+
+   DTN_INFO("\n");
+   }
+   }
+#endif
+
DTN_INFO_END();
  }
  
diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile

index bc2087f6dcb2..8b811f589524 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
@@ -12,7 +12,7 @@
  
  DCN31 = dcn31_resource.o dcn31_hubbub.o dcn31_hwseq.o dcn31_init.o dcn31_hubp.o \

dcn31_dccg.o dcn31_optc.o dcn31_dio_link_encoder.o dcn31_panel_cntl.o \
-   dcn31_apg.o
+   dcn31_apg.o dcn31_hpo_dp_stream_encoder.o
  
  ifdef CONFIG_X86

  CFLAGS_$(AMDDALPATH)/dc/dcn31/dcn31_resource.o := -msse
diff --git

Re: [PATCH] drm/amdgpu: increase max xgmi physical node for aldebaran

2021-08-17 Thread Alex Deucher

Acked-by: Alex Deucher 

On Tue, Aug 17, 2021 at 12:30 AM Hawking Zhang  wrote:
>
> aldebaran supports up to 16 xgmi physical nodes.
>
> Signed-off-by: Hawking Zhang 
> Reviewed-by: John Clements 
> ---
>  drivers/gpu/drm/amd/amdgpu/gfxhub_v1_1.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_1.c 
> b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_1.c
> index 8fca72e..497b86c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_1.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_1.c
> @@ -75,9 +75,8 @@ int gfxhub_v1_1_get_xgmi_info(struct amdgpu_device *adev)
> max_physical_node_id = 7;
> break;
> case CHIP_ALDEBARAN:
> -   /* just using duplicates for Aldebaran support, revisit later 
> */
> -   max_num_physical_nodes   = 8;
> -   max_physical_node_id = 7;
> +   max_num_physical_nodes   = 16;
> +   max_physical_node_id = 15;
> break;
> default:
> return -EINVAL;
> --
> 2.7.4
>

Re: [PATCH v6 08/13] mm: call pgmap->ops->page_free for DEVICE_GENERIC pages

2021-08-17 Thread Felix Kuehling

Am 2021-08-17 um 1:50 a.m. schrieb Christoph Hellwig:
> On Mon, Aug 16, 2021 at 03:00:49PM -0400, Felix Kuehling wrote:
>> Am 2021-08-15 um 11:40 a.m. schrieb Christoph Hellwig:
>>> On Fri, Aug 13, 2021 at 01:31:45AM -0500, Alex Sierra wrote:
 Add MEMORY_DEVICE_GENERIC case to free_zone_device_page callback.
 Device generic type memory case is now able to free its pages properly.
>>> How is this going to work for the two existing MEMORY_DEVICE_GENERIC
>>> that now change behavior?  And which don't have a ->page_free callback
>>> at all?
>> That's a good catch. Existing drivers shouldn't need a page_free
>> callback if they didn't have one before. That means we need to add a
>> NULL-pointer check in free_device_page.
> Also the other state clearing (__ClearPageWaiters/mem_cgroup_uncharge/
> ->mapping = NULL).
>
> In many ways this seems like you want to bring back the DEVICE_PUBLIC
> pgmap type that was removed a while ago due to the lack of users
> instead of overloading the generic type.

I think so. I'm not clear about how DEVICE_PUBLIC differed from what
DEVICE_GENERIC is today. As I understand it, DEVICE_PUBLIC was removed
because it was unused and also known to be broken in some ways.
DEVICE_GENERIC seemed close enough to what we need, other than not being
supported in the migration helpers.

Would you see benefit in re-introducing DEVICE_PUBLIC as a distinct
memory type from DEVICE_GENERIC? What would be the benefits of making
that distinction?

Thanks,
  Felix

Re: [PATCH v2 09/12] mm: Remove the now unused mem_encrypt_active() function

2021-08-17 Thread Tom Lendacky

On 8/17/21 5:24 AM, Borislav Petkov wrote:
> On Tue, Aug 17, 2021 at 12:22:33PM +0200, Borislav Petkov wrote:
>> This one wants to be part of the previous patch.
> 
> ... and the three following patches too - the treewide patch does a
> single atomic :) replacement and that's it.

Ok, I'll squash those all together.

Thanks,
Tom

>

Re: [PATCH v2 06/12] x86/sev: Replace occurrences of sev_active() with prot_guest_has()

2021-08-17 Thread Tom Lendacky

On 8/17/21 5:02 AM, Borislav Petkov wrote:
> On Fri, Aug 13, 2021 at 11:59:25AM -0500, Tom Lendacky wrote:
>> diff --git a/arch/x86/kernel/machine_kexec_64.c 
>> b/arch/x86/kernel/machine_kexec_64.c
>> index 8e7b517ad738..66ff788b79c9 100644
>> --- a/arch/x86/kernel/machine_kexec_64.c
>> +++ b/arch/x86/kernel/machine_kexec_64.c
>> @@ -167,7 +167,7 @@ static int init_transition_pgtable(struct kimage *image, 
>> pgd_t *pgd)
>>  }
>>  pte = pte_offset_kernel(pmd, vaddr);
>>  
>> -if (sev_active())
>> +if (prot_guest_has(PATTR_GUEST_MEM_ENCRYPT))
>>  prot = PAGE_KERNEL_EXEC;
>>  
>>  set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
>> @@ -207,7 +207,7 @@ static int init_pgtable(struct kimage *image, unsigned 
>> long start_pgtable)
>>  level4p = (pgd_t *)__va(start_pgtable);
>>  clear_page(level4p);
>>  
>> -if (sev_active()) {
>> +if (prot_guest_has(PATTR_GUEST_MEM_ENCRYPT)) {
>>  info.page_flag   |= _PAGE_ENC;
>>  info.kernpg_flag |= _PAGE_ENC;
>>  }
>> @@ -570,12 +570,12 @@ void arch_kexec_unprotect_crashkres(void)
>>   */
>>  int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages, gfp_t gfp)
>>  {
>> -if (sev_active())
>> +if (!prot_guest_has(PATTR_HOST_MEM_ENCRYPT))
>>  return 0;
>>  
>>  /*
>> - * If SME is active we need to be sure that kexec pages are
>> - * not encrypted because when we boot to the new kernel the
>> + * If host memory encryption is active we need to be sure that kexec
>> + * pages are not encrypted because when we boot to the new kernel the
>>   * pages won't be accessed encrypted (initially).
>>   */
> 
> That hunk belongs logically into the previous patch which removes
> sme_active().

I was trying to keep the sev_active() changes separate... so even though
it's an SME thing, I kept it here. But I can move it to the previous
patch, it just might look strange.

> 
>>  return set_memory_decrypted((unsigned long)vaddr, pages);
>> @@ -583,12 +583,12 @@ int arch_kexec_post_alloc_pages(void *vaddr, unsigned 
>> int pages, gfp_t gfp)
>>  
>>  void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages)
>>  {
>> -if (sev_active())
>> +if (!prot_guest_has(PATTR_HOST_MEM_ENCRYPT))
>>  return;
>>  
>>  /*
>> - * If SME is active we need to reset the pages back to being
>> - * an encrypted mapping before freeing them.
>> + * If host memory encryption is active we need to reset the pages back
>> + * to being an encrypted mapping before freeing them.
>>   */
>>  set_memory_encrypted((unsigned long)vaddr, pages);
>>  }
>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>> index e8ccab50ebf6..b69f5ac622d5 100644
>> --- a/arch/x86/kvm/svm/svm.c
>> +++ b/arch/x86/kvm/svm/svm.c
>> @@ -25,6 +25,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  #include 
>>  #include 
>> @@ -457,7 +458,7 @@ static int has_svm(void)
>>  return 0;
>>  }
>>  
>> -if (sev_active()) {
>> +if (prot_guest_has(PATTR_SEV)) {
>>  pr_info("KVM is unsupported when running as an SEV guest\n");
>>  return 0;
> 
> Same question as for PATTR_SME. PATTR_GUEST_MEM_ENCRYPT should be enough.

Yup, I'll change them all.

> 
>> @@ -373,7 +373,7 @@ int __init early_set_memory_encrypted(unsigned long 
>> vaddr, unsigned long size)
>>   * up under SME the trampoline area cannot be encrypted, whereas under SEV
>>   * the trampoline area must be encrypted.
>>   */
>> -bool sev_active(void)
>> +static bool sev_active(void)
>>  {
>>  return sev_status & MSR_AMD64_SEV_ENABLED;
>>  }
>> @@ -382,7 +382,6 @@ static bool sme_active(void)
>>  {
>>  return sme_me_mask && !sev_active();
>>  }
>> -EXPORT_SYMBOL_GPL(sev_active);
> 
> Just get rid of it altogether.

Ok.

Thanks,
Tom

> 
> Thx.
>

Re: [PATCH v2 03/12] x86/sev: Add an x86 version of prot_guest_has()

2021-08-17 Thread Tom Lendacky

On 8/15/21 9:39 AM, Borislav Petkov wrote:
> On Sun, Aug 15, 2021 at 08:53:31AM -0500, Tom Lendacky wrote:
>> It's not a cross-vendor thing as opposed to a KVM or other hypervisor
>> thing where the family doesn't have to be reported as AMD or HYGON.
> 
> What would be the use case? A HV starts a guest which is supposed to be
> encrypted using the AMD's confidential guest technology but the HV tells
> the guest that it is not running on an AMD SVM HV but something else?
> 
> Is that even an actual use case?
> 
> Or am I way off?
> 
> I know we have talked about this in the past but this still sounds
> insane.

Maybe the KVM folks have a better understanding of it...

I can change it to be an AMD/HYGON check...  although, I'll have to check
to see if any (very) early use of the function will work with that.

At a minimum, the check in arch/x86/kernel/head64.c will have to be
changed or removed. I'll take a closer look.

Thanks,
Tom

>

Re: [PATCH v2 05/12] x86/sme: Replace occurrences of sme_active() with prot_guest_has()

2021-08-17 Thread Tom Lendacky

On 8/17/21 4:00 AM, Borislav Petkov wrote:
> On Fri, Aug 13, 2021 at 11:59:24AM -0500, Tom Lendacky wrote:
>> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
>> index edc67ddf065d..5635ca9a1fbe 100644
>> --- a/arch/x86/mm/mem_encrypt.c
>> +++ b/arch/x86/mm/mem_encrypt.c
>> @@ -144,7 +144,7 @@ void __init sme_unmap_bootdata(char *real_mode_data)
>>  struct boot_params *boot_data;
>>  unsigned long cmdline_paddr;
>>  
>> -if (!sme_active())
>> +if (!amd_prot_guest_has(PATTR_SME))
>>  return;
>>  
>>  /* Get the command line address before unmapping the real_mode_data */
>> @@ -164,7 +164,7 @@ void __init sme_map_bootdata(char *real_mode_data)
>>  struct boot_params *boot_data;
>>  unsigned long cmdline_paddr;
>>  
>> -if (!sme_active())
>> +if (!amd_prot_guest_has(PATTR_SME))
>>  return;
>>  
>>  __sme_early_map_unmap_mem(real_mode_data, sizeof(boot_params), true);
>> @@ -378,7 +378,7 @@ bool sev_active(void)
>>  return sev_status & MSR_AMD64_SEV_ENABLED;
>>  }
>>  
>> -bool sme_active(void)
>> +static bool sme_active(void)
> 
> Just get rid of it altogether. Also, there's an
> 
> EXPORT_SYMBOL_GPL(sev_active);
> > which needs to go under the actual function. Here's a diff ontop:

Will do.

> 
> ---
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index 5635ca9a1fbe..a3a2396362a5 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -364,8 +364,9 @@ int __init early_set_memory_encrypted(unsigned long 
> vaddr, unsigned long size)
>  /*
>   * SME and SEV are very similar but they are not the same, so there are
>   * times that the kernel will need to distinguish between SME and SEV. The
> - * sme_active() and sev_active() functions are used for this.  When a
> - * distinction isn't needed, the mem_encrypt_active() function can be used.
> + * PATTR_HOST_MEM_ENCRYPT and PATTR_GUEST_MEM_ENCRYPT flags to
> + * amd_prot_guest_has() are used for this. When a distinction isn't needed,
> + * the mem_encrypt_active() function can be used.
>   *
>   * The trampoline code is a good example for this requirement.  Before
>   * paging is activated, SME will access all memory as decrypted, but SEV
> @@ -377,11 +378,6 @@ bool sev_active(void)
>  {
>   return sev_status & MSR_AMD64_SEV_ENABLED;
>  }
> -
> -static bool sme_active(void)
> -{
> - return sme_me_mask && !sev_active();
> -}
>  EXPORT_SYMBOL_GPL(sev_active);
>  
>  /* Needs to be called from non-instrumentable code */
> @@ -398,7 +394,7 @@ bool amd_prot_guest_has(unsigned int attr)
>  
>   case PATTR_SME:
>   case PATTR_HOST_MEM_ENCRYPT:
> - return sme_active();
> + return sme_me_mask && !sev_active();
>  
>   case PATTR_SEV:
>   case PATTR_GUEST_MEM_ENCRYPT:
> 
>>  {
>>  return sme_me_mask && !sev_active();
>>  }
>> @@ -428,7 +428,7 @@ bool force_dma_unencrypted(struct device *dev)
>>   * device does not support DMA to addresses that include the
>>   * encryption mask.
>>   */
>> -if (sme_active()) {
>> +if (amd_prot_guest_has(PATTR_SME)) {
> 
> So I'm not sure: you add PATTR_SME which you call with
> amd_prot_guest_has() and PATTR_HOST_MEM_ENCRYPT which you call with
> prot_guest_has() and they both end up being the same thing on AMD.
> 
> So why even bother with PATTR_SME?
> 
> This is only going to cause confusion later and I'd say let's simply use
> prot_guest_has(PATTR_HOST_MEM_ENCRYPT) everywhere...

Ok, I can do that. I was trying to ensure that anything that is truly SME
or SEV specific would be called out now.

I'm ok with letting the TDX folks make changes to these calls to be SME or
SEV specific, if necessary, later.

Thanks,
Tom

>

Re: [PATCH v2 04/12] powerpc/pseries/svm: Add a powerpc version of prot_guest_has()

2021-08-17 Thread Tom Lendacky

On 8/17/21 3:35 AM, Borislav Petkov wrote:
> On Fri, Aug 13, 2021 at 11:59:23AM -0500, Tom Lendacky wrote:
>> Introduce a powerpc version of the prot_guest_has() function. This will
>> be used to replace the powerpc mem_encrypt_active() implementation, so
>> the implementation will initially only support the PATTR_MEM_ENCRYPT
>> attribute.
>>
>> Cc: Michael Ellerman 
>> Cc: Benjamin Herrenschmidt 
>> Cc: Paul Mackerras 
>> Signed-off-by: Tom Lendacky 
>> ---
>>  arch/powerpc/include/asm/protected_guest.h | 30 ++
>>  arch/powerpc/platforms/pseries/Kconfig |  1 +
>>  2 files changed, 31 insertions(+)
>>  create mode 100644 arch/powerpc/include/asm/protected_guest.h
>>
>> diff --git a/arch/powerpc/include/asm/protected_guest.h 
>> b/arch/powerpc/include/asm/protected_guest.h
>> new file mode 100644
>> index ..ce55c2c7e534
>> --- /dev/null
>> +++ b/arch/powerpc/include/asm/protected_guest.h
>> @@ -0,0 +1,30 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Protected Guest (and Host) Capability checks
>> + *
>> + * Copyright (C) 2021 Advanced Micro Devices, Inc.
>> + *
>> + * Author: Tom Lendacky 
>> + */
>> +
>> +#ifndef _POWERPC_PROTECTED_GUEST_H
>> +#define _POWERPC_PROTECTED_GUEST_H
>> +
>> +#include 
>> +
>> +#ifndef __ASSEMBLY__
> 
> Same thing here. Pls audit the whole set whether those __ASSEMBLY__
> guards are really needed and remove them if not.

Will do.

Thanks,
Tom

> 
> Thx.
>

Re: [PATCH] Revert "drm/scheduler: Avoid accessing freed bad job."

2021-08-17 Thread Christian König





Am 17.08.21 um 15:37 schrieb Andrey Grodzovsky:

On 2021-08-17 12:28 a.m., Jingwen Chen wrote:

[Why]
for bailing job, this commit will delete it from pending list thus the
bailing job will never have a chance to be resubmitted even in advance
tdr mode.

[How]
after embeded hw_fence into amdgpu_job is done, the race condition that
this commit tries to work around is completely solved.So revert this
commit.
This reverts commit 135517d3565b48f4def3b1b82008bc17eb5d1c90.



Can you elaborate please how this solves the race ?
As far as I see and  with this patch reverted, in drm_sched_job_timedout
you get a pointer to next job to process in timed out handler, 
immediately
next this job is actually finished and it's fence signaled, this in 
turn triggers
drm_sched_get_cleanup_job which fetches this job and returns to 
drm_sched_main
which in turn call free_job callabck->...->amdgpu_fence_free which 
frees the job
from the HW dma_fence release callback. After that you proceed with a 
freed job

in timed out handler.

If you could take the HW fence reference from drm_sched_job_timedout 
before

starting processing then yes, I think it would work.


Yes, precisely that's what I had in mind as well and seems to be missing 
from this patch.


Regards,
Christian.



Andrey




Signed-off-by: Jingwen Chen 
---
  drivers/gpu/drm/scheduler/sched_main.c | 27 --
  1 file changed, 27 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c

index a2a953693b45..31d1176d939f 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -317,21 +317,10 @@ static void drm_sched_job_timedout(struct 
work_struct *work)

  enum drm_gpu_sched_stat status = DRM_GPU_SCHED_STAT_NOMINAL;
    sched = container_of(work, struct drm_gpu_scheduler, 
work_tdr.work);

-
-    /* Protects against concurrent deletion in 
drm_sched_get_cleanup_job */

-    spin_lock(>job_list_lock);
  job = list_first_entry_or_null(>pending_list,
 struct drm_sched_job, list);
    if (job) {
-    /*
- * Remove the bad job so it cannot be freed by concurrent
- * drm_sched_cleanup_jobs. It will be reinserted back after 
sched->thread

- * is parked at which point it's safe.
- */
-    list_del_init(>list);
-    spin_unlock(>job_list_lock);
-
  status = job->sched->ops->timedout_job(job);
    /*
@@ -342,8 +331,6 @@ static void drm_sched_job_timedout(struct 
work_struct *work)

  job->sched->ops->free_job(job);
  sched->free_guilty = false;
  }
-    } else {
-    spin_unlock(>job_list_lock);
  }
    if (status != DRM_GPU_SCHED_STAT_ENODEV) {
@@ -392,20 +379,6 @@ void drm_sched_stop(struct drm_gpu_scheduler 
*sched, struct drm_sched_job *bad)

    kthread_park(sched->thread);
  -    /*
- * Reinsert back the bad job here - now it's safe as
- * drm_sched_get_cleanup_job cannot race against us and release the
- * bad job at this point - we parked (waited for) any in progress
- * (earlier) cleanups and drm_sched_get_cleanup_job will not be 
called

- * now until the scheduler thread is unparked.
- */
-    if (bad && bad->sched == sched)
-    /*
- * Add at the head of the queue to reflect it was the earliest
- * job extracted.
- */
-    list_add(>list, >pending_list);
-
  /*
   * Iterate the job list from later to  earlier one and either 
deactive
   * their HW callbacks or remove them from pending list if they 
already

Re: [PATCH] Revert "drm/scheduler: Avoid accessing freed bad job."

2021-08-17 Thread Andrey Grodzovsky


On 2021-08-17 12:28 a.m., Jingwen Chen wrote:

[Why]
for bailing job, this commit will delete it from pending list thus the
bailing job will never have a chance to be resubmitted even in advance
tdr mode.

[How]
after embeded hw_fence into amdgpu_job is done, the race condition that
this commit tries to work around is completely solved.So revert this
commit.
This reverts commit 135517d3565b48f4def3b1b82008bc17eb5d1c90.



Can you elaborate please how this solves the race ?
As far as I see and  with this patch reverted, in drm_sched_job_timedout
you get a pointer to next job to process in timed out handler, immediately
next this job is actually finished and it's fence signaled, this in turn 
triggers
drm_sched_get_cleanup_job which fetches this job and returns to 
drm_sched_main
which in turn call free_job callabck->...->amdgpu_fence_free which frees 
the job
from the HW dma_fence release callback. After that you proceed with a 
freed job

in timed out handler.

If you could take the HW fence reference from drm_sched_job_timedout before
starting processing then yes, I think it would work.

Andrey




Signed-off-by: Jingwen Chen 
---
  drivers/gpu/drm/scheduler/sched_main.c | 27 --
  1 file changed, 27 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a953693b45..31d1176d939f 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -317,21 +317,10 @@ static void drm_sched_job_timedout(struct work_struct 
*work)
enum drm_gpu_sched_stat status = DRM_GPU_SCHED_STAT_NOMINAL;
  
  	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);

-
-   /* Protects against concurrent deletion in drm_sched_get_cleanup_job */
-   spin_lock(>job_list_lock);
job = list_first_entry_or_null(>pending_list,
   struct drm_sched_job, list);
  
  	if (job) {

-   /*
-* Remove the bad job so it cannot be freed by concurrent
-* drm_sched_cleanup_jobs. It will be reinserted back after 
sched->thread
-* is parked at which point it's safe.
-*/
-   list_del_init(>list);
-   spin_unlock(>job_list_lock);
-
status = job->sched->ops->timedout_job(job);
  
  		/*

@@ -342,8 +331,6 @@ static void drm_sched_job_timedout(struct work_struct *work)
job->sched->ops->free_job(job);
sched->free_guilty = false;
}
-   } else {
-   spin_unlock(>job_list_lock);
}
  
  	if (status != DRM_GPU_SCHED_STAT_ENODEV) {

@@ -392,20 +379,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
struct drm_sched_job *bad)
  
  	kthread_park(sched->thread);
  
-	/*

-* Reinsert back the bad job here - now it's safe as
-* drm_sched_get_cleanup_job cannot race against us and release the
-* bad job at this point - we parked (waited for) any in progress
-* (earlier) cleanups and drm_sched_get_cleanup_job will not be called
-* now until the scheduler thread is unparked.
-*/
-   if (bad && bad->sched == sched)
-   /*
-* Add at the head of the queue to reflect it was the earliest
-* job extracted.
-*/
-   list_add(>list, >pending_list);
-
/*
 * Iterate the job list from later to  earlier one and either deactive
 * their HW callbacks or remove them from pending list if they already

Re: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-17 Thread Lazar, Lijo





On 8/17/2021 5:19 PM, Lazar, Lijo wrote:



On 8/17/2021 4:36 PM, Michel Dänzer wrote:

On 2021-08-17 12:37 p.m., Lazar, Lijo wrote:



On 8/17/2021 3:29 PM, Michel Dänzer wrote:

On 2021-08-17 11:37 a.m., Lazar, Lijo wrote:



On 8/17/2021 2:56 PM, Michel Dänzer wrote:

On 2021-08-17 11:12 a.m., Lazar, Lijo wrote:



On 8/17/2021 1:53 PM, Michel Dänzer wrote:

From: Michel Dänzer 

schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if 
GFXOFF

was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This 
makes
sure the delayed work doesn't run at unexpected times, and 
allows it to

be lock-free.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
  mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian 
König)

v4:
* Fix race condition between amdgpu_gfx_off_ctrl incrementing
  adev->gfx.gfx_off_req_count and 
amdgpu_device_delay_enable_gfx_off

  checking for it to be 0 (Evan Quan)

Cc: sta...@vger.kernel.org
Reviewed-by: Lijo Lazar  # v3
Acked-by: Christian König  # v3
Signed-off-by: Michel Dänzer 
---

Alex, probably best to wait a bit longer before picking this up. :)

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    | 36 
+++---

 2 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

index f3fd5ec710b6..f944ed858f3e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2777,12 +2777,11 @@ static void 
amdgpu_device_delay_enable_gfx_off(struct work_struct *work)

 struct amdgpu_device *adev =
 container_of(work, struct amdgpu_device, 
gfx.gfx_off_delay_work.work);

 -    mutex_lock(>gfx.gfx_off_mutex);
-    if (!adev->gfx.gfx_off_state && 
!adev->gfx.gfx_off_req_count) {
-    if (!amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, true))

-    adev->gfx.gfx_off_state = true;
-    }
-    mutex_unlock(>gfx.gfx_off_mutex);
+    WARN_ON_ONCE(adev->gfx.gfx_off_state);
+    WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
+
+    if (!amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, true))

+    adev->gfx.gfx_off_state = true;
 }
   /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c

index a0be0772c8b3..b4ced45301be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -563,24 +563,38 @@ void amdgpu_gfx_off_ctrl(struct 
amdgpu_device *adev, bool enable)

   mutex_lock(>gfx.gfx_off_mutex);
 -    if (!enable)
-    adev->gfx.gfx_off_req_count++;
-    else if (adev->gfx.gfx_off_req_count > 0)
+    if (enable) {
+    /* If the count is already 0, it means there's an 
imbalance bug somewhere.
+ * Note that the bug may be in a different caller than 
the one which triggers the

+ * WARN_ON_ONCE.
+ */
+    if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
+    goto unlock;
+
 adev->gfx.gfx_off_req_count--;
 -    if (enable && !adev->gfx.gfx_off_state && 
!adev->gfx.gfx_off_req_count) {
-    schedule_delayed_work(>gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);

-    } else if (!enable && adev->gfx.gfx_off_state) {
-    if (!amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, false)) {

-    adev->gfx.gfx_off_state = false;
+    if (adev->gfx.gfx_off_req_count == 0 && 
!adev->gfx.gfx_off_state)
+
schedule_delayed_work(>gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);

+    } else {
+    if (adev->gfx.gfx_off_req_count == 0) {
+
cancel_delayed_work_sync(>gfx.gfx_off_delay_work);

+
+    if (adev->gfx.gfx_off_state &&


More of a question which I didn't check last time - Is this 
expected to be true when the disable call comes in first?


My assumption is that cancel_delayed_work_sync guarantees 
amdgpu_device_delay_enable_gfx_off's assignment is visible here.




To clarify - when nothing is scheduled. If enable() is called when 
the count is 0, it goes to unlock. Now the expectation is someone 
to call Disable first.


Yes, the very first amdgpu_gfx_off_ctrl call must pass enable=false, 
or it's a bug, which


  if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))

will catch.


Let's say

Re: [PATCH v2 09/12] mm: Remove the now unused mem_encrypt_active() function

2021-08-17 Thread Borislav Petkov

On Fri, Aug 13, 2021 at 11:59:28AM -0500, Tom Lendacky wrote:
> The mem_encrypt_active() function has been replaced by prot_guest_has(),
> so remove the implementation.
> 
> Reviewed-by: Joerg Roedel 
> Signed-off-by: Tom Lendacky 
> ---
>  include/linux/mem_encrypt.h | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h
> index 5c4a18a91f89..ae4526389261 100644
> --- a/include/linux/mem_encrypt.h
> +++ b/include/linux/mem_encrypt.h
> @@ -16,10 +16,6 @@
>  
>  #include 
>  
> -#else/* !CONFIG_ARCH_HAS_MEM_ENCRYPT */
> -
> -static inline bool mem_encrypt_active(void) { return false; }
> -
>  #endif   /* CONFIG_ARCH_HAS_MEM_ENCRYPT */
>  
>  #ifdef CONFIG_AMD_MEM_ENCRYPT
> -- 

This one wants to be part of the previous patch.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH v2 07/12] x86/sev: Replace occurrences of sev_es_active() with prot_guest_has()

2021-08-17 Thread Borislav Petkov

On Fri, Aug 13, 2021 at 11:59:26AM -0500, Tom Lendacky wrote:
> Replace occurrences of sev_es_active() with the more generic
> prot_guest_has() using PATTR_GUEST_PROT_STATE, except for in
> arch/x86/kernel/sev*.c and arch/x86/mm/mem_encrypt*.c where PATTR_SEV_ES
> will be used. If future support is added for other memory encyrption
> techonologies, the use of PATTR_GUEST_PROT_STATE can be updated, as
> required, to specifically use PATTR_SEV_ES.
> 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: Borislav Petkov 
> Signed-off-by: Tom Lendacky 
> ---
>  arch/x86/include/asm/mem_encrypt.h | 2 --
>  arch/x86/kernel/sev.c  | 6 +++---
>  arch/x86/mm/mem_encrypt.c  | 7 +++
>  arch/x86/realmode/init.c   | 3 +--
>  4 files changed, 7 insertions(+), 11 deletions(-)

Same comments to this one as for the previous two.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH v2 04/12] powerpc/pseries/svm: Add a powerpc version of prot_guest_has()

2021-08-17 Thread Borislav Petkov

On Fri, Aug 13, 2021 at 11:59:23AM -0500, Tom Lendacky wrote:
> Introduce a powerpc version of the prot_guest_has() function. This will
> be used to replace the powerpc mem_encrypt_active() implementation, so
> the implementation will initially only support the PATTR_MEM_ENCRYPT
> attribute.
> 
> Cc: Michael Ellerman 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Signed-off-by: Tom Lendacky 
> ---
>  arch/powerpc/include/asm/protected_guest.h | 30 ++
>  arch/powerpc/platforms/pseries/Kconfig |  1 +
>  2 files changed, 31 insertions(+)
>  create mode 100644 arch/powerpc/include/asm/protected_guest.h
> 
> diff --git a/arch/powerpc/include/asm/protected_guest.h 
> b/arch/powerpc/include/asm/protected_guest.h
> new file mode 100644
> index ..ce55c2c7e534
> --- /dev/null
> +++ b/arch/powerpc/include/asm/protected_guest.h
> @@ -0,0 +1,30 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Protected Guest (and Host) Capability checks
> + *
> + * Copyright (C) 2021 Advanced Micro Devices, Inc.
> + *
> + * Author: Tom Lendacky 
> + */
> +
> +#ifndef _POWERPC_PROTECTED_GUEST_H
> +#define _POWERPC_PROTECTED_GUEST_H
> +
> +#include 
> +
> +#ifndef __ASSEMBLY__

Same thing here. Pls audit the whole set whether those __ASSEMBLY__
guards are really needed and remove them if not.

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH v2 06/12] x86/sev: Replace occurrences of sev_active() with prot_guest_has()

2021-08-17 Thread Borislav Petkov

On Fri, Aug 13, 2021 at 11:59:25AM -0500, Tom Lendacky wrote:
> diff --git a/arch/x86/kernel/machine_kexec_64.c 
> b/arch/x86/kernel/machine_kexec_64.c
> index 8e7b517ad738..66ff788b79c9 100644
> --- a/arch/x86/kernel/machine_kexec_64.c
> +++ b/arch/x86/kernel/machine_kexec_64.c
> @@ -167,7 +167,7 @@ static int init_transition_pgtable(struct kimage *image, 
> pgd_t *pgd)
>   }
>   pte = pte_offset_kernel(pmd, vaddr);
>  
> - if (sev_active())
> + if (prot_guest_has(PATTR_GUEST_MEM_ENCRYPT))
>   prot = PAGE_KERNEL_EXEC;
>  
>   set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
> @@ -207,7 +207,7 @@ static int init_pgtable(struct kimage *image, unsigned 
> long start_pgtable)
>   level4p = (pgd_t *)__va(start_pgtable);
>   clear_page(level4p);
>  
> - if (sev_active()) {
> + if (prot_guest_has(PATTR_GUEST_MEM_ENCRYPT)) {
>   info.page_flag   |= _PAGE_ENC;
>   info.kernpg_flag |= _PAGE_ENC;
>   }
> @@ -570,12 +570,12 @@ void arch_kexec_unprotect_crashkres(void)
>   */
>  int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages, gfp_t gfp)
>  {
> - if (sev_active())
> + if (!prot_guest_has(PATTR_HOST_MEM_ENCRYPT))
>   return 0;
>  
>   /*
> -  * If SME is active we need to be sure that kexec pages are
> -  * not encrypted because when we boot to the new kernel the
> +  * If host memory encryption is active we need to be sure that kexec
> +  * pages are not encrypted because when we boot to the new kernel the
>* pages won't be accessed encrypted (initially).
>*/

That hunk belongs logically into the previous patch which removes
sme_active().

>   return set_memory_decrypted((unsigned long)vaddr, pages);
> @@ -583,12 +583,12 @@ int arch_kexec_post_alloc_pages(void *vaddr, unsigned 
> int pages, gfp_t gfp)
>  
>  void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages)
>  {
> - if (sev_active())
> + if (!prot_guest_has(PATTR_HOST_MEM_ENCRYPT))
>   return;
>  
>   /*
> -  * If SME is active we need to reset the pages back to being
> -  * an encrypted mapping before freeing them.
> +  * If host memory encryption is active we need to reset the pages back
> +  * to being an encrypted mapping before freeing them.
>*/
>   set_memory_encrypted((unsigned long)vaddr, pages);
>  }
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index e8ccab50ebf6..b69f5ac622d5 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -25,6 +25,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -457,7 +458,7 @@ static int has_svm(void)
>   return 0;
>   }
>  
> - if (sev_active()) {
> + if (prot_guest_has(PATTR_SEV)) {
>   pr_info("KVM is unsupported when running as an SEV guest\n");
>   return 0;

Same question as for PATTR_SME. PATTR_GUEST_MEM_ENCRYPT should be enough.

> @@ -373,7 +373,7 @@ int __init early_set_memory_encrypted(unsigned long 
> vaddr, unsigned long size)
>   * up under SME the trampoline area cannot be encrypted, whereas under SEV
>   * the trampoline area must be encrypted.
>   */
> -bool sev_active(void)
> +static bool sev_active(void)
>  {
>   return sev_status & MSR_AMD64_SEV_ENABLED;
>  }
> @@ -382,7 +382,6 @@ static bool sme_active(void)
>  {
>   return sme_me_mask && !sev_active();
>  }
> -EXPORT_SYMBOL_GPL(sev_active);

Just get rid of it altogether.

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH v2 05/12] x86/sme: Replace occurrences of sme_active() with prot_guest_has()

2021-08-17 Thread Borislav Petkov

On Fri, Aug 13, 2021 at 11:59:24AM -0500, Tom Lendacky wrote:
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index edc67ddf065d..5635ca9a1fbe 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -144,7 +144,7 @@ void __init sme_unmap_bootdata(char *real_mode_data)
>   struct boot_params *boot_data;
>   unsigned long cmdline_paddr;
>  
> - if (!sme_active())
> + if (!amd_prot_guest_has(PATTR_SME))
>   return;
>  
>   /* Get the command line address before unmapping the real_mode_data */
> @@ -164,7 +164,7 @@ void __init sme_map_bootdata(char *real_mode_data)
>   struct boot_params *boot_data;
>   unsigned long cmdline_paddr;
>  
> - if (!sme_active())
> + if (!amd_prot_guest_has(PATTR_SME))
>   return;
>  
>   __sme_early_map_unmap_mem(real_mode_data, sizeof(boot_params), true);
> @@ -378,7 +378,7 @@ bool sev_active(void)
>   return sev_status & MSR_AMD64_SEV_ENABLED;
>  }
>  
> -bool sme_active(void)
> +static bool sme_active(void)

Just get rid of it altogether. Also, there's an

EXPORT_SYMBOL_GPL(sev_active);

which needs to go under the actual function. Here's a diff ontop:

---
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 5635ca9a1fbe..a3a2396362a5 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -364,8 +364,9 @@ int __init early_set_memory_encrypted(unsigned long vaddr, 
unsigned long size)
 /*
  * SME and SEV are very similar but they are not the same, so there are
  * times that the kernel will need to distinguish between SME and SEV. The
- * sme_active() and sev_active() functions are used for this.  When a
- * distinction isn't needed, the mem_encrypt_active() function can be used.
+ * PATTR_HOST_MEM_ENCRYPT and PATTR_GUEST_MEM_ENCRYPT flags to
+ * amd_prot_guest_has() are used for this. When a distinction isn't needed,
+ * the mem_encrypt_active() function can be used.
  *
  * The trampoline code is a good example for this requirement.  Before
  * paging is activated, SME will access all memory as decrypted, but SEV
@@ -377,11 +378,6 @@ bool sev_active(void)
 {
return sev_status & MSR_AMD64_SEV_ENABLED;
 }
-
-static bool sme_active(void)
-{
-   return sme_me_mask && !sev_active();
-}
 EXPORT_SYMBOL_GPL(sev_active);
 
 /* Needs to be called from non-instrumentable code */
@@ -398,7 +394,7 @@ bool amd_prot_guest_has(unsigned int attr)
 
case PATTR_SME:
case PATTR_HOST_MEM_ENCRYPT:
-   return sme_active();
+   return sme_me_mask && !sev_active();
 
case PATTR_SEV:
case PATTR_GUEST_MEM_ENCRYPT:

>  {
>   return sme_me_mask && !sev_active();
>  }
> @@ -428,7 +428,7 @@ bool force_dma_unencrypted(struct device *dev)
>* device does not support DMA to addresses that include the
>* encryption mask.
>*/
> - if (sme_active()) {
> + if (amd_prot_guest_has(PATTR_SME)) {

So I'm not sure: you add PATTR_SME which you call with
amd_prot_guest_has() and PATTR_HOST_MEM_ENCRYPT which you call with
prot_guest_has() and they both end up being the same thing on AMD.

So why even bother with PATTR_SME?

This is only going to cause confusion later and I'd say let's simply use
prot_guest_has(PATTR_HOST_MEM_ENCRYPT) everywhere...

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH v2 04/12] powerpc/pseries/svm: Add a powerpc version of prot_guest_has()

2021-08-17 Thread Michael Ellerman

Tom Lendacky  writes:
> Introduce a powerpc version of the prot_guest_has() function. This will
> be used to replace the powerpc mem_encrypt_active() implementation, so
> the implementation will initially only support the PATTR_MEM_ENCRYPT
> attribute.
>
> Cc: Michael Ellerman 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Signed-off-by: Tom Lendacky 
> ---
>  arch/powerpc/include/asm/protected_guest.h | 30 ++
>  arch/powerpc/platforms/pseries/Kconfig |  1 +
>  2 files changed, 31 insertions(+)
>  create mode 100644 arch/powerpc/include/asm/protected_guest.h
>
> diff --git a/arch/powerpc/include/asm/protected_guest.h 
> b/arch/powerpc/include/asm/protected_guest.h
> new file mode 100644
> index ..ce55c2c7e534
> --- /dev/null
> +++ b/arch/powerpc/include/asm/protected_guest.h
> @@ -0,0 +1,30 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Protected Guest (and Host) Capability checks
> + *
> + * Copyright (C) 2021 Advanced Micro Devices, Inc.
> + *
> + * Author: Tom Lendacky 
> + */
> +
> +#ifndef _POWERPC_PROTECTED_GUEST_H
> +#define _POWERPC_PROTECTED_GUEST_H

Minor nit, we would usually use _ASM_POWERPC_PROTECTED_GUEST_H

Otherwise looks OK to me.

Acked-by: Michael Ellerman 

cheers

Re: [PATCH v2 09/12] mm: Remove the now unused mem_encrypt_active() function

2021-08-17 Thread Borislav Petkov

On Tue, Aug 17, 2021 at 12:22:33PM +0200, Borislav Petkov wrote:
> This one wants to be part of the previous patch.

... and the three following patches too - the treewide patch does a
single atomic :) replacement and that's it.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-17 Thread Lazar, Lijo





On 8/17/2021 4:36 PM, Michel Dänzer wrote:

On 2021-08-17 12:37 p.m., Lazar, Lijo wrote:



On 8/17/2021 3:29 PM, Michel Dänzer wrote:

On 2021-08-17 11:37 a.m., Lazar, Lijo wrote:



On 8/17/2021 2:56 PM, Michel Dänzer wrote:

On 2021-08-17 11:12 a.m., Lazar, Lijo wrote:



On 8/17/2021 1:53 PM, Michel Dänzer wrote:

From: Michel Dänzer 

schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
  mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
v4:
* Fix race condition between amdgpu_gfx_off_ctrl incrementing
  adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
  checking for it to be 0 (Evan Quan)

Cc: sta...@vger.kernel.org
Reviewed-by: Lijo Lazar  # v3
Acked-by: Christian König  # v3
Signed-off-by: Michel Dänzer 
---

Alex, probably best to wait a bit longer before picking this up. :)

     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++
     drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    | 36 +++---
     2 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index f3fd5ec710b6..f944ed858f3e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2777,12 +2777,11 @@ static void amdgpu_device_delay_enable_gfx_off(struct 
work_struct *work)
     struct amdgpu_device *adev =
     container_of(work, struct amdgpu_device, 
gfx.gfx_off_delay_work.work);
     -    mutex_lock(>gfx.gfx_off_mutex);
-    if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
-    if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, 
true))
-    adev->gfx.gfx_off_state = true;
-    }
-    mutex_unlock(>gfx.gfx_off_mutex);
+    WARN_ON_ONCE(adev->gfx.gfx_off_state);
+    WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
+
+    if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, true))
+    adev->gfx.gfx_off_state = true;
     }
       /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index a0be0772c8b3..b4ced45301be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -563,24 +563,38 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool 
enable)
       mutex_lock(>gfx.gfx_off_mutex);
     -    if (!enable)
-    adev->gfx.gfx_off_req_count++;
-    else if (adev->gfx.gfx_off_req_count > 0)
+    if (enable) {
+    /* If the count is already 0, it means there's an imbalance bug 
somewhere.
+ * Note that the bug may be in a different caller than the one which 
triggers the
+ * WARN_ON_ONCE.
+ */
+    if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
+    goto unlock;
+
     adev->gfx.gfx_off_req_count--;
     -    if (enable && !adev->gfx.gfx_off_state && 
!adev->gfx.gfx_off_req_count) {
-    schedule_delayed_work(>gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);
-    } else if (!enable && adev->gfx.gfx_off_state) {
-    if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, 
false)) {
-    adev->gfx.gfx_off_state = false;
+    if (adev->gfx.gfx_off_req_count == 0 && !adev->gfx.gfx_off_state)
+    schedule_delayed_work(>gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);
+    } else {
+    if (adev->gfx.gfx_off_req_count == 0) {
+    cancel_delayed_work_sync(>gfx.gfx_off_delay_work);
+
+    if (adev->gfx.gfx_off_state &&


More of a question which I didn't check last time - Is this expected to be true 
when the disable call comes in first?


My assumption is that cancel_delayed_work_sync guarantees 
amdgpu_device_delay_enable_gfx_off's assignment is visible here.



To clarify - when nothing is scheduled. If enable() is called when the count is 
0, it goes to unlock. Now the expectation is someone to call Disable first.


Yes, the very first amdgpu_gfx_off_ctrl call must pass enable=false, or it's a 
bug, which

  if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))

will catch.



Let's say  Disable() is called first, then the variable will be false, right?


Ohh, I see

Re: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-17 Thread Michel Dänzer

On 2021-08-17 12:37 p.m., Lazar, Lijo wrote:
> 
> 
> On 8/17/2021 3:29 PM, Michel Dänzer wrote:
>> On 2021-08-17 11:37 a.m., Lazar, Lijo wrote:
>>>
>>>
>>> On 8/17/2021 2:56 PM, Michel Dänzer wrote:
 On 2021-08-17 11:12 a.m., Lazar, Lijo wrote:
>
>
> On 8/17/2021 1:53 PM, Michel Dänzer wrote:
>> From: Michel Dänzer 
>>
>> schedule_delayed_work does not push back the work if it was already
>> scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
>> after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
>> was disabled and re-enabled again during those 100 ms.
>>
>> This resulted in frame drops / stutter with the upcoming mutter 41
>> release on Navi 14, due to constantly enabling GFXOFF in the HW and
>> disabling it again (for getting the GPU clock counter).
>>
>> To fix this, call cancel_delayed_work_sync when the disable count
>> transitions from 0 to 1, and only schedule the delayed work on the
>> reverse transition, not if the disable count was already 0. This makes
>> sure the delayed work doesn't run at unexpected times, and allows it to
>> be lock-free.
>>
>> v2:
>> * Use cancel_delayed_work_sync & mutex_trylock instead of
>>  mod_delayed_work.
>> v3:
>> * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
>> v4:
>> * Fix race condition between amdgpu_gfx_off_ctrl incrementing
>>  adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
>>  checking for it to be 0 (Evan Quan)
>>
>> Cc: sta...@vger.kernel.org
>> Reviewed-by: Lijo Lazar  # v3
>> Acked-by: Christian König  # v3
>> Signed-off-by: Michel Dänzer 
>> ---
>>
>> Alex, probably best to wait a bit longer before picking this up. :)
>>
>>     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++
>>     drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    | 36 
>> +++---
>>     2 files changed, 30 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index f3fd5ec710b6..f944ed858f3e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -2777,12 +2777,11 @@ static void 
>> amdgpu_device_delay_enable_gfx_off(struct work_struct *work)
>>     struct amdgpu_device *adev =
>>     container_of(work, struct amdgpu_device, 
>> gfx.gfx_off_delay_work.work);
>>     -    mutex_lock(>gfx.gfx_off_mutex);
>> -    if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
>> -    if (!amdgpu_dpm_set_powergating_by_smu(adev, 
>> AMD_IP_BLOCK_TYPE_GFX, true))
>> -    adev->gfx.gfx_off_state = true;
>> -    }
>> -    mutex_unlock(>gfx.gfx_off_mutex);
>> +    WARN_ON_ONCE(adev->gfx.gfx_off_state);
>> +    WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
>> +
>> +    if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, 
>> true))
>> +    adev->gfx.gfx_off_state = true;
>>     }
>>       /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> index a0be0772c8b3..b4ced45301be 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> @@ -563,24 +563,38 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device 
>> *adev, bool enable)
>>       mutex_lock(>gfx.gfx_off_mutex);
>>     -    if (!enable)
>> -    adev->gfx.gfx_off_req_count++;
>> -    else if (adev->gfx.gfx_off_req_count > 0)
>> +    if (enable) {
>> +    /* If the count is already 0, it means there's an imbalance bug 
>> somewhere.
>> + * Note that the bug may be in a different caller than the one 
>> which triggers the
>> + * WARN_ON_ONCE.
>> + */
>> +    if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
>> +    goto unlock;
>> +
>>     adev->gfx.gfx_off_req_count--;
>>     -    if (enable && !adev->gfx.gfx_off_state && 
>> !adev->gfx.gfx_off_req_count) {
>> -    schedule_delayed_work(>gfx.gfx_off_delay_work, 
>> GFX_OFF_DELAY_ENABLE);
>> -    } else if (!enable && adev->gfx.gfx_off_state) {
>> -    if (!amdgpu_dpm_set_powergating_by_smu(adev, 
>> AMD_IP_BLOCK_TYPE_GFX, false)) {
>> -    adev->gfx.gfx_off_state = false;
>> +    if (adev->gfx.gfx_off_req_count == 0 && 
>> !adev->gfx.gfx_off_state)
>> +    schedule_delayed_work(>gfx.gfx_off_delay_work, 
>> GFX_OFF_DELAY_ENABLE);
>> +    } else {
>> +    if (adev->gfx.gfx_off_req_count == 0) {
>> +    cancel_delayed_work_sync(>gfx.gfx_off_delay_work);
>> +
>> +    if

Re: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-17 Thread Lazar, Lijo





On 8/17/2021 3:29 PM, Michel Dänzer wrote:

On 2021-08-17 11:37 a.m., Lazar, Lijo wrote:



On 8/17/2021 2:56 PM, Michel Dänzer wrote:

On 2021-08-17 11:12 a.m., Lazar, Lijo wrote:



On 8/17/2021 1:53 PM, Michel Dänzer wrote:

From: Michel Dänzer 

schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
     mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
v4:
* Fix race condition between amdgpu_gfx_off_ctrl incrementing
     adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
     checking for it to be 0 (Evan Quan)

Cc: sta...@vger.kernel.org
Reviewed-by: Lijo Lazar  # v3
Acked-by: Christian König  # v3
Signed-off-by: Michel Dänzer 
---

Alex, probably best to wait a bit longer before picking this up. :)

    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++
    drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    | 36 +++---
    2 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index f3fd5ec710b6..f944ed858f3e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2777,12 +2777,11 @@ static void amdgpu_device_delay_enable_gfx_off(struct 
work_struct *work)
    struct amdgpu_device *adev =
    container_of(work, struct amdgpu_device, 
gfx.gfx_off_delay_work.work);
    -    mutex_lock(>gfx.gfx_off_mutex);
-    if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
-    if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, 
true))
-    adev->gfx.gfx_off_state = true;
-    }
-    mutex_unlock(>gfx.gfx_off_mutex);
+    WARN_ON_ONCE(adev->gfx.gfx_off_state);
+    WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
+
+    if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, true))
+    adev->gfx.gfx_off_state = true;
    }
      /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index a0be0772c8b3..b4ced45301be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -563,24 +563,38 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool 
enable)
      mutex_lock(>gfx.gfx_off_mutex);
    -    if (!enable)
-    adev->gfx.gfx_off_req_count++;
-    else if (adev->gfx.gfx_off_req_count > 0)
+    if (enable) {
+    /* If the count is already 0, it means there's an imbalance bug 
somewhere.
+ * Note that the bug may be in a different caller than the one which 
triggers the
+ * WARN_ON_ONCE.
+ */
+    if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
+    goto unlock;
+
    adev->gfx.gfx_off_req_count--;
    -    if (enable && !adev->gfx.gfx_off_state && 
!adev->gfx.gfx_off_req_count) {
-    schedule_delayed_work(>gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);
-    } else if (!enable && adev->gfx.gfx_off_state) {
-    if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, 
false)) {
-    adev->gfx.gfx_off_state = false;
+    if (adev->gfx.gfx_off_req_count == 0 && !adev->gfx.gfx_off_state)
+    schedule_delayed_work(>gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);
+    } else {
+    if (adev->gfx.gfx_off_req_count == 0) {
+    cancel_delayed_work_sync(>gfx.gfx_off_delay_work);
+
+    if (adev->gfx.gfx_off_state &&


More of a question which I didn't check last time - Is this expected to be true 
when the disable call comes in first?


My assumption is that cancel_delayed_work_sync guarantees 
amdgpu_device_delay_enable_gfx_off's assignment is visible here.



To clarify - when nothing is scheduled. If enable() is called when the count is 
0, it goes to unlock. Now the expectation is someone to call Disable first.


Yes, the very first amdgpu_gfx_off_ctrl call must pass enable=false, or it's a 
bug, which

 if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))

will catch.



Let's say  Disable() is called first, then the variable will be false, right?


Ohh, I see what you mean. The first time amdgpu_gfx_off_ctrl is called with 
enable=false, adev->gfx.gfx_off_state ==

Re: [PATCH] drm/amdgpu: properly powergate Polaris12 UVD/VCE on suspend

2021-08-17 Thread Lazar, Lijo





On 8/17/2021 3:13 PM, Quan, Evan wrote:

[AMD Official Use Only]

+Leo to share his insights


-Original Message-
From: Lazar, Lijo 
Sent: Tuesday, August 17, 2021 3:28 PM
To: Quan, Evan ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Chen, Guchun
; Pan, Xinhui 
Subject: Re: [PATCH] drm/amdgpu: properly powergate Polaris12 UVD/VCE
on suspend



On 8/17/2021 12:10 PM, Evan Quan wrote:

If the powergating of UVD/VCE is in process, wait for its completion
before proceeding(suspending). This can fix some hangs observed on
suspending when UVD/VCE still using(e.g. issue "pm-suspend" when video
is still playing).

Change-Id: I36f39d9731e0a9638b52d5d92558b0ee9c23a9ed
Signed-off-by: Evan Quan 
Signed-off-by: xinhui pan 
---
   drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 5 +
   drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 5 +
   2 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
index 4eebf973a065..2fdce572baeb 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
@@ -554,6 +554,11 @@ static int uvd_v6_0_suspend(void *handle)
int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+   /*
+* If the powergating is in process, wait for its completion.
+*/
+   flush_delayed_work(>uvd.idle_work);
+

If running idle is a prerequisite before going to suspend, then something else
is missing here.

Otherwise, the hang looks more like a pending work launched after
hardware is suspended and trying to access hardware. As the hardware is
going to be suspended anyway, doesn't it work with
cancel_delayed_work_sync - making sure that nothing is going to be
launched later to access hardware?

[Quan, Evan] The reason we chose flush_delayed_work instead of cancel_delayed_work_sync 
is we think those operations performed in idle_work(dpm disablement, powergating) seems 
needed considering the action is 'suspend'. So, instead of "cancel", maybe 
waiting for them completion is more proper.


But it will do so only if the work is scheduled - so it doesn't seem to 
be a prerequisite for suspend. If it was a prerequisite, then the 
existing code is missing that (so that it gets done for all cases).




Then this may be a potential issue for other suspend calls also where work is
pending to be launched when hardware is suspended.

[Quan, Evan] Do you mean we need to check whether there is similar issue for 
other IPs?



Yes, if there are cases where other IPs may schedule a delayed work and 
call hw_fini without cancelling the work.


Thanks,
Lijo


BR
Evan


Thanks,
Lijo


r = uvd_v6_0_hw_fini(adev);
if (r)
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index 6d9108fa22e0..f0adecd5ec0b 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -503,6 +503,11 @@ static int vce_v3_0_suspend(void *handle)
int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+   /*
+* If the powergating is in process, wait for its completion.
+*/
+   flush_delayed_work(>vce.idle_work);
+
r = vce_v3_0_hw_fini(adev);
if (r)
return r;

Re: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-17 Thread Michel Dänzer

On 2021-08-17 11:37 a.m., Lazar, Lijo wrote:
> 
> 
> On 8/17/2021 2:56 PM, Michel Dänzer wrote:
>> On 2021-08-17 11:12 a.m., Lazar, Lijo wrote:
>>>
>>>
>>> On 8/17/2021 1:53 PM, Michel Dänzer wrote:
 From: Michel Dänzer 

 schedule_delayed_work does not push back the work if it was already
 scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
 after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
 was disabled and re-enabled again during those 100 ms.

 This resulted in frame drops / stutter with the upcoming mutter 41
 release on Navi 14, due to constantly enabling GFXOFF in the HW and
 disabling it again (for getting the GPU clock counter).

 To fix this, call cancel_delayed_work_sync when the disable count
 transitions from 0 to 1, and only schedule the delayed work on the
 reverse transition, not if the disable count was already 0. This makes
 sure the delayed work doesn't run at unexpected times, and allows it to
 be lock-free.

 v2:
 * Use cancel_delayed_work_sync & mutex_trylock instead of
     mod_delayed_work.
 v3:
 * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
 v4:
 * Fix race condition between amdgpu_gfx_off_ctrl incrementing
     adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
     checking for it to be 0 (Evan Quan)

 Cc: sta...@vger.kernel.org
 Reviewed-by: Lijo Lazar  # v3
 Acked-by: Christian König  # v3
 Signed-off-by: Michel Dänzer 
 ---

 Alex, probably best to wait a bit longer before picking this up. :)

    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++
    drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    | 36 +++---
    2 files changed, 30 insertions(+), 17 deletions(-)

 diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
 b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
 index f3fd5ec710b6..f944ed858f3e 100644
 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
 +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
 @@ -2777,12 +2777,11 @@ static void 
 amdgpu_device_delay_enable_gfx_off(struct work_struct *work)
    struct amdgpu_device *adev =
    container_of(work, struct amdgpu_device, 
 gfx.gfx_off_delay_work.work);
    -    mutex_lock(>gfx.gfx_off_mutex);
 -    if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
 -    if (!amdgpu_dpm_set_powergating_by_smu(adev, 
 AMD_IP_BLOCK_TYPE_GFX, true))
 -    adev->gfx.gfx_off_state = true;
 -    }
 -    mutex_unlock(>gfx.gfx_off_mutex);
 +    WARN_ON_ONCE(adev->gfx.gfx_off_state);
 +    WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
 +
 +    if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, 
 true))
 +    adev->gfx.gfx_off_state = true;
    }
      /**
 diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
 b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
 index a0be0772c8b3..b4ced45301be 100644
 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
 +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
 @@ -563,24 +563,38 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, 
 bool enable)
      mutex_lock(>gfx.gfx_off_mutex);
    -    if (!enable)
 -    adev->gfx.gfx_off_req_count++;
 -    else if (adev->gfx.gfx_off_req_count > 0)
 +    if (enable) {
 +    /* If the count is already 0, it means there's an imbalance bug 
 somewhere.
 + * Note that the bug may be in a different caller than the one 
 which triggers the
 + * WARN_ON_ONCE.
 + */
 +    if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
 +    goto unlock;
 +
    adev->gfx.gfx_off_req_count--;
    -    if (enable && !adev->gfx.gfx_off_state && 
 !adev->gfx.gfx_off_req_count) {
 -    schedule_delayed_work(>gfx.gfx_off_delay_work, 
 GFX_OFF_DELAY_ENABLE);
 -    } else if (!enable && adev->gfx.gfx_off_state) {
 -    if (!amdgpu_dpm_set_powergating_by_smu(adev, 
 AMD_IP_BLOCK_TYPE_GFX, false)) {
 -    adev->gfx.gfx_off_state = false;
 +    if (adev->gfx.gfx_off_req_count == 0 && !adev->gfx.gfx_off_state)
 +    schedule_delayed_work(>gfx.gfx_off_delay_work, 
 GFX_OFF_DELAY_ENABLE);
 +    } else {
 +    if (adev->gfx.gfx_off_req_count == 0) {
 +    cancel_delayed_work_sync(>gfx.gfx_off_delay_work);
 +
 +    if (adev->gfx.gfx_off_state &&
>>>
>>> More of a question which I didn't check last time - Is this expected to be 
>>> true when the disable call comes in first?
>>
>> My assumption is that cancel_delayed_work_sync guarantees 
>> amdgpu_device_delay_enable_gfx_off's assignment is visible here.
>>
> 
> To clarify - when nothing is scheduled. If enable()

[PATCH] drm/amd/amdgpu:flush ttm delayed work before cancel_sync

2021-08-17 Thread YuBiao Wang

[Why]
In some cases when we unload driver, warning call trace
will show up in vram_mgr_fini which claims that LRU is not empty, caused
by the ttm bo inside delay deleted queue.

[How]
We should flush delayed work to make sure the delay deleting is done.

Signed-off-by: YuBiao Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 4d266c40382c..0b5764aa98a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3824,8 +3824,10 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 {
dev_info(adev->dev, "amdgpu: finishing device.\n");
flush_delayed_work(>delayed_init_work);
-   if (adev->mman.initialized)
+   if (adev->mman.initialized) {
+   flush_delayed_work(>mman.bdev.wq);
ttm_bo_lock_delayed_workqueue(>mman.bdev);
+   }
adev->shutdown = true;
 
/* make sure IB test finished before entering exclusive mode
-- 
2.25.1

RE: [PATCH] drm/amdgpu: properly powergate Polaris12 UVD/VCE on suspend

2021-08-17 Thread Quan, Evan

[AMD Official Use Only]

+Leo to share his insights

> -Original Message-
> From: Lazar, Lijo 
> Sent: Tuesday, August 17, 2021 3:28 PM
> To: Quan, Evan ; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Chen, Guchun
> ; Pan, Xinhui 
> Subject: Re: [PATCH] drm/amdgpu: properly powergate Polaris12 UVD/VCE
> on suspend
> 
> 
> 
> On 8/17/2021 12:10 PM, Evan Quan wrote:
> > If the powergating of UVD/VCE is in process, wait for its completion
> > before proceeding(suspending). This can fix some hangs observed on
> > suspending when UVD/VCE still using(e.g. issue "pm-suspend" when video
> > is still playing).
> >
> > Change-Id: I36f39d9731e0a9638b52d5d92558b0ee9c23a9ed
> > Signed-off-by: Evan Quan 
> > Signed-off-by: xinhui pan 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 5 +
> >   drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 5 +
> >   2 files changed, 10 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> > b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> > index 4eebf973a065..2fdce572baeb 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> > @@ -554,6 +554,11 @@ static int uvd_v6_0_suspend(void *handle)
> > int r;
> > struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> >
> > +   /*
> > +* If the powergating is in process, wait for its completion.
> > +*/
> > +   flush_delayed_work(>uvd.idle_work);
> > +
> If running idle is a prerequisite before going to suspend, then something else
> is missing here.
> 
> Otherwise, the hang looks more like a pending work launched after
> hardware is suspended and trying to access hardware. As the hardware is
> going to be suspended anyway, doesn't it work with
> cancel_delayed_work_sync - making sure that nothing is going to be
> launched later to access hardware?
[Quan, Evan] The reason we chose flush_delayed_work instead of 
cancel_delayed_work_sync is we think those operations performed in 
idle_work(dpm disablement, powergating) seems needed considering the action is 
'suspend'. So, instead of "cancel", maybe waiting for them completion is more 
proper. 
> 
> Then this may be a potential issue for other suspend calls also where work is
> pending to be launched when hardware is suspended.
[Quan, Evan] Do you mean we need to check whether there is similar issue for 
other IPs?

BR
Evan
> 
> Thanks,
> Lijo
> 
> > r = uvd_v6_0_hw_fini(adev);
> > if (r)
> > return r;
> > diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> > b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> > index 6d9108fa22e0..f0adecd5ec0b 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> > @@ -503,6 +503,11 @@ static int vce_v3_0_suspend(void *handle)
> > int r;
> > struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> >
> > +   /*
> > +* If the powergating is in process, wait for its completion.
> > +*/
> > +   flush_delayed_work(>vce.idle_work);
> > +
> > r = vce_v3_0_hw_fini(adev);
> > if (r)
> > return r;
> >

Re: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-17 Thread Lazar, Lijo





On 8/17/2021 2:56 PM, Michel Dänzer wrote:

On 2021-08-17 11:12 a.m., Lazar, Lijo wrote:



On 8/17/2021 1:53 PM, Michel Dänzer wrote:

From: Michel Dänzer 

schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
    mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
v4:
* Fix race condition between amdgpu_gfx_off_ctrl incrementing
    adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
    checking for it to be 0 (Evan Quan)

Cc: sta...@vger.kernel.org
Reviewed-by: Lijo Lazar  # v3
Acked-by: Christian König  # v3
Signed-off-by: Michel Dänzer 
---

Alex, probably best to wait a bit longer before picking this up. :)

   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++
   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    | 36 +++---
   2 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index f3fd5ec710b6..f944ed858f3e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2777,12 +2777,11 @@ static void amdgpu_device_delay_enable_gfx_off(struct 
work_struct *work)
   struct amdgpu_device *adev =
   container_of(work, struct amdgpu_device, 
gfx.gfx_off_delay_work.work);
   -    mutex_lock(>gfx.gfx_off_mutex);
-    if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
-    if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, 
true))
-    adev->gfx.gfx_off_state = true;
-    }
-    mutex_unlock(>gfx.gfx_off_mutex);
+    WARN_ON_ONCE(adev->gfx.gfx_off_state);
+    WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
+
+    if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, true))
+    adev->gfx.gfx_off_state = true;
   }
     /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index a0be0772c8b3..b4ced45301be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -563,24 +563,38 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool 
enable)
     mutex_lock(>gfx.gfx_off_mutex);
   -    if (!enable)
-    adev->gfx.gfx_off_req_count++;
-    else if (adev->gfx.gfx_off_req_count > 0)
+    if (enable) {
+    /* If the count is already 0, it means there's an imbalance bug 
somewhere.
+ * Note that the bug may be in a different caller than the one which 
triggers the
+ * WARN_ON_ONCE.
+ */
+    if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
+    goto unlock;
+
   adev->gfx.gfx_off_req_count--;
   -    if (enable && !adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) 
{
-    schedule_delayed_work(>gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);
-    } else if (!enable && adev->gfx.gfx_off_state) {
-    if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, 
false)) {
-    adev->gfx.gfx_off_state = false;
+    if (adev->gfx.gfx_off_req_count == 0 && !adev->gfx.gfx_off_state)
+    schedule_delayed_work(>gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);
+    } else {
+    if (adev->gfx.gfx_off_req_count == 0) {
+    cancel_delayed_work_sync(>gfx.gfx_off_delay_work);
+
+    if (adev->gfx.gfx_off_state &&


More of a question which I didn't check last time - Is this expected to be true 
when the disable call comes in first?


My assumption is that cancel_delayed_work_sync guarantees 
amdgpu_device_delay_enable_gfx_off's assignment is visible here.



To clarify - when nothing is scheduled. If enable() is called when the 
count is 0, it goes to unlock. Now the expectation is someone to call 
Disable first.  Let's say  Disable() is called first, then the variable 
will be false, right?


Thanks,
Lijo

RE: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-17 Thread Quan, Evan

[AMD Official Use Only]

Thanks! This seems fine to me.
Reviewed-by: Evan Quan 

> -Original Message-
> From: amd-gfx  On Behalf Of
> Michel Dänzer
> Sent: Tuesday, August 17, 2021 4:23 PM
> To: Deucher, Alexander ; Koenig, Christian
> 
> Cc: Liu, Leo ; Zhu, James ; amd-
> g...@lists.freedesktop.org; dri-de...@lists.freedesktop.org
> Subject: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is
> disabled
> 
> From: Michel Dänzer 
> 
> schedule_delayed_work does not push back the work if it was already
> scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
> after the first time GFXOFF was disabled and re-enabled, even if GFXOFF was
> disabled and re-enabled again during those 100 ms.
> 
> This resulted in frame drops / stutter with the upcoming mutter 41 release
> on Navi 14, due to constantly enabling GFXOFF in the HW and disabling it
> again (for getting the GPU clock counter).
> 
> To fix this, call cancel_delayed_work_sync when the disable count transitions
> from 0 to 1, and only schedule the delayed work on the reverse transition,
> not if the disable count was already 0. This makes sure the delayed work
> doesn't run at unexpected times, and allows it to be lock-free.
> 
> v2:
> * Use cancel_delayed_work_sync & mutex_trylock instead of
>   mod_delayed_work.
> v3:
> * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
> v4:
> * Fix race condition between amdgpu_gfx_off_ctrl incrementing
>   adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
>   checking for it to be 0 (Evan Quan)
> 
> Cc: sta...@vger.kernel.org
> Reviewed-by: Lijo Lazar  # v3
> Acked-by: Christian König  # v3
> Signed-off-by: Michel Dänzer 
> ---
> 
> Alex, probably best to wait a bit longer before picking this up. :)
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 36 +++
> ---
>  2 files changed, 30 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index f3fd5ec710b6..f944ed858f3e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2777,12 +2777,11 @@ static void
> amdgpu_device_delay_enable_gfx_off(struct work_struct *work)
>   struct amdgpu_device *adev =
>   container_of(work, struct amdgpu_device,
> gfx.gfx_off_delay_work.work);
> 
> - mutex_lock(>gfx.gfx_off_mutex);
> - if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
> - if (!amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, true))
> - adev->gfx.gfx_off_state = true;
> - }
> - mutex_unlock(>gfx.gfx_off_mutex);
> + WARN_ON_ONCE(adev->gfx.gfx_off_state);
> + WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
> +
> + if (!amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, true))
> + adev->gfx.gfx_off_state = true;
>  }
> 
>  /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index a0be0772c8b3..b4ced45301be 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -563,24 +563,38 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device
> *adev, bool enable)
> 
>   mutex_lock(>gfx.gfx_off_mutex);
> 
> - if (!enable)
> - adev->gfx.gfx_off_req_count++;
> - else if (adev->gfx.gfx_off_req_count > 0)
> + if (enable) {
> + /* If the count is already 0, it means there's an imbalance bug
> somewhere.
> +  * Note that the bug may be in a different caller than the one
> which triggers the
> +  * WARN_ON_ONCE.
> +  */
> + if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
> + goto unlock;
> +
>   adev->gfx.gfx_off_req_count--;
> 
> - if (enable && !adev->gfx.gfx_off_state && !adev-
> >gfx.gfx_off_req_count) {
> - schedule_delayed_work(>gfx.gfx_off_delay_work,
> GFX_OFF_DELAY_ENABLE);
> - } else if (!enable && adev->gfx.gfx_off_state) {
> - if (!amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, false)) {
> - adev->gfx.gfx_off_state = false;
> + if (adev->gfx.gfx_off_req_count == 0 && !adev-
> >gfx.gfx_off_state)
> + schedule_delayed_work(
> >gfx.gfx_off_delay_work, GFX_OFF_DELAY_ENABLE);
> + } else {
> + if (adev->gfx.gfx_off_req_count == 0) {
> + cancel_delayed_work_sync(
> >gfx.gfx_off_delay_work);
> +
> + if (adev->gfx.gfx_off_state &&
> + !amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, false)) {
> + adev->gfx.gfx_off_state = false;
> 
> - if (adev->gfx.funcs->init_spm_golden) {
> - dev_dbg(adev->dev,

Re: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-17 Thread Michel Dänzer

On 2021-08-17 11:12 a.m., Lazar, Lijo wrote:
> 
> 
> On 8/17/2021 1:53 PM, Michel Dänzer wrote:
>> From: Michel Dänzer 
>>
>> schedule_delayed_work does not push back the work if it was already
>> scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
>> after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
>> was disabled and re-enabled again during those 100 ms.
>>
>> This resulted in frame drops / stutter with the upcoming mutter 41
>> release on Navi 14, due to constantly enabling GFXOFF in the HW and
>> disabling it again (for getting the GPU clock counter).
>>
>> To fix this, call cancel_delayed_work_sync when the disable count
>> transitions from 0 to 1, and only schedule the delayed work on the
>> reverse transition, not if the disable count was already 0. This makes
>> sure the delayed work doesn't run at unexpected times, and allows it to
>> be lock-free.
>>
>> v2:
>> * Use cancel_delayed_work_sync & mutex_trylock instead of
>>    mod_delayed_work.
>> v3:
>> * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
>> v4:
>> * Fix race condition between amdgpu_gfx_off_ctrl incrementing
>>    adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
>>    checking for it to be 0 (Evan Quan)
>>
>> Cc: sta...@vger.kernel.org
>> Reviewed-by: Lijo Lazar  # v3
>> Acked-by: Christian König  # v3
>> Signed-off-by: Michel Dänzer 
>> ---
>>
>> Alex, probably best to wait a bit longer before picking this up. :)
>>
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    | 36 +++---
>>   2 files changed, 30 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index f3fd5ec710b6..f944ed858f3e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -2777,12 +2777,11 @@ static void 
>> amdgpu_device_delay_enable_gfx_off(struct work_struct *work)
>>   struct amdgpu_device *adev =
>>   container_of(work, struct amdgpu_device, 
>> gfx.gfx_off_delay_work.work);
>>   -    mutex_lock(>gfx.gfx_off_mutex);
>> -    if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
>> -    if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, 
>> true))
>> -    adev->gfx.gfx_off_state = true;
>> -    }
>> -    mutex_unlock(>gfx.gfx_off_mutex);
>> +    WARN_ON_ONCE(adev->gfx.gfx_off_state);
>> +    WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
>> +
>> +    if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, 
>> true))
>> +    adev->gfx.gfx_off_state = true;
>>   }
>>     /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> index a0be0772c8b3..b4ced45301be 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> @@ -563,24 +563,38 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, 
>> bool enable)
>>     mutex_lock(>gfx.gfx_off_mutex);
>>   -    if (!enable)
>> -    adev->gfx.gfx_off_req_count++;
>> -    else if (adev->gfx.gfx_off_req_count > 0)
>> +    if (enable) {
>> +    /* If the count is already 0, it means there's an imbalance bug 
>> somewhere.
>> + * Note that the bug may be in a different caller than the one 
>> which triggers the
>> + * WARN_ON_ONCE.
>> + */
>> +    if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
>> +    goto unlock;
>> +
>>   adev->gfx.gfx_off_req_count--;
>>   -    if (enable && !adev->gfx.gfx_off_state && 
>> !adev->gfx.gfx_off_req_count) {
>> -    schedule_delayed_work(>gfx.gfx_off_delay_work, 
>> GFX_OFF_DELAY_ENABLE);
>> -    } else if (!enable && adev->gfx.gfx_off_state) {
>> -    if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, 
>> false)) {
>> -    adev->gfx.gfx_off_state = false;
>> +    if (adev->gfx.gfx_off_req_count == 0 && !adev->gfx.gfx_off_state)
>> +    schedule_delayed_work(>gfx.gfx_off_delay_work, 
>> GFX_OFF_DELAY_ENABLE);
>> +    } else {
>> +    if (adev->gfx.gfx_off_req_count == 0) {
>> +    cancel_delayed_work_sync(>gfx.gfx_off_delay_work);
>> +
>> +    if (adev->gfx.gfx_off_state &&
> 
> More of a question which I didn't check last time - Is this expected to be 
> true when the disable call comes in first?

My assumption is that cancel_delayed_work_sync guarantees 
amdgpu_device_delay_enable_gfx_off's assignment is visible here.


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer

RE: [PATCH] drm/amd: consolidate TA shared memory structures

2021-08-17 Thread Clements, John

[AMD Official Use Only]

Reviewed-by: John Clements 

-Original Message-
From: Li, Candice  
Sent: Tuesday, August 17, 2021 5:11 PM
To: amd-gfx@lists.freedesktop.org
Cc: Clements, John ; Li, Candice 
Subject: [PATCH] drm/amd: consolidate TA shared memory structures

Change-Id: I81be5a824fced3d2244cf209444c2391f6bc6c50
Signed-off-by: Candice Li 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c   | 218 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h   |  68 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c   |   4 +-
 .../gpu/drm/amd/amdgpu/amdgpu_securedisplay.c |   4 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_hdcp.c|  12 +-
 .../drm/amd/display/modules/hdcp/hdcp_psp.c   |  56 ++---
 6 files changed, 167 insertions(+), 195 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index cf40609f39d4f0..ebb827b6331b65 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -908,9 +908,9 @@ static int psp_xgmi_init_shared_buf(struct psp_context *psp)
 */
ret = amdgpu_bo_create_kernel(psp->adev, PSP_XGMI_SHARED_MEM_SIZE,
  PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM,
- >xgmi_context.xgmi_shared_bo,
- >xgmi_context.xgmi_shared_mc_addr,
- >xgmi_context.xgmi_shared_buf);
+ 
>xgmi_context.context.mem_context.shared_bo,
+ 
>xgmi_context.context.mem_context.shared_mc_addr,
+ 
>xgmi_context.context.mem_context.shared_buf);
 
return ret;
 }
@@ -957,15 +957,15 @@ static int psp_xgmi_load(struct psp_context *psp)
psp_prep_ta_load_cmd_buf(cmd,
 psp->fw_pri_mc_addr,
 psp->xgmi.size_bytes,
-psp->xgmi_context.xgmi_shared_mc_addr,
+
psp->xgmi_context.context.mem_context.shared_mc_addr,
 PSP_XGMI_SHARED_MEM_SIZE);
 
ret = psp_cmd_submit_buf(psp, NULL, cmd,
 psp->fence_buf_mc_addr);
 
if (!ret) {
-   psp->xgmi_context.initialized = 1;
-   psp->xgmi_context.session_id = cmd->resp.session_id;
+   psp->xgmi_context.context.initialized = true;
+   psp->xgmi_context.context.session_id = cmd->resp.session_id;
}
 
release_psp_cmd_buf(psp);
@@ -990,7 +990,7 @@ static int psp_xgmi_unload(struct psp_context *psp)
 
cmd = acquire_psp_cmd_buf(psp);
 
-   psp_prep_ta_unload_cmd_buf(cmd, psp->xgmi_context.session_id);
+   psp_prep_ta_unload_cmd_buf(cmd, psp->xgmi_context.context.session_id);
 
ret = psp_cmd_submit_buf(psp, NULL, cmd,
 psp->fence_buf_mc_addr);
@@ -1002,26 +1002,26 @@ static int psp_xgmi_unload(struct psp_context *psp)
 
 int psp_xgmi_invoke(struct psp_context *psp, uint32_t ta_cmd_id)  {
-   return psp_ta_invoke(psp, ta_cmd_id, psp->xgmi_context.session_id);
+   return psp_ta_invoke(psp, ta_cmd_id, 
+psp->xgmi_context.context.session_id);
 }
 
 int psp_xgmi_terminate(struct psp_context *psp)  {
int ret;
 
-   if (!psp->xgmi_context.initialized)
+   if (!psp->xgmi_context.context.initialized)
return 0;
 
ret = psp_xgmi_unload(psp);
if (ret)
return ret;
 
-   psp->xgmi_context.initialized = 0;
+   psp->xgmi_context.context.initialized = false;
 
/* free xgmi shared memory */
-   amdgpu_bo_free_kernel(>xgmi_context.xgmi_shared_bo,
-   >xgmi_context.xgmi_shared_mc_addr,
-   >xgmi_context.xgmi_shared_buf);
+   amdgpu_bo_free_kernel(>xgmi_context.context.mem_context.shared_bo,
+   >xgmi_context.context.mem_context.shared_mc_addr,
+   >xgmi_context.context.mem_context.shared_buf);
 
return 0;
 }
@@ -1036,7 +1036,7 @@ int psp_xgmi_initialize(struct psp_context *psp)
!psp->xgmi.start_addr)
return -ENOENT;
 
-   if (!psp->xgmi_context.initialized) {
+   if (!psp->xgmi_context.context.initialized) {
ret = psp_xgmi_init_shared_buf(psp);
if (ret)
return ret;
@@ -1048,7 +1048,7 @@ int psp_xgmi_initialize(struct psp_context *psp)
return ret;
 
/* Initialize XGMI session */
-   xgmi_cmd = (struct ta_xgmi_shared_memory 
*)(psp->xgmi_context.xgmi_shared_buf);
+   xgmi_cmd = (struct ta_xgmi_shared_memory 
+*)(psp->xgmi_context.context.mem_context.shared_buf);
memset(xgmi_cmd, 0, sizeof(struct ta_xgmi_shared_memory));
xgmi_cmd->cmd_id = TA_COMMAND_XGMI__INITIALIZE;
 
@@ -1062,7 +1062,7 @@ int

[PATCH 2/2] drm/amdkfd: fix KFDSVMRangeTest.PartialUnmapSysMemTest fails

2021-08-17 Thread Yifan Zhang

[ RUN  ] KFDSVMRangeTest.PartialUnmapSysMemTest
/home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDTestUtil.cpp:245: Failure
Value of: (hsaKmtAllocMemory(m_Node, m_Size, m_Flags, _pBuf))
  Actual: 1
Expected: HSAKMT_STATUS_SUCCESS
Which is: 0
/home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDTestUtil.cpp:248: Failure
Value of: (hsaKmtMapMemoryToGPUNodes(m_pBuf, m_Size, __null, mapFlags, 1, 
_Node))
  Actual: 1
Expected: HSAKMT_STATUS_SUCCESS
Which is: 0
/home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDTestUtil.cpp:306: Failure
Expected: ((void *)__null) != (ptr), actual: NULL vs NULL
Segmentation fault (core dumped)
[  ] Profile: Full Test
[  ] HW capabilities: 0x9

kernel log:

[  102.029150]  ret_from_fork+0x22/0x30
[  102.029158] ---[ end trace 15c34e782714f9a3 ]---
[ 3613.603598] amdgpu: Address: 0x7f7149ccc000 already allocated by SVM
[ 3613.610620] show_signal_msg: 27 callbacks suppressed

These is race with deferred actions from previous memory map
changes (e.g. munmap).Flush pending deffered work to avoid such case.

Signed-off-by: Yifan Zhang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 3177c4a0e753..e1c4abb98b35 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1261,6 +1261,10 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file 
*filep,
return -EINVAL;
 
 #if IS_ENABLED(CONFIG_HSA_AMD_SVM)
+   /* Flush pending deferred work to avoid racing with deferred actions
+* from previous memory map changes (e.g. munmap).
+*/
+   svm_range_list_lock_and_flush_work(svms, current->mm);
mutex_lock(>lock);
if (interval_tree_iter_first(>objects,
 args->va_addr >> PAGE_SHIFT,
@@ -1271,6 +1275,7 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file 
*filep,
return -EADDRINUSE;
}
mutex_unlock(>lock);
+   mmap_write_unlock(current->mm);
 #endif
dev = kfd_device_by_id(args->gpu_id);
if (!dev)
-- 
2.25.1

[PATCH 1/2] drm/amdkfd: export svm_range_list_lock_and_flush_work

2021-08-17 Thread Yifan Zhang

export svm_range_list_lock_and_flush_work to make other kfd parts be
able to sync svm_range_list.

Signed-off-by: Yifan Zhang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index c1833acc54c7..d4a43c94bcf9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1500,7 +1500,7 @@ static int svm_range_validate_and_map(struct mm_struct 
*mm,
  * Context: Returns with mmap write lock held, pending deferred work flushed
  *
  */
-static void
+void
 svm_range_list_lock_and_flush_work(struct svm_range_list *svms,
   struct mm_struct *mm)
 {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
index 3fc1fd8b4fbc..e7fc5e8998aa 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
@@ -188,6 +188,7 @@ void svm_range_prefault(struct svm_range *prange, struct 
mm_struct *mm,
void *owner);
 struct kfd_process_device *
 svm_range_get_pdd_by_adev(struct svm_range *prange, struct amdgpu_device 
*adev);
+void svm_range_list_lock_and_flush_work(struct svm_range_list *svms, struct 
mm_struct *mm);
 
 /* SVM API and HMM page migration work together, device memory type
  * is initialized to not 0 when page migration register device memory.
-- 
2.25.1

RE: [PATCH] drm/amdkfd: fix KFDSVMRangeTest.PartialUnmapSysMemTest fails

2021-08-17 Thread Zhang, Yifan

[AMD Official Use Only]

Hi Felix, 

Thanks for comments. I will send Patch V2.

BRs,
Yifan

-Original Message-
From: Kuehling, Felix  
Sent: Tuesday, August 17, 2021 9:16 AM
To: Zhang, Yifan ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdkfd: fix KFDSVMRangeTest.PartialUnmapSysMemTest 
fails


Am 2021-08-14 um 6:12 a.m. schrieb Yifan Zhang:
> [ RUN  ] KFDSVMRangeTest.PartialUnmapSysMemTest
> /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDTestUtil.cpp:245: 
> Failure Value of: (hsaKmtAllocMemory(m_Node, m_Size, m_Flags, _pBuf))
>   Actual: 1
> Expected: HSAKMT_STATUS_SUCCESS
> Which is: 0
> /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDTestUtil.cpp:248: 
> Failure Value of: (hsaKmtMapMemoryToGPUNodes(m_pBuf, m_Size, __null, 
> mapFlags, 1, _Node))
>   Actual: 1
> Expected: HSAKMT_STATUS_SUCCESS
> Which is: 0
> /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDTestUtil.cpp:306: 
> Failure
> Expected: ((void *)__null) != (ptr), actual: NULL vs NULL Segmentation 
> fault (core dumped)
> [  ] Profile: Full Test
> [  ] HW capabilities: 0x9
>
> kernel log:
>
> [  102.029150]  ret_from_fork+0x22/0x30 [  102.029158] ---[ end trace 
> 15c34e782714f9a3 ]--- [ 3613.603598] amdgpu: Address: 0x7f7149ccc000 
> already allocated by SVM [ 3613.610620] show_signal_msg: 27 callbacks 
> suppressed
>
> These is race with deferred actions from previous memory map changes 
> (e.g. munmap).Flush pending deffered work to avoid such case.
>
> Signed-off-by: Yifan Zhang 
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 3177c4a0e753..982bf538dc3d 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -1261,6 +1261,13 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file 
> *filep,
>   return -EINVAL;
>  
>  #if IS_ENABLED(CONFIG_HSA_AMD_SVM)
> + /* Flush pending deferred work to avoid racing with deferred actions 
> from
> +  * previous memory map changes (e.g. munmap). Concurrent memory map 
> changes
> +  * can still race with get_attr because we don't hold the mmap lock. 
> +But that

This comment would need to be updated. This is not get_attr. Whether or not 
undefined behaviour is acceptable in this case is a different question from 
get_attr. In the get_attr case, a race is caused by a broken application and 
causes potentially incorrect results being reported to user mode. Garbage-in 
==> garbage-out. No problem.

In the case of this race here, the cause is still a broken application.
But this check is here to catch broken applications and prevent them from 
mapping the same virtual address range with two different allocators. So I'd 
argue that the race condition is not acceptable here because it has 
consequences for VM mappings managed by the kernel mode driver.


> +  * would be a race condition in the application anyway, and undefined
> +  * behaviour is acceptable in that case.
> +  */
> + flush_work(>deferred_list_work);
>   mutex_lock(>lock);

This still leaves a brief race. There is an easy way to fix that: Use 
svm_range_list_lock_and_flush_work like this:

svm_range_list_lock_and_flush_work(svms, current->mm);
mutex_lock(>lock);
mmap_write_unlock(current->mm);
...

Regards,
  Felix


>   if (interval_tree_iter_first(>objects,
>args->va_addr >> PAGE_SHIFT,

Re: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-17 Thread Lazar, Lijo





On 8/17/2021 1:53 PM, Michel Dänzer wrote:

From: Michel Dänzer 

schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
   mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
v4:
* Fix race condition between amdgpu_gfx_off_ctrl incrementing
   adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
   checking for it to be 0 (Evan Quan)

Cc: sta...@vger.kernel.org
Reviewed-by: Lijo Lazar  # v3
Acked-by: Christian König  # v3
Signed-off-by: Michel Dänzer 
---

Alex, probably best to wait a bit longer before picking this up. :)

  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 36 +++---
  2 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index f3fd5ec710b6..f944ed858f3e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2777,12 +2777,11 @@ static void amdgpu_device_delay_enable_gfx_off(struct 
work_struct *work)
struct amdgpu_device *adev =
container_of(work, struct amdgpu_device, 
gfx.gfx_off_delay_work.work);
  
-	mutex_lock(>gfx.gfx_off_mutex);

-   if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
-   if (!amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, true))
-   adev->gfx.gfx_off_state = true;
-   }
-   mutex_unlock(>gfx.gfx_off_mutex);
+   WARN_ON_ONCE(adev->gfx.gfx_off_state);
+   WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
+
+   if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, 
true))
+   adev->gfx.gfx_off_state = true;
  }
  
  /**

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index a0be0772c8b3..b4ced45301be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -563,24 +563,38 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool 
enable)
  
  	mutex_lock(>gfx.gfx_off_mutex);
  
-	if (!enable)

-   adev->gfx.gfx_off_req_count++;
-   else if (adev->gfx.gfx_off_req_count > 0)
+   if (enable) {
+   /* If the count is already 0, it means there's an imbalance bug 
somewhere.
+* Note that the bug may be in a different caller than the one 
which triggers the
+* WARN_ON_ONCE.
+*/
+   if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
+   goto unlock;
+
adev->gfx.gfx_off_req_count--;
  
-	if (enable && !adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {

-   schedule_delayed_work(>gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);
-   } else if (!enable && adev->gfx.gfx_off_state) {
-   if (!amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, false)) {
-   adev->gfx.gfx_off_state = false;
+   if (adev->gfx.gfx_off_req_count == 0 && 
!adev->gfx.gfx_off_state)
+   schedule_delayed_work(>gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);
+   } else {
+   if (adev->gfx.gfx_off_req_count == 0) {
+   cancel_delayed_work_sync(>gfx.gfx_off_delay_work);
+
+   if (adev->gfx.gfx_off_state &&


More of a question which I didn't check last time - Is this expected to 
be true when the disable call comes in first?


Thanks,
Lijo


+   !amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, false)) {
+   adev->gfx.gfx_off_state = false;
  
-			if (adev->gfx.funcs->init_spm_golden) {

-   dev_dbg(adev->dev, "GFXOFF is disabled, re-init SPM 
golden settings\n");
-   amdgpu_gfx_init_spm_golden(adev);
+   if (adev->gfx.funcs->init_spm_golden) {
+   dev_dbg(adev->dev,
+   "GFXOFF is disabled, re-init SPM 
golden settings\n");
+

[PATCH] drm/amd: consolidate TA shared memory structures

2021-08-17 Thread Candice Li

Change-Id: I81be5a824fced3d2244cf209444c2391f6bc6c50
Signed-off-by: Candice Li 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c   | 218 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h   |  68 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c   |   4 +-
 .../gpu/drm/amd/amdgpu/amdgpu_securedisplay.c |   4 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_hdcp.c|  12 +-
 .../drm/amd/display/modules/hdcp/hdcp_psp.c   |  56 ++---
 6 files changed, 167 insertions(+), 195 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index cf40609f39d4f0..ebb827b6331b65 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -908,9 +908,9 @@ static int psp_xgmi_init_shared_buf(struct psp_context *psp)
 */
ret = amdgpu_bo_create_kernel(psp->adev, PSP_XGMI_SHARED_MEM_SIZE,
  PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM,
- >xgmi_context.xgmi_shared_bo,
- >xgmi_context.xgmi_shared_mc_addr,
- >xgmi_context.xgmi_shared_buf);
+ 
>xgmi_context.context.mem_context.shared_bo,
+ 
>xgmi_context.context.mem_context.shared_mc_addr,
+ 
>xgmi_context.context.mem_context.shared_buf);
 
return ret;
 }
@@ -957,15 +957,15 @@ static int psp_xgmi_load(struct psp_context *psp)
psp_prep_ta_load_cmd_buf(cmd,
 psp->fw_pri_mc_addr,
 psp->xgmi.size_bytes,
-psp->xgmi_context.xgmi_shared_mc_addr,
+
psp->xgmi_context.context.mem_context.shared_mc_addr,
 PSP_XGMI_SHARED_MEM_SIZE);
 
ret = psp_cmd_submit_buf(psp, NULL, cmd,
 psp->fence_buf_mc_addr);
 
if (!ret) {
-   psp->xgmi_context.initialized = 1;
-   psp->xgmi_context.session_id = cmd->resp.session_id;
+   psp->xgmi_context.context.initialized = true;
+   psp->xgmi_context.context.session_id = cmd->resp.session_id;
}
 
release_psp_cmd_buf(psp);
@@ -990,7 +990,7 @@ static int psp_xgmi_unload(struct psp_context *psp)
 
cmd = acquire_psp_cmd_buf(psp);
 
-   psp_prep_ta_unload_cmd_buf(cmd, psp->xgmi_context.session_id);
+   psp_prep_ta_unload_cmd_buf(cmd, psp->xgmi_context.context.session_id);
 
ret = psp_cmd_submit_buf(psp, NULL, cmd,
 psp->fence_buf_mc_addr);
@@ -1002,26 +1002,26 @@ static int psp_xgmi_unload(struct psp_context *psp)
 
 int psp_xgmi_invoke(struct psp_context *psp, uint32_t ta_cmd_id)
 {
-   return psp_ta_invoke(psp, ta_cmd_id, psp->xgmi_context.session_id);
+   return psp_ta_invoke(psp, ta_cmd_id, 
psp->xgmi_context.context.session_id);
 }
 
 int psp_xgmi_terminate(struct psp_context *psp)
 {
int ret;
 
-   if (!psp->xgmi_context.initialized)
+   if (!psp->xgmi_context.context.initialized)
return 0;
 
ret = psp_xgmi_unload(psp);
if (ret)
return ret;
 
-   psp->xgmi_context.initialized = 0;
+   psp->xgmi_context.context.initialized = false;
 
/* free xgmi shared memory */
-   amdgpu_bo_free_kernel(>xgmi_context.xgmi_shared_bo,
-   >xgmi_context.xgmi_shared_mc_addr,
-   >xgmi_context.xgmi_shared_buf);
+   amdgpu_bo_free_kernel(>xgmi_context.context.mem_context.shared_bo,
+   >xgmi_context.context.mem_context.shared_mc_addr,
+   >xgmi_context.context.mem_context.shared_buf);
 
return 0;
 }
@@ -1036,7 +1036,7 @@ int psp_xgmi_initialize(struct psp_context *psp)
!psp->xgmi.start_addr)
return -ENOENT;
 
-   if (!psp->xgmi_context.initialized) {
+   if (!psp->xgmi_context.context.initialized) {
ret = psp_xgmi_init_shared_buf(psp);
if (ret)
return ret;
@@ -1048,7 +1048,7 @@ int psp_xgmi_initialize(struct psp_context *psp)
return ret;
 
/* Initialize XGMI session */
-   xgmi_cmd = (struct ta_xgmi_shared_memory 
*)(psp->xgmi_context.xgmi_shared_buf);
+   xgmi_cmd = (struct ta_xgmi_shared_memory 
*)(psp->xgmi_context.context.mem_context.shared_buf);
memset(xgmi_cmd, 0, sizeof(struct ta_xgmi_shared_memory));
xgmi_cmd->cmd_id = TA_COMMAND_XGMI__INITIALIZE;
 
@@ -1062,7 +1062,7 @@ int psp_xgmi_get_hive_id(struct psp_context *psp, 
uint64_t *hive_id)
struct ta_xgmi_shared_memory *xgmi_cmd;
int ret;
 
-   xgmi_cmd = (struct ta_xgmi_shared_memory 
*)psp->xgmi_context.xgmi_shared_buf;
+   xgmi_cmd = (struct ta_xgmi_shared_memory

Re: [PATCH v3] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-17 Thread Michel Dänzer

On 2021-08-17 10:17 a.m., Lazar, Lijo wrote:
> On 8/17/2021 1:21 PM, Quan, Evan wrote:
>>> -Original Message-
>>> From: amd-gfx  On Behalf Of
>>> Michel Dänzer
>>> Sent: Monday, August 16, 2021 6:35 PM
>>> To: Deucher, Alexander ; Koenig, Christian
>>> 
>>> Cc: Liu, Leo ; Zhu, James ; amd-
>>> g...@lists.freedesktop.org; dri-de...@lists.freedesktop.org
>>> Subject: [PATCH v3] drm/amdgpu: Cancel delayed work when GFXOFF is
>>> disabled
>>>
>>> From: Michel Dänzer 
>>>
>>> schedule_delayed_work does not push back the work if it was already
>>> scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
>>> after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
>>> was disabled and re-enabled again during those 100 ms.
>>>
>>> This resulted in frame drops / stutter with the upcoming mutter 41
>>> release on Navi 14, due to constantly enabling GFXOFF in the HW and
>>> disabling it again (for getting the GPU clock counter).
>>>
>>> To fix this, call cancel_delayed_work_sync when the disable count
>>> transitions from 0 to 1, and only schedule the delayed work on the
>>> reverse transition, not if the disable count was already 0. This makes
>>> sure the delayed work doesn't run at unexpected times, and allows it to
>>> be lock-free.
>>>
>>> v2:
>>> * Use cancel_delayed_work_sync & mutex_trylock instead of
>>>    mod_delayed_work.
>>> v3:
>>> * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
>>>
>>> Cc: sta...@vger.kernel.org
>>> Signed-off-by: Michel Dänzer 
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +--
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    | 22 +-
>>> 
>>>   2 files changed, 22 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index f3fd5ec710b6..f944ed858f3e 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -2777,12 +2777,11 @@ static void
>>> amdgpu_device_delay_enable_gfx_off(struct work_struct *work)
>>>   struct amdgpu_device *adev =
>>>   container_of(work, struct amdgpu_device,
>>> gfx.gfx_off_delay_work.work);
>>>
>>> -    mutex_lock(>gfx.gfx_off_mutex);
>>> -    if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
>>> -    if (!amdgpu_dpm_set_powergating_by_smu(adev,
>>> AMD_IP_BLOCK_TYPE_GFX, true))
>>> -    adev->gfx.gfx_off_state = true;
>>> -    }
>>> -    mutex_unlock(>gfx.gfx_off_mutex);
>>> +    WARN_ON_ONCE(adev->gfx.gfx_off_state);
>>> +    WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
>>> +
>>> +    if (!amdgpu_dpm_set_powergating_by_smu(adev,
>>> AMD_IP_BLOCK_TYPE_GFX, true))
>>> +    adev->gfx.gfx_off_state = true;
>>>   }
>>>
>>>   /**
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>> index a0be0772c8b3..ca91aafcb32b 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>>> @@ -563,15 +563,26 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device
>>> *adev, bool enable)
>>>
>>>   mutex_lock(>gfx.gfx_off_mutex);
>>>
>>> -    if (!enable)
>>> -    adev->gfx.gfx_off_req_count++;
>>> -    else if (adev->gfx.gfx_off_req_count > 0)
>>> +    if (enable) {
>>> +    /* If the count is already 0, it means there's an imbalance bug
>>> somewhere.
>>> + * Note that the bug may be in a different caller than the one
>>> which triggers the
>>> + * WARN_ON_ONCE.
>>> + */
>>> +    if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
>>> +    goto unlock;
>>> +
>>>   adev->gfx.gfx_off_req_count--;
>>> +    } else {
>>> +    adev->gfx.gfx_off_req_count++;
>>> +    }
>>>
>>>   if (enable && !adev->gfx.gfx_off_state && !adev-
 gfx.gfx_off_req_count) {
>>>   schedule_delayed_work(>gfx.gfx_off_delay_work,
>>> GFX_OFF_DELAY_ENABLE);
>>> -    } else if (!enable && adev->gfx.gfx_off_state) {
>>> -    if (!amdgpu_dpm_set_powergating_by_smu(adev,
>>> AMD_IP_BLOCK_TYPE_GFX, false)) {
>>> +    } else if (!enable && adev->gfx.gfx_off_req_count == 1) {
>> [Quan, Evan] It seems here will leave a small time window for race 
>> condition. If amdgpu_device_delay_enable_gfx_off() happens to occur here, it 
>> will "WARN_ON_ONCE(adev->gfx.gfx_off_req_count);". How about something as 
>> below?
>> @@ -573,13 +573,11 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, 
>> bool enable)
>>  goto unlock;
>>
>>  adev->gfx.gfx_off_req_count--;
>> -   } else {
>> -   adev->gfx.gfx_off_req_count++;
>>  }
>>
>>  if (enable && !adev->gfx.gfx_off_state && 
>> !adev->gfx.gfx_off_req_count) {
>>  schedule_delayed_work(>gfx.gfx_off_delay_work, 
>> GFX_OFF_DELAY_ENABLE);
>> -   } else if (!enable && adev->gfx.gfx_off_req_count == 1) {
>> +   } else if

[PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-17 Thread Michel Dänzer

From: Michel Dänzer 

schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
  mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
v4:
* Fix race condition between amdgpu_gfx_off_ctrl incrementing
  adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
  checking for it to be 0 (Evan Quan)

Cc: sta...@vger.kernel.org
Reviewed-by: Lijo Lazar  # v3
Acked-by: Christian König  # v3
Signed-off-by: Michel Dänzer 
---

Alex, probably best to wait a bit longer before picking this up. :)

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 36 +++---
 2 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index f3fd5ec710b6..f944ed858f3e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2777,12 +2777,11 @@ static void amdgpu_device_delay_enable_gfx_off(struct 
work_struct *work)
struct amdgpu_device *adev =
container_of(work, struct amdgpu_device, 
gfx.gfx_off_delay_work.work);
 
-   mutex_lock(>gfx.gfx_off_mutex);
-   if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
-   if (!amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, true))
-   adev->gfx.gfx_off_state = true;
-   }
-   mutex_unlock(>gfx.gfx_off_mutex);
+   WARN_ON_ONCE(adev->gfx.gfx_off_state);
+   WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
+
+   if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, 
true))
+   adev->gfx.gfx_off_state = true;
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index a0be0772c8b3..b4ced45301be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -563,24 +563,38 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool 
enable)
 
mutex_lock(>gfx.gfx_off_mutex);
 
-   if (!enable)
-   adev->gfx.gfx_off_req_count++;
-   else if (adev->gfx.gfx_off_req_count > 0)
+   if (enable) {
+   /* If the count is already 0, it means there's an imbalance bug 
somewhere.
+* Note that the bug may be in a different caller than the one 
which triggers the
+* WARN_ON_ONCE.
+*/
+   if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
+   goto unlock;
+
adev->gfx.gfx_off_req_count--;
 
-   if (enable && !adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) 
{
-   schedule_delayed_work(>gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);
-   } else if (!enable && adev->gfx.gfx_off_state) {
-   if (!amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, false)) {
-   adev->gfx.gfx_off_state = false;
+   if (adev->gfx.gfx_off_req_count == 0 && 
!adev->gfx.gfx_off_state)
+   schedule_delayed_work(>gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);
+   } else {
+   if (adev->gfx.gfx_off_req_count == 0) {
+   cancel_delayed_work_sync(>gfx.gfx_off_delay_work);
+
+   if (adev->gfx.gfx_off_state &&
+   !amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, false)) {
+   adev->gfx.gfx_off_state = false;
 
-   if (adev->gfx.funcs->init_spm_golden) {
-   dev_dbg(adev->dev, "GFXOFF is disabled, re-init 
SPM golden settings\n");
-   amdgpu_gfx_init_spm_golden(adev);
+   if (adev->gfx.funcs->init_spm_golden) {
+   dev_dbg(adev->dev,
+   "GFXOFF is disabled, re-init 
SPM golden settings\n");
+   amdgpu_gfx_init_spm_golden(adev);
+   }
}
}
+
+   adev->gfx.gfx_off_req_count++;

Re: [PATCH v3] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-17 Thread Lazar, Lijo





On 8/17/2021 1:21 PM, Quan, Evan wrote:

[AMD Official Use Only]




-Original Message-
From: amd-gfx  On Behalf Of
Michel Dänzer
Sent: Monday, August 16, 2021 6:35 PM
To: Deucher, Alexander ; Koenig, Christian

Cc: Liu, Leo ; Zhu, James ; amd-
g...@lists.freedesktop.org; dri-de...@lists.freedesktop.org
Subject: [PATCH v3] drm/amdgpu: Cancel delayed work when GFXOFF is
disabled

From: Michel Dänzer 

schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
   mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)

Cc: sta...@vger.kernel.org
Signed-off-by: Michel Dänzer 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +--
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 22 +-

  2 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index f3fd5ec710b6..f944ed858f3e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2777,12 +2777,11 @@ static void
amdgpu_device_delay_enable_gfx_off(struct work_struct *work)
struct amdgpu_device *adev =
container_of(work, struct amdgpu_device,
gfx.gfx_off_delay_work.work);

-   mutex_lock(>gfx.gfx_off_mutex);
-   if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
-   if (!amdgpu_dpm_set_powergating_by_smu(adev,
AMD_IP_BLOCK_TYPE_GFX, true))
-   adev->gfx.gfx_off_state = true;
-   }
-   mutex_unlock(>gfx.gfx_off_mutex);
+   WARN_ON_ONCE(adev->gfx.gfx_off_state);
+   WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
+
+   if (!amdgpu_dpm_set_powergating_by_smu(adev,
AMD_IP_BLOCK_TYPE_GFX, true))
+   adev->gfx.gfx_off_state = true;
  }

  /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index a0be0772c8b3..ca91aafcb32b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -563,15 +563,26 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device
*adev, bool enable)

mutex_lock(>gfx.gfx_off_mutex);

-   if (!enable)
-   adev->gfx.gfx_off_req_count++;
-   else if (adev->gfx.gfx_off_req_count > 0)
+   if (enable) {
+   /* If the count is already 0, it means there's an imbalance bug
somewhere.
+* Note that the bug may be in a different caller than the one
which triggers the
+* WARN_ON_ONCE.
+*/
+   if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
+   goto unlock;
+
adev->gfx.gfx_off_req_count--;
+   } else {
+   adev->gfx.gfx_off_req_count++;
+   }

if (enable && !adev->gfx.gfx_off_state && !adev-

gfx.gfx_off_req_count) {

schedule_delayed_work(>gfx.gfx_off_delay_work,
GFX_OFF_DELAY_ENABLE);
-   } else if (!enable && adev->gfx.gfx_off_state) {
-   if (!amdgpu_dpm_set_powergating_by_smu(adev,
AMD_IP_BLOCK_TYPE_GFX, false)) {
+   } else if (!enable && adev->gfx.gfx_off_req_count == 1) {

[Quan, Evan] It seems here will leave a small time window for race condition. If 
amdgpu_device_delay_enable_gfx_off() happens to occur here, it will 
"WARN_ON_ONCE(adev->gfx.gfx_off_req_count);". How about something as below?
@@ -573,13 +573,11 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool 
enable)
 goto unlock;

 adev->gfx.gfx_off_req_count--;
-   } else {
-   adev->gfx.gfx_off_req_count++;
 }

 if (enable && !adev->gfx.gfx_off_state && 
!adev->gfx.gfx_off_req_count) {
 schedule_delayed_work(>gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);
-   } else if (!enable && adev->gfx.gfx_off_req_count == 1) {
+   } else if (!enable && adev->gfx.gfx_off_req_count == 0) {
 cancel_delayed_work_sync(>gfx.gfx_off_delay_work);

 if (adev->gfx.gfx_off_state &&
@@ -593,6 +591,9 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool 
enable)
 }
 }

+   if (!enable)
+

RE: [PATCH v3] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-17 Thread Quan, Evan

[AMD Official Use Only]



> -Original Message-
> From: amd-gfx  On Behalf Of
> Michel Dänzer
> Sent: Monday, August 16, 2021 6:35 PM
> To: Deucher, Alexander ; Koenig, Christian
> 
> Cc: Liu, Leo ; Zhu, James ; amd-
> g...@lists.freedesktop.org; dri-de...@lists.freedesktop.org
> Subject: [PATCH v3] drm/amdgpu: Cancel delayed work when GFXOFF is
> disabled
> 
> From: Michel Dänzer 
> 
> schedule_delayed_work does not push back the work if it was already
> scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
> after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
> was disabled and re-enabled again during those 100 ms.
> 
> This resulted in frame drops / stutter with the upcoming mutter 41
> release on Navi 14, due to constantly enabling GFXOFF in the HW and
> disabling it again (for getting the GPU clock counter).
> 
> To fix this, call cancel_delayed_work_sync when the disable count
> transitions from 0 to 1, and only schedule the delayed work on the
> reverse transition, not if the disable count was already 0. This makes
> sure the delayed work doesn't run at unexpected times, and allows it to
> be lock-free.
> 
> v2:
> * Use cancel_delayed_work_sync & mutex_trylock instead of
>   mod_delayed_work.
> v3:
> * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
> 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Michel Dänzer 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 22 +-
> 
>  2 files changed, 22 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index f3fd5ec710b6..f944ed858f3e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2777,12 +2777,11 @@ static void
> amdgpu_device_delay_enable_gfx_off(struct work_struct *work)
>   struct amdgpu_device *adev =
>   container_of(work, struct amdgpu_device,
> gfx.gfx_off_delay_work.work);
> 
> - mutex_lock(>gfx.gfx_off_mutex);
> - if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
> - if (!amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, true))
> - adev->gfx.gfx_off_state = true;
> - }
> - mutex_unlock(>gfx.gfx_off_mutex);
> + WARN_ON_ONCE(adev->gfx.gfx_off_state);
> + WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
> +
> + if (!amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, true))
> + adev->gfx.gfx_off_state = true;
>  }
> 
>  /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index a0be0772c8b3..ca91aafcb32b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -563,15 +563,26 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device
> *adev, bool enable)
> 
>   mutex_lock(>gfx.gfx_off_mutex);
> 
> - if (!enable)
> - adev->gfx.gfx_off_req_count++;
> - else if (adev->gfx.gfx_off_req_count > 0)
> + if (enable) {
> + /* If the count is already 0, it means there's an imbalance bug
> somewhere.
> +  * Note that the bug may be in a different caller than the one
> which triggers the
> +  * WARN_ON_ONCE.
> +  */
> + if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
> + goto unlock;
> +
>   adev->gfx.gfx_off_req_count--;
> + } else {
> + adev->gfx.gfx_off_req_count++;
> + }
> 
>   if (enable && !adev->gfx.gfx_off_state && !adev-
> >gfx.gfx_off_req_count) {
>   schedule_delayed_work(>gfx.gfx_off_delay_work,
> GFX_OFF_DELAY_ENABLE);
> - } else if (!enable && adev->gfx.gfx_off_state) {
> - if (!amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, false)) {
> + } else if (!enable && adev->gfx.gfx_off_req_count == 1) {
[Quan, Evan] It seems here will leave a small time window for race condition. 
If amdgpu_device_delay_enable_gfx_off() happens to occur here, it will 
"WARN_ON_ONCE(adev->gfx.gfx_off_req_count);". How about something as below?
@@ -573,13 +573,11 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool 
enable)
goto unlock;

adev->gfx.gfx_off_req_count--;
-   } else {
-   adev->gfx.gfx_off_req_count++;
}

if (enable && !adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) 
{
schedule_delayed_work(>gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);
-   } else if (!enable && adev->gfx.gfx_off_req_count == 1) {
+   } else if (!enable && adev->gfx.gfx_off_req_count == 0) {
cancel_delayed_work_sync(>gfx.gfx_off_delay_work);

if (adev->gfx.gfx_off_state &&
@@ -593,6 +591,9 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device

Re: [PATCH] drm/amdgpu: properly powergate Polaris12 UVD/VCE on suspend

2021-08-17 Thread Lazar, Lijo





On 8/17/2021 12:10 PM, Evan Quan wrote:

If the powergating of UVD/VCE is in process, wait for its completion
before proceeding(suspending). This can fix some hangs observed on
suspending when UVD/VCE still using(e.g. issue "pm-suspend" when video
is still playing).

Change-Id: I36f39d9731e0a9638b52d5d92558b0ee9c23a9ed
Signed-off-by: Evan Quan 
Signed-off-by: xinhui pan 
---
  drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 5 +
  drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 5 +
  2 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
index 4eebf973a065..2fdce572baeb 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
@@ -554,6 +554,11 @@ static int uvd_v6_0_suspend(void *handle)
int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
  
+	/*

+* If the powergating is in process, wait for its completion.
+*/
+   flush_delayed_work(>uvd.idle_work);
+
If running idle is a prerequisite before going to suspend, then 
something else is missing here.


Otherwise, the hang looks more like a pending work launched after 
hardware is suspended and trying to access hardware. As the hardware is 
going to be suspended anyway, doesn't it work with 
cancel_delayed_work_sync - making sure that nothing is going to be 
launched later to access hardware?


Then this may be a potential issue for other suspend calls also where 
work is pending to be launched when hardware is suspended.


Thanks,
Lijo


r = uvd_v6_0_hw_fini(adev);
if (r)
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index 6d9108fa22e0..f0adecd5ec0b 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -503,6 +503,11 @@ static int vce_v3_0_suspend(void *handle)
int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
  
+	/*

+* If the powergating is in process, wait for its completion.
+*/
+   flush_delayed_work(>vce.idle_work);
+
r = vce_v3_0_hw_fini(adev);
if (r)
return r;

Re: [PATCH v6 08/13] mm: call pgmap->ops->page_free for DEVICE_GENERIC pages

2021-08-17 Thread Christoph Hellwig

On Mon, Aug 16, 2021 at 03:00:49PM -0400, Felix Kuehling wrote:
> 
> Am 2021-08-15 um 11:40 a.m. schrieb Christoph Hellwig:
> > On Fri, Aug 13, 2021 at 01:31:45AM -0500, Alex Sierra wrote:
> >> Add MEMORY_DEVICE_GENERIC case to free_zone_device_page callback.
> >> Device generic type memory case is now able to free its pages properly.
> > How is this going to work for the two existing MEMORY_DEVICE_GENERIC
> > that now change behavior?  And which don't have a ->page_free callback
> > at all?
> 
> That's a good catch. Existing drivers shouldn't need a page_free
> callback if they didn't have one before. That means we need to add a
> NULL-pointer check in free_device_page.

Also the other state clearing (__ClearPageWaiters/mem_cgroup_uncharge/
->mapping = NULL).

In many ways this seems like you want to bring back the DEVICE_PUBLIC
pgmap type that was removed a while ago due to the lack of users
instead of overloading the generic type.

Re: [PATCH v6 05/13] drm/amdkfd: generic type as sys mem on migration to ram

2021-08-17 Thread Christoph Hellwig

On Mon, Aug 16, 2021 at 02:53:18PM -0500, Sierra Guiza, Alejandro (Alex) wrote:
> For above’s condition equal to connected_to_cpu , we’re explicitly 
> migrating from
> device memory to system memory with device generic type. In this type, 
> device PTEs are
> present in CPU page table.
>
> During migrate_vma_collect_pmd walk op at migrate_vma_setup call, there’s 
> a condition
> for present pte that require migrate->flags be set for 
> MIGRATE_VMA_SELECT_SYSTEM.
> Otherwise, the migration for this entry will be ignored.

I think we might need a new SELECT flag here for IOMEM.

Re: [PATCH v6 04/13] drm/amdkfd: add SPM support for SVM

2021-08-17 Thread Christoph Hellwig

On Mon, Aug 16, 2021 at 02:54:30PM -0400, Felix Kuehling wrote:
> I think you're right. We only need the start and end address from
> lookup_resource and we already know that anyway. It means we can drop
> patch 3 from the series.
> 
> Just to be sure, we'll confirm that the end address determined by our
> driver matches the one from lookup_resource (coming from the system
> address map in the system BIOS). If there were a mismatch, it would
> probably be a bug (in the driver or the BIOS) that we'd need to fix anyway.

Or rather that the driver claimed area is smaller or the same as the
bios range.  No harm (except for potential peformance implications) when
you don't use all of it.

> I don't really see the "mess" you're talking about. Including the above,
> there are only 3 conditional statements in that function that are not
> error-handling related:
> 
> /* Page migration works on Vega10 or newer */
> if (kfddev->device_info->asic_family < CHIP_VEGA10)
> return -EINVAL;
> ...
> if (xgmi_connected_to_cpu)
> res = lookup_resource(_resource, adev->gmc.aper_base);
> else
> res = devm_request_free_mem_region(adev->dev, 
> _resource, size);
> ...
> pgmap->type = xgmi_connected_to_cpu ?
> MEMORY_DEVICE_GENERIC : MEMORY_DEVICE_PRIVATE;
> 

Plus the devm_release_mem_region error handling that is currently missing.

[PATCH] drm/amdgpu: properly powergate Polaris12 UVD/VCE on suspend

2021-08-17 Thread Evan Quan

If the powergating of UVD/VCE is in process, wait for its completion
before proceeding(suspending). This can fix some hangs observed on
suspending when UVD/VCE still using(e.g. issue "pm-suspend" when video
is still playing).

Change-Id: I36f39d9731e0a9638b52d5d92558b0ee9c23a9ed
Signed-off-by: Evan Quan 
Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 5 +
 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 5 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
index 4eebf973a065..2fdce572baeb 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
@@ -554,6 +554,11 @@ static int uvd_v6_0_suspend(void *handle)
int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   /*
+* If the powergating is in process, wait for its completion.
+*/
+   flush_delayed_work(>uvd.idle_work);
+
r = uvd_v6_0_hw_fini(adev);
if (r)
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index 6d9108fa22e0..f0adecd5ec0b 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -503,6 +503,11 @@ static int vce_v3_0_suspend(void *handle)
int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   /*
+* If the powergating is in process, wait for its completion.
+*/
+   flush_delayed_work(>vce.idle_work);
+
r = vce_v3_0_hw_fini(adev);
if (r)
return r;
-- 
2.29.0

RE: [PATCH] Revert "drm/scheduler: Avoid accessing freed bad job."

2021-08-17 Thread Liu, Monk

[AMD Official Use Only]

Reviewed-by: Monk Liu 

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Jingwen Chen  
Sent: Tuesday, August 17, 2021 12:28 PM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Monk ; Koenig, Christian ; 
Grodzovsky, Andrey ; Chen, JingWen 

Subject: [PATCH] Revert "drm/scheduler: Avoid accessing freed bad job."

[Why]
for bailing job, this commit will delete it from pending list thus the bailing 
job will never have a chance to be resubmitted even in advance tdr mode.

[How]
after embeded hw_fence into amdgpu_job is done, the race condition that this 
commit tries to work around is completely solved.So revert this commit.
This reverts commit 135517d3565b48f4def3b1b82008bc17eb5d1c90.

Signed-off-by: Jingwen Chen 
---
 drivers/gpu/drm/scheduler/sched_main.c | 27 --
 1 file changed, 27 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a953693b45..31d1176d939f 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -317,21 +317,10 @@ static void drm_sched_job_timedout(struct work_struct 
*work)
enum drm_gpu_sched_stat status = DRM_GPU_SCHED_STAT_NOMINAL;
 
sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
-
-   /* Protects against concurrent deletion in drm_sched_get_cleanup_job */
-   spin_lock(>job_list_lock);
job = list_first_entry_or_null(>pending_list,
   struct drm_sched_job, list);
 
if (job) {
-   /*
-* Remove the bad job so it cannot be freed by concurrent
-* drm_sched_cleanup_jobs. It will be reinserted back after 
sched->thread
-* is parked at which point it's safe.
-*/
-   list_del_init(>list);
-   spin_unlock(>job_list_lock);
-
status = job->sched->ops->timedout_job(job);
 
/*
@@ -342,8 +331,6 @@ static void drm_sched_job_timedout(struct work_struct *work)
job->sched->ops->free_job(job);
sched->free_guilty = false;
}
-   } else {
-   spin_unlock(>job_list_lock);
}
 
if (status != DRM_GPU_SCHED_STAT_ENODEV) { @@ -392,20 +379,6 @@ void 
drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 
kthread_park(sched->thread);
 
-   /*
-* Reinsert back the bad job here - now it's safe as
-* drm_sched_get_cleanup_job cannot race against us and release the
-* bad job at this point - we parked (waited for) any in progress
-* (earlier) cleanups and drm_sched_get_cleanup_job will not be called
-* now until the scheduler thread is unparked.
-*/
-   if (bad && bad->sched == sched)
-   /*
-* Add at the head of the queue to reflect it was the earliest
-* job extracted.
-*/
-   list_add(>list, >pending_list);
-
/*
 * Iterate the job list from later to  earlier one and either deactive
 * their HW callbacks or remove them from pending list if they already
--
2.25.1

Re: [PATCH] drm/amdgpu/vcn:enable priority queues for encoder

2021-08-17 Thread Sharma, Shashank


Hi Satyajit,

On 8/10/2021 12:39 PM, Satyajit Sahu wrote:

VCN and VCE support multiple queues with different priority.
Use differnt encoder queue based on the priority set by UMD.

Signed-off-by: Satyajit Sahu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c   | 35 +--
  drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c   | 14 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h   |  9 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c   | 14 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h   |  8 ++
  drivers/gpu/drm/amd/amdgpu/vce_v2_0.c |  4 ++-
  drivers/gpu/drm/amd/amdgpu/vce_v3_0.c |  4 ++-
  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c |  4 ++-
  drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c |  4 ++-
  drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c |  4 ++-
  drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c |  4 ++-
  drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c |  5 ++--
  13 files changed, 99 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index e7a010b7ca1f..b0ae2b45c7c3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -73,15 +73,44 @@ static enum gfx_pipe_priority 
amdgpu_ctx_sched_prio_to_compute_prio(enum drm_sch
}
  }
  
+static enum gfx_pipe_priority amdgpu_ctx_sched_prio_to_vcn_prio(enum drm_sched_priority prio)

+{
+   switch (prio) {
+   case DRM_SCHED_PRIORITY_HIGH:
+   return AMDGPU_VCN_ENC_PRIO_HIGH;
+   case DRM_SCHED_PRIORITY_KERNEL:
+   return AMDGPU_VCN_ENC_PRIO_VERY_HIGH;
+   default:
+   return AMDGPU_VCN_ENC_PRIO_NORMAL;
+   }
+}
+
+static enum gfx_pipe_priority amdgpu_ctx_sched_prio_to_vce_prio(enum 
drm_sched_priority prio)
+{
+   switch (prio) {
+   case DRM_SCHED_PRIORITY_HIGH:
+   return AMDGPU_VCE_ENC_PRIO_HIGH;
+   case DRM_SCHED_PRIORITY_KERNEL:
+   return AMDGPU_VCE_ENC_PRIO_VERY_HIGH;
+   default:
+   return AMDGPU_VCE_ENC_PRIO_NORMAL;
+   }
+}
+


As it is also mentioned by Nirmoy, we can't use kernel priority for 
VCE/VCN ENC work due to the delay/stale punishment.


But adding DRM_SCHED_PRIORITY_VERY_HIGH also might not be a good way as 
other drivers might not be very interested in that (or are they ?)


I can think of two ways to go ahead from here:

Mrthod 1:
=
- Introduce DRM_SCHED_PRIORITY_VERY_HIGH in the DRM scheduler layer, and 
demo a usecase of the same in our high priority VCN queues (in fact this 
should work for high priority gfx queues as well)


- Check the feedback from the community

If there are not many consumer or there is not much interest around:

Mehod 2:

- Add a AMDGPU specific priority flag while submitting the work
- Use this flag to keep the DRM scheduler level: DRM_SCHED_PRIORITY_HIGH 
but the AMD scheduler level: AMDGPU_VCE_ENC_PRIO_VERY_HIGH
- Pass this flag parameter to 
amdgpu_ctx_sched_prio_to_vcn_prio/amdgpu_ctx_sched_prio_to_vce_prio 
functions and set the proper scheduler level which will allow AMDGPU 
driver to schedule it in the correct ring



  static unsigned int amdgpu_ctx_prio_sched_to_hw(struct amdgpu_device *adev,
 enum drm_sched_priority prio,
 u32 hw_ip)
  {
unsigned int hw_prio;
  
-	hw_prio = (hw_ip == AMDGPU_HW_IP_COMPUTE) ?

-   amdgpu_ctx_sched_prio_to_compute_prio(prio) :
-   AMDGPU_RING_PRIO_DEFAULT;
+   if (hw_ip == AMDGPU_HW_IP_COMPUTE)
+   hw_prio = amdgpu_ctx_sched_prio_to_compute_prio(prio);
+   else if (hw_ip == AMDGPU_HW_IP_VCN_ENC)
+   hw_prio = amdgpu_ctx_sched_prio_to_vcn_prio(prio);
+   else if (hw_ip == AMDGPU_HW_IP_VCE)
+   hw_prio = amdgpu_ctx_sched_prio_to_vce_prio(prio);
+   else
+   hw_prio = AMDGPU_RING_PRIO_DEFAULT;


Move to switch(case) probably for the better appearance ?


hw_ip = array_index_nospec(hw_ip, AMDGPU_HW_IP_NUM);
if (adev->gpu_sched[hw_ip][hw_prio].num_scheds == 0)
hw_prio = AMDGPU_RING_PRIO_DEFAULT;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
index b7d861ed5284..4087acb6b8bb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
@@ -37,7 +37,7 @@ int amdgpu_to_sched_priority(int amdgpu_priority,
  {
switch (amdgpu_priority) {
case AMDGPU_CTX_PRIORITY_VERY_HIGH:
-   *prio = DRM_SCHED_PRIORITY_HIGH;
+   *prio = DRM_SCHED_PRIORITY_KERNEL;

as mentioned above

break;
case AMDGPU_CTX_PRIORITY_HIGH:
*prio = DRM_SCHED_PRIORITY_HIGH;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index

67 matches

Mail list logo