date:20191016

Re: [PATCH] drm/amdgpu: always reset asic when going into suspend

2019-10-16 Thread Daniel Drake

On Wed, Oct 16, 2019 at 2:43 AM Alex Deucher  wrote:
> Is s2idle actually powering down the GPU?

My understanding is that s2idle (at a high level) just calls all
devices suspend routines and then puts the CPU into its deepest
running state.
So if there is something special to be done to power off the GPU, I
believe that amdgpu is responsible for making arrangements for that to
happen.
In this case the amdgpu code already does:

pci_disable_device(dev->pdev);
pci_set_power_state(dev->pdev, PCI_D3hot);

And the PCI layer will call through to any appropriate ACPI methods
related to that low power state.

> Do you see a difference in power usage?  I think you are just working around 
> the fact that the
> GPU never actually gets powered down.

I ran a series of experiments.

Base setup: no UI running, ran "setterm -powersave 1; setterm -blank
1" and waited 1 minute for screen to turn off.
Base power usage in this state is 4.7W as reported by BAT0/power_now

1. Run amdgpu_device_suspend(ddev, true, true); before my change
--> Power usage increases to 6.1W

2. Run amdgpu_device_suspend(ddev, true, true); with my change applied
--> Power usage increases to 6.0W

3. Put amdgpu device in runtime suspend
--> Power usage increases to 6.2W

4. Try unmodified suspend path but d3cold instead of d3hot
--> Power usage increases to 6.1W

So, all of the suspend schemes actually increase the power usage by
roughly the same amount, reset or not, with and without my patch :/
Any ideas?

Thanks,
Daniel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH hmm 08/15] xen/gntdev: Use select for DMA_SHARED_BUFFER

2019-10-16 Thread Oleksandr Andrushchenko

On 10/16/19 8:11 AM, Jürgen Groß wrote:
> On 15.10.19 20:12, Jason Gunthorpe wrote:
>> From: Jason Gunthorpe 
>>
>> DMA_SHARED_BUFFER can not be enabled by the user (it represents a 
>> library
>> set in the kernel). The kconfig convention is to use select for such
>> symbols so they are turned on implicitly when the user enables a kconfig
>> that needs them.
>>
>> Otherwise the XEN_GNTDEV_DMABUF kconfig is overly difficult to enable.
>>
>> Fixes: 932d6562179e ("xen/gntdev: Add initial support for dma-buf UAPI")
>> Cc: Oleksandr Andrushchenko 
>> Cc: Boris Ostrovsky 
>> Cc: xen-de...@lists.xenproject.org
>> Cc: Juergen Gross 
>> Cc: Stefano Stabellini 
>> Signed-off-by: Jason Gunthorpe 
>
> Reviewed-by: Juergen Gross 
>
Reviewed-by: Oleksandr Andrushchenko 
>
> Juergen
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH hmm 08/15] xen/gntdev: Use select for DMA_SHARED_BUFFER

2019-10-16 Thread Jürgen Groß


On 15.10.19 20:12, Jason Gunthorpe wrote:

From: Jason Gunthorpe 

DMA_SHARED_BUFFER can not be enabled by the user (it represents a library
set in the kernel). The kconfig convention is to use select for such
symbols so they are turned on implicitly when the user enables a kconfig
that needs them.

Otherwise the XEN_GNTDEV_DMABUF kconfig is overly difficult to enable.

Fixes: 932d6562179e ("xen/gntdev: Add initial support for dma-buf UAPI")
Cc: Oleksandr Andrushchenko 
Cc: Boris Ostrovsky 
Cc: xen-de...@lists.xenproject.org
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Signed-off-by: Jason Gunthorpe 


Reviewed-by: Juergen Gross 


Juergen
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu/powerplay: implement interface pp_power_profile_mode

2019-10-16 Thread Liang, Prike

implement get_power_profile_mode for getting power profile mode status.

Signed-off-by: Prike Liang 
---
 drivers/gpu/drm/amd/powerplay/renoir_ppt.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/renoir_ppt.c 
b/drivers/gpu/drm/amd/powerplay/renoir_ppt.c
index fa314c2..953e347 100644
--- a/drivers/gpu/drm/amd/powerplay/renoir_ppt.c
+++ b/drivers/gpu/drm/amd/powerplay/renoir_ppt.c
@@ -640,6 +640,39 @@ static int renoir_set_watermarks_table(
return ret;
 }
 
+static int renoir_get_power_profile_mode(struct smu_context *smu,
+  char *buf)
+{
+   static const char *profile_name[] = {
+   "BOOTUP_DEFAULT",
+   "3D_FULL_SCREEN",
+   "POWER_SAVING",
+   "VIDEO",
+   "VR",
+   "COMPUTE",
+   "CUSTOM"};
+   uint32_t i, size = 0;
+   int16_t workload_type = 0;
+
+   if (!smu->pm_enabled || !buf)
+   return -EINVAL;
+
+   for (i = 0; i <= PP_SMC_POWER_PROFILE_CUSTOM; i++) {
+   /*
+* Conv PP_SMC_POWER_PROFILE* to WORKLOAD_PPLIB_*_BIT
+* Not all profile modes are supported on arcturus.
+*/
+   workload_type = smu_workload_get_type(smu, i);
+   if (workload_type < 0)
+   continue;
+
+   size += sprintf(buf + size, "%2d %14s%s\n",
+   i, profile_name[i], (i == smu->power_profile_mode) ? 
"*" : " ");
+   }
+
+   return size;
+}
+
 static const struct pptable_funcs renoir_ppt_funcs = {
.get_smu_msg_index = renoir_get_smu_msg_index,
.get_smu_table_index = renoir_get_smu_table_index,
@@ -658,6 +691,7 @@ static const struct pptable_funcs renoir_ppt_funcs = {
.set_performance_level = renoir_set_performance_level,
.get_dpm_clock_table = renoir_get_dpm_clock_table,
.set_watermarks_table = renoir_set_watermarks_table,
+   .get_power_profile_mode = renoir_get_power_profile_mode,
 };
 
 void renoir_set_ppt_funcs(struct smu_context *smu)
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH hmm 00/15] Consolidate the mmu notifier interval_tree and locking

2019-10-16 Thread Christian König


Am 15.10.19 um 20:12 schrieb Jason Gunthorpe:

From: Jason Gunthorpe 

8 of the mmu_notifier using drivers (i915_gem, radeon_mn, umem_odp, hfi1,
scif_dma, vhost, gntdev, hmm) drivers are using a common pattern where
they only use invalidate_range_start/end and immediately check the
invalidating range against some driver data structure to tell if the
driver is interested. Half of them use an interval_tree, the others are
simple linear search lists.

Of the ones I checked they largely seem to have various kinds of races,
bugs and poor implementation. This is a result of the complexity in how
the notifier interacts with get_user_pages(). It is extremely difficult to
use it correctly.

Consolidate all of this code together into the core mmu_notifier and
provide a locking scheme similar to hmm_mirror that allows the user to
safely use get_user_pages() and reliably know if the page list still
matches the mm.


That sounds really good, but could you outline for a moment how that is 
archived?


Please keep in mind that the page reference get_user_pages() grabs is 
*NOT* sufficient to guarantee coherency here.


Regards,
Christian.



This new arrangment plays nicely with the !blockable mode for
OOM. Scanning the interval tree is done such that the intersection test
will always succeed, and since there is no invalidate_range_end exposed to
drivers the scheme safely allows multiple drivers to be subscribed.

Four places are converted as an example of how the new API is used.
Four are left for future patches:
  - i915_gem has complex locking around destruction of a registration,
needs more study
  - hfi1 (2nd user) needs access to the rbtree
  - scif_dma has a complicated logic flow
  - vhost's mmu notifiers are already being rewritten

This is still being tested, but I figured to send it to start getting help
from the xen, amd and hfi drivers which I cannot test here.

It would be intended for the hmm tree.

Jason Gunthorpe (15):
   mm/mmu_notifier: define the header pre-processor parts even if
 disabled
   mm/mmu_notifier: add an interval tree notifier
   mm/hmm: allow hmm_range to be used with a mmu_range_notifier or
 hmm_mirror
   mm/hmm: define the pre-processor related parts of hmm.h even if
 disabled
   RDMA/odp: Use mmu_range_notifier_insert()
   RDMA/hfi1: Use mmu_range_notifier_inset for user_exp_rcv
   drm/radeon: use mmu_range_notifier_insert
   xen/gntdev: Use select for DMA_SHARED_BUFFER
   xen/gntdev: use mmu_range_notifier_insert
   nouveau: use mmu_notifier directly for invalidate_range_start
   nouveau: use mmu_range_notifier instead of hmm_mirror
   drm/amdgpu: Call find_vma under mmap_sem
   drm/amdgpu: Use mmu_range_insert instead of hmm_mirror
   drm/amdgpu: Use mmu_range_notifier instead of hmm_mirror
   mm/hmm: remove hmm_mirror and related

  Documentation/vm/hmm.rst  | 105 +---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   2 +
  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   9 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c|  14 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c|   1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c| 445 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h|  53 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h|  13 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   | 111 ++--
  drivers/gpu/drm/nouveau/nouveau_svm.c | 229 +---
  drivers/gpu/drm/radeon/radeon.h   |   9 +-
  drivers/gpu/drm/radeon/radeon_mn.c| 218 ++-
  drivers/infiniband/core/device.c  |   1 -
  drivers/infiniband/core/umem_odp.c| 288 +-
  drivers/infiniband/hw/hfi1/file_ops.c |   2 +-
  drivers/infiniband/hw/hfi1/hfi.h  |   2 +-
  drivers/infiniband/hw/hfi1/user_exp_rcv.c | 144 ++---
  drivers/infiniband/hw/hfi1/user_exp_rcv.h |   3 +-
  drivers/infiniband/hw/mlx5/mlx5_ib.h  |   7 +-
  drivers/infiniband/hw/mlx5/mr.c   |   3 +-
  drivers/infiniband/hw/mlx5/odp.c  |  48 +-
  drivers/xen/Kconfig   |   3 +-
  drivers/xen/gntdev-common.h   |   8 +-
  drivers/xen/gntdev.c  | 179 ++
  include/linux/hmm.h   | 195 +--
  include/linux/mmu_notifier.h  | 124 +++-
  include/rdma/ib_umem_odp.h|  65 +--
  include/rdma/ib_verbs.h   |   2 -
  kernel/fork.c |   1 -
  mm/Kconfig|   2 +-
  mm/hmm.c  | 275 +
  mm/mmu_notifier.c | 542 +-
  32 files changed, 1180 insertions(+), 1923 deletions(-)



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amd/powerplay: bug fix for memory clock request from display

2019-10-16 Thread Kenneth Feng

In some cases, display fixes memory clock frequency to a high value
rather than the natural memory clock switching.
When we comes back from s3 resume, the request from display is not reset,
this causes the bug which makes the memory clock goes into a low value.
Then due to the insuffcient memory clock, the screen flicks.

Signed-off-by: Kenneth Feng 
---
 drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c 
b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
index e2a03f4..ee374df 100644
--- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
@@ -1354,6 +1354,8 @@ static int smu_resume(void *handle)
if (smu->is_apu)
smu_set_gfx_cgpg(&adev->smu, true);
 
+   smu->disable_uclk_switch = 0;
+
mutex_unlock(&smu->mutex);
 
pr_info("SMU is resumed successfully!\n");
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/3] drm/amdgpu/uvd:Add uvd enc session bo

2019-10-16 Thread Christian König


Am 16.10.19 um 00:08 schrieb Zhu, James:

Add uvd enc session bo for uvd encode IB test.

Signed-off-by: James Zhu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h | 4 
  1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h
index 5eb6328..1e39c8a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h
@@ -67,6 +67,10 @@ struct amdgpu_uvd {
unsignedharvest_config;
/* store image width to adjust nb memory state */
unsigneddecode_image_width;
+
+   struct amdgpu_bo *enc_session_bo;
+   void *enc_session_cpu_addr;
+   uint64_t  enc_session_gpu_addr;


Please don't keep that allocated all the time, but rather only allocate 
it on demand during the IB test.


Regards,
Christian.


  };
  
  int amdgpu_uvd_sw_init(struct amdgpu_device *adev);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 3/3] drm/amdgpu/uvd:Allocate enc session bo for uvd7.0 ring IB test

2019-10-16 Thread Christian König


Am 16.10.19 um 00:08 schrieb Zhu, James:

Allocate 256K enc session bo for uvd6.0 ring IB test to fix S3 resume
corruption issue.

Signed-off-by: James Zhu 
---
  drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 16 ++--
  1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
index 01f658f..1b17fc9 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
@@ -228,7 +228,7 @@ static int uvd_v7_0_enc_get_create_msg(struct amdgpu_ring 
*ring, uint32_t handle
return r;
  
  	ib = &job->ibs[0];

-   dummy = ib->gpu_addr + 1024;
+   dummy = ring->adev->vcn.enc_session_gpu_addr;
  
  	ib->length_dw = 0;

ib->ptr[ib->length_dw++] = 0x0018;
@@ -289,7 +289,7 @@ static int uvd_v7_0_enc_get_destroy_msg(struct amdgpu_ring 
*ring, uint32_t handl
return r;
  
  	ib = &job->ibs[0];

-   dummy = ib->gpu_addr + 1024;
+   dummy = ring->adev->vcn.enc_session_gpu_addr + 128 * PAGE_SIZE;
  
  	ib->length_dw = 0;

ib->ptr[ib->length_dw++] = 0x0018;
@@ -333,9 +333,16 @@ static int uvd_v7_0_enc_get_destroy_msg(struct amdgpu_ring 
*ring, uint32_t handl
   */
  static int uvd_v7_0_enc_ring_test_ib(struct amdgpu_ring *ring, long timeout)
  {
+   struct amdgpu_device *adev = ring->adev;
struct dma_fence *fence = NULL;
long r;
  
+	r = amdgpu_bo_create_kernel(adev, 2 * 128, PAGE_SIZE,

+   AMDGPU_GEM_DOMAIN_VRAM, &adev->vcn.enc_session_bo,
+   &adev->vcn.enc_session_gpu_addr, 
&adev->vcn.enc_session_cpu_addr);
+   if (r)
+   return r;
+


Looks like you actually do allocate that on demand, but please don't put 
the bo and addresses into the adev->vcn structure. It is only valid 
temporary.


Regards,
Christian.


r = uvd_v7_0_enc_get_create_msg(ring, 1, NULL);
if (r)
goto error;
@@ -352,6 +359,11 @@ static int uvd_v7_0_enc_ring_test_ib(struct amdgpu_ring 
*ring, long timeout)
  
  error:

dma_fence_put(fence);
+
+   amdgpu_bo_free_kernel(&adev->vcn.enc_session_bo,
+ &adev->vcn.enc_session_gpu_addr,
+ (void **)&adev->vcn.enc_session_cpu_addr);
+
return r;
  }
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/3] drm/amdgpu/uvd6: fix allocation size in enc ring test (v2)

2019-10-16 Thread Christian König


Am 16.10.19 um 00:18 schrieb Alex Deucher:

We need to allocate a large enough buffer for the
session info, otherwise the IB test can overwrite
other memory.

v2: - session info is 128K according to mesa
 - use the same session info for create and destroy

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204241
Signed-off-by: Alex Deucher 


Yeah, that looks better than the version from James.

Acked-by: Christian König  for the series.


---
  drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 31 ++-
  1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
index 670784a78512..217084d56ab8 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
@@ -206,13 +206,14 @@ static int uvd_v6_0_enc_ring_test_ring(struct amdgpu_ring 
*ring)
   * Open up a stream for HW test
   */
  static int uvd_v6_0_enc_get_create_msg(struct amdgpu_ring *ring, uint32_t 
handle,
+  struct amdgpu_bo *bo,
   struct dma_fence **fence)
  {
const unsigned ib_size_dw = 16;
struct amdgpu_job *job;
struct amdgpu_ib *ib;
struct dma_fence *f = NULL;
-   uint64_t dummy;
+   uint64_t addr;
int i, r;
  
  	r = amdgpu_job_alloc_with_ib(ring->adev, ib_size_dw * 4, &job);

@@ -220,15 +221,15 @@ static int uvd_v6_0_enc_get_create_msg(struct amdgpu_ring 
*ring, uint32_t handle
return r;
  
  	ib = &job->ibs[0];

-   dummy = ib->gpu_addr + 1024;
+   addr = amdgpu_bo_gpu_offset(bo);
  
  	ib->length_dw = 0;

ib->ptr[ib->length_dw++] = 0x0018;
ib->ptr[ib->length_dw++] = 0x0001; /* session info */
ib->ptr[ib->length_dw++] = handle;
ib->ptr[ib->length_dw++] = 0x0001;
-   ib->ptr[ib->length_dw++] = upper_32_bits(dummy);
-   ib->ptr[ib->length_dw++] = dummy;
+   ib->ptr[ib->length_dw++] = upper_32_bits(addr);
+   ib->ptr[ib->length_dw++] = addr;
  
  	ib->ptr[ib->length_dw++] = 0x0014;

ib->ptr[ib->length_dw++] = 0x0002; /* task info */
@@ -268,13 +269,14 @@ static int uvd_v6_0_enc_get_create_msg(struct amdgpu_ring 
*ring, uint32_t handle
   */
  static int uvd_v6_0_enc_get_destroy_msg(struct amdgpu_ring *ring,
uint32_t handle,
+   struct amdgpu_bo *bo,
struct dma_fence **fence)
  {
const unsigned ib_size_dw = 16;
struct amdgpu_job *job;
struct amdgpu_ib *ib;
struct dma_fence *f = NULL;
-   uint64_t dummy;
+   uint64_t addr;
int i, r;
  
  	r = amdgpu_job_alloc_with_ib(ring->adev, ib_size_dw * 4, &job);

@@ -282,15 +284,15 @@ static int uvd_v6_0_enc_get_destroy_msg(struct 
amdgpu_ring *ring,
return r;
  
  	ib = &job->ibs[0];

-   dummy = ib->gpu_addr + 1024;
+   addr = amdgpu_bo_gpu_offset(bo);
  
  	ib->length_dw = 0;

ib->ptr[ib->length_dw++] = 0x0018;
ib->ptr[ib->length_dw++] = 0x0001; /* session info */
ib->ptr[ib->length_dw++] = handle;
ib->ptr[ib->length_dw++] = 0x0001;
-   ib->ptr[ib->length_dw++] = upper_32_bits(dummy);
-   ib->ptr[ib->length_dw++] = dummy;
+   ib->ptr[ib->length_dw++] = upper_32_bits(addr);
+   ib->ptr[ib->length_dw++] = addr;
  
  	ib->ptr[ib->length_dw++] = 0x0014;

ib->ptr[ib->length_dw++] = 0x0002; /* task info */
@@ -327,13 +329,20 @@ static int uvd_v6_0_enc_get_destroy_msg(struct 
amdgpu_ring *ring,
  static int uvd_v6_0_enc_ring_test_ib(struct amdgpu_ring *ring, long timeout)
  {
struct dma_fence *fence = NULL;
+   struct amdgpu_bo *bo = NULL;
long r;
  
-	r = uvd_v6_0_enc_get_create_msg(ring, 1, NULL);

+   r = amdgpu_bo_create_reserved(ring->adev, 128 * 1024, PAGE_SIZE,
+ AMDGPU_GEM_DOMAIN_VRAM,
+ &bo, NULL, NULL);
+   if (r)
+   return r;
+
+   r = uvd_v6_0_enc_get_create_msg(ring, 1, bo, NULL);
if (r)
goto error;
  
-	r = uvd_v6_0_enc_get_destroy_msg(ring, 1, &fence);

+   r = uvd_v6_0_enc_get_destroy_msg(ring, 1, bo, &fence);
if (r)
goto error;
  
@@ -345,6 +354,8 @@ static int uvd_v6_0_enc_ring_test_ib(struct amdgpu_ring *ring, long timeout)
  
  error:

dma_fence_put(fence);
+   amdgpu_bo_unreserve(bo);
+   amdgpu_bo_unref(&bo);
return r;
  }
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu/powerplay: implement interface pp_power_profile_mode

2019-10-16 Thread Quan, Evan

Reviewed-by: Evan Quan 

> -Original Message-
> From: Liang, Prike 
> Sent: 2019年10月16日 16:24
> To: amd-gfx@lists.freedesktop.org
> Cc: Quan, Evan ; Huang, Ray
> ; Liang, Prike 
> Subject: [PATCH] drm/amdgpu/powerplay: implement interface
> pp_power_profile_mode
> 
> implement get_power_profile_mode for getting power profile mode status.
> 
> Signed-off-by: Prike Liang 
> ---
>  drivers/gpu/drm/amd/powerplay/renoir_ppt.c | 34
> ++
>  1 file changed, 34 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/powerplay/renoir_ppt.c
> b/drivers/gpu/drm/amd/powerplay/renoir_ppt.c
> index fa314c2..953e347 100644
> --- a/drivers/gpu/drm/amd/powerplay/renoir_ppt.c
> +++ b/drivers/gpu/drm/amd/powerplay/renoir_ppt.c
> @@ -640,6 +640,39 @@ static int renoir_set_watermarks_table(
>   return ret;
>  }
> 
> +static int renoir_get_power_profile_mode(struct smu_context *smu,
> +char *buf)
> +{
> + static const char *profile_name[] = {
> + "BOOTUP_DEFAULT",
> + "3D_FULL_SCREEN",
> + "POWER_SAVING",
> + "VIDEO",
> + "VR",
> + "COMPUTE",
> + "CUSTOM"};
> + uint32_t i, size = 0;
> + int16_t workload_type = 0;
> +
> + if (!smu->pm_enabled || !buf)
> + return -EINVAL;
> +
> + for (i = 0; i <= PP_SMC_POWER_PROFILE_CUSTOM; i++) {
> + /*
> +  * Conv PP_SMC_POWER_PROFILE* to
> WORKLOAD_PPLIB_*_BIT
> +  * Not all profile modes are supported on arcturus.
> +  */
> + workload_type = smu_workload_get_type(smu, i);
> + if (workload_type < 0)
> + continue;
> +
> + size += sprintf(buf + size, "%2d %14s%s\n",
> + i, profile_name[i], (i == smu->power_profile_mode) ?
> "*" : " ");
> + }
> +
> + return size;
> +}
> +
>  static const struct pptable_funcs renoir_ppt_funcs = {
>   .get_smu_msg_index = renoir_get_smu_msg_index,
>   .get_smu_table_index = renoir_get_smu_table_index, @@ -658,6
> +691,7 @@ static const struct pptable_funcs renoir_ppt_funcs = {
>   .set_performance_level = renoir_set_performance_level,
>   .get_dpm_clock_table = renoir_get_dpm_clock_table,
>   .set_watermarks_table = renoir_set_watermarks_table,
> + .get_power_profile_mode = renoir_get_power_profile_mode,
>  };
> 
>  void renoir_set_ppt_funcs(struct smu_context *smu)
> --
> 2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amd/powerplay: bug fix for memory clock request from display

2019-10-16 Thread Xiao, Jack

Reviewed-by: Jack Xiao 

-Original Message-
From: amd-gfx  On Behalf Of Kenneth Feng
Sent: Wednesday, October 16, 2019 4:58 PM
To: amd-gfx@lists.freedesktop.org
Cc: Feng, Kenneth 
Subject: [PATCH] drm/amd/powerplay: bug fix for memory clock request from 
display

In some cases, display fixes memory clock frequency to a high value rather than 
the natural memory clock switching.
When we comes back from s3 resume, the request from display is not reset, this 
causes the bug which makes the memory clock goes into a low value.
Then due to the insuffcient memory clock, the screen flicks.

Signed-off-by: Kenneth Feng 
---
 drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c 
b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
index e2a03f4..ee374df 100644
--- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
@@ -1354,6 +1354,8 @@ static int smu_resume(void *handle)
if (smu->is_apu)
smu_set_gfx_cgpg(&adev->smu, true);
 
+   smu->disable_uclk_switch = 0;
+
mutex_unlock(&smu->mutex);
 
pr_info("SMU is resumed successfully!\n");
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: TTM refcount problem.

2019-10-16 Thread Bas Nieuwenhuizen

On Mon, Jul 29, 2019 at 11:32 AM Christian König
 wrote:
>
> > Is this a known issue?
> No, that looks like a new one to me.
>
> Is that somehow reproducible?

I tried finding a reliable reproducer (only Vulkan CTS runs uncommonly
caught it), but could not find anything better.

However this issue seems to be fixed with one of the following patches
from drm-misc-fixes:

"drm/ttm: fix handling in ttm_bo_add_mem_to_lru"
"drm/ttm: fix busy reference in ttm_mem_evict_first"

I haven't seen the issue in 100 CTS runs.

Thanks,
Bas

>
> Christian.
>
> Am 29.07.19 um 10:14 schrieb Bas Nieuwenhuizen:
> > Hi all,
> >
> > I have a TTM refcount issue:
> >
> > [173774.309968] [ cut here ]
> > [173774.309970] kernel BUG at drivers/gpu/drm/ttm/ttm_bo.c:202!
> > [173774.309982] invalid opcode:  [#1] PREEMPT SMP NOPTI
> > [173774.309985] CPU: 13 PID: 128214 Comm: kworker/13:2 Not tainted
> > 5.2.0-rc1-g3f2e519b0974 #10
> > [173774.309986] Hardware name: To Be Filled By O.E.M. To Be Filled By
> > O.E.M./X399 Taichi, BIOS P1.50 09/05/2017
> > [173774.309995] Workqueue: events ttm_bo_delayed_workqueue [ttm]
> > [173774.31] RIP: 0010:ttm_bo_ref_bug+0x5/0x10 [ttm]
> > [173774.310002] Code: c0 c3 b8 01 00 00 00 c3 66 66 2e 0f 1f 84 00 00
> > 00 00 00 66 90 0f 1f 44 00 00 f0 ff 8f a4 00 00 00 c3 0f 1f 00 0f 1f
> > 44 00 00 <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 48 8b 07
> > 48 89
> > [173774.310003] RSP: 0018:b42e5589bde8 EFLAGS: 00010246
> > [173774.310005] RAX: b42e5589be40 RBX: 9395fd0cd908 RCX:
> > 9395fd0cd8f8
> > [173774.310006] RDX: b42e5589be40 RSI: 939b59b64f18 RDI:
> > 9395fd0cd87c
> > [173774.310007] RBP: c0930f40 R08: 0014 R09:
> > c091f100
> > [173774.310008] R10: 9399f69b0800 R11: 0001 R12:
> > 
> > [173774.310009] R13: 9395fd0cd850 R14: 0001 R15:
> > 0001
> > [173774.310010] FS:  () GS:939b7d34()
> > knlGS:
> > [173774.310011] CS:  0010 DS:  ES:  CR0: 80050033
> > [173774.310012] CR2: 7f4f64008838 CR3: 000643baa000 CR4:
> > 003406e0
> > [173774.310013] Call Trace:
> > [173774.310019]  ttm_bo_cleanup_refs+0x160/0x1e0 [ttm]
> > [173774.310025]  ttm_bo_delayed_delete+0xa8/0x1e0 [ttm]
> > [173774.310029]  ttm_bo_delayed_workqueue+0x17/0x40 [ttm]
> > [173774.310033]  process_one_work+0x1fd/0x430
> > [173774.310036]  worker_thread+0x2d/0x3d0
> > [173774.310038]  ? process_one_work+0x430/0x430
> > [173774.310040]  kthread+0x112/0x130
> > [173774.310042]  ? kthread_create_on_node+0x60/0x60
> > [173774.310045]  ret_from_fork+0x22/0x40
> > [173774.310048] Modules linked in: fuse nct6775 hwmon_vid
> > nls_iso8859_1 nls_cp437 vfat fat edac_mce_amd kvm_amd kvm irqbypass
> > amdgpu arc4 iwlmvm mac80211 snd_usb_audio uvcvideo snd_usbmidi_lib
> > videobuf2_vmalloc crct10dif_pclmul videobuf2_memops
> > snd_hda_codec_realtek videobuf2_v4l2 btusb gpu_sched snd_rawmidi
> > videobuf2_common snd_hda_codec_generic btrtl videodev crc32_pclmul
> > btbcm snd_seq_device ledtrig_audio ttm btintel ghash_clmulni_intel
> > wmi_bmof mxm_wmi snd_hda_codec_hdmi media bluetooth drm_kms_helper
> > iwlwifi snd_hda_intel drm aesni_intel snd_hda_codec joydev input_leds
> > aes_x86_64 snd_hda_core mousedev evdev crypto_simd cryptd ecdh_generic
> > led_class agpgart snd_hwdep mac_hid cdc_acm glue_helper ecc snd_pcm
> > igb syscopyarea pcspkr cfg80211 sysfillrect snd_timer sysimgblt snd
> > fb_sys_fops ccp ptp soundcore pps_core rng_core k10temp i2c_algo_bit
> > sp5100_tco dca i2c_piix4 rfkill wmi pcc_cpufreq button acpi_cpufreq
> > sch_fq_codel ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2
> > sd_mod
> > [173774.310085]  hid_generic usbhid hid crc32c_intel ahci xhci_pci
> > libahci xhci_hcd libata usbcore scsi_mod usb_common
> > [173774.310094] ---[ end trace 1f8d21980c0b3fd5 ]---
> > [173774.310097] RIP: 0010:ttm_bo_ref_bug+0x5/0x10 [ttm]
> > [173774.310099] Code: c0 c3 b8 01 00 00 00 c3 66 66 2e 0f 1f 84 00 00
> > 00 00 00 66 90 0f 1f 44 00 00 f0 ff 8f a4 00 00 00 c3 0f 1f 00 0f 1f
> > 44 00 00 <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 48 8b 07
> > 48 89
> > [173774.310100] RSP: 0018:b42e5589bde8 EFLAGS: 00010246
> > [173774.310101] RAX: b42e5589be40 RBX: 9395fd0cd908 RCX:
> > 9395fd0cd8f8
> > [173774.310102] RDX: b42e5589be40 RSI: 939b59b64f18 RDI:
> > 9395fd0cd87c
> > [173774.310103] RBP: c0930f40 R08: 0014 R09:
> > c091f100
> > [173774.310104] R10: 9399f69b0800 R11: 0001 R12:
> > 
> > [173774.310104] R13: 9395fd0cd850 R14: 0001 R15:
> > 0001
> > [173774.310106] FS:  () GS:939b7d34()
> > knlGS:
> > [173774.310107] CS:  0010 DS:  ES:  CR0: 80050033
> > [173774.310107] CR2: 7f4f64008838 CR3: 000643baa000 CR4:

Re: TTM refcount problem.

2019-10-16 Thread Christian König


Am 16.10.19 um 12:09 schrieb Bas Nieuwenhuizen:

On Mon, Jul 29, 2019 at 11:32 AM Christian König
 wrote:

Is this a known issue?

No, that looks like a new one to me.

Is that somehow reproducible?

I tried finding a reliable reproducer (only Vulkan CTS runs uncommonly
caught it), but could not find anything better.

However this issue seems to be fixed with one of the following patches
from drm-misc-fixes:

"drm/ttm: fix handling in ttm_bo_add_mem_to_lru"
"drm/ttm: fix busy reference in ttm_mem_evict_first"

I haven't seen the issue in 100 CTS runs.


Thanks for the information.

I'm currently completely reworking the handling and trying to get rid of 
all the reference dropping which just results in a BUG().


Issues like that one will then hopefully completely disappear.

Regards,
Christian.



Thanks,
Bas


Christian.

Am 29.07.19 um 10:14 schrieb Bas Nieuwenhuizen:

Hi all,

I have a TTM refcount issue:

[173774.309968] [ cut here ]
[173774.309970] kernel BUG at drivers/gpu/drm/ttm/ttm_bo.c:202!
[173774.309982] invalid opcode:  [#1] PREEMPT SMP NOPTI
[173774.309985] CPU: 13 PID: 128214 Comm: kworker/13:2 Not tainted
5.2.0-rc1-g3f2e519b0974 #10
[173774.309986] Hardware name: To Be Filled By O.E.M. To Be Filled By
O.E.M./X399 Taichi, BIOS P1.50 09/05/2017
[173774.309995] Workqueue: events ttm_bo_delayed_workqueue [ttm]
[173774.31] RIP: 0010:ttm_bo_ref_bug+0x5/0x10 [ttm]
[173774.310002] Code: c0 c3 b8 01 00 00 00 c3 66 66 2e 0f 1f 84 00 00
00 00 00 66 90 0f 1f 44 00 00 f0 ff 8f a4 00 00 00 c3 0f 1f 00 0f 1f
44 00 00 <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 48 8b 07
48 89
[173774.310003] RSP: 0018:b42e5589bde8 EFLAGS: 00010246
[173774.310005] RAX: b42e5589be40 RBX: 9395fd0cd908 RCX:
9395fd0cd8f8
[173774.310006] RDX: b42e5589be40 RSI: 939b59b64f18 RDI:
9395fd0cd87c
[173774.310007] RBP: c0930f40 R08: 0014 R09:
c091f100
[173774.310008] R10: 9399f69b0800 R11: 0001 R12:

[173774.310009] R13: 9395fd0cd850 R14: 0001 R15:
0001
[173774.310010] FS:  () GS:939b7d34()
knlGS:
[173774.310011] CS:  0010 DS:  ES:  CR0: 80050033
[173774.310012] CR2: 7f4f64008838 CR3: 000643baa000 CR4:
003406e0
[173774.310013] Call Trace:
[173774.310019]  ttm_bo_cleanup_refs+0x160/0x1e0 [ttm]
[173774.310025]  ttm_bo_delayed_delete+0xa8/0x1e0 [ttm]
[173774.310029]  ttm_bo_delayed_workqueue+0x17/0x40 [ttm]
[173774.310033]  process_one_work+0x1fd/0x430
[173774.310036]  worker_thread+0x2d/0x3d0
[173774.310038]  ? process_one_work+0x430/0x430
[173774.310040]  kthread+0x112/0x130
[173774.310042]  ? kthread_create_on_node+0x60/0x60
[173774.310045]  ret_from_fork+0x22/0x40
[173774.310048] Modules linked in: fuse nct6775 hwmon_vid
nls_iso8859_1 nls_cp437 vfat fat edac_mce_amd kvm_amd kvm irqbypass
amdgpu arc4 iwlmvm mac80211 snd_usb_audio uvcvideo snd_usbmidi_lib
videobuf2_vmalloc crct10dif_pclmul videobuf2_memops
snd_hda_codec_realtek videobuf2_v4l2 btusb gpu_sched snd_rawmidi
videobuf2_common snd_hda_codec_generic btrtl videodev crc32_pclmul
btbcm snd_seq_device ledtrig_audio ttm btintel ghash_clmulni_intel
wmi_bmof mxm_wmi snd_hda_codec_hdmi media bluetooth drm_kms_helper
iwlwifi snd_hda_intel drm aesni_intel snd_hda_codec joydev input_leds
aes_x86_64 snd_hda_core mousedev evdev crypto_simd cryptd ecdh_generic
led_class agpgart snd_hwdep mac_hid cdc_acm glue_helper ecc snd_pcm
igb syscopyarea pcspkr cfg80211 sysfillrect snd_timer sysimgblt snd
fb_sys_fops ccp ptp soundcore pps_core rng_core k10temp i2c_algo_bit
sp5100_tco dca i2c_piix4 rfkill wmi pcc_cpufreq button acpi_cpufreq
sch_fq_codel ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2
sd_mod
[173774.310085]  hid_generic usbhid hid crc32c_intel ahci xhci_pci
libahci xhci_hcd libata usbcore scsi_mod usb_common
[173774.310094] ---[ end trace 1f8d21980c0b3fd5 ]---
[173774.310097] RIP: 0010:ttm_bo_ref_bug+0x5/0x10 [ttm]
[173774.310099] Code: c0 c3 b8 01 00 00 00 c3 66 66 2e 0f 1f 84 00 00
00 00 00 66 90 0f 1f 44 00 00 f0 ff 8f a4 00 00 00 c3 0f 1f 00 0f 1f
44 00 00 <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 48 8b 07
48 89
[173774.310100] RSP: 0018:b42e5589bde8 EFLAGS: 00010246
[173774.310101] RAX: b42e5589be40 RBX: 9395fd0cd908 RCX:
9395fd0cd8f8
[173774.310102] RDX: b42e5589be40 RSI: 939b59b64f18 RDI:
9395fd0cd87c
[173774.310103] RBP: c0930f40 R08: 0014 R09:
c091f100
[173774.310104] R10: 9399f69b0800 R11: 0001 R12:

[173774.310104] R13: 9395fd0cd850 R14: 0001 R15:
0001
[173774.310106] FS:  () GS:939b7d34()
knlGS:
[173774.310107] CS:  0010 DS:  ES:  CR0: 80050033
[173774.310107] CR2: 7f4f64008838 CR3: 000643baa000 CR4:
003406e0
[173774.310110] note:

[PATCH -next] drm/amd/display: Make dc_link_detect_helper static

2019-10-16 Thread YueHaibing

Fix sparse warning:

drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:746:6:
 warning: symbol 'dc_link_detect_helper' was not declared. Should it be static?

Reported-by: Hulk Robot 
Signed-off-by: YueHaibing 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index fb18681..9350536 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -743,7 +743,8 @@ static bool wait_for_alt_mode(struct dc_link *link)
  * This does not create remote sinks but will trigger DM
  * to start MST detection if a branch is detected.
  */
-bool dc_link_detect_helper(struct dc_link *link, enum dc_detect_reason reason)
+static bool dc_link_detect_helper(struct dc_link *link,
+ enum dc_detect_reason reason)
 {
struct dc_sink_init_data sink_init_data = { 0 };
struct display_sink_capability sink_caps = { 0 };
-- 
2.7.4

[PATCH v4 07/11] drm/ttm: rename ttm_fbdev_mmap

2019-10-16 Thread Gerd Hoffmann

Rename ttm_fbdev_mmap to ttm_bo_mmap_obj.  Move the vm_pgoff sanity
check to amdgpu_bo_fbdev_mmap (only ttm_fbdev_mmap user in tree).

The ttm_bo_mmap_obj function can now be used to map any buffer object.
This allows to implement &drm_gem_object_funcs.mmap in gem ttm helpers.

v3: patch added to series

Signed-off-by: Gerd Hoffmann 
Acked-by: Thomas Zimmermann 
---
 include/drm/ttm/ttm_bo_api.h   | 10 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  5 -
 drivers/gpu/drm/ttm/ttm_bo_vm.c|  8 ++--
 3 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index 43c4929a2171..d2277e06316d 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -710,16 +710,14 @@ int ttm_bo_kmap(struct ttm_buffer_object *bo, unsigned 
long start_page,
 void ttm_bo_kunmap(struct ttm_bo_kmap_obj *map);
 
 /**
- * ttm_fbdev_mmap - mmap fbdev memory backed by a ttm buffer object.
+ * ttm_bo_mmap_obj - mmap memory backed by a ttm buffer object.
  *
  * @vma:   vma as input from the fbdev mmap method.
- * @bo:The bo backing the address space. The address space will
- * have the same size as the bo, and start at offset 0.
+ * @bo:The bo backing the address space.
  *
- * This function is intended to be called by the fbdev mmap method
- * if the fbdev address space is to be backed by a bo.
+ * Maps a buffer object.
  */
-int ttm_fbdev_mmap(struct vm_area_struct *vma, struct ttm_buffer_object *bo);
+int ttm_bo_mmap_obj(struct vm_area_struct *vma, struct ttm_buffer_object *bo);
 
 /**
  * ttm_bo_mmap - mmap out of the ttm device address space.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 1fead0e8b890..6f0b789a0b49 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1058,7 +1058,10 @@ void amdgpu_bo_fini(struct amdgpu_device *adev)
 int amdgpu_bo_fbdev_mmap(struct amdgpu_bo *bo,
 struct vm_area_struct *vma)
 {
-   return ttm_fbdev_mmap(vma, &bo->tbo);
+   if (vma->vm_pgoff != 0)
+   return -EACCES;
+
+   return ttm_bo_mmap_obj(vma, &bo->tbo);
 }
 
 /**
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 53345c0854d5..1a9db691f954 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -479,14 +479,10 @@ int ttm_bo_mmap(struct file *filp, struct vm_area_struct 
*vma,
 }
 EXPORT_SYMBOL(ttm_bo_mmap);
 
-int ttm_fbdev_mmap(struct vm_area_struct *vma, struct ttm_buffer_object *bo)
+int ttm_bo_mmap_obj(struct vm_area_struct *vma, struct ttm_buffer_object *bo)
 {
-   if (vma->vm_pgoff != 0)
-   return -EACCES;
-
ttm_bo_get(bo);
-
ttm_bo_mmap_vma_setup(bo, vma);
return 0;
 }
-EXPORT_SYMBOL(ttm_fbdev_mmap);
+EXPORT_SYMBOL(ttm_bo_mmap_obj);
-- 
2.18.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: fix amdgpu trace event print string format error

2019-10-16 Thread Wang, Kevin(Yang)

add @Koenig, Christian,
could you help me review it?

Best Regards,
Kevin


From: Wang, Kevin(Yang) 
Sent: Wednesday, October 16, 2019 11:06 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Wang, Kevin(Yang) 
Subject: [PATCH] drm/amdgpu: fix amdgpu trace event print string format error

the trace event print string format error.
(use integer type to handle string)

before:
amdgpu_test_kev-1556  [002]   138.508781: amdgpu_cs_ioctl:
sched_job=8, timeline=gfx_0.0.0, context=177, seqno=1,
ring_name=94d01c207bf0, num_ibs=2

after:
amdgpu_test_kev-1506  [004]   370.703783: amdgpu_cs_ioctl:
sched_job=12, timeline=gfx_0.0.0, context=234, seqno=2,
ring_name=gfx_0.0.0, num_ibs=1

change trace event list:
1.amdgpu_cs_ioctl
2.amdgpu_sched_run_job
3.amdgpu_ib_pipe_sync

Signed-off-by: Kevin Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 8227ebd0f511..f940526c5889 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -170,7 +170,7 @@ TRACE_EVENT(amdgpu_cs_ioctl,
  __field(unsigned int, context)
  __field(unsigned int, seqno)
  __field(struct dma_fence *, fence)
-__field(char *, ring_name)
+__string(ring, 
to_amdgpu_ring(job->base.sched)->name)
  __field(u32, num_ibs)
  ),

@@ -179,12 +179,12 @@ TRACE_EVENT(amdgpu_cs_ioctl,
__assign_str(timeline, 
AMDGPU_JOB_GET_TIMELINE_NAME(job))
__entry->context = 
job->base.s_fence->finished.context;
__entry->seqno = job->base.s_fence->finished.seqno;
-  __entry->ring_name = 
to_amdgpu_ring(job->base.sched)->name;
+  __assign_str(ring, 
to_amdgpu_ring(job->base.sched)->name)
__entry->num_ibs = job->num_ibs;
),
 TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, 
ring_name=%s, num_ibs=%u",
   __entry->sched_job_id, __get_str(timeline), 
__entry->context,
- __entry->seqno, __entry->ring_name, __entry->num_ibs)
+ __entry->seqno, __get_str(ring), __entry->num_ibs)
 );

 TRACE_EVENT(amdgpu_sched_run_job,
@@ -195,7 +195,7 @@ TRACE_EVENT(amdgpu_sched_run_job,
  __string(timeline, 
AMDGPU_JOB_GET_TIMELINE_NAME(job))
  __field(unsigned int, context)
  __field(unsigned int, seqno)
-__field(char *, ring_name)
+__string(ring, 
to_amdgpu_ring(job->base.sched)->name)
  __field(u32, num_ibs)
  ),

@@ -204,12 +204,12 @@ TRACE_EVENT(amdgpu_sched_run_job,
__assign_str(timeline, 
AMDGPU_JOB_GET_TIMELINE_NAME(job))
__entry->context = 
job->base.s_fence->finished.context;
__entry->seqno = job->base.s_fence->finished.seqno;
-  __entry->ring_name = 
to_amdgpu_ring(job->base.sched)->name;
+  __assign_str(ring, 
to_amdgpu_ring(job->base.sched)->name)
__entry->num_ibs = job->num_ibs;
),
 TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, 
ring_name=%s, num_ibs=%u",
   __entry->sched_job_id, __get_str(timeline), 
__entry->context,
- __entry->seqno, __entry->ring_name, __entry->num_ibs)
+ __entry->seqno, __get_str(ring), __entry->num_ibs)
 );


@@ -473,7 +473,7 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,
 TP_PROTO(struct amdgpu_job *sched_job, struct dma_fence *fence),
 TP_ARGS(sched_job, fence),
 TP_STRUCT__entry(
-__field(const char *,name)
+__string(ring, sched_job->base.sched->name);
  __field(uint64_t, id)
  __field(struct dma_fence *, fence)
  __field(uint64_t, ctx)
@@ -481,14 +481,14 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,
  ),

 TP_fast_assign(
-  __entry->name = sched_job->base.sched->name;
+  __assign_str(ring, sched_job->base.sched->name)
__entry->id = sched_job->base.id;
__entry->fence = fence;
__entry->ctx = fence->context;
__entry

Re: [PATCH] drm/amdgpu: fix amdgpu trace event print string format error

2019-10-16 Thread Koenig, Christian

Hi Kevin,

well that copies the string into the ring buffer every time the trace event is 
called which is not necessary a good idea for a constant string.

Can't we avoid that somehow?

Thanks,
Christian.

Am 16.10.19 um 14:01 schrieb Wang, Kevin(Yang):
add @Koenig, Christian,
could you help me review it?

Best Regards,
Kevin


From: Wang, Kevin(Yang) 
Sent: Wednesday, October 16, 2019 11:06 AM
To: amd-gfx@lists.freedesktop.org 

Cc: Wang, Kevin(Yang) 
Subject: [PATCH] drm/amdgpu: fix amdgpu trace event print string format error

the trace event print string format error.
(use integer type to handle string)

before:
amdgpu_test_kev-1556  [002]   138.508781: amdgpu_cs_ioctl:
sched_job=8, timeline=gfx_0.0.0, context=177, seqno=1,
ring_name=94d01c207bf0, num_ibs=2

after:
amdgpu_test_kev-1506  [004]   370.703783: amdgpu_cs_ioctl:
sched_job=12, timeline=gfx_0.0.0, context=234, seqno=2,
ring_name=gfx_0.0.0, num_ibs=1

change trace event list:
1.amdgpu_cs_ioctl
2.amdgpu_sched_run_job
3.amdgpu_ib_pipe_sync

Signed-off-by: Kevin Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 8227ebd0f511..f940526c5889 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -170,7 +170,7 @@ TRACE_EVENT(amdgpu_cs_ioctl,
  __field(unsigned int, context)
  __field(unsigned int, seqno)
  __field(struct dma_fence *, fence)
-__field(char *, ring_name)
+__string(ring, 
to_amdgpu_ring(job->base.sched)->name)
  __field(u32, num_ibs)
  ),

@@ -179,12 +179,12 @@ TRACE_EVENT(amdgpu_cs_ioctl,
__assign_str(timeline, 
AMDGPU_JOB_GET_TIMELINE_NAME(job))
__entry->context = 
job->base.s_fence->finished.context;
__entry->seqno = job->base.s_fence->finished.seqno;
-  __entry->ring_name = 
to_amdgpu_ring(job->base.sched)->name;
+  __assign_str(ring, 
to_amdgpu_ring(job->base.sched)->name)
__entry->num_ibs = job->num_ibs;
),
 TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, 
ring_name=%s, num_ibs=%u",
   __entry->sched_job_id, __get_str(timeline), 
__entry->context,
- __entry->seqno, __entry->ring_name, __entry->num_ibs)
+ __entry->seqno, __get_str(ring), __entry->num_ibs)
 );

 TRACE_EVENT(amdgpu_sched_run_job,
@@ -195,7 +195,7 @@ TRACE_EVENT(amdgpu_sched_run_job,
  __string(timeline, 
AMDGPU_JOB_GET_TIMELINE_NAME(job))
  __field(unsigned int, context)
  __field(unsigned int, seqno)
-__field(char *, ring_name)
+__string(ring, 
to_amdgpu_ring(job->base.sched)->name)
  __field(u32, num_ibs)
  ),

@@ -204,12 +204,12 @@ TRACE_EVENT(amdgpu_sched_run_job,
__assign_str(timeline, 
AMDGPU_JOB_GET_TIMELINE_NAME(job))
__entry->context = 
job->base.s_fence->finished.context;
__entry->seqno = job->base.s_fence->finished.seqno;
-  __entry->ring_name = 
to_amdgpu_ring(job->base.sched)->name;
+  __assign_str(ring, 
to_amdgpu_ring(job->base.sched)->name)
__entry->num_ibs = job->num_ibs;
),
 TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, 
ring_name=%s, num_ibs=%u",
   __entry->sched_job_id, __get_str(timeline), 
__entry->context,
- __entry->seqno, __entry->ring_name, __entry->num_ibs)
+ __entry->seqno, __get_str(ring), __entry->num_ibs)
 );


@@ -473,7 +473,7 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,
 TP_PROTO(struct amdgpu_job *sched_job, struct dma_fence *fence),
 TP_ARGS(sched_job, fence),
 TP_STRUCT__entry(
-__field(const char *,name)
+__string(ring, sched_job->base.sched->name);
  __field(uint64_t, id)
  __field(struct dma_fence *, fence)
  __field(uint64_t, ctx)
@@ -481,14 +481,14 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,

Re: [PATCH] drm/amdgpu/psp: declare PSP TA firmware

2019-10-16 Thread Deucher, Alexander

Reviewed-by: Alex Deucher 

From: amd-gfx  on behalf of chen gong 

Sent: Tuesday, October 15, 2019 10:48 PM
To: amd-gfx@lists.freedesktop.org 
Cc: Lakha, Bhawanpreet ; Gong, Curry 

Subject: [PATCH] drm/amdgpu/psp: declare PSP TA firmware

Add PSP TA firmware declaration for raven raven2 picasso

Signed-off-by: chen gong 
---
 drivers/gpu/drm/amd/amdgpu/psp_v10_0.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/psp_v10_0.c
index b96484a..b345e69 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v10_0.c
@@ -40,6 +40,9 @@
 MODULE_FIRMWARE("amdgpu/raven_asd.bin");
 MODULE_FIRMWARE("amdgpu/picasso_asd.bin");
 MODULE_FIRMWARE("amdgpu/raven2_asd.bin");
+MODULE_FIRMWARE("amdgpu/picasso_ta.bin");
+MODULE_FIRMWARE("amdgpu/raven2_ta.bin");
+MODULE_FIRMWARE("amdgpu/raven_ta.bin");

 static int psp_v10_0_init_microcode(struct psp_context *psp)
 {
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/3] drm/amdgpu/uvd:Add uvd enc session bo

2019-10-16 Thread James Zhu

Thanks for your comments!

Alex has a new patch submit, I am verifying it.

Thanks!

James Zhu

On 2019-10-16 4:59 a.m., Christian König wrote:
> Am 16.10.19 um 00:08 schrieb Zhu, James:
>> Add uvd enc session bo for uvd encode IB test.
>>
>> Signed-off-by: James Zhu 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h | 4 
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h
>> index 5eb6328..1e39c8a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h
>> @@ -67,6 +67,10 @@ struct amdgpu_uvd {
>>   unsigned    harvest_config;
>>   /* store image width to adjust nb memory state */
>>   unsigned    decode_image_width;
>> +
>> +    struct amdgpu_bo *enc_session_bo;
>> +    void *enc_session_cpu_addr;
>> +    uint64_t  enc_session_gpu_addr;
>
> Please don't keep that allocated all the time, but rather only 
> allocate it on demand during the IB test.
>
> Regards,
> Christian.
>
>>   };
>>     int amdgpu_uvd_sw_init(struct amdgpu_device *adev);
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: fix amdgpu trace event print string format error

2019-10-16 Thread Wang, Kevin(Yang)

Hi Chris,

You said that this kind of scene also existed in other source code, these has 
same method.
in amdgpu_trace.h file, this usage case is exits in amdgpu driver.
likes TRACE_EVENT(amdgpu_cs_ioctl) -> timeline :
TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, 
ring_name=%s, num_ibs=%u",
  __entry->sched_job_id, __get_str(timeline), 
__entry->context,
  __entry->seqno, __get_str(ring), __entry->num_ibs)
and do you have other better way to do it?
thanks.

Best Regards,
Kevin


From: Koenig, Christian 
Sent: Wednesday, October 16, 2019 8:15 PM
To: Wang, Kevin(Yang) ; amd-gfx@lists.freedesktop.org 

Subject: Re: [PATCH] drm/amdgpu: fix amdgpu trace event print string format 
error

Hi Kevin,

well that copies the string into the ring buffer every time the trace event is 
called which is not necessary a good idea for a constant string.

Can't we avoid that somehow?

Thanks,
Christian.

Am 16.10.19 um 14:01 schrieb Wang, Kevin(Yang):
add @Koenig, Christian,
could you help me review it?

Best Regards,
Kevin


From: Wang, Kevin(Yang) 
Sent: Wednesday, October 16, 2019 11:06 AM
To: amd-gfx@lists.freedesktop.org 

Cc: Wang, Kevin(Yang) 
Subject: [PATCH] drm/amdgpu: fix amdgpu trace event print string format error

the trace event print string format error.
(use integer type to handle string)

before:
amdgpu_test_kev-1556  [002]   138.508781: amdgpu_cs_ioctl:
sched_job=8, timeline=gfx_0.0.0, context=177, seqno=1,
ring_name=94d01c207bf0, num_ibs=2

after:
amdgpu_test_kev-1506  [004]   370.703783: amdgpu_cs_ioctl:
sched_job=12, timeline=gfx_0.0.0, context=234, seqno=2,
ring_name=gfx_0.0.0, num_ibs=1

change trace event list:
1.amdgpu_cs_ioctl
2.amdgpu_sched_run_job
3.amdgpu_ib_pipe_sync

Signed-off-by: Kevin Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 8227ebd0f511..f940526c5889 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -170,7 +170,7 @@ TRACE_EVENT(amdgpu_cs_ioctl,
  __field(unsigned int, context)
  __field(unsigned int, seqno)
  __field(struct dma_fence *, fence)
-__field(char *, ring_name)
+__string(ring, 
to_amdgpu_ring(job->base.sched)->name)
  __field(u32, num_ibs)
  ),

@@ -179,12 +179,12 @@ TRACE_EVENT(amdgpu_cs_ioctl,
__assign_str(timeline, 
AMDGPU_JOB_GET_TIMELINE_NAME(job))
__entry->context = 
job->base.s_fence->finished.context;
__entry->seqno = job->base.s_fence->finished.seqno;
-  __entry->ring_name = 
to_amdgpu_ring(job->base.sched)->name;
+  __assign_str(ring, 
to_amdgpu_ring(job->base.sched)->name)
__entry->num_ibs = job->num_ibs;
),
 TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, 
ring_name=%s, num_ibs=%u",
   __entry->sched_job_id, __get_str(timeline), 
__entry->context,
- __entry->seqno, __entry->ring_name, __entry->num_ibs)
+ __entry->seqno, __get_str(ring), __entry->num_ibs)
 );

 TRACE_EVENT(amdgpu_sched_run_job,
@@ -195,7 +195,7 @@ TRACE_EVENT(amdgpu_sched_run_job,
  __string(timeline, 
AMDGPU_JOB_GET_TIMELINE_NAME(job))
  __field(unsigned int, context)
  __field(unsigned int, seqno)
-__field(char *, ring_name)
+__string(ring, 
to_amdgpu_ring(job->base.sched)->name)
  __field(u32, num_ibs)
  ),

@@ -204,12 +204,12 @@ TRACE_EVENT(amdgpu_sched_run_job,
__assign_str(timeline, 
AMDGPU_JOB_GET_TIMELINE_NAME(job))
__entry->context = 
job->base.s_fence->finished.context;
__entry->seqno = job->base.s_fence->finished.seqno;
-  __entry->ring_name = 
to_amdgpu_ring(job->base.sched)->name;
+  __assign_str(ring, 
to_amdgpu_ring(job->base.sched)->name)
__entry->num_ibs = job->num_ibs;
),
 TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, 
ring_name=%s, num_ibs=%u",

Re: [PATCH] drm/amdgpu: fix amdgpu trace event print string format error

2019-10-16 Thread Koenig, Christian

Hi Kevin,

do you have other better way to do it?
Not of hand, but maybe check the trace documentation if there is any better 
approach.

If you can't find anything the patch is Reviewed-by: Christian König 
.

Regards,
Christian.

Am 16.10.19 um 15:30 schrieb Wang, Kevin(Yang):
Hi Chris,

You said that this kind of scene also existed in other source code, these has 
same method.
in amdgpu_trace.h file, this usage case is exits in amdgpu driver.
likes TRACE_EVENT(amdgpu_cs_ioctl) -> timeline :
TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, 
ring_name=%s, num_ibs=%u",
  __entry->sched_job_id, __get_str(timeline), 
__entry->context,
  __entry->seqno, __get_str(ring), __entry->num_ibs)
and do you have other better way to do it?
thanks.

Best Regards,
Kevin


From: Koenig, Christian 

Sent: Wednesday, October 16, 2019 8:15 PM
To: Wang, Kevin(Yang) ; 
amd-gfx@lists.freedesktop.org 

Subject: Re: [PATCH] drm/amdgpu: fix amdgpu trace event print string format 
error

Hi Kevin,

well that copies the string into the ring buffer every time the trace event is 
called which is not necessary a good idea for a constant string.

Can't we avoid that somehow?

Thanks,
Christian.

Am 16.10.19 um 14:01 schrieb Wang, Kevin(Yang):
add @Koenig, Christian,
could you help me review it?

Best Regards,
Kevin


From: Wang, Kevin(Yang) 
Sent: Wednesday, October 16, 2019 11:06 AM
To: amd-gfx@lists.freedesktop.org 

Cc: Wang, Kevin(Yang) 
Subject: [PATCH] drm/amdgpu: fix amdgpu trace event print string format error

the trace event print string format error.
(use integer type to handle string)

before:
amdgpu_test_kev-1556  [002]   138.508781: amdgpu_cs_ioctl:
sched_job=8, timeline=gfx_0.0.0, context=177, seqno=1,
ring_name=94d01c207bf0, num_ibs=2

after:
amdgpu_test_kev-1506  [004]   370.703783: amdgpu_cs_ioctl:
sched_job=12, timeline=gfx_0.0.0, context=234, seqno=2,
ring_name=gfx_0.0.0, num_ibs=1

change trace event list:
1.amdgpu_cs_ioctl
2.amdgpu_sched_run_job
3.amdgpu_ib_pipe_sync

Signed-off-by: Kevin Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 8227ebd0f511..f940526c5889 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -170,7 +170,7 @@ TRACE_EVENT(amdgpu_cs_ioctl,
  __field(unsigned int, context)
  __field(unsigned int, seqno)
  __field(struct dma_fence *, fence)
-__field(char *, ring_name)
+__string(ring, 
to_amdgpu_ring(job->base.sched)->name)
  __field(u32, num_ibs)
  ),

@@ -179,12 +179,12 @@ TRACE_EVENT(amdgpu_cs_ioctl,
__assign_str(timeline, 
AMDGPU_JOB_GET_TIMELINE_NAME(job))
__entry->context = 
job->base.s_fence->finished.context;
__entry->seqno = job->base.s_fence->finished.seqno;
-  __entry->ring_name = 
to_amdgpu_ring(job->base.sched)->name;
+  __assign_str(ring, 
to_amdgpu_ring(job->base.sched)->name)
__entry->num_ibs = job->num_ibs;
),
 TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, 
ring_name=%s, num_ibs=%u",
   __entry->sched_job_id, __get_str(timeline), 
__entry->context,
- __entry->seqno, __entry->ring_name, __entry->num_ibs)
+ __entry->seqno, __get_str(ring), __entry->num_ibs)
 );

 TRACE_EVENT(amdgpu_sched_run_job,
@@ -195,7 +195,7 @@ TRACE_EVENT(amdgpu_sched_run_job,
  __string(timeline, 
AMDGPU_JOB_GET_TIMELINE_NAME(job))
  __field(unsigned int, context)
  __field(unsigned int, seqno)
-__field(char *, ring_name)
+__string(ring, 
to_amdgpu_ring(job->base.sched)->name)
  __field(u32, num_ibs)
  ),

@@ -204,12 +204,12 @@ TRACE_EVENT(amdgpu_sched_run_job,
__assign_str(timeline, 
AMDGPU_JOB_GET_TIMELINE_NAME(job))
__entry->context = 
job->base.s_fence->finished.context;

[RFC] drm: Add AMD GFX9+ format modifiers.

2019-10-16 Thread Bas Nieuwenhuizen

This adds initial format modifiers for AMD GFX9 and newer GPUs.

This is particularly useful to determine if we can use DCC, and whether
we need an extra display compatible DCC metadata plane.

Design decisions:
  - Always expose a single plane
   This way everything works correctly with images with multiple planes.

  - Do not add an extra memory region in DCC for putting a bit on whether
we are in compressed state.
   A decompress on import is cheap enough if already decompressed, and
   I do think in most cases we can avoid it in advance during modifier
   negotiation. The remainder is probably not common enough to worry
   about.

  - Explicitly define the sizes as part of the modifier description instead
of using whatever the current version of radeonsi does.
   This way we can avoid dedicated buffers and we can make sure we keep
   compatibility across mesa versions. I'd like to put some tests on
   this on ac_surface.c so we can learn early in the process if things
   need to be changed. Furthermore, the lack of configurable strides on
   GFX10 means things already go wrong if we do not agree, making a
   custom stride somewhat less useful.

  - No usage of BO metadata at all for modifier usecases.
   To avoid the requirement of dedicated dma bufs per image. For
   non-modifier based interop we still use the BO metadata, since we
   need to keep compatibility with old mesa and this is used for
   depth/msaa/3d/CL etc. API interop.

  - A single FD for all planes.
   Easier in Vulkan / bindless and radeonsi is already transitioning.

  - Make a single modifier for DCN1
  It defines things uniquely given bpp, which we can assume, so adding
  more modifier values do not add clarity.

  - Not exposing the 4K and 256B tiling modes.
  These are largely only better for something like a cursor or very long
  and/or tall images. Are they worth the added complexity to save memory?
  For context, at 32bpp, tiles are 128x128 pixels.

  - For multiplane images, every plane uses the same tiling.
  On GFX9/GFX10 we can, so no need to make it complicated.

  - We use family_id + external_rev to distinguish between incompatible GPUs.
  PCI ID is not enough, as RAVEN and RAVEN2 have the same PCI device id,
  but different tiling. We might be able to find bigger equivalence
  groups for _X, but especially for DCC I would be uncomfortable making it
  shared between GPUs.

  - For DCN1 DCC, radeonsi currently uses another texelbuffer with indices
to reorder. This is not shared.
  Specific to current implementation and does not need to be shared. To
  pave the way to shader-based solution, lets keep this internal to each
  driver. This should reduce the modifier churn if any of the driver
  implementations change. (Especially as you'd want to support the old
  implementation for a while to stay compatible with old kernels not
  supporting a new modifier yet).

  - No support for rotated swizzling.
  Can be added easily later and nothing in the stack would generate it
  currently.

  - Add extra enum values in the definitions.
  This way we can easily switch on modifier without having to pass around
  the current GPU everywhere, assuming the modifier has been validated.
---

 Since my previous attempt for modifiers got bogged down on details for
 the GFX6-GFX8 modifiers in previous discussions, this only attempts to
 define modifiers for GFX9+, which is significantly simpler.

 For a final version I'd like to wait until I have written most of the
 userspace + kernelspace so we can actually test it. However, I'd
 appreciate any early feedback people are willing to give.

 Initial Mesa amd/common support + tests are available at
 https://gitlab.freedesktop.org/bnieuwenhuizen/mesa/tree/modifiers

 I tested the HW to actually behave as described in the descriptions
 on Raven and plan to test on a subset of the others.

 include/uapi/drm/drm_fourcc.h | 118 ++
 1 file changed, 118 insertions(+)

diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
index 3feeaa3f987a..9bd286ab2bee 100644
--- a/include/uapi/drm/drm_fourcc.h
+++ b/include/uapi/drm/drm_fourcc.h
@@ -756,6 +756,124 @@ extern "C" {
  */
 #define DRM_FORMAT_MOD_ALLWINNER_TILED fourcc_mod_code(ALLWINNER, 1)
 
+/*
+ * AMD GFX9+ format modifiers
+ */
+
+/*
+ * enum-like values for easy switches.
+ *
+ * No fixed field-size but implementations are supposed to enforce all-zeros of
+ * unused bits during validation.
+ */
+#define DRM_FORMAT_MOD_AMD_GFX9_64K_STANDARD_id   0
+#define DRM_FORMAT_MOD_AMD_GFX9_64K_DISPLAY_id1
+#define DRM_FORMAT_MOD_AMD_GFX9_64K_X_STANDARD_id 2
+#define DRM_FORMAT_MOD_AMD_GFX9_64K_X_DISPLAY_id  3
+#define DRM_FORMAT_MOD_AMD_GFX10_64K_X_RENDER_id  4
+#define DRM_FORMAT_MOD_AMD_GFX9_64K_X_STANDARD_DCC_id 5
+#define DRM

[PATCH v3] drm/amd/display: Add MST atomic routines

2019-10-16 Thread mikita.lipski

From: Mikita Lipski 

- Adding encoder atomic check to find vcpi slots for a connector
- Using DRM helper functions to calculate PBN
- Adding connector atomic check to release vcpi slots if connector
loses CRTC
- Calculate  PBN and VCPI slots only once during atomic
check and store them on crtc_state to eliminate
redundant calculation
- Call drm_dp_mst_atomic_check to verify validity of MST topology
during state atomic check

v2: squashed previous 3 separate patches, removed DSC PBN calculation,
and added PBN and VCPI slots properties to amdgpu connector

v3:
- moved vcpi_slots and pbn properties to dm_crtc_state and dc_stream_state
- updates stream's vcpi_slots and pbn on commit
- separated patch from the DSC MST series 

Cc: Jerry Zuo 
Cc: Harry Wentland 
Cc: Nicholas Kazlauskas 
Cc: Lyude Paul 
Signed-off-by: Mikita Lipski 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 59 ++-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  4 ++
 .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 42 +
 .../display/amdgpu_dm/amdgpu_dm_mst_types.c   | 32 ++
 drivers/gpu/drm/amd/display/dc/dc_stream.h|  3 +
 5 files changed, 99 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 10cce584719f..c37c384a3365 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3811,6 +3811,9 @@ create_stream_for_sink(struct amdgpu_dm_connector 
*aconnector,
 
update_stream_signal(stream, sink);
 
+   stream->vcpi_slots = 0;
+   stream->pbn = 0;
+
if (stream->signal == SIGNAL_TYPE_HDMI_TYPE_A)
mod_build_hf_vsif_infopacket(stream, &stream->vsp_infopacket, 
false, false);
 
@@ -3889,6 +3892,8 @@ dm_crtc_duplicate_state(struct drm_crtc *crtc)
state->crc_src = cur->crc_src;
state->cm_has_degamma = cur->cm_has_degamma;
state->cm_is_degamma_srgb = cur->cm_is_degamma_srgb;
+   state->vcpi_slots = cur->vcpi_slots;
+   state->pbn = cur->pbn;
 
/* TODO Duplicate dc_stream after objects are stream object is 
flattened */
 
@@ -4587,6 +4592,38 @@ static int dm_encoder_helper_atomic_check(struct 
drm_encoder *encoder,
  struct drm_crtc_state *crtc_state,
  struct drm_connector_state 
*conn_state)
 {
+   struct drm_atomic_state *state = crtc_state->state;
+   struct drm_connector *connector = conn_state->connector;
+   struct amdgpu_dm_connector *aconnector = 
to_amdgpu_dm_connector(connector);
+   struct dm_crtc_state *dm_new_crtc_state = to_dm_crtc_state(crtc_state);
+   const struct drm_display_mode *adjusted_mode = 
&crtc_state->adjusted_mode;
+   struct drm_dp_mst_topology_mgr *mst_mgr;
+   struct drm_dp_mst_port *mst_port;
+   int clock, bpp = 0;
+
+   if (!aconnector->port || !aconnector->dc_sink)
+   return 0;
+
+   mst_port = aconnector->port;
+   mst_mgr = &aconnector->mst_port->mst_mgr;
+
+   if (!crtc_state->connectors_changed && !crtc_state->mode_changed)
+   return 0;
+
+   if(!state->duplicated) {
+   bpp = (uint8_t)connector->display_info.bpc * 3;
+   clock = adjusted_mode->clock;
+   dm_new_crtc_state->pbn = drm_dp_calc_pbn_mode(clock, bpp);
+   }
+   dm_new_crtc_state->vcpi_slots = drm_dp_atomic_find_vcpi_slots(state,
+  mst_mgr,
+  mst_port,
+  
dm_new_crtc_state->pbn);
+
+   if (dm_new_crtc_state->vcpi_slots < 0) {
+   DRM_DEBUG_ATOMIC("failed finding vcpi slots: %d\n", 
dm_new_crtc_state->vcpi_slots);
+   return dm_new_crtc_state->vcpi_slots;
+   }
return 0;
 }
 
@@ -6127,6 +6164,9 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
acrtc_state->stream->out_transfer_func;
}
 
+   acrtc_state->stream->vcpi_slots = acrtc_state->vcpi_slots;
+   acrtc_state->stream->pbn = acrtc_state->pbn;
+
acrtc_state->stream->abm_level = acrtc_state->abm_level;
if (acrtc_state->abm_level != dm_old_crtc_state->abm_level)
bundle->stream_update.abm_level = 
&acrtc_state->abm_level;
@@ -6527,7 +6567,7 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state)
struct dc_stream_update stream_update;
struct dc_info_packet hdr_packet;
struct dc_stream_status *status = NULL;
-   bool abm_changed, hdr_changed, scaling_changed;
+   bool abm_changed, hdr_changed, scaling_changed, 
mst_vcpi_changed;
 
memset(&dummy

Re: [PATCH] drm/amdgpu/psp: add psp memory training implementation(v3)

2019-10-16 Thread Tuikov, Luben

Reviewed-by: Luben Tuikov 

On 2019-10-15 23:56, Tianci Yin wrote:
> From: "Tianci.Yin" 
> 
> add memory training implementation code to save resume time.
> 
> Change-Id: I625794a780b11d824ab57ef39cc33b872c6dc6c9
> Reviewed-by: Alex Deucher 
> Signed-off-by: Tianci.Yin 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h |   1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |   9 ++
>  drivers/gpu/drm/amd/amdgpu/psp_v11_0.c  | 161 
>  3 files changed, 171 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 8704f93cabf2..c2b776fd82b5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -151,6 +151,7 @@ extern uint amdgpu_sdma_phase_quantum;
>  extern char *amdgpu_disable_cu;
>  extern char *amdgpu_virtual_display;
>  extern uint amdgpu_pp_feature_mask;
> +extern uint amdgpu_force_long_training;
>  extern int amdgpu_job_hang_limit;
>  extern int amdgpu_lbpw;
>  extern int amdgpu_compute_multipipe;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index da7cbee25c61..c7d086569acb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -127,6 +127,7 @@ char *amdgpu_disable_cu = NULL;
>  char *amdgpu_virtual_display = NULL;
>  /* OverDrive(bit 14) disabled by default*/
>  uint amdgpu_pp_feature_mask = 0xbfff;
> +uint amdgpu_force_long_training = 0;
>  int amdgpu_job_hang_limit = 0;
>  int amdgpu_lbpw = -1;
>  int amdgpu_compute_multipipe = -1;
> @@ -390,6 +391,14 @@ module_param_named(sched_hw_submission, 
> amdgpu_sched_hw_submission, int, 0444);
>  MODULE_PARM_DESC(ppfeaturemask, "all power features enabled (default))");
>  module_param_named(ppfeaturemask, amdgpu_pp_feature_mask, uint, 0444);
>  
> +/**
> + * DOC: forcelongtraining (uint)
> + * Force long memory training in resume.
> + * The default is zero, indicates short training in resume.
> + */
> +MODULE_PARM_DESC(forcelongtraining, "force memory long training");
> +module_param_named(forcelongtraining, amdgpu_force_long_training, uint, 
> 0444);
> +
>  /**
>   * DOC: pcie_gen_cap (uint)
>   * Override PCIE gen speed capabilities. See the CAIL flags in 
> drivers/gpu/drm/amd/include/amd_pcie.h.
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> index 2ba0f68ced10..19339de0cf12 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> @@ -58,6 +58,8 @@ MODULE_FIRMWARE("amdgpu/arcturus_ta.bin");
>  #define mmRLC_GPM_UCODE_DATA_NV100x5b62
>  #define mmSDMA0_UCODE_ADDR_NV10  0x5880
>  #define mmSDMA0_UCODE_DATA_NV10  0x5881
> +/* memory training timeout define */
> +#define MEM_TRAIN_SEND_MSG_TIMEOUT_US300
>  
>  static int psp_v11_0_init_microcode(struct psp_context *psp)
>  {
> @@ -902,6 +904,162 @@ static int psp_v11_0_rlc_autoload_start(struct 
> psp_context *psp)
>   return psp_rlc_autoload_start(psp);
>  }
>  
> +static int psp_v11_0_memory_training_send_msg(struct psp_context *psp, int 
> msg)
> +{
> + int ret;
> + int i;
> + uint32_t data_32;
> + int max_wait;
> + struct amdgpu_device *adev = psp->adev;
> +
> + data_32 = (psp->mem_train_ctx.c2p_train_data_offset >> 20);
> + WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36, data_32);
> + WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_35, msg);
> +
> + max_wait = MEM_TRAIN_SEND_MSG_TIMEOUT_US / adev->usec_timeout;
> + for (i = 0; i < max_wait; i++) {
> + ret = psp_wait_for(psp, SOC15_REG_OFFSET(MP0, 0, 
> mmMP0_SMN_C2PMSG_35),
> +0x8000, 0x8000, false);
> + if (ret == 0)
> + break;
> + }
> + if (i < max_wait)
> + ret = 0;
> + else
> + ret = -ETIME;
> +
> + DRM_DEBUG("%s training %s, cost %d * %dms.\n",
> +   (msg == PSP_BL__DRAM_SHORT_TRAIN) ? "short" : "long",
> +   (ret == 0) ? "succeed" : "failed",
> +   i, adev->usec_timeout/1000);
> + return ret;
> +}
> +
> +static void psp_v11_0_memory_training_fini(struct psp_context *psp)
> +{
> + struct psp_memory_training_context *ctx = &psp->mem_train_ctx;
> +
> + ctx->init = PSP_MEM_TRAIN_NOT_SUPPORT;
> + kfree(ctx->sys_cache);
> + ctx->sys_cache = NULL;
> +}
> +
> +static int psp_v11_0_memory_training_init(struct psp_context *psp)
> +{
> + int ret;
> + struct psp_memory_training_context *ctx = &psp->mem_train_ctx;
> +
> + if (ctx->init != PSP_MEM_TRAIN_RESERVE_SUCCESS) {
> + DRM_DEBUG("memory training is not supported!\n");
> + return 0;
> + }
> +
> + ctx->sys_cache = kzalloc(ctx->train_data_size, GFP_KERNEL);
> + if (ctx->sys_cache == NULL) {
> + DRM_ERROR("alloc mem_train_ctx.sys_cache failed!\n");
> + ret = -ENOM

Re: linux-next: Tree for Oct 16 (amd display)

2019-10-16 Thread Randy Dunlap

On 10/15/19 10:17 PM, Stephen Rothwell wrote:
> Hi all,
> 
> Changes since 20191015:
> 

on x86_64:

../drivers/gpu/drm/amd/amdgpu/../display/dc/dcn20/dcn20_resource.c: In function 
‘dcn20_populate_dml_pipes_from_context’:
../drivers/gpu/drm/amd/amdgpu/../display/dc/dcn20/dcn20_resource.c:1913:48: 
error: ‘struct dc_crtc_timing_flags’ has no member named ‘DSC’
   if (res_ctx->pipe_ctx[i].stream->timing.flags.DSC)
^
../drivers/gpu/drm/amd/amdgpu/../display/dc/dcn20/dcn20_resource.c:1914:73: 
error: ‘struct dc_crtc_timing’ has no member named ‘dsc_cfg’
pipes[pipe_cnt].dout.output_bpp = 
res_ctx->pipe_ctx[i].stream->timing.dsc_cfg.bits_per_pixel / 16.0;
 ^


Full randconfig file is attached.


-- 
~Randy
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 5.4.0-rc3 Kernel Configuration
#

#
# Compiler: gcc (SUSE Linux) 7.4.1 20190424 [gcc-7-branch revision 270538]
#
CONFIG_CC_IS_GCC=y
CONFIG_GCC_VERSION=70401
CONFIG_CLANG_VERSION=0
CONFIG_CC_CAN_LINK=y
CONFIG_CC_HAS_ASM_GOTO=y
CONFIG_CC_HAS_ASM_INLINE=y
CONFIG_CC_HAS_WARN_MAYBE_UNINITIALIZED=y
CONFIG_CC_DISABLE_WARN_MAYBE_UNINITIALIZED=y
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=32
# CONFIG_COMPILE_TEST is not set
CONFIG_HEADER_TEST=y
# CONFIG_KERNEL_HEADER_TEST is not set
# CONFIG_UAPI_HEADER_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_BUILD_SALT=""
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
# CONFIG_SYSVIPC is not set
# CONFIG_CROSS_MEMORY_ATTACH is not set
# CONFIG_USELIB is not set
CONFIG_HAVE_ARCH_AUDITSYSCALL=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_SIM=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
CONFIG_IRQ_MSI_IOMMU=y
CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y
CONFIG_GENERIC_IRQ_RESERVATION_MODE=y
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
CONFIG_GENERIC_IRQ_DEBUGFS=y
# end of IRQ subsystem

CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_ARCH_CLOCKSOURCE_INIT=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
CONFIG_NO_HZ=y
# CONFIG_HIGH_RES_TIMERS is not set
# end of Timers subsystem

CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_COUNT=y

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
CONFIG_IRQ_TIME_ACCOUNTING=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_PSI=y
CONFIG_PSI_DEFAULT_DISABLED=y
# end of CPU/Task time and stats accounting

#
# RCU Subsystem
#
CONFIG_TINY_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TINY_SRCU=y
CONFIG_TASKS_RCU=y
# end of RCU Subsystem

CONFIG_BUILD_BIN2C=y
CONFIG_IKCONFIG=m
CONFIG_IKHEADERS=m
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y

#
# Scheduler features
#
# end of Scheduler features

CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y
CONFIG_ARCH_SUPPORTS_INT128=y
CONFIG_CGROUPS=y
# CONFIG_MEMCG is not set
CONFIG_BLK_CGROUP=y
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_CFS_BANDWIDTH=y
# CONFIG_RT_GROUP_SCHED is not set
# CONFIG_CGROUP_PIDS is not set
CONFIG_CGROUP_RDMA=y
CONFIG_CGROUP_FREEZER=y
# CONFIG_CGROUP_DEVICE is not set
CONFIG_CGROUP_CPUACCT=y
CONFIG_CGROUP_PERF=y
# CONFIG_CGROUP_DEBUG is not set
CONFIG_NAMESPACES=y
CONFIG_UTS_NS=y
# CONFIG_USER_NS is not set
# CONFIG_PID_NS is not set
CONFIG_CHECKPOINT_RESTORE=y
CONFIG_SCHED_AUTOGROUP=y
CONFIG_SYSFS_DEPRECATED=y
# CONFIG_SYSFS_DEPRECATED_V2 is not set
CONFIG_RELAY=y
# CONFIG_BLK_DEV_INITRD is not set
# CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE is not set
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
CONFIG_EXPERT=y
CONFIG_MULTIUSER=y
# CONFIG_SGETMASK_SYSCALL is not set
CONFIG_SYSFS_SYSCALL=y
# CONFIG_FHANDLE is not set
CONFIG_POSIX_TIMERS=y
# CONFIG_PRINTK is not set
# CONFIG_BUG is not set
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
# CONFIG_FUTEX is not set
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
# CONFIG_TIMERFD is not set
# CONFIG_EVENTFD is not set
CONFIG_SHMEM=y
CONFIG_AIO=y
# CONFIG_IO_URING is not set
# CONFIG_ADVISE_SYSCALLS is not se

Re: [PATCH 1/3] drm/amdgpu/uvd6: fix allocation size in enc ring test (v2)

2019-10-16 Thread James Zhu

Reviewed-by: James Zhu  for this series
Tested-by: James Zhu  for this series

James

On 2019-10-15 6:18 p.m., Alex Deucher wrote:
> We need to allocate a large enough buffer for the
> session info, otherwise the IB test can overwrite
> other memory.
>
> v2: - session info is 128K according to mesa
>  - use the same session info for create and destroy
>
> Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204241
> Signed-off-by: Alex Deucher 
> ---
>   drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 31 ++-
>   1 file changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c 
> b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> index 670784a78512..217084d56ab8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> @@ -206,13 +206,14 @@ static int uvd_v6_0_enc_ring_test_ring(struct 
> amdgpu_ring *ring)
>* Open up a stream for HW test
>*/
>   static int uvd_v6_0_enc_get_create_msg(struct amdgpu_ring *ring, uint32_t 
> handle,
> +struct amdgpu_bo *bo,
>  struct dma_fence **fence)
>   {
>   const unsigned ib_size_dw = 16;
>   struct amdgpu_job *job;
>   struct amdgpu_ib *ib;
>   struct dma_fence *f = NULL;
> - uint64_t dummy;
> + uint64_t addr;
>   int i, r;
>   
>   r = amdgpu_job_alloc_with_ib(ring->adev, ib_size_dw * 4, &job);
> @@ -220,15 +221,15 @@ static int uvd_v6_0_enc_get_create_msg(struct 
> amdgpu_ring *ring, uint32_t handle
>   return r;
>   
>   ib = &job->ibs[0];
> - dummy = ib->gpu_addr + 1024;
> + addr = amdgpu_bo_gpu_offset(bo);
>   
>   ib->length_dw = 0;
>   ib->ptr[ib->length_dw++] = 0x0018;
>   ib->ptr[ib->length_dw++] = 0x0001; /* session info */
>   ib->ptr[ib->length_dw++] = handle;
>   ib->ptr[ib->length_dw++] = 0x0001;
> - ib->ptr[ib->length_dw++] = upper_32_bits(dummy);
> - ib->ptr[ib->length_dw++] = dummy;
> + ib->ptr[ib->length_dw++] = upper_32_bits(addr);
> + ib->ptr[ib->length_dw++] = addr;
>   
>   ib->ptr[ib->length_dw++] = 0x0014;
>   ib->ptr[ib->length_dw++] = 0x0002; /* task info */
> @@ -268,13 +269,14 @@ static int uvd_v6_0_enc_get_create_msg(struct 
> amdgpu_ring *ring, uint32_t handle
>*/
>   static int uvd_v6_0_enc_get_destroy_msg(struct amdgpu_ring *ring,
>   uint32_t handle,
> + struct amdgpu_bo *bo,
>   struct dma_fence **fence)
>   {
>   const unsigned ib_size_dw = 16;
>   struct amdgpu_job *job;
>   struct amdgpu_ib *ib;
>   struct dma_fence *f = NULL;
> - uint64_t dummy;
> + uint64_t addr;
>   int i, r;
>   
>   r = amdgpu_job_alloc_with_ib(ring->adev, ib_size_dw * 4, &job);
> @@ -282,15 +284,15 @@ static int uvd_v6_0_enc_get_destroy_msg(struct 
> amdgpu_ring *ring,
>   return r;
>   
>   ib = &job->ibs[0];
> - dummy = ib->gpu_addr + 1024;
> + addr = amdgpu_bo_gpu_offset(bo);
>   
>   ib->length_dw = 0;
>   ib->ptr[ib->length_dw++] = 0x0018;
>   ib->ptr[ib->length_dw++] = 0x0001; /* session info */
>   ib->ptr[ib->length_dw++] = handle;
>   ib->ptr[ib->length_dw++] = 0x0001;
> - ib->ptr[ib->length_dw++] = upper_32_bits(dummy);
> - ib->ptr[ib->length_dw++] = dummy;
> + ib->ptr[ib->length_dw++] = upper_32_bits(addr);
> + ib->ptr[ib->length_dw++] = addr;
>   
>   ib->ptr[ib->length_dw++] = 0x0014;
>   ib->ptr[ib->length_dw++] = 0x0002; /* task info */
> @@ -327,13 +329,20 @@ static int uvd_v6_0_enc_get_destroy_msg(struct 
> amdgpu_ring *ring,
>   static int uvd_v6_0_enc_ring_test_ib(struct amdgpu_ring *ring, long timeout)
>   {
>   struct dma_fence *fence = NULL;
> + struct amdgpu_bo *bo = NULL;
>   long r;
>   
> - r = uvd_v6_0_enc_get_create_msg(ring, 1, NULL);
> + r = amdgpu_bo_create_reserved(ring->adev, 128 * 1024, PAGE_SIZE,
> +   AMDGPU_GEM_DOMAIN_VRAM,
> +   &bo, NULL, NULL);
> + if (r)
> + return r;
> +
> + r = uvd_v6_0_enc_get_create_msg(ring, 1, bo, NULL);
>   if (r)
>   goto error;
>   
> - r = uvd_v6_0_enc_get_destroy_msg(ring, 1, &fence);
> + r = uvd_v6_0_enc_get_destroy_msg(ring, 1, bo, &fence);
>   if (r)
>   goto error;
>   
> @@ -345,6 +354,8 @@ static int uvd_v6_0_enc_ring_test_ib(struct amdgpu_ring 
> *ring, long timeout)
>   
>   error:
>   dma_fence_put(fence);
> + amdgpu_bo_unreserve(bo);
> + amdgpu_bo_unref(&bo);
>   return r;
>   }
>   
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu/display: fix build when CONFIG_DRM_AMD_DC_DSC_SUPPORT=n

2019-10-16 Thread Alex Deucher

Add proper config check.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index 914e378bcda4..4f0331810696 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -1910,8 +1910,10 @@ int dcn20_populate_dml_pipes_from_context(
pipes[pipe_cnt].dout.output_bpp = output_bpc * 3;
}
 
+#ifdef CONFIG_DRM_AMD_DC_DSC_SUPPORT
if (res_ctx->pipe_ctx[i].stream->timing.flags.DSC)
pipes[pipe_cnt].dout.output_bpp = 
res_ctx->pipe_ctx[i].stream->timing.dsc_cfg.bits_per_pixel / 16.0;
+#endif
 
/* todo: default max for now, until there is logic reflecting 
this in dc*/
pipes[pipe_cnt].dout.output_bpc = 12;
-- 
2.23.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH hmm 00/15] Consolidate the mmu notifier interval_tree and locking

2019-10-16 Thread Jason Gunthorpe

On Wed, Oct 16, 2019 at 10:58:02AM +0200, Christian König wrote:
> Am 15.10.19 um 20:12 schrieb Jason Gunthorpe:
> > From: Jason Gunthorpe 
> > 
> > 8 of the mmu_notifier using drivers (i915_gem, radeon_mn, umem_odp, hfi1,
> > scif_dma, vhost, gntdev, hmm) drivers are using a common pattern where
> > they only use invalidate_range_start/end and immediately check the
> > invalidating range against some driver data structure to tell if the
> > driver is interested. Half of them use an interval_tree, the others are
> > simple linear search lists.
> > 
> > Of the ones I checked they largely seem to have various kinds of races,
> > bugs and poor implementation. This is a result of the complexity in how
> > the notifier interacts with get_user_pages(). It is extremely difficult to
> > use it correctly.
> > 
> > Consolidate all of this code together into the core mmu_notifier and
> > provide a locking scheme similar to hmm_mirror that allows the user to
> > safely use get_user_pages() and reliably know if the page list still
> > matches the mm.
> 
> That sounds really good, but could you outline for a moment how that is
> archived?

It uses the same basic scheme as hmm and rdma odp, outlined in the
revisions to hmm.rst later on.

Basically, 

 seq = mmu_range_read_begin(&mrn);

 // This is a speculative region
 .. get_user_pages()/hmm_range_fault() ..
 // Result cannot be derferenced

 take_lock(driver->update);
 if (mmu_range_read_retry(&mrn, range.notifier_seq) {
// collision! The results are not correct
goto again
 }

 // no collision, and now under lock. Now we can de-reference the pages/etc
 // program HW
 // Now the invalidate callback is responsible to synchronize against changes
 unlock(driver->update) 

Basically, anything that was using hmm_mirror correctly transisions
over fairly trivially, just with the modification to store a sequence
number to close that race described in the hmm commit.

For something like AMD gpu I expect it to transition to use dma_fence
from the notifier for coherency right before it unlocks driver->update.

Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH -next] drm/amd/display: Make dc_link_detect_helper static

2019-10-16 Thread Alex Deucher

On Wed, Oct 16, 2019 at 8:29 AM YueHaibing  wrote:
>
> Fix sparse warning:
>
> drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:746:6:
>  warning: symbol 'dc_link_detect_helper' was not declared. Should it be 
> static?
>
> Reported-by: Hulk Robot 
> Signed-off-by: YueHaibing 

Applied.  thanks!

Alex

> ---
>  drivers/gpu/drm/amd/display/dc/core/dc_link.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
> b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> index fb18681..9350536 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> @@ -743,7 +743,8 @@ static bool wait_for_alt_mode(struct dc_link *link)
>   * This does not create remote sinks but will trigger DM
>   * to start MST detection if a branch is detected.
>   */
> -bool dc_link_detect_helper(struct dc_link *link, enum dc_detect_reason 
> reason)
> +static bool dc_link_detect_helper(struct dc_link *link,
> + enum dc_detect_reason reason)
>  {
> struct dc_sink_init_data sink_init_data = { 0 };
> struct display_sink_capability sink_caps = { 0 };
> --
> 2.7.4
>
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu/display: fix build when CONFIG_DRM_AMD_DC_DSC_SUPPORT=n

2019-10-16 Thread Mikita Lipski

Reviewed-by: Mikita Lipski 

On 16.10.2019 12:13, Alex Deucher wrote:
> Add proper config check.
> 
> Signed-off-by: Alex Deucher 
> ---
>   drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
> b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
> index 914e378bcda4..4f0331810696 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
> @@ -1910,8 +1910,10 @@ int dcn20_populate_dml_pipes_from_context(
>   pipes[pipe_cnt].dout.output_bpp = output_bpc * 3;
>   }
>   
> +#ifdef CONFIG_DRM_AMD_DC_DSC_SUPPORT
>   if (res_ctx->pipe_ctx[i].stream->timing.flags.DSC)
>   pipes[pipe_cnt].dout.output_bpp = 
> res_ctx->pipe_ctx[i].stream->timing.dsc_cfg.bits_per_pixel / 16.0;
> +#endif
>   
>   /* todo: default max for now, until there is logic reflecting 
> this in dc*/
>   pipes[pipe_cnt].dout.output_bpc = 12;
> 

-- 
Thanks,
Mikita Lipski
Software Engineer, AMD
mikita.lip...@amd.com
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: AMDGPU and 16B stack alignment

2019-10-16 Thread Arvind Sankar

On Tue, Oct 15, 2019 at 06:51:26PM -0700, Nick Desaulniers wrote:
> On Tue, Oct 15, 2019 at 1:26 PM Arvind Sankar  wrote:
> >
> > On Tue, Oct 15, 2019 at 11:05:56AM -0700, Nick Desaulniers wrote:
> > > Hmmm...I would have liked to remove it outright, as it is an ABI
> > > mismatch that is likely to result in instability and non-fun-to-debug
> > > runtime issues in the future.  I suspect my patch does work for GCC
> > > 7.1+.  The question is: Do we want to either:
> > > 1. mark AMDGPU broken for GCC < 7.1, or
> > > 2. continue supporting it via stack alignment mismatch?
> > >
> > > 2 is brittle, and may break at any point in the future, but if it's
> > > working for someone it does make me feel bad to outright disable it.
> > > What I'd image 2 looks like is (psuedo code in a Makefile):
> > >
> > > if CC_IS_GCC && GCC_VERSION < 7.1:
> > >   set stack alignment to 16B and hope for the best
> > >
> > > So my diff would be amended to keep the stack alignment flags, but
> > > only to support GCC < 7.1.  And that assumes my change compiles with
> > > GCC 7.1+. (Looks like it does for me locally with GCC 8.3, but I would
> > > feel even more confident if someone with hardware to test on and GCC
> > > 7.1+ could boot test).
> > > --
> > > Thanks,
> > > ~Nick Desaulniers
> >
> > If we do keep it, would adding -mstackrealign make it more robust?
> > That's simple and will only add the alignment to functions that require
> > 16-byte alignment (at least on gcc).
> 
> I think there's also `-mincoming-stack-boundary=`.
> https://github.com/ClangBuiltLinux/linux/issues/735#issuecomment-540038017

Yes, but -mstackrealign looks like it's supported by clang as well.
> 
> >
> > Alternative is to use
> > __attribute__((force_align_arg_pointer)) on functions that might be
> > called from 8-byte-aligned code.
> 
> Which is hard to automate and easy to forget.  Likely a large diff to fix 
> today.

Right, this is a no-go, esp to just fix old compilers.
> 
> >
> > It looks like -mstackrealign should work from gcc 5.3 onwards.
> 
> The kernel would generally like to support GCC 4.9+.
> 
> There's plenty of different ways to keep layering on duct tape and
> bailing wire to support differing ABIs, but that's just adding
> technical debt that will have to be repaid one day.  That's why the
> cleanest solution IMO is mark the driver broken for old toolchains,
> and use a code-base-consistent stack alignment.  Bending over
> backwards to support old toolchains means accepting stack alignment
> mismatches, which is in the "unspecified behavior" ring of the
> "undefined behavior" Venn diagram.  I have the same opinion on relying
> on explicitly undefined behavior.
> 
> I'll send patches for fixing up Clang, but please consider my strong
> advice to generally avoid stack alignment mismatches, regardless of
> compiler.
> --
> Thanks,
> ~Nick Desaulniers

What I suggested was in reference to your proposal for dropping the
-mpreferred-stack-boundary=4 for modern compilers, but keeping it for
<7.1. -mstackrealign would at least let 5.3 onwards be less likely to
break (and it doesn't error before then, I think it just doesn't
actually do anything, so no worse than now at least).

Simply dropping support for <7.1 would be cleanest, yes, but it sounds
like people don't want to go that far.

[ANNOUNCE] libdrm 2.4.100

2019-10-16 Thread Marek Olšák

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512


Anusha Srivatsa (1):
  intel: sync i915_pciids.h with kernel

Emil Velikov (1):
  *-symbols-check: use normal shell over bash

Eric Engestrom (7):
  xf86drm: dedupe `#define`s
  xf86drm: use max size of drm node name instead of arbitrary size
  xf86drm: dedupe drmGetDeviceName() logic
  meson: fix sys/mkdev.h detection on Solaris
  *-symbols-check: let meson figure out how to execute the scripts
  RELEASING: update instructions to use meson instead of autotools
  libdrm: remove autotools support

Flora Cui (3):
  tests/amdgpu: fix for dispatch/draw test
  tests/amdgpu: add gpu reset test
  tests/amdgpu: disable reset test for now

Guchun Chen (7):
  amdgpu: add gfx ras inject configuration file
  tests/amdgpu/ras: refine ras inject test
  amdgpu: add umc ras inject test configuration
  amdgpu: remove json package dependence
  amdgpu: delete test configuration file
  amdgpu: add ras inject unit test
  amdgpu: add ras feature capability check in inject test

Ilia Mirkin (1):
  tests/util: fix incorrect memset argument order

Jonathan Gray (2):
  xf86drm: test for render nodes before primary nodes
  xf86drm: open correct render node on non-linux

Le Ma (2):
  tests/amdgpu: divide dispatch test into compute and gfx
  tests/amdgpu: add the missing deactivation case for dispatch test

Lucas De Marchi (1):
  intel: sync i915_pciids.h with kernel

Marek Olšák (5):
  include: update amdgpu_drm.h
  amdgpu: add amdgpu_cs_query_reset_state2 for AMDGPU_CTX_OP_QUERY_STATE2
  Bump the version to 2.4.100
  Revert "libdrm: remove autotools support"
  Bump the version to 2.4.100 for autotools

Niclas Zeising (2):
  meson.build: Fix typo
  meson.build: Fix header detection on FreeBSD

Nirmoy Das (1):
  test/amdgpu: don't free unused bo handle

Rodrigo Vivi (2):
  intel: add the TGL 12 PCI IDs and macros
  intel: Add support for EHL

git tag: libdrm-2.4.100

https://dri.freedesktop.org/libdrm/libdrm-2.4.100.tar.bz2
MD5:  f47bc87e28198ba527e6b44ffdd62f65  libdrm-2.4.100.tar.bz2
SHA1: 9f526909aba08b5658cfba3f7fde9385cad6f3b5  libdrm-2.4.100.tar.bz2
SHA256: c77cc828186c9ceec3e56ae202b43ee99eb932b4a87255038a80e8a1060d0a5d  
libdrm-2.4.100.tar.bz2
SHA512: 
4d3a5556e650872944af52f49de395e0ce8ac9ac58530e39a34413e94dc56c231ee71b8b8de9fb944263515a922b3ebbf7ddfebeaaa91543c2604f9bcf561247
  libdrm-2.4.100.tar.bz2
PGP:  https://dri.freedesktop.org/libdrm/libdrm-2.4.100.tar.bz2.sig

https://dri.freedesktop.org/libdrm/libdrm-2.4.100.tar.gz
MD5:  c47b1718734cc661734ed63f94bc27c1  libdrm-2.4.100.tar.gz
SHA1: 2097f0b98deaff16b8f3b93cedcb5cd35291a3c1  libdrm-2.4.100.tar.gz
SHA256: 6a5337c054c0c47bc16607a21efa2b622e08030be4101ef4a241c5eb05b6619b  
libdrm-2.4.100.tar.gz
SHA512: 
b61835473c77691c4a8e67b32b9df420661e8bf8700507334b58bde5e6a402dee4aea2bec1e5b83343dd28fcb6cf9fd084064d437332f178df81c4780552595b
  libdrm-2.4.100.tar.gz
PGP:  https://dri.freedesktop.org/libdrm/libdrm-2.4.100.tar.gz.sig

-BEGIN PGP SIGNATURE-

iQEzBAEBCgAdFiEEzUfFNBo3XzO+97r6/dFdWs7w8rEFAl2njhEACgkQ/dFdWs7w
8rHl9Af+ODvdiUlbe20uOd8vBDVFgIR5Z4J8aFr/bUJ7ZxXSAytWglfY6Th9U89H
sN6UyXes9tr3OhAotBgZ2LYh1BDM18XSBIdteqQz9uiaZfw+L8OnZq3eikJ8Axlw
bCbJZtWa17KJQjFR7Cv3WozsaopKJm7A6cXHmuhk1lq9ukUBDPCQIqaqf9K+zTBQ
sIV8wliAmEK3s9tRhT3vsk12DmDfO0kUN504vOXdOjhAClDN03M0w4RXqDENG7gz
qm7eHgE7ugqkCxBPGdNfTSIoamFVOaNR/PCqzt5/6VMaW9oFKSTTgyglrVUjulE6
AZRDHPdQ9D6P84WpPtAWOug4EJhVuQ==
=NkL8
-END PGP SIGNATURE-
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 3/3] drm/amdgpu: enable -msse2 for GCC 7.1+ users

2019-10-16 Thread Nick Desaulniers

A final attempt at enabling sse2 for GCC users.

Orininally attempted in:
commit 10117450735c ("drm/amd/display: add -msse2 to prevent Clang from 
emitting libcalls to undefined SW FP routines")

Reverted due to "reported instability" in:
commit 193392ed9f69 ("Revert "drm/amd/display: add -msse2 to prevent Clang from 
emitting libcalls to undefined SW FP routines"")

Re-added just for Clang in:
commit 0f0727d971f6 ("drm/amd/display: readd -msse2 to prevent Clang from 
emitting libcalls to undefined SW FP routines")

The original report didn't have enough information to know if the GPF
was due to misalignment, but I suspect that it was. (The missing
information was the disassembly of the function at the bottom of the
trace, to see if the instruction pointer pointed to an instruction with
16B alignment memory operand requirements.  The stack trace does show
the stack was only 8B but not 16B aligned though, which makes this a
strong possibility).

Now that the stack misalignment issue has been fixed for users of GCC
7.1+, reattempt adding -msse2. This matches Clang.

It will likely never be safe to enable this for pre-GCC 7.1 AND use a
16B aligned stack in these translation units.

This is only a functional change for GCC 7.1+ users, and should be boot
tested.

Link: https://bugs.freedesktop.org/show_bug.cgi?id=109487
Signed-off-by: Nick Desaulniers 
---
 drivers/gpu/drm/amd/display/dc/calcs/Makefile | 4 +---
 drivers/gpu/drm/amd/display/dc/dcn20/Makefile | 4 +---
 drivers/gpu/drm/amd/display/dc/dcn21/Makefile | 4 +---
 drivers/gpu/drm/amd/display/dc/dml/Makefile   | 4 +---
 drivers/gpu/drm/amd/display/dc/dsc/Makefile   | 4 +---
 5 files changed, 5 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/calcs/Makefile 
b/drivers/gpu/drm/amd/display/dc/calcs/Makefile
index a1af55a86508..26c6d735cdc7 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/calcs/Makefile
@@ -37,9 +37,7 @@ ifdef IS_OLD_GCC
 # GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-boundary=3
 # (8B stack alignment).
 calcs_ccflags += -mpreferred-stack-boundary=4
-endif
-
-ifdef CONFIG_CC_IS_CLANG
+else
 calcs_ccflags += -msse2
 endif
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn20/Makefile
index cb0ac131f74a..63f3bddba7da 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/Makefile
@@ -23,9 +23,7 @@ ifdef IS_OLD_GCC
 # GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-boundary=3
 # (8B stack alignment).
 CFLAGS_$(AMDDALPATH)/dc/dcn20/dcn20_resource.o += -mpreferred-stack-boundary=4
-endif
-
-ifdef CONFIG_CC_IS_CLANG
+else
 CFLAGS_$(AMDDALPATH)/dc/dcn20/dcn20_resource.o += -msse2
 endif
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn21/Makefile
index f92320ddd27f..ff50ae71fe27 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn21/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn21/Makefile
@@ -16,9 +16,7 @@ ifdef IS_OLD_GCC
 # GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-boundary=3
 # (8B stack alignment).
 CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o += -mpreferred-stack-boundary=4
-endif
-
-ifdef CONFIG_CC_IS_CLANG
+else
 CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o += -msse2
 endif
 
diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml/Makefile
index ef1bdd20b425..8df251626e22 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
@@ -37,9 +37,7 @@ ifdef IS_OLD_GCC
 # GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-boundary=3
 # (8B stack alignment).
 dml_ccflags += -mpreferred-stack-boundary=4
-endif
-
-ifdef CONFIG_CC_IS_CLANG
+else
 dml_ccflags += -msse2
 endif
 
diff --git a/drivers/gpu/drm/amd/display/dc/dsc/Makefile 
b/drivers/gpu/drm/amd/display/dc/dsc/Makefile
index 3f7840828a9f..970737217e53 100644
--- a/drivers/gpu/drm/amd/display/dc/dsc/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dsc/Makefile
@@ -14,9 +14,7 @@ ifdef IS_OLD_GCC
 # GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-boundary=3
 # (8B stack alignment).
 dsc_ccflags += -mpreferred-stack-boundary=4
-endif
-
-ifdef CONFIG_CC_IS_CLANG
+else
 dsc_ccflags += -msse2
 endif
 
-- 
2.23.0.700.g56cf767bdb-goog

[PATCH 1/3] drm/amdgpu: fix stack alignment ABI mismatch for Clang

2019-10-16 Thread Nick Desaulniers

The x86 kernel is compiled with an 8B stack alignment via
`-mpreferred-stack-boundary=3` for GCC since 3.6-rc1 via
commit d9b0cde91c60 ("x86-64, gcc: Use -mpreferred-stack-boundary=3 if 
supported")
or `-mstack-alignment=8` for Clang. Parts of the AMDGPU driver are
compiled with 16B stack alignment.

Generally, the stack alignment is part of the ABI. Linking together two
different translation units with differing stack alignment is dangerous,
particularly when the translation unit with the smaller stack alignment
makes calls into the translation unit with the larger stack alignment.
While 8B aligned stacks are sometimes also 16B aligned, they are not
always.

Multiple users have reported General Protection Faults (GPF) when using
the AMDGPU driver compiled with Clang. Clang is placing objects in stack
slots assuming the stack is 16B aligned, and selecting instructions that
require 16B aligned memory operands.

At runtime, syscall handlers with 8B aligned stack call into code that
assumes 16B stack alignment.  When the stack is a multiple of 8B but not
16B, these instructions result in a GPF.

Remove the code that added compatibility between the differing compiler
flags, as it will result in runtime GPFs when built with Clang. Cleanups
for GCC will be sent in later patches in the series.

Link: https://github.com/ClangBuiltLinux/linux/issues/735
Debugged-by: Yuxuan Shui 
Reported-by: Shirish S 
Reported-by: Yuxuan Shui 
Suggested-by: Andrew Cooper 
Signed-off-by: Nick Desaulniers 
---
 drivers/gpu/drm/amd/display/dc/calcs/Makefile | 10 --
 drivers/gpu/drm/amd/display/dc/dcn20/Makefile | 10 --
 drivers/gpu/drm/amd/display/dc/dcn21/Makefile | 10 --
 drivers/gpu/drm/amd/display/dc/dml/Makefile   | 10 --
 drivers/gpu/drm/amd/display/dc/dsc/Makefile   | 10 --
 5 files changed, 20 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/calcs/Makefile 
b/drivers/gpu/drm/amd/display/dc/calcs/Makefile
index 985633c08a26..4b1a8a08a5de 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/calcs/Makefile
@@ -24,13 +24,11 @@
 # It calculates Bandwidth and Watermarks values for HW programming
 #
 
-ifneq ($(call cc-option, -mpreferred-stack-boundary=4),)
-   cc_stack_align := -mpreferred-stack-boundary=4
-else ifneq ($(call cc-option, -mstack-alignment=16),)
-   cc_stack_align := -mstack-alignment=16
-endif
+calcs_ccflags := -mhard-float -msse
 
-calcs_ccflags := -mhard-float -msse $(cc_stack_align)
+ifdef CONFIG_CC_IS_GCC
+calcs_ccflags += -mpreferred-stack-boundary=4
+endif
 
 ifdef CONFIG_CC_IS_CLANG
 calcs_ccflags += -msse2
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn20/Makefile
index ddb8d5649e79..5fe3eb80075d 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/Makefile
@@ -10,13 +10,11 @@ ifdef CONFIG_DRM_AMD_DC_DSC_SUPPORT
 DCN20 += dcn20_dsc.o
 endif
 
-ifneq ($(call cc-option, -mpreferred-stack-boundary=4),)
-   cc_stack_align := -mpreferred-stack-boundary=4
-else ifneq ($(call cc-option, -mstack-alignment=16),)
-   cc_stack_align := -mstack-alignment=16
-endif
+CFLAGS_$(AMDDALPATH)/dc/dcn20/dcn20_resource.o := -mhard-float -msse
 
-CFLAGS_$(AMDDALPATH)/dc/dcn20/dcn20_resource.o := -mhard-float -msse 
$(cc_stack_align)
+ifdef CONFIG_CC_IS_GCC
+CFLAGS_$(AMDDALPATH)/dc/dcn20/dcn20_resource.o += -mpreferred-stack-boundary=4
+endif
 
 ifdef CONFIG_CC_IS_CLANG
 CFLAGS_$(AMDDALPATH)/dc/dcn20/dcn20_resource.o += -msse2
diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn21/Makefile
index ef673bffc241..7057e20748b9 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn21/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn21/Makefile
@@ -3,13 +3,11 @@
 
 DCN21 = dcn21_hubp.o dcn21_hubbub.o dcn21_resource.o
 
-ifneq ($(call cc-option, -mpreferred-stack-boundary=4),)
-   cc_stack_align := -mpreferred-stack-boundary=4
-else ifneq ($(call cc-option, -mstack-alignment=16),)
-   cc_stack_align := -mstack-alignment=16
-endif
+CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o := -mhard-float -msse
 
-CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o := -mhard-float -msse 
$(cc_stack_align)
+ifdef CONFIG_CC_IS_GCC
+CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o += -mpreferred-stack-boundary=4
+endif
 
 ifdef CONFIG_CC_IS_CLANG
 CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o += -msse2
diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml/Makefile
index 5b2a65b42403..1bd6e307b7f8 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
@@ -24,13 +24,11 @@
 # It provides the general basic services required by other DAL
 # subcomponents.
 
-ifneq ($(call cc-option, -mpreferred-stack-boundary=4),)
-   cc_stack_align := -mpreferred-stack-boundary=4
-else ifneq ($(call cc-option, -mstack-ali

[PATCH 2/3] drm/amdgpu: fix stack alignment ABI mismatch for GCC 7.1+

2019-10-16 Thread Nick Desaulniers

GCC earlier than 7.1 errors when compiling code that makes use of
`double`s and sets a stack alignment outside of the range of [2^4-2^12]:

$ cat foo.c
double foo(double x, double y) {
  return x + y;
}
$ gcc-4.9 -mpreferred-stack-boundary=3 foo.c
error: -mpreferred-stack-boundary=3 is not between 4 and 12

This is likely why the AMDGPU driver was ever compiled with a different
stack alignment (and thus different ABI) than the rest of the x86
kernel. The kernel uses 8B stack alignment, while the driver was using
16B stack alignment in a few places.

Since GCC 7.1+ doesn't error, fix the ABI mismatch for users of newer
versions of GCC.

There was discussion about whether to mark the driver broken or not for
users of GCC earlier than 7.1, but since the driver currently is
working, don't explicitly break the driver for them here.

Relying on differing stack alignment is unspecified behavior, and
brittle, and may break in the future.

This patch is no functional change for GCC users earlier than 7.1. It's
been compile tested on GCC 4.9 and 8.3 to check the correct flags. It
should be boot tested when built with GCC 7.1+.

-mincoming-stack-boundary= or -mstackrealign may help keep this code
building for pre-GCC 7.1 users.

The version check for GCC is broken into two conditionals, both because
cc-ifversion is currently GCC specific, and it simplifies a subsequent
patch.

Signed-off-by: Nick Desaulniers 
---
 drivers/gpu/drm/amd/display/dc/calcs/Makefile | 9 +
 drivers/gpu/drm/amd/display/dc/dcn20/Makefile | 9 +
 drivers/gpu/drm/amd/display/dc/dcn21/Makefile | 9 +
 drivers/gpu/drm/amd/display/dc/dml/Makefile   | 9 +
 drivers/gpu/drm/amd/display/dc/dsc/Makefile   | 9 +
 5 files changed, 45 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/calcs/Makefile 
b/drivers/gpu/drm/amd/display/dc/calcs/Makefile
index 4b1a8a08a5de..a1af55a86508 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/calcs/Makefile
@@ -27,6 +27,15 @@
 calcs_ccflags := -mhard-float -msse
 
 ifdef CONFIG_CC_IS_GCC
+ifeq ($(call cc-ifversion, -lt, 0701, y), y)
+IS_OLD_GCC = 1
+endif
+endif
+
+ifdef IS_OLD_GCC
+# Stack alignment mismatch, proceed with caution.
+# GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-boundary=3
+# (8B stack alignment).
 calcs_ccflags += -mpreferred-stack-boundary=4
 endif
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn20/Makefile
index 5fe3eb80075d..cb0ac131f74a 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/Makefile
@@ -13,6 +13,15 @@ endif
 CFLAGS_$(AMDDALPATH)/dc/dcn20/dcn20_resource.o := -mhard-float -msse
 
 ifdef CONFIG_CC_IS_GCC
+ifeq ($(call cc-ifversion, -lt, 0701, y), y)
+IS_OLD_GCC = 1
+endif
+endif
+
+ifdef IS_OLD_GCC
+# Stack alignment mismatch, proceed with caution.
+# GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-boundary=3
+# (8B stack alignment).
 CFLAGS_$(AMDDALPATH)/dc/dcn20/dcn20_resource.o += -mpreferred-stack-boundary=4
 endif
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn21/Makefile
index 7057e20748b9..f92320ddd27f 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn21/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn21/Makefile
@@ -6,6 +6,15 @@ DCN21 = dcn21_hubp.o dcn21_hubbub.o dcn21_resource.o
 CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o := -mhard-float -msse
 
 ifdef CONFIG_CC_IS_GCC
+ifeq ($(call cc-ifversion, -lt, 0701, y), y)
+IS_OLD_GCC = 1
+endif
+endif
+
+ifdef IS_OLD_GCC
+# Stack alignment mismatch, proceed with caution.
+# GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-boundary=3
+# (8B stack alignment).
 CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o += -mpreferred-stack-boundary=4
 endif
 
diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml/Makefile
index 1bd6e307b7f8..ef1bdd20b425 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
@@ -27,6 +27,15 @@
 dml_ccflags := -mhard-float -msse
 
 ifdef CONFIG_CC_IS_GCC
+ifeq ($(call cc-ifversion, -lt, 0701, y), y)
+IS_OLD_GCC = 1
+endif
+endif
+
+ifdef IS_OLD_GCC
+# Stack alignment mismatch, proceed with caution.
+# GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-boundary=3
+# (8B stack alignment).
 dml_ccflags += -mpreferred-stack-boundary=4
 endif
 
diff --git a/drivers/gpu/drm/amd/display/dc/dsc/Makefile 
b/drivers/gpu/drm/amd/display/dc/dsc/Makefile
index 932c3055230e..3f7840828a9f 100644
--- a/drivers/gpu/drm/amd/display/dc/dsc/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dsc/Makefile
@@ -4,6 +4,15 @@
 dsc_ccflags := -mhard-float -msse
 
 ifdef CONFIG_CC_IS_GCC
+ifeq ($(call cc-ifversion, -lt, 0701, y), y)
+IS_OLD_GCC = 1
+endif
+endif
+
+ifdef IS_OLD_GCC
+# Stack alignment mismatch, proceed with caution.
+# GCC < 7.1 c

[PATCH 0/3] drm/amdgpu: fix stack alignment ABI mismatch

2019-10-16 Thread Nick Desaulniers

The x86 kernel is compiled with an 8B stack alignment via
`-mpreferred-stack-boundary=3` for GCC since 3.6-rc1 via
commit d9b0cde91c60 ("x86-64, gcc: Use -mpreferred-stack-boundary=3 if 
supported")
or `-mstack-alignment=8` for Clang. Parts of the AMDGPU driver are
compiled with 16B stack alignment.

Generally, the stack alignment is part of the ABI. Linking together two
different translation units with differing stack alignment is dangerous,
particularly when the translation unit with the smaller stack alignment
makes calls into the translation unit with the larger stack alignment.
While 8B aligned stacks are sometimes also 16B aligned, they are not
always.

Multiple users have reported General Protection Faults (GPF) when using
the AMDGPU driver compiled with Clang. Clang is placing objects in stack
slots assuming the stack is 16B aligned, and selecting instructions that
require 16B aligned memory operands.

At runtime, syscall handlers with 8B aligned stack call into code that
assumes 16B stack alignment.  When the stack is a multiple of 8B but not
16B, these instructions result in a GPF.

Remove the code that added compatibility between the differing compiler
flags, as it will result in runtime GPFs when built with Clang.

The series is broken into 3 patches, the first is an important fix for
Clang for ChromeOS. The rest are attempted cleanups for GCC, but require
additional boot testing. The first patch is critical, the rest are nice
to have. I've compile tested the series with ToT Clang, GCC 4.9, and GCC
8.3 **but** I do not have hardware to test on, so I need folks with the
above compilers and relevant hardware to help test the series.

The first patch is a functional change for Clang only. It does not
change anything for any version of GCC. Yuxuan boot tested a previous
incarnation on hardware, but I've changed it enough that I think it made
sense to drop the previous tested by tag.

The second patch is a functional change for GCC 7.1+ only. It does not
affect older versions of GCC or Clang (though if someone wanted to
double check with pre-GCC 7.1 it wouldn't hurt).  It should be boot
tested on GCC 7.1+ on the relevant hardware.

The final patch is also a functional change for GCC 7.1+ only. It does
not affect older versions of GCC or Clang. It should be boot tested on
GCC 7.1+ on the relevant hardware. Theoretically, there may be an issue
with it, and it's ok to drop it. The series was intentional broken into
3 in order to allow them to be incrementally tested and accepted. It's
ok to take earlier patches without the later patches.

And finally, I do not condone linking object files of differing stack
alignments.  Idealistically, we'd mark the driver broken for pre-GCC
7.1.  Pragmatically, "if it ain't broke, don't fix it."

Nick Desaulniers (3):
  drm/amdgpu: fix stack alignment ABI mismatch for Clang
  drm/amdgpu: fix stack alignment ABI mismatch for GCC 7.1+
  drm/amdgpu: enable -msse2 for GCC 7.1+ users

 drivers/gpu/drm/amd/display/dc/calcs/Makefile | 19 ---
 drivers/gpu/drm/amd/display/dc/dcn20/Makefile | 19 ---
 drivers/gpu/drm/amd/display/dc/dcn21/Makefile | 19 ---
 drivers/gpu/drm/amd/display/dc/dml/Makefile   | 19 ---
 drivers/gpu/drm/amd/display/dc/dsc/Makefile   | 19 ---
 5 files changed, 60 insertions(+), 35 deletions(-)

-- 
2.23.0.700.g56cf767bdb-goog

Re: AMDGPU and 16B stack alignment

2019-10-16 Thread Nick Desaulniers

On Wed, Oct 16, 2019 at 11:55 AM Arvind Sankar  wrote:
>
> On Tue, Oct 15, 2019 at 06:51:26PM -0700, Nick Desaulniers wrote:
> > On Tue, Oct 15, 2019 at 1:26 PM Arvind Sankar  wrote:
> > >
> > > On Tue, Oct 15, 2019 at 11:05:56AM -0700, Nick Desaulniers wrote:
> > > > Hmmm...I would have liked to remove it outright, as it is an ABI
> > > > mismatch that is likely to result in instability and non-fun-to-debug
> > > > runtime issues in the future.  I suspect my patch does work for GCC
> > > > 7.1+.  The question is: Do we want to either:
> > > > 1. mark AMDGPU broken for GCC < 7.1, or
> > > > 2. continue supporting it via stack alignment mismatch?
> > > >
> > > > 2 is brittle, and may break at any point in the future, but if it's
> > > > working for someone it does make me feel bad to outright disable it.
> > > > What I'd image 2 looks like is (psuedo code in a Makefile):
> > > >
> > > > if CC_IS_GCC && GCC_VERSION < 7.1:
> > > >   set stack alignment to 16B and hope for the best
> > > >
> > > > So my diff would be amended to keep the stack alignment flags, but
> > > > only to support GCC < 7.1.  And that assumes my change compiles with
> > > > GCC 7.1+. (Looks like it does for me locally with GCC 8.3, but I would
> > > > feel even more confident if someone with hardware to test on and GCC
> > > > 7.1+ could boot test).
> > > > --
> > > > Thanks,
> > > > ~Nick Desaulniers
> > >
> > > If we do keep it, would adding -mstackrealign make it more robust?
> > > That's simple and will only add the alignment to functions that require
> > > 16-byte alignment (at least on gcc).
> >
> > I think there's also `-mincoming-stack-boundary=`.
> > https://github.com/ClangBuiltLinux/linux/issues/735#issuecomment-540038017
>
> Yes, but -mstackrealign looks like it's supported by clang as well.

Good to know, but I want less duct tape, not more.

> >
> > >
> > > Alternative is to use
> > > __attribute__((force_align_arg_pointer)) on functions that might be
> > > called from 8-byte-aligned code.
> >
> > Which is hard to automate and easy to forget.  Likely a large diff to fix 
> > today.
>
> Right, this is a no-go, esp to just fix old compilers.
> >
> > >
> > > It looks like -mstackrealign should work from gcc 5.3 onwards.
> >
> > The kernel would generally like to support GCC 4.9+.
> >
> > There's plenty of different ways to keep layering on duct tape and
> > bailing wire to support differing ABIs, but that's just adding
> > technical debt that will have to be repaid one day.  That's why the
> > cleanest solution IMO is mark the driver broken for old toolchains,
> > and use a code-base-consistent stack alignment.  Bending over
> > backwards to support old toolchains means accepting stack alignment
> > mismatches, which is in the "unspecified behavior" ring of the
> > "undefined behavior" Venn diagram.  I have the same opinion on relying
> > on explicitly undefined behavior.
> >
> > I'll send patches for fixing up Clang, but please consider my strong
> > advice to generally avoid stack alignment mismatches, regardless of
> > compiler.
> > --
> > Thanks,
> > ~Nick Desaulniers
>
> What I suggested was in reference to your proposal for dropping the
> -mpreferred-stack-boundary=4 for modern compilers, but keeping it for
> <7.1. -mstackrealign would at least let 5.3 onwards be less likely to
> break (and it doesn't error before then, I think it just doesn't
> actually do anything, so no worse than now at least).
>
> Simply dropping support for <7.1 would be cleanest, yes, but it sounds
> like people don't want to go that far.

That's fair.  I've included your suggestions in the commit message of
02/03 of a series I just sent but forgot to in reply to this thread:
https://lkml.org/lkml/2019/10/16/1700

Also, I do appreciate the suggestions and understand the value of brainstorming.
-- 
Thanks,
~Nick Desaulniers

[PATCH] drm/amdgpu: disable c-states on xgmi perfmons

2019-10-16 Thread Kim, Jonathan

read or writes to df registers when gpu df is in c-states will result in
hang.  df c-states should be disabled prior to read or writes then
re-enabled after read or writes.

Change-Id: I6d5a83e4fe13e29c73dfb03a94fe7c611e867fec
Signed-off-by: Jonathan Kim 
---
 drivers/gpu/drm/amd/amdgpu/df_v3_6.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/df_v3_6.c 
b/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
index 16fbd2bc8ad1..9a58416662e0 100644
--- a/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
+++ b/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
@@ -102,6 +102,9 @@ static uint64_t df_v3_6_get_fica(struct amdgpu_device *adev,
address = adev->nbio.funcs->get_pcie_index_offset(adev);
data = adev->nbio.funcs->get_pcie_data_offset(adev);
 
+   if (smu_set_df_cstate(&adev->smu, 0))
+   return 0x;
+
spin_lock_irqsave(&adev->pcie_idx_lock, flags);
WREG32(address, smnDF_PIE_AON_FabricIndirectConfigAccessAddress3);
WREG32(data, ficaa_val);
@@ -114,6 +117,8 @@ static uint64_t df_v3_6_get_fica(struct amdgpu_device *adev,
 
spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
 
+   smu_set_df_cstate(&adev->smu, 1);
+
return (((ficadh_val & 0x) << 32) | ficadl_val);
 }
 
@@ -125,6 +130,9 @@ static void df_v3_6_set_fica(struct amdgpu_device *adev, 
uint32_t ficaa_val,
address = adev->nbio.funcs->get_pcie_index_offset(adev);
data = adev->nbio.funcs->get_pcie_data_offset(adev);
 
+   if (smu_set_df_cstate(&adev->smu, 0))
+   return;
+
spin_lock_irqsave(&adev->pcie_idx_lock, flags);
WREG32(address, smnDF_PIE_AON_FabricIndirectConfigAccessAddress3);
WREG32(data, ficaa_val);
@@ -134,8 +142,9 @@ static void df_v3_6_set_fica(struct amdgpu_device *adev, 
uint32_t ficaa_val,
 
WREG32(address, smnDF_PIE_AON_FabricIndirectConfigAccessDataHi3);
WREG32(data, ficadh_val);
-
spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
+
+   smu_set_df_cstate(&adev->smu, 1);
 }
 
 /*
@@ -153,12 +162,17 @@ static void df_v3_6_perfmon_rreg(struct amdgpu_device 
*adev,
address = adev->nbio.funcs->get_pcie_index_offset(adev);
data = adev->nbio.funcs->get_pcie_data_offset(adev);
 
+   if (smu_set_df_cstate(&adev->smu, 0))
+   return;
+
spin_lock_irqsave(&adev->pcie_idx_lock, flags);
WREG32(address, lo_addr);
*lo_val = RREG32(data);
WREG32(address, hi_addr);
*hi_val = RREG32(data);
spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
+
+   smu_set_df_cstate(&adev->smu, 1);
 }
 
 /*
@@ -175,12 +189,17 @@ static void df_v3_6_perfmon_wreg(struct amdgpu_device 
*adev, uint32_t lo_addr,
address = adev->nbio.funcs->get_pcie_index_offset(adev);
data = adev->nbio.funcs->get_pcie_data_offset(adev);
 
+   if (smu_set_df_cstate(&adev->smu, 0))
+   return;
+
spin_lock_irqsave(&adev->pcie_idx_lock, flags);
WREG32(address, lo_addr);
WREG32(data, lo_val);
WREG32(address, hi_addr);
WREG32(data, hi_val);
spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
+
+   smu_set_df_cstate(&adev->smu, 1);
 }
 
 /* get the number of df counters available */
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: disable c-states on xgmi perfmons

2019-10-16 Thread Kim, Jonathan

+ Felix

-Original Message-
From: Kim, Jonathan  
Sent: Wednesday, October 16, 2019 8:49 PM
To: amd-gfx@lists.freedesktop.org
Cc: felix.keuhl...@amd.com; Quan, Evan ; Kim, Jonathan 
; Kim, Jonathan 
Subject: [PATCH] drm/amdgpu: disable c-states on xgmi perfmons

read or writes to df registers when gpu df is in c-states will result in hang.  
df c-states should be disabled prior to read or writes then re-enabled after 
read or writes.

Change-Id: I6d5a83e4fe13e29c73dfb03a94fe7c611e867fec
Signed-off-by: Jonathan Kim 
---
 drivers/gpu/drm/amd/amdgpu/df_v3_6.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/df_v3_6.c 
b/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
index 16fbd2bc8ad1..9a58416662e0 100644
--- a/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
+++ b/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
@@ -102,6 +102,9 @@ static uint64_t df_v3_6_get_fica(struct amdgpu_device *adev,
address = adev->nbio.funcs->get_pcie_index_offset(adev);
data = adev->nbio.funcs->get_pcie_data_offset(adev);
 
+   if (smu_set_df_cstate(&adev->smu, 0))
+   return 0x;
+
spin_lock_irqsave(&adev->pcie_idx_lock, flags);
WREG32(address, smnDF_PIE_AON_FabricIndirectConfigAccessAddress3);
WREG32(data, ficaa_val);
@@ -114,6 +117,8 @@ static uint64_t df_v3_6_get_fica(struct amdgpu_device *adev,
 
spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
 
+   smu_set_df_cstate(&adev->smu, 1);
+
return (((ficadh_val & 0x) << 32) | ficadl_val);  }
 
@@ -125,6 +130,9 @@ static void df_v3_6_set_fica(struct amdgpu_device *adev, 
uint32_t ficaa_val,
address = adev->nbio.funcs->get_pcie_index_offset(adev);
data = adev->nbio.funcs->get_pcie_data_offset(adev);
 
+   if (smu_set_df_cstate(&adev->smu, 0))
+   return;
+
spin_lock_irqsave(&adev->pcie_idx_lock, flags);
WREG32(address, smnDF_PIE_AON_FabricIndirectConfigAccessAddress3);
WREG32(data, ficaa_val);
@@ -134,8 +142,9 @@ static void df_v3_6_set_fica(struct amdgpu_device *adev, 
uint32_t ficaa_val,
 
WREG32(address, smnDF_PIE_AON_FabricIndirectConfigAccessDataHi3);
WREG32(data, ficadh_val);
-
spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
+
+   smu_set_df_cstate(&adev->smu, 1);
 }
 
 /*
@@ -153,12 +162,17 @@ static void df_v3_6_perfmon_rreg(struct amdgpu_device 
*adev,
address = adev->nbio.funcs->get_pcie_index_offset(adev);
data = adev->nbio.funcs->get_pcie_data_offset(adev);
 
+   if (smu_set_df_cstate(&adev->smu, 0))
+   return;
+
spin_lock_irqsave(&adev->pcie_idx_lock, flags);
WREG32(address, lo_addr);
*lo_val = RREG32(data);
WREG32(address, hi_addr);
*hi_val = RREG32(data);
spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
+
+   smu_set_df_cstate(&adev->smu, 1);
 }
 
 /*
@@ -175,12 +189,17 @@ static void df_v3_6_perfmon_wreg(struct amdgpu_device 
*adev, uint32_t lo_addr,
address = adev->nbio.funcs->get_pcie_index_offset(adev);
data = adev->nbio.funcs->get_pcie_data_offset(adev);
 
+   if (smu_set_df_cstate(&adev->smu, 0))
+   return;
+
spin_lock_irqsave(&adev->pcie_idx_lock, flags);
WREG32(address, lo_addr);
WREG32(data, lo_val);
WREG32(address, hi_addr);
WREG32(data, hi_val);
spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
+
+   smu_set_df_cstate(&adev->smu, 1);
 }
 
 /* get the number of df counters available */
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: disable c-states on xgmi perfmons

2019-10-16 Thread Quan, Evan

Hi Jonathan,

At default, vega20 still takes old powerplay routines. So, this will not work 
on Vega20.
As proposed before, the logics similar as below should be used: 
/
if (is_support_sw_smu(adev)) {
r = smu_set_df_cstate(&adev->smu, DF_CSTATE_DISALLOW or 
DF_CSTATE_ALLOW);
} else if (adev->powerplay.pp_funcs &&
   adev->powerplay.pp_funcs->set_df_cstate) {
r = adev->powerplay.pp_funcs->set_df_cstate(
adev->powerplay.pp_handle,
DF_CSTATE_DISALLOW or DF_CSTATE_ALLOW);
}


Regards,
Evan
> -Original Message-
> From: Kim, Jonathan 
> Sent: 2019年10月17日 8:50
> To: amd-gfx@lists.freedesktop.org
> Cc: Kuehling, Felix ; Quan, Evan
> 
> Subject: RE: [PATCH] drm/amdgpu: disable c-states on xgmi perfmons
> 
> + Felix
> 
> -Original Message-
> From: Kim, Jonathan 
> Sent: Wednesday, October 16, 2019 8:49 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: felix.keuhl...@amd.com; Quan, Evan ; Kim,
> Jonathan ; Kim, Jonathan
> 
> Subject: [PATCH] drm/amdgpu: disable c-states on xgmi perfmons
> 
> read or writes to df registers when gpu df is in c-states will result in 
> hang.  df
> c-states should be disabled prior to read or writes then re-enabled after read
> or writes.
> 
> Change-Id: I6d5a83e4fe13e29c73dfb03a94fe7c611e867fec
> Signed-off-by: Jonathan Kim 
> ---
>  drivers/gpu/drm/amd/amdgpu/df_v3_6.c | 21 -
>  1 file changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
> b/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
> index 16fbd2bc8ad1..9a58416662e0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
> +++ b/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
> @@ -102,6 +102,9 @@ static uint64_t df_v3_6_get_fica(struct
> amdgpu_device *adev,
>   address = adev->nbio.funcs->get_pcie_index_offset(adev);
>   data = adev->nbio.funcs->get_pcie_data_offset(adev);
> 
> + if (smu_set_df_cstate(&adev->smu, 0))
> + return 0x;
> +
>   spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>   WREG32(address,
> smnDF_PIE_AON_FabricIndirectConfigAccessAddress3);
>   WREG32(data, ficaa_val);
> @@ -114,6 +117,8 @@ static uint64_t df_v3_6_get_fica(struct
> amdgpu_device *adev,
> 
>   spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> 
> + smu_set_df_cstate(&adev->smu, 1);
> +
>   return (((ficadh_val & 0x) << 32) | ficadl_val);  }
> 
> @@ -125,6 +130,9 @@ static void df_v3_6_set_fica(struct amdgpu_device
> *adev, uint32_t ficaa_val,
>   address = adev->nbio.funcs->get_pcie_index_offset(adev);
>   data = adev->nbio.funcs->get_pcie_data_offset(adev);
> 
> + if (smu_set_df_cstate(&adev->smu, 0))
> + return;
> +
>   spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>   WREG32(address,
> smnDF_PIE_AON_FabricIndirectConfigAccessAddress3);
>   WREG32(data, ficaa_val);
> @@ -134,8 +142,9 @@ static void df_v3_6_set_fica(struct amdgpu_device
> *adev, uint32_t ficaa_val,
> 
>   WREG32(address,
> smnDF_PIE_AON_FabricIndirectConfigAccessDataHi3);
>   WREG32(data, ficadh_val);
> -
>   spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> +
> + smu_set_df_cstate(&adev->smu, 1);
>  }
> 
>  /*
> @@ -153,12 +162,17 @@ static void df_v3_6_perfmon_rreg(struct
> amdgpu_device *adev,
>   address = adev->nbio.funcs->get_pcie_index_offset(adev);
>   data = adev->nbio.funcs->get_pcie_data_offset(adev);
> 
> + if (smu_set_df_cstate(&adev->smu, 0))
> + return;
> +
>   spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>   WREG32(address, lo_addr);
>   *lo_val = RREG32(data);
>   WREG32(address, hi_addr);
>   *hi_val = RREG32(data);
>   spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> +
> + smu_set_df_cstate(&adev->smu, 1);
>  }
> 
>  /*
> @@ -175,12 +189,17 @@ static void df_v3_6_perfmon_wreg(struct
> amdgpu_device *adev, uint32_t lo_addr,
>   address = adev->nbio.funcs->get_pcie_index_offset(adev);
>   data = adev->nbio.funcs->get_pcie_data_offset(adev);
> 
> + if (smu_set_df_cstate(&adev->smu, 0))
> + return;
> +
>   spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>   WREG32(address, lo_addr);
>   WREG32(data, lo_val);
>   WREG32(address, hi_addr);
>   WREG32(data, hi_val);
>   spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> +
> + smu_set_df_cstate(&adev->smu, 1);
>  }
> 
>  /* get the number of df counters available */
> --
> 2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: disable c-states on xgmi perfmons

2019-10-16 Thread Kim, Jonathan

read or writes to df registers when gpu df is in c-states will result in
hang.  df c-states should be disabled prior to read or writes then
re-enabled after read or writes.

v2: use old powerplay routines for vega20

Change-Id: I6d5a83e4fe13e29c73dfb03a94fe7c611e867fec
Signed-off-by: Jonathan Kim 
---
 drivers/gpu/drm/amd/amdgpu/df_v3_6.c | 36 +++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/df_v3_6.c 
b/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
index 16fbd2bc8ad1..f403c62c944e 100644
--- a/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
+++ b/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
@@ -93,6 +93,21 @@ const struct attribute_group *df_v3_6_attr_groups[] = {
NULL
 };
 
+static df_v3_6_set_df_cstate(struct amdgpu_device *adev, int allow)
+{
+   int r = 0;
+
+   if (is_support_sw_smu(adev)) {
+   r = smu_set_df_cstate(&adev->smu, allow);
+   } else if (adev->powerplay.pp_funcs
+   && adev->powerplay.pp_funcs->set_df_cstate) {
+   r = adev->powerplay.pp_funcs->set_df_cstate(
+   adev->powerplay.pp_handle, allow);
+   }
+
+   return r;
+}
+
 static uint64_t df_v3_6_get_fica(struct amdgpu_device *adev,
 uint32_t ficaa_val)
 {
@@ -102,6 +117,9 @@ static uint64_t df_v3_6_get_fica(struct amdgpu_device *adev,
address = adev->nbio.funcs->get_pcie_index_offset(adev);
data = adev->nbio.funcs->get_pcie_data_offset(adev);
 
+   if (df_v3_6_set_df_cstate(adev, DF_CSTATE_DISALLOW))
+   return 0x;
+
spin_lock_irqsave(&adev->pcie_idx_lock, flags);
WREG32(address, smnDF_PIE_AON_FabricIndirectConfigAccessAddress3);
WREG32(data, ficaa_val);
@@ -114,6 +132,8 @@ static uint64_t df_v3_6_get_fica(struct amdgpu_device *adev,
 
spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
 
+   df_v3_6_set_df_cstate(adev, DF_CSTATE_ALLOW);
+
return (((ficadh_val & 0x) << 32) | ficadl_val);
 }
 
@@ -125,6 +145,9 @@ static void df_v3_6_set_fica(struct amdgpu_device *adev, 
uint32_t ficaa_val,
address = adev->nbio.funcs->get_pcie_index_offset(adev);
data = adev->nbio.funcs->get_pcie_data_offset(adev);
 
+   if (df_v3_6_set_df_cstate(adev, DF_CSTATE_DISALLOW))
+   return;
+
spin_lock_irqsave(&adev->pcie_idx_lock, flags);
WREG32(address, smnDF_PIE_AON_FabricIndirectConfigAccessAddress3);
WREG32(data, ficaa_val);
@@ -134,8 +157,9 @@ static void df_v3_6_set_fica(struct amdgpu_device *adev, 
uint32_t ficaa_val,
 
WREG32(address, smnDF_PIE_AON_FabricIndirectConfigAccessDataHi3);
WREG32(data, ficadh_val);
-
spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
+
+   df_v3_6_set_df_cstate(adev, DF_CSTATE_ALLOW);
 }
 
 /*
@@ -153,12 +177,17 @@ static void df_v3_6_perfmon_rreg(struct amdgpu_device 
*adev,
address = adev->nbio.funcs->get_pcie_index_offset(adev);
data = adev->nbio.funcs->get_pcie_data_offset(adev);
 
+   if (df_v3_6_set_df_cstate(adev, DF_CSTATE_DISALLOW))
+   return;
+
spin_lock_irqsave(&adev->pcie_idx_lock, flags);
WREG32(address, lo_addr);
*lo_val = RREG32(data);
WREG32(address, hi_addr);
*hi_val = RREG32(data);
spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
+
+   df_v3_6_set_df_cstate(adev, DF_CSTATE_ALLOW);
 }
 
 /*
@@ -175,12 +204,17 @@ static void df_v3_6_perfmon_wreg(struct amdgpu_device 
*adev, uint32_t lo_addr,
address = adev->nbio.funcs->get_pcie_index_offset(adev);
data = adev->nbio.funcs->get_pcie_data_offset(adev);
 
+   if (df_v3_6_set_df_cstate(adev, DF_CSTATE_DISALLOW))
+   return;
+
spin_lock_irqsave(&adev->pcie_idx_lock, flags);
WREG32(address, lo_addr);
WREG32(data, lo_val);
WREG32(address, hi_addr);
WREG32(data, hi_val);
spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
+
+   df_v3_6_set_df_cstate(adev, DF_CSTATE_ALLOW);
 }
 
 /* get the number of df counters available */
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: disable c-states on xgmi perfmons

2019-10-16 Thread Kim, Jonathan

Thanks Evan.  Resent with fixes.

-Original Message-
From: Quan, Evan  
Sent: Wednesday, October 16, 2019 9:34 PM
To: Kim, Jonathan ; amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix 
Subject: RE: [PATCH] drm/amdgpu: disable c-states on xgmi perfmons

Hi Jonathan,

At default, vega20 still takes old powerplay routines. So, this will not work 
on Vega20.
As proposed before, the logics similar as below should be used: 
/
if (is_support_sw_smu(adev)) {
r = smu_set_df_cstate(&adev->smu, DF_CSTATE_DISALLOW or 
DF_CSTATE_ALLOW); } else if (adev->powerplay.pp_funcs &&
   adev->powerplay.pp_funcs->set_df_cstate) {
r = adev->powerplay.pp_funcs->set_df_cstate(
adev->powerplay.pp_handle,
DF_CSTATE_DISALLOW or DF_CSTATE_ALLOW); } 


Regards,
Evan
> -Original Message-
> From: Kim, Jonathan 
> Sent: 2019年10月17日 8:50
> To: amd-gfx@lists.freedesktop.org
> Cc: Kuehling, Felix ; Quan, Evan 
> 
> Subject: RE: [PATCH] drm/amdgpu: disable c-states on xgmi perfmons
> 
> + Felix
> 
> -Original Message-
> From: Kim, Jonathan 
> Sent: Wednesday, October 16, 2019 8:49 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: felix.keuhl...@amd.com; Quan, Evan ; Kim, 
> Jonathan ; Kim, Jonathan 
> Subject: [PATCH] drm/amdgpu: disable c-states on xgmi perfmons
> 
> read or writes to df registers when gpu df is in c-states will result 
> in hang.  df c-states should be disabled prior to read or writes then 
> re-enabled after read or writes.
> 
> Change-Id: I6d5a83e4fe13e29c73dfb03a94fe7c611e867fec
> Signed-off-by: Jonathan Kim 
> ---
>  drivers/gpu/drm/amd/amdgpu/df_v3_6.c | 21 -
>  1 file changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
> b/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
> index 16fbd2bc8ad1..9a58416662e0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
> +++ b/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
> @@ -102,6 +102,9 @@ static uint64_t df_v3_6_get_fica(struct 
> amdgpu_device *adev,
>   address = adev->nbio.funcs->get_pcie_index_offset(adev);
>   data = adev->nbio.funcs->get_pcie_data_offset(adev);
> 
> + if (smu_set_df_cstate(&adev->smu, 0))
> + return 0x;
> +
>   spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>   WREG32(address,
> smnDF_PIE_AON_FabricIndirectConfigAccessAddress3);
>   WREG32(data, ficaa_val);
> @@ -114,6 +117,8 @@ static uint64_t df_v3_6_get_fica(struct 
> amdgpu_device *adev,
> 
>   spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> 
> + smu_set_df_cstate(&adev->smu, 1);
> +
>   return (((ficadh_val & 0x) << 32) | ficadl_val);  }
> 
> @@ -125,6 +130,9 @@ static void df_v3_6_set_fica(struct amdgpu_device 
> *adev, uint32_t ficaa_val,
>   address = adev->nbio.funcs->get_pcie_index_offset(adev);
>   data = adev->nbio.funcs->get_pcie_data_offset(adev);
> 
> + if (smu_set_df_cstate(&adev->smu, 0))
> + return;
> +
>   spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>   WREG32(address,
> smnDF_PIE_AON_FabricIndirectConfigAccessAddress3);
>   WREG32(data, ficaa_val);
> @@ -134,8 +142,9 @@ static void df_v3_6_set_fica(struct amdgpu_device 
> *adev, uint32_t ficaa_val,
> 
>   WREG32(address,
> smnDF_PIE_AON_FabricIndirectConfigAccessDataHi3);
>   WREG32(data, ficadh_val);
> -
>   spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> +
> + smu_set_df_cstate(&adev->smu, 1);
>  }
> 
>  /*
> @@ -153,12 +162,17 @@ static void df_v3_6_perfmon_rreg(struct 
> amdgpu_device *adev,
>   address = adev->nbio.funcs->get_pcie_index_offset(adev);
>   data = adev->nbio.funcs->get_pcie_data_offset(adev);
> 
> + if (smu_set_df_cstate(&adev->smu, 0))
> + return;
> +
>   spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>   WREG32(address, lo_addr);
>   *lo_val = RREG32(data);
>   WREG32(address, hi_addr);
>   *hi_val = RREG32(data);
>   spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> +
> + smu_set_df_cstate(&adev->smu, 1);
>  }
> 
>  /*
> @@ -175,12 +189,17 @@ static void df_v3_6_perfmon_wreg(struct 
> amdgpu_device *adev, uint32_t lo_addr,
>   address = adev->nbio.funcs->get_pcie_index_offset(adev);
>   data = adev->nbio.funcs->get_pcie_data_offset(adev);
> 
> + if (smu_set_df_cstate(&adev->smu, 0))
> + return;
> +
>   spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>   WREG32(address, lo_addr);
>   WREG32(data, lo_val);
>   WREG32(address, hi_addr);
>   WREG32(data, hi_val);
>   spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> +
> + smu_set_df_cstate(&adev->smu, 1);
>  }
> 
>  /* get the number of df counters available */
> --
> 2.17.1

__

[pull] amdgpu, radeon drm-fixes-5.4

2019-10-16 Thread Alex Deucher

Hi Dave, Daniel,

A few fixes for 5.4.  Nothing too major.

The following changes since commit 083164dbdb17c5ea4ad92c1782b59c9d75567790:

  drm/amdgpu: fix memory leak (2019-10-09 11:45:59 -0500)

are available in the Git repository at:

  git://people.freedesktop.org/~agd5f/linux tags/drm-fixes-5.4-2019-10-16

for you to fetch changes up to d12c50857c6edc1d18aa7a60c5a4d6d943137bc0:

  drm/amdgpu/sdma5: fix mask value of POLL_REGMEM packet for pipe sync 
(2019-10-11 21:32:06 -0500)


drm-fixes-5.4-2019-10-16:

amdgpu:
- Powerplay fix for SMU7 parts
- Bail earlier when cik/si support is not set to 1
- Fix an SDMA issue on navi

radeon:
- revert a PPC fix which broken x86


Alex Deucher (2):
  drm/amdgpu/powerplay: fix typo in mvdd table setup
  Revert "drm/radeon: Fix EEH during kexec"

Hans de Goede (1):
  drm/amdgpu: Bail earlier when amdgpu.cik_/si_support is not set to 1

Xiaojie Yuan (1):
  drm/amdgpu/sdma5: fix mask value of POLL_REGMEM packet for pipe sync

 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 35 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c| 35 --
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c |  2 +-
 .../drm/amd/powerplay/smumgr/polaris10_smumgr.c|  2 +-
 .../gpu/drm/amd/powerplay/smumgr/vegam_smumgr.c|  2 +-
 drivers/gpu/drm/radeon/radeon_drv.c|  8 -
 6 files changed, 38 insertions(+), 46 deletions(-)
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: disable c-states on xgmi perfmons

2019-10-16 Thread Quan, Evan

Reviewed-by: Evan Quan 

> -Original Message-
> From: Kim, Jonathan 
> Sent: 2019年10月17日 10:06
> To: amd-gfx@lists.freedesktop.org
> Cc: Kuehling, Felix ; Quan, Evan
> ; Kim, Jonathan ; Kim,
> Jonathan 
> Subject: [PATCH] drm/amdgpu: disable c-states on xgmi perfmons
> 
> read or writes to df registers when gpu df is in c-states will result in
> hang.  df c-states should be disabled prior to read or writes then
> re-enabled after read or writes.
> 
> v2: use old powerplay routines for vega20
> 
> Change-Id: I6d5a83e4fe13e29c73dfb03a94fe7c611e867fec
> Signed-off-by: Jonathan Kim 
> ---
>  drivers/gpu/drm/amd/amdgpu/df_v3_6.c | 36
> +++-
>  1 file changed, 35 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
> b/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
> index 16fbd2bc8ad1..f403c62c944e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
> +++ b/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
> @@ -93,6 +93,21 @@ const struct attribute_group *df_v3_6_attr_groups[] =
> {
>   NULL
>  };
> 
> +static df_v3_6_set_df_cstate(struct amdgpu_device *adev, int allow)
> +{
> + int r = 0;
> +
> + if (is_support_sw_smu(adev)) {
> + r = smu_set_df_cstate(&adev->smu, allow);
> + } else if (adev->powerplay.pp_funcs
> + && adev->powerplay.pp_funcs->set_df_cstate) {
> + r = adev->powerplay.pp_funcs->set_df_cstate(
> + adev->powerplay.pp_handle, allow);
> + }
> +
> + return r;
> +}
> +
>  static uint64_t df_v3_6_get_fica(struct amdgpu_device *adev,
>uint32_t ficaa_val)
>  {
> @@ -102,6 +117,9 @@ static uint64_t df_v3_6_get_fica(struct
> amdgpu_device *adev,
>   address = adev->nbio.funcs->get_pcie_index_offset(adev);
>   data = adev->nbio.funcs->get_pcie_data_offset(adev);
> 
> + if (df_v3_6_set_df_cstate(adev, DF_CSTATE_DISALLOW))
> + return 0x;
> +
>   spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>   WREG32(address,
> smnDF_PIE_AON_FabricIndirectConfigAccessAddress3);
>   WREG32(data, ficaa_val);
> @@ -114,6 +132,8 @@ static uint64_t df_v3_6_get_fica(struct
> amdgpu_device *adev,
> 
>   spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> 
> + df_v3_6_set_df_cstate(adev, DF_CSTATE_ALLOW);
> +
>   return (((ficadh_val & 0x) << 32) | ficadl_val);
>  }
> 
> @@ -125,6 +145,9 @@ static void df_v3_6_set_fica(struct amdgpu_device
> *adev, uint32_t ficaa_val,
>   address = adev->nbio.funcs->get_pcie_index_offset(adev);
>   data = adev->nbio.funcs->get_pcie_data_offset(adev);
> 
> + if (df_v3_6_set_df_cstate(adev, DF_CSTATE_DISALLOW))
> + return;
> +
>   spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>   WREG32(address,
> smnDF_PIE_AON_FabricIndirectConfigAccessAddress3);
>   WREG32(data, ficaa_val);
> @@ -134,8 +157,9 @@ static void df_v3_6_set_fica(struct amdgpu_device
> *adev, uint32_t ficaa_val,
> 
>   WREG32(address,
> smnDF_PIE_AON_FabricIndirectConfigAccessDataHi3);
>   WREG32(data, ficadh_val);
> -
>   spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> +
> + df_v3_6_set_df_cstate(adev, DF_CSTATE_ALLOW);
>  }
> 
>  /*
> @@ -153,12 +177,17 @@ static void df_v3_6_perfmon_rreg(struct
> amdgpu_device *adev,
>   address = adev->nbio.funcs->get_pcie_index_offset(adev);
>   data = adev->nbio.funcs->get_pcie_data_offset(adev);
> 
> + if (df_v3_6_set_df_cstate(adev, DF_CSTATE_DISALLOW))
> + return;
> +
>   spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>   WREG32(address, lo_addr);
>   *lo_val = RREG32(data);
>   WREG32(address, hi_addr);
>   *hi_val = RREG32(data);
>   spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> +
> + df_v3_6_set_df_cstate(adev, DF_CSTATE_ALLOW);
>  }
> 
>  /*
> @@ -175,12 +204,17 @@ static void df_v3_6_perfmon_wreg(struct
> amdgpu_device *adev, uint32_t lo_addr,
>   address = adev->nbio.funcs->get_pcie_index_offset(adev);
>   data = adev->nbio.funcs->get_pcie_data_offset(adev);
> 
> + if (df_v3_6_set_df_cstate(adev, DF_CSTATE_DISALLOW))
> + return;
> +
>   spin_lock_irqsave(&adev->pcie_idx_lock, flags);
>   WREG32(address, lo_addr);
>   WREG32(data, lo_val);
>   WREG32(address, hi_addr);
>   WREG32(data, hi_val);
>   spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> +
> + df_v3_6_set_df_cstate(adev, DF_CSTATE_ALLOW);
>  }
> 
>  /* get the number of df counters available */
> --
> 2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amd/powerplay: add lock protection for swSMU APIs

2019-10-16 Thread Quan, Evan

This is a quick and low risk fix. Those APIs which
are exposed to other IPs or to support sysfs/hwmon
interfaces or DAL will have lock protection. Meanwhile
no lock protection is enforced for swSMU internal used
APIs. Future optimization is needed.

Change-Id: I8392652c9da1574a85acd9b171f04380f3630852
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c   |   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h   |   6 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c|  23 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c  |   4 +-
 drivers/gpu/drm/amd/powerplay/amdgpu_smu.c| 684 --
 .../gpu/drm/amd/powerplay/inc/amdgpu_smu.h| 163 +++--
 drivers/gpu/drm/amd/powerplay/navi10_ppt.c|  15 +-
 drivers/gpu/drm/amd/powerplay/renoir_ppt.c|  12 +-
 drivers/gpu/drm/amd/powerplay/smu_v11_0.c |   7 +-
 drivers/gpu/drm/amd/powerplay/vega20_ppt.c|   6 +-
 10 files changed, 773 insertions(+), 153 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c
index 263265245e19..28d32725285b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c
@@ -912,7 +912,8 @@ int amdgpu_dpm_get_sclk(struct amdgpu_device *adev, bool 
low)
if (is_support_sw_smu(adev)) {
ret = smu_get_dpm_freq_range(&adev->smu, SMU_GFXCLK,
 low ? &clk_freq : NULL,
-!low ? &clk_freq : NULL);
+!low ? &clk_freq : NULL,
+true);
if (ret)
return 0;
return clk_freq * 100;
@@ -930,7 +931,8 @@ int amdgpu_dpm_get_mclk(struct amdgpu_device *adev, bool 
low)
if (is_support_sw_smu(adev)) {
ret = smu_get_dpm_freq_range(&adev->smu, SMU_UCLK,
 low ? &clk_freq : NULL,
-!low ? &clk_freq : NULL);
+!low ? &clk_freq : NULL,
+true);
if (ret)
return 0;
return clk_freq * 100;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
index 1c5c0fd76dbf..2cfb677272af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
@@ -298,12 +298,6 @@ enum amdgpu_pcie_gen {
 #define amdgpu_dpm_get_current_power_state(adev) \

((adev)->powerplay.pp_funcs->get_current_power_state((adev)->powerplay.pp_handle))
 
-#define amdgpu_smu_get_current_power_state(adev) \
-   ((adev)->smu.ppt_funcs->get_current_power_state(&((adev)->smu)))
-
-#define amdgpu_smu_set_power_state(adev) \
-   ((adev)->smu.ppt_funcs->set_power_state(&((adev)->smu)))
-
 #define amdgpu_dpm_get_pp_num_states(adev, data) \

((adev)->powerplay.pp_funcs->get_pp_num_states((adev)->powerplay.pp_handle, 
data))
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
index c50d5f1e75e5..36f36b35000d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
@@ -211,7 +211,7 @@ static ssize_t amdgpu_get_dpm_state(struct device *dev,
 
if (is_support_sw_smu(adev)) {
if (adev->smu.ppt_funcs->get_current_power_state)
-   pm = amdgpu_smu_get_current_power_state(adev);
+   pm = smu_get_current_power_state(&adev->smu);
else
pm = adev->pm.dpm.user_state;
} else if (adev->powerplay.pp_funcs->get_current_power_state) {
@@ -957,7 +957,7 @@ static ssize_t amdgpu_set_pp_dpm_sclk(struct device *dev,
return ret;
 
if (is_support_sw_smu(adev))
-   ret = smu_force_clk_levels(&adev->smu, SMU_SCLK, mask);
+   ret = smu_force_clk_levels(&adev->smu, SMU_SCLK, mask, true);
else if (adev->powerplay.pp_funcs->force_clock_level)
ret = amdgpu_dpm_force_clock_level(adev, PP_SCLK, mask);
 
@@ -1004,7 +1004,7 @@ static ssize_t amdgpu_set_pp_dpm_mclk(struct device *dev,
return ret;
 
if (is_support_sw_smu(adev))
-   ret = smu_force_clk_levels(&adev->smu, SMU_MCLK, mask);
+   ret = smu_force_clk_levels(&adev->smu, SMU_MCLK, mask, true);
else if (adev->powerplay.pp_funcs->force_clock_level)
ret = amdgpu_dpm_force_clock_level(adev, PP_MCLK, mask);
 
@@ -1044,7 +1044,7 @@ static ssize_t amdgpu_set_pp_dpm_socclk(struct device 
*dev,
return ret;
 
if (is_support_sw_smu(adev))
-   ret = smu_force_clk_levels(&adev->smu, SMU_SOCCLK, mask);
+   ret = smu_force_clk_levels(&adev->smu, SMU_SOCCLK, mask, true);

[PATCH] drm/amd/display: Modify display link stream setup sequence.

2019-10-16 Thread Liu, Zhan

From: Zhan Liu 

[Why]
When a specific kind of connector is detected,
DC needs to set the attribute of the stream.
This step needs to be done before enabling link,
or some bugs (e.g. display won't light up)
will be observed.

[How]
Setting the attribute of the stream first, then
enabling stream.

Signed-off-by: Zhan Liu 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 20 +--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index fb18681b502b..713caab82837 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -2745,16 +2745,6 @@ void core_link_enable_stream(
dc_is_virtual_signal(pipe_ctx->stream->signal))
return;

-   if (!dc_is_virtual_signal(pipe_ctx->stream->signal)) {
-   stream->link->link_enc->funcs->setup(
-   stream->link->link_enc,
-   pipe_ctx->stream->signal);
-   pipe_ctx->stream_res.stream_enc->funcs->setup_stereo_sync(
-   pipe_ctx->stream_res.stream_enc,
-   pipe_ctx->stream_res.tg->inst,
-   stream->timing.timing_3d_format != 
TIMING_3D_FORMAT_NONE);
-   }
-
if (dc_is_dp_signal(pipe_ctx->stream->signal))
pipe_ctx->stream_res.stream_enc->funcs->dp_set_stream_attribute(
pipe_ctx->stream_res.stream_enc,
@@ -2841,6 +2831,16 @@ void core_link_enable_stream(
CONTROLLER_DP_TEST_PATTERN_VIDEOMODE,
COLOR_DEPTH_UNDEFINED);

+   if (!dc_is_virtual_signal(pipe_ctx->stream->signal)) {
+   stream->link->link_enc->funcs->setup(
+   stream->link->link_enc,
+   pipe_ctx->stream->signal);
+   
pipe_ctx->stream_res.stream_enc->funcs->setup_stereo_sync(
+   pipe_ctx->stream_res.stream_enc,
+   pipe_ctx->stream_res.tg->inst,
+   stream->timing.timing_3d_format != 
TIMING_3D_FORMAT_NONE);
+   }
+
 #ifdef CONFIG_DRM_AMD_DC_DSC_SUPPORT
if (pipe_ctx->stream->timing.flags.DSC) {
if (dc_is_dp_signal(pipe_ctx->stream->signal) ||
--
2.17.1
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu/display: fix compile error

2019-10-16 Thread Chen Wandun

From: Chenwandun 

drivers/gpu/drm/amd/amdgpu/../display/dc/dcn20/dcn20_resource.c:1913:48: error: 
struct dc_crtc_timing_flags has no member named DSC
   if (res_ctx->pipe_ctx[i].stream->timing.flags.DSC)
^
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn20/dcn20_resource.c:1914:73: error: 
struct dc_crtc_timing has no member named dsc_cfg
   pipes[pipe_cnt].dout.output_bpp = 
res_ctx->pipe_ctx[i].stream->timing.dsc_cfg.bits_per_pixel / 16.0;
^
Signed-off-by: Chenwandun 
---
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index 914e378..4f03318 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -1910,8 +1910,10 @@ int dcn20_populate_dml_pipes_from_context(
pipes[pipe_cnt].dout.output_bpp = output_bpc * 3;
}
 
+#ifdef CONFIG_DRM_AMD_DC_DSC_SUPPORT
if (res_ctx->pipe_ctx[i].stream->timing.flags.DSC)
pipes[pipe_cnt].dout.output_bpp = 
res_ctx->pipe_ctx[i].stream->timing.dsc_cfg.bits_per_pixel / 16.0;
+#endif
 
/* todo: default max for now, until there is logic reflecting 
this in dc*/
pipes[pipe_cnt].dout.output_bpc = 12;
-- 
2.7.4

47 matches

Mail list logo