RE: [PATCH] drm/amdgpu: needn't set aggregated doorbell for map queue

2023-11-13 Thread Liang, Prike
[Public]

Please ignore this patch, double confirm from the stakeholder the MES map queue 
has possibility to change as an unmap queue after write the doorbell.

Regards,
--Prike

> -Original Message-
> From: Liang, Prike 
> Sent: Thursday, November 9, 2023 3:36 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Liang, Prike
> 
> Subject: [PATCH] drm/amdgpu: needn't set aggregated doorbell for map
> queue
>
> Needn't set aggregated doorbell for map queue and remove the dead code.
>
> Signed-off-by: Prike Liang 
> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 6 --
> drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 4 
>  2 files changed, 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index c8a3bf01743f..601bb6755bd3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -8220,9 +8220,6 @@ static void gfx_v10_0_ring_set_wptr_gfx(struct
> amdgpu_ring *ring)
>   WDOORBELL64(ring->doorbell_index, wptr_tmp);
>   } else {
>   WDOORBELL64(ring->doorbell_index, wptr_tmp);
> -
> - if (*is_queue_unmap)
> - WDOORBELL64(aggregated_db_index,
> wptr_tmp);
>   }
>   } else {
>   if (ring->use_doorbell) {
> @@ -8283,9 +8280,6 @@ static void
> gfx_v10_0_ring_set_wptr_compute(struct amdgpu_ring *ring)
>   WDOORBELL64(ring->doorbell_index, wptr_tmp);
>   } else {
>   WDOORBELL64(ring->doorbell_index, wptr_tmp);
> -
> - if (*is_queue_unmap)
> - WDOORBELL64(aggregated_db_index,
> wptr_tmp);
>   }
>   } else {
>   /* XXX check if swapping is necessary on BE */ diff --git
> a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> index c1ff5eda8961..14633e2ceac6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> @@ -358,10 +358,6 @@ static void sdma_v5_0_ring_set_wptr(struct
> amdgpu_ring *ring)
>   DRM_DEBUG("calling WDOORBELL64(0x%08x,
> 0x%016llx)\n",
>   ring->doorbell_index, ring->wptr <<
> 2);
>   WDOORBELL64(ring->doorbell_index, ring->wptr <<
> 2);
> -
> - if (*is_queue_unmap)
> - WDOORBELL64(aggregated_db_index,
> - ring->wptr << 2);
>   }
>   } else {
>   if (ring->use_doorbell) {
> --
> 2.34.1



[PATCH] drm/amdgpu: correct the amdgpu runtime dereference usage count

2023-11-13 Thread Prike Liang
Fix the amdgpu runpm dereference usage count.

Signed-off-by: Prike Liang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 0cacd0b9f8be..4737ada467cc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -340,12 +340,12 @@ int amdgpu_display_crtc_set_config(struct drm_mode_set 
*set,
adev->have_disp_power_ref = true;
return ret;
}
-   /* if we have no active crtcs, then drop the power ref
-* we got before
+   /* if we have no active crtcs, then go to
+* drop the power ref we got before
 */
if (!active && adev->have_disp_power_ref) {
-   pm_runtime_put_autosuspend(dev->dev);
adev->have_disp_power_ref = false;
+   goto out;
}
 
 out:
-- 
2.34.1



RE: [PATCH 1/2] drm/amdgpu: correct the amdgpu runtime dereference usage count

2023-11-13 Thread Liang, Prike
[Public]

> -Original Message-
> From: Deucher, Alexander 
> Sent: Friday, November 10, 2023 5:46 AM
> To: Liang, Prike ; amd-gfx@lists.freedesktop.org
> Subject: RE: [PATCH 1/2] drm/amdgpu: correct the amdgpu runtime
> dereference usage count
>
> [Public]
>
> > -Original Message-
> > From: Liang, Prike 
> > Sent: Thursday, November 9, 2023 2:37 AM
> > To: amd-gfx@lists.freedesktop.org
> > Cc: Deucher, Alexander ; Liang, Prike
> > 
> > Subject: [PATCH 1/2] drm/amdgpu: correct the amdgpu runtime
> > dereference usage count
> >
> > Fix the amdgpu runpm dereference usage count.
> >
> > Signed-off-by: Prike Liang 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 2 +-
> > drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 1 +
> >  2 files changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> > index a53f436fa9f1..f6e5d9f7 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> > @@ -1992,7 +1992,7 @@ static int amdgpu_debugfs_sclk_set(void *data,
> > u64 val)
> >
> >   ret = amdgpu_dpm_set_soft_freq_range(adev, PP_SCLK,
> > (uint32_t)val, (uint32_t)val);
> >   if (ret)
> > - ret = -EINVAL;
> > + goto out;
>
> I think this hunk can be dropped.  It doesn't really change anything.  Or you
> could just drop the whole ret check since we just return ret at the end
> anyway.  Not sure if changing the error code is important here or not.
>
[Prike] Thanks for pointing it out, revisit this part that seems ok for 
reassign the return value when the caller function can't return a proper error 
type.
I will keep this part as the unmodified since this has no problem for 
dereferencing the runpm usage.
> >
> >  out:
> >   pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> > index 0cacd0b9f8be..ff1f42ae6d8e 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> > @@ -346,6 +346,7 @@ int amdgpu_display_crtc_set_config(struct
> > drm_mode_set *set,
> >   if (!active && adev->have_disp_power_ref) {
> >   pm_runtime_put_autosuspend(dev->dev);
> >   adev->have_disp_power_ref = false;
> > + return ret;
> >   }
>
> I think it would be cleaner to just drop the runtime_put above and update
> the comment.  We'll just fall through to the end of the function.
>
> Alex
>
[Prike]  Do you mean something like as the following? I will submit a new patch 
for this.

-   /* if we have no active crtcs, then drop the power ref
-* we got before
+   /* if we have no active crtcs, then go to
+* drop the power ref we got before
 */
if (!active && adev->have_disp_power_ref) {
-   pm_runtime_put_autosuspend(dev->dev);
adev->have_disp_power_ref = false;
+   goto out;
}
> >
> >  out:
> > --
> > 2.34.1
>



[PATCH] drm/amdkfd: Fix sq_intr error typo

2023-11-13 Thread Yunxiang Li
Here the register field should be read from context_id1 rather than 0 as
indicated by the macro name.

Fixes: cc009e613de6 ("drm/amdkfd: Add KFD support for soc21 v3")
Signed-off-by: Yunxiang Li 
---
 drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
index 2a65792fd116..41fa85cfe2d2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
@@ -184,8 +184,8 @@ static void print_sq_intr_info_error(uint32_t context_id0, 
uint32_t context_id1)
REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_ERROR_CTXID0, 
SH_ID),
REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_ERROR_CTXID0, 
PRIV),
REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_ERROR_CTXID0, 
WAVE_ID),
-   REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_ERROR_CTXID1, 
SIMD_ID),
-   REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_ERROR_CTXID1, 
WGP_ID));
+   REG_GET_FIELD(context_id1, SQ_INTERRUPT_WORD_ERROR_CTXID1, 
SIMD_ID),
+   REG_GET_FIELD(context_id1, SQ_INTERRUPT_WORD_ERROR_CTXID1, 
WGP_ID));
 }
 
 static void event_interrupt_poison_consumption_v11(struct kfd_node *dev,
-- 
2.34.1



Re: [PATCH 00/20] remove I2C_CLASS_DDC support

2023-11-13 Thread Wolfram Sang

> We're not in a hurry. It's just my experience with patch series' affecting
> multiple subsystems that typically the decision was to apply the full
> series via one tree. Also to avoid inquires from maintainers like:
> Shall I take it or are you going to take it?
> Of course there may be different opinions. Please advise.

Ok, then this turns out to be a negotation thing between the drm/fbdev
maintainers and me. I *can* take all the patches, of course. But since
the number of patches touching the non-i2c subsystems is high, I'd like
to hear their preference, too.



signature.asc
Description: PGP signature


[PATCH] drm/amd/display: fix a NULL pointer dereference in amdgpu_dm_i2c_xfer()

2023-11-13 Thread Mario Limonciello
When ddc_service_construct() is called, it explicitly checks both the
link type and whether there is something on the link which will
dictate whether the pin is marked as hw_supported.

If the pin isn't set or the link is not set (such as from
unloading/reloading amdgpu in an IGT test) then fail the
amdgpu_dm_i2c_xfer() call.

Cc: sta...@vger.kernel.org
Fixes: 22676bc500c2 ("drm/amd/display: Fix dmub soft hang for PSR 1")
Link: https://github.com/fwupd/fwupd/issues/6327
Signed-off-by: Mario Limonciello 
---
v1->v2:
 * Fix a memory leak
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index adbeb2c897b5..f6b31c108180 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -7463,6 +7463,9 @@ static int amdgpu_dm_i2c_xfer(struct i2c_adapter 
*i2c_adap,
int i;
int result = -EIO;
 
+   if (!ddc_service->ddc_pin || 
!ddc_service->ddc_pin->hw_info.hw_supported)
+   return result;
+
cmd.payloads = kcalloc(num, sizeof(struct i2c_payload), GFP_KERNEL);
 
if (!cmd.payloads)
-- 
2.34.1



Re: [PATCH 03/20] drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c: remove I2C_CLASS_DDC support

2023-11-13 Thread Harry Wentland
Please just use "drm/amd/display:" as tag in the commit subject.

With that fixed, this is
Acked-by: Harry Wentland 

Harry

On 2023-11-13 06:23, Heiner Kallweit wrote:
> After removal of the legacy EEPROM driver and I2C_CLASS_DDC support in
> olpc_dcon there's no i2c client driver left supporting I2C_CLASS_DDC.
> Class-based device auto-detection is a legacy mechanism and shouldn't
> be used in new code. So we can remove this class completely now.
> 
> Preferably this series should be applied via the i2c tree.
> 
> Signed-off-by: Heiner Kallweit 
> 
> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 6f99f6754..ae1edc6ab 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -7529,7 +7529,6 @@ create_i2c(struct ddc_service *ddc_service,
>   if (!i2c)
>   return NULL;
>   i2c->base.owner = THIS_MODULE;
> - i2c->base.class = I2C_CLASS_DDC;
>   i2c->base.dev.parent = >pdev->dev;
>   i2c->base.algo = _dm_i2c_algo;
>   snprintf(i2c->base.name, sizeof(i2c->base.name), "AMDGPU DM i2c hw bus 
> %d", link_index);
> 



Re: [PATCH 00/20] remove I2C_CLASS_DDC support

2023-11-13 Thread Wolfram Sang

> Preferably this series should be applied via the i2c tree.

Are we in a hurry here, i.e. does it block further development of the
i801 smbus driver? My gut feeling says the patches should rather go via
drm and fbdev trees, but I may be convinced otherwise.



signature.asc
Description: PGP signature


Re: [PATCH 19/24] drm/amdkfd: enable pc sampling stop

2023-11-13 Thread James Zhu


On 2023-11-13 12:04, Yat Sin, David wrote:


[AMD Official Use Only - General]


*From:* Zhu, James 
*Sent:* Monday, November 13, 2023 10:20 AM
*To:* Yat Sin, David ; Zhu, James 
; amd-gfx@lists.freedesktop.org
*Cc:* Kuehling, Felix ; Greathouse, Joseph 


*Subject:* Re: [PATCH 19/24] drm/amdkfd: enable pc sampling stop

On 2023-11-10 14:07, Yat Sin, David wrote:

[AMD Official Use Only - General]

-Original Message-

From: Zhu, James  

Sent: Friday, November 3, 2023 9:12 AM

To:amd-gfx@lists.freedesktop.org

Cc: Kuehling, Felix  
; Greathouse, Joseph

  ; Yat Sin, 
David  ; Zhu,

James  

Subject: [PATCH 19/24] drm/amdkfd: enable pc sampling stop

Enable pc sampling stop.

Signed-off-by: James Zhu  

---

  drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 28 +--

-

  drivers/gpu/drm/amd/amdkfd/kfd_priv.h    |  2 ++

  2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

index 33d003ca0093..2c4ac5b4cc4b 100644

--- a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

@@ -108,10 +108,32 @@ static int kfd_pc_sample_start(struct

kfd_process_device *pdd,

   return 0;

  }

-static int kfd_pc_sample_stop(struct kfd_process_device *pdd)

+static int kfd_pc_sample_stop(struct kfd_process_device *pdd,

+ struct pc_sampling_entry 
*pcs_entry)

  {

- return -EINVAL;

+ bool pc_sampling_stop = false;

+

+ pcs_entry->enabled = false;

+ mutex_lock(>dev->pcs_data.mutex);

For the START/STOP/DESTROY ioctls, I think we can hold 
pdd->dev->pcs_data.mutex through the whole IOCTL. Then we would not have to 
deal with the intermediate states where the START/STOP/DESTROY are happening at the 
same time.

[JZ] pdd->dev->pcs_data.mutex is per device, not per process. In the 
future, also it will share protection within different pc sampling 
methods on the same devices. So I don't think a bigger lock here is 
good idea.
[David] I think the CREATE/START/STOP/DESTROY actions are not time 
critical. So if two processes are using the same GPU, it is ok for 
amdgpu to block the 2^nd process until amdgpu has fully completed the 
request from the 1^st process. I think we are making the code more 
complex for a use-case that would be very rare.


[JZ] IIRC, the init RFC version used bigger lock, and is questioned as 
an inefficient way,



+ pdd->dev->pcs_data.hosttrap_entry.base.active_count--;

+ if (!pdd->dev->pcs_data.hosttrap_entry.base.active_count) {

+ WRITE_ONCE(pdd->dev-

pcs_data.hosttrap_entry.base.stop_enable, true);

+ pc_sampling_stop = true;

+ }

+ mutex_unlock(>dev->pcs_data.mutex);

+ if (pc_sampling_stop) {

+ kfd_process_set_trap_pc_sampling_flag(>qpd,

+ pdd->dev-

pcs_data.hosttrap_entry.base.pc_sample_info.method,

+false);

+

+ mutex_lock(>dev->pcs_data.mutex);

+ pdd->dev->pcs_data.hosttrap_entry.base.target_simd = 0;

+ pdd->dev->pcs_data.hosttrap_entry.base.target_wave_slot = 
0;

+ WRITE_ONCE(pdd->dev-

pcs_data.hosttrap_entry.base.stop_enable, false);

+ mutex_unlock(>dev->pcs_data.mutex);

+ }

+

+ return 0;

  }

  static int kfd_pc_sample_create(struct kfd_process_device *pdd, @@ 
-251,7

+273,7 @@ int kfd_pc_sample(struct kfd_process_device *pdd,

   if (!pcs_entry->enabled)

   return -EALREADY;

   else

- return kfd_pc_sample_stop(pdd);

+ return kfd_pc_sample_stop(pdd, pcs_entry);

   }

   return -EINVAL;

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h

b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h

index 613910e0d440..badaa4d68cc4 100644

--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h

+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h

@@ -259,6 +259,8 @@ struct kfd_dev;

  struct kfd_dev_pc_sampling_data {

   uint32_t use_count; /* Num of PC sampling sessions */

   uint32_t active_count;  /* Num of active sessions */

+ uint32_t 

RE: [PATCH 19/24] drm/amdkfd: enable pc sampling stop

2023-11-13 Thread Yat Sin, David
[AMD Official Use Only - General]



From: Zhu, James 
Sent: Monday, November 13, 2023 10:20 AM
To: Yat Sin, David ; Zhu, James ; 
amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix ; Greathouse, Joseph 

Subject: Re: [PATCH 19/24] drm/amdkfd: enable pc sampling stop



On 2023-11-10 14:07, Yat Sin, David wrote:

[AMD Official Use Only - General]



-Original Message-

From: Zhu, James 

Sent: Friday, November 3, 2023 9:12 AM

To: amd-gfx@lists.freedesktop.org

Cc: Kuehling, Felix ; 
Greathouse, Joseph

; Yat Sin, David 
; Zhu,

James 

Subject: [PATCH 19/24] drm/amdkfd: enable pc sampling stop



Enable pc sampling stop.



Signed-off-by: James Zhu 

---

 drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 28 +--

-

 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  2 ++

 2 files changed, 27 insertions(+), 3 deletions(-)



diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

index 33d003ca0093..2c4ac5b4cc4b 100644

--- a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

@@ -108,10 +108,32 @@ static int kfd_pc_sample_start(struct

kfd_process_device *pdd,

  return 0;

 }



-static int kfd_pc_sample_stop(struct kfd_process_device *pdd)

+static int kfd_pc_sample_stop(struct kfd_process_device *pdd,

+ struct pc_sampling_entry *pcs_entry)

 {

- return -EINVAL;

+ bool pc_sampling_stop = false;

+

+ pcs_entry->enabled = false;

+ mutex_lock(>dev->pcs_data.mutex);

For the START/STOP/DESTROY ioctls, I think we can hold pdd->dev->pcs_data.mutex 
through the whole IOCTL. Then we would not have to deal with the intermediate 
states where the START/STOP/DESTROY are happening at the same time.
[JZ] pdd->dev->pcs_data.mutex is per device, not per process. In the future, 
also it will share protection within different pc sampling methods on the same 
devices. So I don't think a bigger lock here is good idea.
[David] I think the CREATE/START/STOP/DESTROY actions are not time critical. So 
if two processes are using the same GPU, it is ok for amdgpu to block the 2nd 
process until amdgpu has fully completed the request from the 1st process. I 
think we are making the code more complex for a use-case that would be very 
rare.


+ pdd->dev->pcs_data.hosttrap_entry.base.active_count--;

+ if (!pdd->dev->pcs_data.hosttrap_entry.base.active_count) {

+ WRITE_ONCE(pdd->dev-

pcs_data.hosttrap_entry.base.stop_enable, true);

+ pc_sampling_stop = true;

+ }

+ mutex_unlock(>dev->pcs_data.mutex);



+ if (pc_sampling_stop) {

+ kfd_process_set_trap_pc_sampling_flag(>qpd,

+ pdd->dev-

pcs_data.hosttrap_entry.base.pc_sample_info.method,

+false);

+

+ mutex_lock(>dev->pcs_data.mutex);

+ pdd->dev->pcs_data.hosttrap_entry.base.target_simd = 0;

+ pdd->dev->pcs_data.hosttrap_entry.base.target_wave_slot = 0;

+ WRITE_ONCE(pdd->dev-

pcs_data.hosttrap_entry.base.stop_enable, false);

+ mutex_unlock(>dev->pcs_data.mutex);

+ }

+

+ return 0;

 }



 static int kfd_pc_sample_create(struct kfd_process_device *pdd, @@ -251,7

+273,7 @@ int kfd_pc_sample(struct kfd_process_device *pdd,

  if (!pcs_entry->enabled)

  return -EALREADY;

  else

- return kfd_pc_sample_stop(pdd);

+ return kfd_pc_sample_stop(pdd, pcs_entry);

  }



  return -EINVAL;

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h

b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h

index 613910e0d440..badaa4d68cc4 100644

--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h

+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h

@@ -259,6 +259,8 @@ struct kfd_dev;

 struct kfd_dev_pc_sampling_data {

  uint32_t use_count; /* Num of PC sampling sessions */

  uint32_t active_count;  /* Num of active sessions */

+ uint32_t target_simd;   /* target simd for trap */

+ uint32_t target_wave_slot;  /* target wave slot for trap */

  bool stop_enable;   /* pc sampling stop in process */

  struct idr pc_sampling_idr;

  struct kfd_pc_sample_info pc_sample_info;

--

2.25.1




Re: [Patch v3] drm/ttm: Schedule delayed_delete worker closer

2023-11-13 Thread Christian König

Am 11.11.23 um 14:08 schrieb Rajneesh Bhardwaj:

Try to allocate system memory on the NUMA node the device is closest to
and try to run delayed_delete workers on a CPU of this node as well.

To optimize the memory clearing operation when a TTM BO gets freed by
the delayed_delete worker, scheduling it closer to a NUMA node where the
memory was initially allocated helps avoid the cases where the worker
gets randomly scheduled on the CPU cores that are across interconnect
boundaries such as xGMI, PCIe etc.

This change helps USWC GTT allocations on NUMA systems (dGPU) and AMD
APU platforms such as GFXIP9.4.3.

Acked-by: Felix Kuehling 
Signed-off-by: Rajneesh Bhardwaj 


Reviewed-by: Christian König 


---
Changes in v3:
  * Use WQ_UNBOUND to address the warning reported by CI pipeline.

  drivers/gpu/drm/ttm/ttm_bo.c | 8 +++-
  drivers/gpu/drm/ttm/ttm_device.c | 6 --
  2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 5757b9415e37..6f28a77a565b 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -370,7 +370,13 @@ static void ttm_bo_release(struct kref *kref)
spin_unlock(>bdev->lru_lock);
  
  			INIT_WORK(>delayed_delete, ttm_bo_delayed_delete);

-   queue_work(bdev->wq, >delayed_delete);
+
+   /* Schedule the worker on the closest NUMA node. This
+* improves performance since system memory might be
+* cleared on free and that is best done on a CPU core
+* close to it.
+*/
+   queue_work_node(bdev->pool.nid, bdev->wq, 
>delayed_delete);
return;
}
  
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c

index 43e27ab77f95..bc97e3dd40f0 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -204,7 +204,8 @@ int ttm_device_init(struct ttm_device *bdev, struct 
ttm_device_funcs *funcs,
if (ret)
return ret;
  
-	bdev->wq = alloc_workqueue("ttm", WQ_MEM_RECLAIM | WQ_HIGHPRI, 16);

+   bdev->wq = alloc_workqueue("ttm",
+  WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 
16);
if (!bdev->wq) {
ttm_global_release();
return -ENOMEM;
@@ -213,7 +214,8 @@ int ttm_device_init(struct ttm_device *bdev, struct 
ttm_device_funcs *funcs,
bdev->funcs = funcs;
  
  	ttm_sys_man_init(bdev);

-   ttm_pool_init(>pool, dev, NUMA_NO_NODE, use_dma_alloc, use_dma32);
+
+   ttm_pool_init(>pool, dev, dev_to_node(dev), use_dma_alloc, 
use_dma32);
  
  	bdev->vma_manager = vma_manager;

spin_lock_init(>lru_lock);




Re: [PATCH v2] drm/amd/display: add a debugfs interface for the DMUB trace mask

2023-11-13 Thread Aurabindo Pillai




On 11/13/2023 9:56 AM, Hamza Mahfooz wrote:

For features that are implemented primarily in DMUB (e.g. PSR), it is
useful to be able to trace them at a DMUB level from the kernel,
especially when debugging issues. So, introduce a debugfs interface that
is able to read and set the DMUB trace mask dynamically at runtime and
document how to use it.

Cc: Alex Deucher 
Cc: Mario Limonciello 
Signed-off-by: Hamza Mahfooz 
---
v2: only return -ETIMEDOUT for DMUB_STATUS_TIMEOUT
---
  Documentation/gpu/amdgpu/display/dc-debug.rst | 41 
  .../gpu/amdgpu/display/trace-groups-table.csv | 29 ++
  .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 97 +++
  .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h   | 40 +++-
  4 files changed, 205 insertions(+), 2 deletions(-)
  create mode 100644 Documentation/gpu/amdgpu/display/trace-groups-table.csv

diff --git a/Documentation/gpu/amdgpu/display/dc-debug.rst 
b/Documentation/gpu/amdgpu/display/dc-debug.rst
index 40c55a618918..817631b1dbf3 100644
--- a/Documentation/gpu/amdgpu/display/dc-debug.rst
+++ b/Documentation/gpu/amdgpu/display/dc-debug.rst
@@ -75,3 +75,44 @@ change in real-time by using something like::
  
  When reporting a bug related to DC, consider attaching this log before and

  after you reproduce the bug.
+
+DMUB Firmware Debug
+===
+
+Sometimes, dmesg logs aren't enough. This is especially true if a feature is
+implemented primarily in DMUB firmware. In such cases, all we see in dmesg when
+an issue arises is some generic timeout error. So, to get more relevant
+information, we can trace DMUB commands by enabling the relevant bits in
+`amdgpu_dm_dmub_trace_mask`.
+
+Currently, we support the tracing of the following groups:
+
+Trace Groups
+
+
+.. csv-table::
+   :header-rows: 1
+   :widths: 1, 1
+   :file: ./trace-groups-table.csv
+
+**Note: Not all ASICs support all of the listed trace groups**
+
+So, to enable just PSR tracing you can use the following command::
+
+  # echo 0x8020 > /sys/kernel/debug/dri/0/amdgpu_dm_dmub_trace_mask
+
+Then, you need to enable logging trace events to the buffer, which you can do
+using the following::
+
+  # echo 1 > /sys/kernel/debug/dri/0/amdgpu_dm_dmcub_trace_event_en
+
+Lastly, after you are able to reproduce the issue you are trying to debug,
+you can disable tracing and read the trace log by using the following::
+
+  # echo 0 > /sys/kernel/debug/dri/0/amdgpu_dm_dmcub_trace_event_en
+  # cat /sys/kernel/debug/dri/0/amdgpu_dm_dmub_tracebuffer
+
+So, when reporting bugs related to features such as PSR and ABM, consider
+enabling the relevant bits in the mask before reproducing the issue and
+attach the log that you obtain from the trace buffer in any bug reports that 
you
+create.
diff --git a/Documentation/gpu/amdgpu/display/trace-groups-table.csv 
b/Documentation/gpu/amdgpu/display/trace-groups-table.csv
new file mode 100644
index ..3f6a50d1d883
--- /dev/null
+++ b/Documentation/gpu/amdgpu/display/trace-groups-table.csv
@@ -0,0 +1,29 @@
+Name, Mask Value
+INFO, 0x1
+IRQ SVC, 0x2
+VBIOS, 0x4
+REGISTER, 0x8
+PHY DBG, 0x10
+PSR, 0x20
+AUX, 0x40
+SMU, 0x80
+MALL, 0x100
+ABM, 0x200
+ALPM, 0x400
+TIMER, 0x800
+HW LOCK MGR, 0x1000
+INBOX1, 0x2000
+PHY SEQ, 0x4000
+PSR STATE, 0x8000
+ZSTATE, 0x1
+TRANSMITTER CTL, 0x2
+PANEL CNTL, 0x4
+FAMS, 0x8
+DPIA, 0x10
+SUBVP, 0x20
+INBOX0, 0x40
+SDP, 0x400
+REPLAY, 0x800
+REPLAY RESIDENCY, 0x2000
+CURSOR INFO, 0x8000
+IPS, 0x1
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index 45c972f2630d..67dea56cf583 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -2971,6 +2971,100 @@ static int allow_edp_hotplug_detection_set(void *data, 
u64 val)
return 0;
  }
  
+static int dmub_trace_mask_set(void *data, u64 val)

+{
+   struct amdgpu_device *adev = data;
+   struct dmub_srv *srv = adev->dm.dc->ctx->dmub_srv->dmub;
+   enum dmub_gpint_command cmd;
+   enum dmub_status status;
+   u64 mask = 0x;
+   u8 shift = 0;
+   u32 res;
+   int i;
+
+   if (!srv->fw_version)
+   return -EINVAL;
+
+   for (i = 0;  i < 4; i++) {
+   res = (val & mask) >> shift;
+
+   switch (i) {
+   case 0:
+   cmd = DMUB_GPINT__SET_TRACE_BUFFER_MASK_WORD0;
+   break;
+   case 1:
+   cmd = DMUB_GPINT__SET_TRACE_BUFFER_MASK_WORD1;
+   break;
+   case 2:
+   cmd = DMUB_GPINT__SET_TRACE_BUFFER_MASK_WORD2;
+   break;
+   case 3:
+   cmd = DMUB_GPINT__SET_TRACE_BUFFER_MASK_WORD3;
+   break;
+   }
+
+   status = 

Re: [PATCH v2] drm/amd/display: add a debugfs interface for the DMUB trace mask

2023-11-13 Thread Christian König

Am 13.11.23 um 15:56 schrieb Hamza Mahfooz:

For features that are implemented primarily in DMUB (e.g. PSR), it is
useful to be able to trace them at a DMUB level from the kernel,
especially when debugging issues. So, introduce a debugfs interface that
is able to read and set the DMUB trace mask dynamically at runtime and
document how to use it.

Cc: Alex Deucher 
Cc: Mario Limonciello 
Signed-off-by: Hamza Mahfooz 


Acked-by: Christian König 


---
v2: only return -ETIMEDOUT for DMUB_STATUS_TIMEOUT
---
  Documentation/gpu/amdgpu/display/dc-debug.rst | 41 
  .../gpu/amdgpu/display/trace-groups-table.csv | 29 ++
  .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 97 +++
  .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h   | 40 +++-
  4 files changed, 205 insertions(+), 2 deletions(-)
  create mode 100644 Documentation/gpu/amdgpu/display/trace-groups-table.csv

diff --git a/Documentation/gpu/amdgpu/display/dc-debug.rst 
b/Documentation/gpu/amdgpu/display/dc-debug.rst
index 40c55a618918..817631b1dbf3 100644
--- a/Documentation/gpu/amdgpu/display/dc-debug.rst
+++ b/Documentation/gpu/amdgpu/display/dc-debug.rst
@@ -75,3 +75,44 @@ change in real-time by using something like::
  
  When reporting a bug related to DC, consider attaching this log before and

  after you reproduce the bug.
+
+DMUB Firmware Debug
+===
+
+Sometimes, dmesg logs aren't enough. This is especially true if a feature is
+implemented primarily in DMUB firmware. In such cases, all we see in dmesg when
+an issue arises is some generic timeout error. So, to get more relevant
+information, we can trace DMUB commands by enabling the relevant bits in
+`amdgpu_dm_dmub_trace_mask`.
+
+Currently, we support the tracing of the following groups:
+
+Trace Groups
+
+
+.. csv-table::
+   :header-rows: 1
+   :widths: 1, 1
+   :file: ./trace-groups-table.csv
+
+**Note: Not all ASICs support all of the listed trace groups**
+
+So, to enable just PSR tracing you can use the following command::
+
+  # echo 0x8020 > /sys/kernel/debug/dri/0/amdgpu_dm_dmub_trace_mask
+
+Then, you need to enable logging trace events to the buffer, which you can do
+using the following::
+
+  # echo 1 > /sys/kernel/debug/dri/0/amdgpu_dm_dmcub_trace_event_en
+
+Lastly, after you are able to reproduce the issue you are trying to debug,
+you can disable tracing and read the trace log by using the following::
+
+  # echo 0 > /sys/kernel/debug/dri/0/amdgpu_dm_dmcub_trace_event_en
+  # cat /sys/kernel/debug/dri/0/amdgpu_dm_dmub_tracebuffer
+
+So, when reporting bugs related to features such as PSR and ABM, consider
+enabling the relevant bits in the mask before reproducing the issue and
+attach the log that you obtain from the trace buffer in any bug reports that 
you
+create.
diff --git a/Documentation/gpu/amdgpu/display/trace-groups-table.csv 
b/Documentation/gpu/amdgpu/display/trace-groups-table.csv
new file mode 100644
index ..3f6a50d1d883
--- /dev/null
+++ b/Documentation/gpu/amdgpu/display/trace-groups-table.csv
@@ -0,0 +1,29 @@
+Name, Mask Value
+INFO, 0x1
+IRQ SVC, 0x2
+VBIOS, 0x4
+REGISTER, 0x8
+PHY DBG, 0x10
+PSR, 0x20
+AUX, 0x40
+SMU, 0x80
+MALL, 0x100
+ABM, 0x200
+ALPM, 0x400
+TIMER, 0x800
+HW LOCK MGR, 0x1000
+INBOX1, 0x2000
+PHY SEQ, 0x4000
+PSR STATE, 0x8000
+ZSTATE, 0x1
+TRANSMITTER CTL, 0x2
+PANEL CNTL, 0x4
+FAMS, 0x8
+DPIA, 0x10
+SUBVP, 0x20
+INBOX0, 0x40
+SDP, 0x400
+REPLAY, 0x800
+REPLAY RESIDENCY, 0x2000
+CURSOR INFO, 0x8000
+IPS, 0x1
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index 45c972f2630d..67dea56cf583 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -2971,6 +2971,100 @@ static int allow_edp_hotplug_detection_set(void *data, 
u64 val)
return 0;
  }
  
+static int dmub_trace_mask_set(void *data, u64 val)

+{
+   struct amdgpu_device *adev = data;
+   struct dmub_srv *srv = adev->dm.dc->ctx->dmub_srv->dmub;
+   enum dmub_gpint_command cmd;
+   enum dmub_status status;
+   u64 mask = 0x;
+   u8 shift = 0;
+   u32 res;
+   int i;
+
+   if (!srv->fw_version)
+   return -EINVAL;
+
+   for (i = 0;  i < 4; i++) {
+   res = (val & mask) >> shift;
+
+   switch (i) {
+   case 0:
+   cmd = DMUB_GPINT__SET_TRACE_BUFFER_MASK_WORD0;
+   break;
+   case 1:
+   cmd = DMUB_GPINT__SET_TRACE_BUFFER_MASK_WORD1;
+   break;
+   case 2:
+   cmd = DMUB_GPINT__SET_TRACE_BUFFER_MASK_WORD2;
+   break;
+   case 3:
+   cmd = DMUB_GPINT__SET_TRACE_BUFFER_MASK_WORD3;
+   break;
+   }
+
+   

Re: [PATCH 22/24] drm/amdkfd: add pc sampling release when process release

2023-11-13 Thread James Zhu


On 2023-11-13 10:19, Yat Sin, David wrote:


[AMD Official Use Only - General]


*From:* Zhu, James 
*Sent:* Monday, November 13, 2023 10:13 AM
*To:* Yat Sin, David ; Zhu, James 
; amd-gfx@lists.freedesktop.org
*Cc:* Kuehling, Felix ; Greathouse, Joseph 

*Subject:* Re: [PATCH 22/24] drm/amdkfd: add pc sampling release when 
process release


On 2023-11-10 14:08, Yat Sin, David wrote:

[AMD Official Use Only - General]

-Original Message-

From: Zhu, James  

Sent: Friday, November 3, 2023 9:12 AM

To:amd-gfx@lists.freedesktop.org

Cc: Kuehling, Felix  
; Greathouse, Joseph

  ; Yat Sin, 
David  ; Zhu,

James  

Subject: [PATCH 22/24] drm/amdkfd: add pc sampling release when process

release

Add pc sampling release when process release, it will force to stop all 
activate

sessions with this process.

Signed-off-by: James Zhu  

---

  drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 26

  drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.h |

1 +

  drivers/gpu/drm/amd/amdkfd/kfd_process.c |  3 +++

  3 files changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

index 66670cdb813a..00d8d3f400a9 100644

--- a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

@@ -274,6 +274,32 @@ static int kfd_pc_sample_destroy(struct

kfd_process_device *pdd, uint32_t trace_

   return 0;

  }

+void kfd_pc_sample_release(struct kfd_process_device *pdd) {

+ struct pc_sampling_entry *pcs_entry;

+ struct idr *idp;

+ uint32_t id;

+

+ if (sched_policy == KFD_SCHED_POLICY_NO_HWS) {

+ pr_err("PC Sampling does not support sched_policy %i",

sched_policy);

+ return;

+ }

You do not need to check the sched_policy here, already checked in 
kfd_ioctl_pc_sample(..) , so cannot have a hosttrap session if policy is NO_HWS.

  [JZ]kfd_pc_sample_release is not called from kfd_ioctl_pc_sample. It 
is in process quit process.


[David] I know. But you cannot have a PC sampling session during 
process clean-up when policy=NO_HWS because the session creation would 
have been blocked on session-create.



[JZ] good point.


+

+ /* force to release all PC sampling task for this process */

+ idp = >dev->pcs_data.hosttrap_entry.base.pc_sampling_idr;

+ mutex_lock(>dev->pcs_data.mutex);

+ idr_for_each_entry(idp, pcs_entry, id) {

+ if (pcs_entry->pdd != pdd)

+ continue;

+ mutex_unlock(>dev->pcs_data.mutex);

Can we not release the mutex here and just tell the worker thread to exit 
by setting the stop_enable bit.

I find we have a lot of places where we are acquiring/releasing the mutex 
within loops and this results in a

lot of extra states that we have to account for during the 
start/stop/destroy calls.

+ if (pcs_entry->enabled)

+ kfd_pc_sample_stop(pdd, pcs_entry);

+ kfd_pc_sample_destroy(pdd, id, pcs_entry);

+ mutex_lock(>dev->pcs_data.mutex);

+ }

+ mutex_unlock(>dev->pcs_data.mutex);

+}

+

  int kfd_pc_sample(struct kfd_process_device *pdd,

   struct kfd_ioctl_pc_sample_args 
__user

*args)  { diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.h

b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.h

index cb93909e6bd3..4ea064fdaa98 100644

--- a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.h

+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.h

@@ -30,6 +30,7 @@

  int kfd_pc_sample(struct kfd_process_device *pdd,

   struct kfd_ioctl_pc_sample_args 
__user

*args);

+void kfd_pc_sample_release(struct kfd_process_device *pdd);

  void kfd_pc_sample_handler(struct work_struct *work);

  #endif /* KFD_PC_SAMPLING_H_ */

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c

b/drivers/gpu/drm/amd/amdkfd/kfd_process.c

index d22d804f180d..54f3db7eaae2 100644

--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c

+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c

@@ -43,6 +43,7 @@ struct mm_struct;

  #include "kfd_svm.h"

  #include "kfd_smi_events.h"

  

Re: [PATCH 19/24] drm/amdkfd: enable pc sampling stop

2023-11-13 Thread James Zhu


On 2023-11-10 14:07, Yat Sin, David wrote:

[AMD Official Use Only - General]


-Original Message-
From: Zhu, James
Sent: Friday, November 3, 2023 9:12 AM
To:amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix; Greathouse, Joseph
; Yat Sin, David; Zhu,
James
Subject: [PATCH 19/24] drm/amdkfd: enable pc sampling stop

Enable pc sampling stop.

Signed-off-by: James Zhu
---
  drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 28 +--
-
  drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  2 ++
  2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
index 33d003ca0093..2c4ac5b4cc4b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
@@ -108,10 +108,32 @@ static int kfd_pc_sample_start(struct
kfd_process_device *pdd,
   return 0;
  }

-static int kfd_pc_sample_stop(struct kfd_process_device *pdd)
+static int kfd_pc_sample_stop(struct kfd_process_device *pdd,
+ struct pc_sampling_entry *pcs_entry)
  {
- return -EINVAL;
+ bool pc_sampling_stop = false;
+
+ pcs_entry->enabled = false;
+ mutex_lock(>dev->pcs_data.mutex);

For the START/STOP/DESTROY ioctls, I think we can hold pdd->dev->pcs_data.mutex 
through the whole IOCTL. Then we would not have to deal with the intermediate states 
where the START/STOP/DESTROY are happening at the same time.
[JZ] pdd->dev->pcs_data.mutex is per device, not per process. In the 
future, also it will share protection within different pc sampling 
methods on the same devices. So I don't think a bigger lock here is good 
idea.



+ pdd->dev->pcs_data.hosttrap_entry.base.active_count--;
+ if (!pdd->dev->pcs_data.hosttrap_entry.base.active_count) {
+ WRITE_ONCE(pdd->dev-

pcs_data.hosttrap_entry.base.stop_enable, true);

+ pc_sampling_stop = true;
+ }
+ mutex_unlock(>dev->pcs_data.mutex);

+ if (pc_sampling_stop) {
+ kfd_process_set_trap_pc_sampling_flag(>qpd,
+ pdd->dev-

pcs_data.hosttrap_entry.base.pc_sample_info.method,

+false);
+
+ mutex_lock(>dev->pcs_data.mutex);
+ pdd->dev->pcs_data.hosttrap_entry.base.target_simd = 0;
+ pdd->dev->pcs_data.hosttrap_entry.base.target_wave_slot = 0;
+ WRITE_ONCE(pdd->dev-

pcs_data.hosttrap_entry.base.stop_enable, false);

+ mutex_unlock(>dev->pcs_data.mutex);
+ }
+
+ return 0;
  }

  static int kfd_pc_sample_create(struct kfd_process_device *pdd, @@ -251,7
+273,7 @@ int kfd_pc_sample(struct kfd_process_device *pdd,
   if (!pcs_entry->enabled)
   return -EALREADY;
   else
- return kfd_pc_sample_stop(pdd);
+ return kfd_pc_sample_stop(pdd, pcs_entry);
   }

   return -EINVAL;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 613910e0d440..badaa4d68cc4 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -259,6 +259,8 @@ struct kfd_dev;
  struct kfd_dev_pc_sampling_data {
   uint32_t use_count; /* Num of PC sampling sessions */
   uint32_t active_count;  /* Num of active sessions */
+ uint32_t target_simd;   /* target simd for trap */
+ uint32_t target_wave_slot;  /* target wave slot for trap */
   bool stop_enable;   /* pc sampling stop in process */
   struct idr pc_sampling_idr;
   struct kfd_pc_sample_info pc_sample_info;
--
2.25.1

RE: [PATCH 22/24] drm/amdkfd: add pc sampling release when process release

2023-11-13 Thread Yat Sin, David
[AMD Official Use Only - General]



From: Zhu, James 
Sent: Monday, November 13, 2023 10:13 AM
To: Yat Sin, David ; Zhu, James ; 
amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix ; Greathouse, Joseph 

Subject: Re: [PATCH 22/24] drm/amdkfd: add pc sampling release when process 
release



On 2023-11-10 14:08, Yat Sin, David wrote:

[AMD Official Use Only - General]



-Original Message-

From: Zhu, James 

Sent: Friday, November 3, 2023 9:12 AM

To: amd-gfx@lists.freedesktop.org

Cc: Kuehling, Felix ; 
Greathouse, Joseph

; Yat Sin, David 
; Zhu,

James 

Subject: [PATCH 22/24] drm/amdkfd: add pc sampling release when process

release



Add pc sampling release when process release, it will force to stop all activate

sessions with this process.



Signed-off-by: James Zhu 

---

 drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 26

  drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.h |

1 +

 drivers/gpu/drm/amd/amdkfd/kfd_process.c |  3 +++

 3 files changed, 30 insertions(+)



diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

index 66670cdb813a..00d8d3f400a9 100644

--- a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c

@@ -274,6 +274,32 @@ static int kfd_pc_sample_destroy(struct

kfd_process_device *pdd, uint32_t trace_

  return 0;

 }



+void kfd_pc_sample_release(struct kfd_process_device *pdd) {

+ struct pc_sampling_entry *pcs_entry;

+ struct idr *idp;

+ uint32_t id;

+

+ if (sched_policy == KFD_SCHED_POLICY_NO_HWS) {

+ pr_err("PC Sampling does not support sched_policy %i",

sched_policy);

+ return;

+ }

You do not need to check the sched_policy here, already checked in 
kfd_ioctl_pc_sample(..) , so cannot have a hosttrap session if policy is NO_HWS.
  [JZ]kfd_pc_sample_release is not called from kfd_ioctl_pc_sample. It is in 
process quit process.
[David] I know. But you cannot have a PC sampling session during process 
clean-up when policy=NO_HWS because the session creation would have been 
blocked on session-create.

+

+ /* force to release all PC sampling task for this process */

+ idp = >dev->pcs_data.hosttrap_entry.base.pc_sampling_idr;

+ mutex_lock(>dev->pcs_data.mutex);

+ idr_for_each_entry(idp, pcs_entry, id) {

+ if (pcs_entry->pdd != pdd)

+ continue;

+ mutex_unlock(>dev->pcs_data.mutex);

Can we not release the mutex here and just tell the worker thread to exit by 
setting the stop_enable bit.

I find we have a lot of places where we are acquiring/releasing the mutex 
within loops and this results in a

lot of extra states that we have to account for during the start/stop/destroy 
calls.

+ if (pcs_entry->enabled)

+ kfd_pc_sample_stop(pdd, pcs_entry);

+ kfd_pc_sample_destroy(pdd, id, pcs_entry);

+ mutex_lock(>dev->pcs_data.mutex);

+ }

+ mutex_unlock(>dev->pcs_data.mutex);

+}

+

 int kfd_pc_sample(struct kfd_process_device *pdd,

  struct kfd_ioctl_pc_sample_args __user

*args)  { diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.h

b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.h

index cb93909e6bd3..4ea064fdaa98 100644

--- a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.h

+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.h

@@ -30,6 +30,7 @@



 int kfd_pc_sample(struct kfd_process_device *pdd,

  struct kfd_ioctl_pc_sample_args __user

*args);

+void kfd_pc_sample_release(struct kfd_process_device *pdd);

 void kfd_pc_sample_handler(struct work_struct *work);



 #endif /* KFD_PC_SAMPLING_H_ */

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c

b/drivers/gpu/drm/amd/amdkfd/kfd_process.c

index d22d804f180d..54f3db7eaae2 100644

--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c

+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c

@@ -43,6 +43,7 @@ struct mm_struct;

 #include "kfd_svm.h"

 #include "kfd_smi_events.h"

 #include "kfd_debug.h"

+#include "kfd_pc_sampling.h"



 /*

  * List of struct kfd_process (field kfd_process).

@@ -1020,6 +1021,8 @@ static void kfd_process_destroy_pdds(struct

kfd_process *p)

  pr_debug("Releasing pdd (topology id %d) for process (pasid

0x%x)\n",

  pdd->dev->id, p->pasid);



+ kfd_pc_sample_release(pdd);

+

  kfd_process_device_destroy_cwsr_dgpu(pdd);

  kfd_process_device_destroy_ib_mem(pdd);



--

2.25.1




Re: [PATCH 22/24] drm/amdkfd: add pc sampling release when process release

2023-11-13 Thread James Zhu


On 2023-11-10 14:08, Yat Sin, David wrote:

[AMD Official Use Only - General]


-Original Message-
From: Zhu, James
Sent: Friday, November 3, 2023 9:12 AM
To:amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix; Greathouse, Joseph
; Yat Sin, David; Zhu,
James
Subject: [PATCH 22/24] drm/amdkfd: add pc sampling release when process
release

Add pc sampling release when process release, it will force to stop all activate
sessions with this process.

Signed-off-by: James Zhu
---
  drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 26
  drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.h |
1 +
  drivers/gpu/drm/amd/amdkfd/kfd_process.c |  3 +++
  3 files changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
index 66670cdb813a..00d8d3f400a9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
@@ -274,6 +274,32 @@ static int kfd_pc_sample_destroy(struct
kfd_process_device *pdd, uint32_t trace_
   return 0;
  }

+void kfd_pc_sample_release(struct kfd_process_device *pdd) {
+ struct pc_sampling_entry *pcs_entry;
+ struct idr *idp;
+ uint32_t id;
+
+ if (sched_policy == KFD_SCHED_POLICY_NO_HWS) {
+ pr_err("PC Sampling does not support sched_policy %i",
sched_policy);
+ return;
+ }

You do not need to check the sched_policy here, already checked in 
kfd_ioctl_pc_sample(..) , so cannot have a hosttrap session if policy is NO_HWS.
[JZ]kfd_pc_sample_release is not called from kfd_ioctl_pc_sample. It is 
in process quit process.

+
+ /* force to release all PC sampling task for this process */
+ idp = >dev->pcs_data.hosttrap_entry.base.pc_sampling_idr;
+ mutex_lock(>dev->pcs_data.mutex);
+ idr_for_each_entry(idp, pcs_entry, id) {
+ if (pcs_entry->pdd != pdd)
+ continue;
+ mutex_unlock(>dev->pcs_data.mutex);

Can we not release the mutex here and just tell the worker thread to exit by 
setting the stop_enable bit.
I find we have a lot of places where we are acquiring/releasing the mutex 
within loops and this results in a
lot of extra states that we have to account for during the start/stop/destroy 
calls.

+ if (pcs_entry->enabled)
+ kfd_pc_sample_stop(pdd, pcs_entry);
+ kfd_pc_sample_destroy(pdd, id, pcs_entry);
+ mutex_lock(>dev->pcs_data.mutex);
+ }
+ mutex_unlock(>dev->pcs_data.mutex);
+}
+
  int kfd_pc_sample(struct kfd_process_device *pdd,
   struct kfd_ioctl_pc_sample_args __user
*args)  { diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.h
b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.h
index cb93909e6bd3..4ea064fdaa98 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.h
@@ -30,6 +30,7 @@

  int kfd_pc_sample(struct kfd_process_device *pdd,
   struct kfd_ioctl_pc_sample_args __user
*args);
+void kfd_pc_sample_release(struct kfd_process_device *pdd);
  void kfd_pc_sample_handler(struct work_struct *work);

  #endif /* KFD_PC_SAMPLING_H_ */
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index d22d804f180d..54f3db7eaae2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -43,6 +43,7 @@ struct mm_struct;
  #include "kfd_svm.h"
  #include "kfd_smi_events.h"
  #include "kfd_debug.h"
+#include "kfd_pc_sampling.h"

  /*
   * List of struct kfd_process (field kfd_process).
@@ -1020,6 +1021,8 @@ static void kfd_process_destroy_pdds(struct
kfd_process *p)
   pr_debug("Releasing pdd (topology id %d) for process (pasid
0x%x)\n",
   pdd->dev->id, p->pasid);

+ kfd_pc_sample_release(pdd);
+
   kfd_process_device_destroy_cwsr_dgpu(pdd);
   kfd_process_device_destroy_ib_mem(pdd);

--
2.25.1

RE: [PATCH v2] drm/amdgpu: Do not program VF copy regs in mmhub v1.8 under SRIOV (v2)

2023-11-13 Thread Dhume, Samir
[AMD Official Use Only - General]

Reviewed-by: Samir Dhume 

-Original Message-
From: Lu, Victor Cheng Chi (Victor) 
Sent: Friday, November 10, 2023 5:00 PM
To: amd-gfx@lists.freedesktop.org
Cc: Dhume, Samir ; Lu, Victor Cheng Chi (Victor) 

Subject: [PATCH v2] drm/amdgpu: Do not program VF copy regs in mmhub v1.8 under 
SRIOV (v2)

MC_VM_AGP_* registers should not be programmed by guest driver.

v2: move early return outside of loop

Signed-off-by: Victor Lu 
---
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
index ea142611be1c..9b0146732e13 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
@@ -130,6 +130,9 @@ static void mmhub_v1_8_init_system_aperture_regs(struct 
amdgpu_device *adev)
uint64_t value;
int i;

+   if (amdgpu_sriov_vf(adev))
+   return;
+
inst_mask = adev->aid_mask;
for_each_inst(i, inst_mask) {
/* Program the AGP BAR */
@@ -139,9 +142,6 @@ static void mmhub_v1_8_init_system_aperture_regs(struct 
amdgpu_device *adev)
WREG32_SOC15(MMHUB, i, regMC_VM_AGP_TOP,
 adev->gmc.agp_end >> 24);

-   if (amdgpu_sriov_vf(adev))
-   return;
-
/* Program the system aperture low logical page number. */
WREG32_SOC15(MMHUB, i, regMC_VM_SYSTEM_APERTURE_LOW_ADDR,
min(adev->gmc.fb_start, adev->gmc.agp_start) >> 18);
--
2.34.1



[PATCH v2] drm/amd/display: add a debugfs interface for the DMUB trace mask

2023-11-13 Thread Hamza Mahfooz
For features that are implemented primarily in DMUB (e.g. PSR), it is
useful to be able to trace them at a DMUB level from the kernel,
especially when debugging issues. So, introduce a debugfs interface that
is able to read and set the DMUB trace mask dynamically at runtime and
document how to use it.

Cc: Alex Deucher 
Cc: Mario Limonciello 
Signed-off-by: Hamza Mahfooz 
---
v2: only return -ETIMEDOUT for DMUB_STATUS_TIMEOUT
---
 Documentation/gpu/amdgpu/display/dc-debug.rst | 41 
 .../gpu/amdgpu/display/trace-groups-table.csv | 29 ++
 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 97 +++
 .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h   | 40 +++-
 4 files changed, 205 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/gpu/amdgpu/display/trace-groups-table.csv

diff --git a/Documentation/gpu/amdgpu/display/dc-debug.rst 
b/Documentation/gpu/amdgpu/display/dc-debug.rst
index 40c55a618918..817631b1dbf3 100644
--- a/Documentation/gpu/amdgpu/display/dc-debug.rst
+++ b/Documentation/gpu/amdgpu/display/dc-debug.rst
@@ -75,3 +75,44 @@ change in real-time by using something like::
 
 When reporting a bug related to DC, consider attaching this log before and
 after you reproduce the bug.
+
+DMUB Firmware Debug
+===
+
+Sometimes, dmesg logs aren't enough. This is especially true if a feature is
+implemented primarily in DMUB firmware. In such cases, all we see in dmesg when
+an issue arises is some generic timeout error. So, to get more relevant
+information, we can trace DMUB commands by enabling the relevant bits in
+`amdgpu_dm_dmub_trace_mask`.
+
+Currently, we support the tracing of the following groups:
+
+Trace Groups
+
+
+.. csv-table::
+   :header-rows: 1
+   :widths: 1, 1
+   :file: ./trace-groups-table.csv
+
+**Note: Not all ASICs support all of the listed trace groups**
+
+So, to enable just PSR tracing you can use the following command::
+
+  # echo 0x8020 > /sys/kernel/debug/dri/0/amdgpu_dm_dmub_trace_mask
+
+Then, you need to enable logging trace events to the buffer, which you can do
+using the following::
+
+  # echo 1 > /sys/kernel/debug/dri/0/amdgpu_dm_dmcub_trace_event_en
+
+Lastly, after you are able to reproduce the issue you are trying to debug,
+you can disable tracing and read the trace log by using the following::
+
+  # echo 0 > /sys/kernel/debug/dri/0/amdgpu_dm_dmcub_trace_event_en
+  # cat /sys/kernel/debug/dri/0/amdgpu_dm_dmub_tracebuffer
+
+So, when reporting bugs related to features such as PSR and ABM, consider
+enabling the relevant bits in the mask before reproducing the issue and
+attach the log that you obtain from the trace buffer in any bug reports that 
you
+create.
diff --git a/Documentation/gpu/amdgpu/display/trace-groups-table.csv 
b/Documentation/gpu/amdgpu/display/trace-groups-table.csv
new file mode 100644
index ..3f6a50d1d883
--- /dev/null
+++ b/Documentation/gpu/amdgpu/display/trace-groups-table.csv
@@ -0,0 +1,29 @@
+Name, Mask Value
+INFO, 0x1
+IRQ SVC, 0x2
+VBIOS, 0x4
+REGISTER, 0x8
+PHY DBG, 0x10
+PSR, 0x20
+AUX, 0x40
+SMU, 0x80
+MALL, 0x100
+ABM, 0x200
+ALPM, 0x400
+TIMER, 0x800
+HW LOCK MGR, 0x1000
+INBOX1, 0x2000
+PHY SEQ, 0x4000
+PSR STATE, 0x8000
+ZSTATE, 0x1
+TRANSMITTER CTL, 0x2
+PANEL CNTL, 0x4
+FAMS, 0x8
+DPIA, 0x10
+SUBVP, 0x20
+INBOX0, 0x40
+SDP, 0x400
+REPLAY, 0x800
+REPLAY RESIDENCY, 0x2000
+CURSOR INFO, 0x8000
+IPS, 0x1
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index 45c972f2630d..67dea56cf583 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -2971,6 +2971,100 @@ static int allow_edp_hotplug_detection_set(void *data, 
u64 val)
return 0;
 }
 
+static int dmub_trace_mask_set(void *data, u64 val)
+{
+   struct amdgpu_device *adev = data;
+   struct dmub_srv *srv = adev->dm.dc->ctx->dmub_srv->dmub;
+   enum dmub_gpint_command cmd;
+   enum dmub_status status;
+   u64 mask = 0x;
+   u8 shift = 0;
+   u32 res;
+   int i;
+
+   if (!srv->fw_version)
+   return -EINVAL;
+
+   for (i = 0;  i < 4; i++) {
+   res = (val & mask) >> shift;
+
+   switch (i) {
+   case 0:
+   cmd = DMUB_GPINT__SET_TRACE_BUFFER_MASK_WORD0;
+   break;
+   case 1:
+   cmd = DMUB_GPINT__SET_TRACE_BUFFER_MASK_WORD1;
+   break;
+   case 2:
+   cmd = DMUB_GPINT__SET_TRACE_BUFFER_MASK_WORD2;
+   break;
+   case 3:
+   cmd = DMUB_GPINT__SET_TRACE_BUFFER_MASK_WORD3;
+   break;
+   }
+
+   status = dmub_srv_send_gpint_command(srv, cmd, res, 30);
+
+   if (status 

RE: [PATCH 00/20] DC Patches Nov 08, 2023

2023-11-13 Thread Wheeler, Daniel
[Public]

Hi all,

This week this patchset was tested on the following systems:
* Lenovo ThinkBook T13s Gen4 with AMD Ryzen 5 6600U
* MSI Gaming X Trio RX 6800
* Gigabyte Gaming OC RX 7900 XTX

These systems were tested on the following display/connection types:
* eDP, (1080p 60hz [5650U]) (1920x1200 60hz [6600U]) (2560x1600 
120hz[6600U])
* VGA and DVI (1680x1050 60hz [DP to VGA/DVI, USB-C to VGA/DVI])
* DP/HDMI/USB-C (1440p 170hz, 4k 60hz, 4k 144hz, 4k 240hz [Includes 
USB-C to DP/HDMI adapters])
* Thunderbolt (LG Ultrafine 5k)
* MST (Startech MST14DP123DP [DP to 3x DP] and 2x 4k 60Hz displays)
* DSC (with Cable Matters 101075 [DP to 3x DP] with 3x 4k60 displays, 
and HP Hook G2 with 1 4k60 display)
* USB 4 (Kensington SD5700T and 1x 4k 60Hz display)
* PCON (Club3D CAC-1085 and 1x 4k 144Hz display [at 4k 120HZ, as that 
is the max the adapter supports])

The testing is a mix of automated and manual tests. Manual testing includes 
(but is not limited to):
* Changing display configurations and settings
* Benchmark testing
* Feature testing (Freesync, etc.)

Automated testing includes (but is not limited to):
* Script testing (scripts to automate some of the manual checks)
* IGT testing

The patchset consists of the amd-staging-drm-next branch (Head commit - 
7bb6e298b738 drm/amdkfd: Move TLB flushing logic into amdgpu) with new patches 
added on top of it.

Tested on Ubuntu 22.04.3, on Wayland and X11, using KDE Plasma and Gnome.


Tested-by: Daniel Wheeler 


Thank you,

Dan Wheeler
Sr. Technologist | AMD
SW Display
--
1 Commerce Valley Dr E, Thornhill, ON L3T 7X6
amd.com

-Original Message-
From: Hung, Alex 
Sent: Wednesday, November 8, 2023 1:44 PM
To: amd-gfx@lists.freedesktop.org
Cc: Wentland, Harry ; Li, Sun peng (Leo) 
; Siqueira, Rodrigo ; Pillai, 
Aurabindo ; Li, Roman ; Lin, Wayne 
; Wang, Chao-kai (Stylon) ; Gutierrez, 
Agustin ; Chung, ChiaHsuan (Tom) 
; Wu, Hersen ; Zuo, Jerry 
; Hung, Alex ; Wheeler, Daniel 

Subject: [PATCH 00/20] DC Patches Nov 08, 2023

This DC patchset brings improvements in multiple areas. In summary, we 
highlight:

* Add missing chips for HDCP
* Add new command to disable replay timing resync
* Fix encoder disable logic
* Enable DSC Flag in MST Mode Validation
* Change the DMCUB mailbox memory location from FB to inbox
* Add disable timeout option
* Negate IPS allow and commit bits
* Enable DCN clock gating for DCN35
* Prefer currently used OTG master when acquiring free pipe
* Try to acquire a free OTG master not used in cur ctx first
* Clear dpcd_sink_ext_caps if not set
* Enable fast plane updates on DCN3.2 and above
* Add null checks for 8K60 lightup
* Refactor resource into component directory
* Fix DSC not Enabled on Direct MST Sink
* Guard against invalid RPTR/WPTR being set
* Enable CM low mem power optimization
* Fix a debugfs null pointer error

Cc: Daniel Wheeler 

Anthony Koo (1):
  drm/amd/display: Add new command to disable replay timing resync

Aric Cyr (1):
  drm/amd/display: Promote DC to 3.2.260

Aurabindo Pillai (1):
  drm/amd/display: Fix a debugfs null pointer error

Daniel Miess (1):
  drm/amd/display: Enable DCN clock gating for DCN35

Duncan Ma (2):
  drm/amd/display: Negate IPS allow and commit bits
  drm/amd/display: Add disable timeout option

Fangzhi Zuo (2):
  drm/amd/display: Fix DSC not Enabled on Direct MST Sink
  drm/amd/display: Enable DSC Flag in MST Mode Validation

Krunoslav Kovac (1):
  drm/amd/display: Send PQ bit in AMD VSIF

Lewis Huang (1):
  drm/amd/display: Change the DMCUB mailbox memory location from FB to
inbox

Mounika Adhuri (1):
  drm/amd/display: Refactor resource into component directory

Muhammad Ahmed (1):
  drm/amd/display: Add null checks for 8K60 lightup

Nicholas Kazlauskas (1):
  drm/amd/display: Guard against invalid RPTR/WPTR being set

Nicholas Susanto (1):
  drm/amd/display: Fix encoder disable logic

Paul Hsieh (1):
  drm/amd/display: Clear dpcd_sink_ext_caps if not set

Rodrigo Siqueira (1):
  drm/amd/display: Add missing chips for HDCP

Tianci Yin (1):
  drm/amd/display: Enable fast plane updates on DCN3.2 and above

Wenjing Liu (2):
  drm/amd/display: Try to acquire a free OTG master not used in cur ctx
first
  drm/amd/display: Prefer currently used OTG master when acquiring free
pipe

Yihan Zhu (1):
  drm/amd/display: Enable CM low mem power optimization

 drivers/gpu/drm/amd/display/Makefile  |   1 +
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  21 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c |   6 +-
 .../display/amdgpu_dm/amdgpu_dm_mst_types.c   |  63 +++---
 drivers/gpu/drm/amd/display/dc/Makefile   |   5 +-
 .../display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c  |  18 +-
 .../amd/display/dc/clk_mgr/dcn35/dcn35_smu.c  |   3 +
 

Re: [PATCH 2/2] drm/virtio: Modify RESOURCE_GET_LAYOUT ioctl

2023-11-13 Thread Dmitry Osipenko
On 11/10/23 10:16, Julia Zhang wrote:
> Modify RESOURCE_GET_LAYOUT ioctl to handle the use case that query
> correct stride for guest linear resource before it is created.
> 
> Signed-off-by: Julia Zhang 
> ---
>  drivers/gpu/drm/virtio/virtgpu_drv.h   | 26 --
>  drivers/gpu/drm/virtio/virtgpu_ioctl.c | 47 --
>  drivers/gpu/drm/virtio/virtgpu_vq.c| 35 +++
>  include/uapi/drm/virtgpu_drm.h |  6 ++--
>  include/uapi/linux/virtio_gpu.h|  8 ++---
>  5 files changed, 66 insertions(+), 56 deletions(-)

1. Please squash this all into a single patch. For upstream kernel it's
not acceptable to have subsequent commits modifying previous commits. To
commit message add your s-o-b, your co-developed-by tags and a brief
comment explaining changes you've done to the original patch.

Signed-off-by: Daniel Stone 
Co-developed-by: Julia Zhang  # query correct
stride for guest linear resource before it's created
Signed-off-by: Julia Zhang 

2. Make sure that patch passes `scripts/checkpatch.pl`

3. Add link to the commit message for the relevant Mesa MR that makes
use of the new ioctl. The MR should be already merged or ready to be merged.

Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/123456

-- 
Best regards,
Dmitry




RE: [PATCH 1/1] drm/amdgpu: finalizing mem_partitions at the end of GMC v9 sw_fini

2023-11-13 Thread Zhang, Hawking
[AMD Official Use Only - General]

Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: Ma, Le 
Sent: Monday, November 13, 2023 18:36
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking 
Subject: [PATCH 1/1] drm/amdgpu: finalizing mem_partitions at the end of GMC v9 
sw_fini

The valid num_mem_partitions is required during ttm pool fini, thus move the 
cleanup at the end of the function.

Signed-off-by: Le Ma 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index bde25eb4ed8e..c1f2f166f064 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -2170,8 +2170,6 @@ static int gmc_v9_0_sw_fini(void *handle)

if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3))
amdgpu_gmc_sysfs_fini(adev);
-   adev->gmc.num_mem_partitions = 0;
-   kfree(adev->gmc.mem_partitions);

amdgpu_gmc_ras_fini(adev);
amdgpu_gem_force_release(adev);
@@ -2185,6 +2183,9 @@ static int gmc_v9_0_sw_fini(void *handle)
amdgpu_bo_free_kernel(>gmc.pdb0_bo, NULL, >gmc.ptr_pdb0);
amdgpu_bo_fini(adev);

+   adev->gmc.num_mem_partitions = 0;
+   kfree(adev->gmc.mem_partitions);
+
return 0;
 }

--
2.38.1



[PATCH 1/1] drm/amdgpu: finalizing mem_partitions at the end of GMC v9 sw_fini

2023-11-13 Thread Le Ma
The valid num_mem_partitions is required during ttm pool fini, thus move the
cleanup at the end of the function.

Signed-off-by: Le Ma 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index bde25eb4ed8e..c1f2f166f064 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -2170,8 +2170,6 @@ static int gmc_v9_0_sw_fini(void *handle)
 
if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3))
amdgpu_gmc_sysfs_fini(adev);
-   adev->gmc.num_mem_partitions = 0;
-   kfree(adev->gmc.mem_partitions);
 
amdgpu_gmc_ras_fini(adev);
amdgpu_gem_force_release(adev);
@@ -2185,6 +2183,9 @@ static int gmc_v9_0_sw_fini(void *handle)
amdgpu_bo_free_kernel(>gmc.pdb0_bo, NULL, >gmc.ptr_pdb0);
amdgpu_bo_fini(adev);
 
+   adev->gmc.num_mem_partitions = 0;
+   kfree(adev->gmc.mem_partitions);
+
return 0;
 }
 
-- 
2.38.1



Re: [PATCH v6 6/6] drm/doc: Define KMS atomic state set

2023-11-13 Thread Simon Ser






On Monday, November 13th, 2023 at 11:15, Pekka Paalanen  
wrote:


> 
> 
> On Mon, 13 Nov 2023 09:44:15 +
> Simon Ser cont...@emersion.fr wrote:
> 
> > On Monday, November 13th, 2023 at 10:38, Pekka Paalanen ppaala...@gmail.com 
> > wrote:
> > 
> > > On Mon, 13 Nov 2023 09:18:39 +
> > > Simon Ser cont...@emersion.fr wrote:
> > > 
> > > > On Monday, October 23rd, 2023 at 10:25, Simon Ser cont...@emersion.fr 
> > > > wrote:
> > > > 
> > > > > > > > > > > > > > +An atomic commit with the flag 
> > > > > > > > > > > > > > DRM_MODE_PAGE_FLIP_ASYNC is allowed to
> > > > > > > > > > > > > > +effectively change only the FB_ID property on any 
> > > > > > > > > > > > > > planes. No-operation changes
> > > > > > > > > > > > > > +are ignored as always. [...]
> > > > > > > > > > > > > > During the hackfest in Brno, it was mentioned that 
> > > > > > > > > > > > > > a commit which re-sets the same FB_ID could 
> > > > > > > > > > > > > > actually have an effect with VRR: It could trigger 
> > > > > > > > > > > > > > scanout of the next frame before vertical blank has 
> > > > > > > > > > > > > > reached its maximum duration. Some kind of 
> > > > > > > > > > > > > > mechanism is required for this in order to allow 
> > > > > > > > > > > > > > user space to perform low frame rate compensation.
> > > > > > > > > > > > 
> > > > > > > > > > > > Xaver tested this hypothesis in a flipping the same fb 
> > > > > > > > > > > > in a VRR monitor
> > > > > > > > > > > > and it worked as expected, so this shouldn't be a 
> > > > > > > > > > > > concern.
> > > > > > > > > > > > Right, so it must have some effect. It cannot be simply 
> > > > > > > > > > > > ignored like in
> > > > > > > > > > > > the proposed doc wording. Do we special-case re-setting 
> > > > > > > > > > > > the same FB_ID
> > > > > > > > > > > > as "not a no-op" or "not ignored" or some other way?
> > > > > > > > > > > > There's an effect in the refresh rate, the image won't 
> > > > > > > > > > > > change but it
> > > > > > > > > > > > will report that a flip had happened asynchronously so 
> > > > > > > > > > > > the reported
> > > > > > > > > > > > framerate will be increased. Maybe an additional 
> > > > > > > > > > > > wording could be like:
> > > > > > > > > > 
> > > > > > > > > > Flipping to the same FB_ID will result in a immediate flip 
> > > > > > > > > > as if it was
> > > > > > > > > > changing to a different one, with no effect on the image 
> > > > > > > > > > but effecting
> > > > > > > > > > the reported frame rate.
> > > > > > > > > 
> > > > > > > > > Re-setting FB_ID to its current value is a special case 
> > > > > > > > > regardless of
> > > > > > > > > PAGE_FLIP_ASYNC, is it not?
> > > > > > > > 
> > > > > > > > No. The rule has so far been that all side effects are observed
> > > > > > > > even if you flip to the same fb. And that is one of my 
> > > > > > > > annoyances
> > > > > > > > with this proposal. The rules will now be different for async 
> > > > > > > > flips
> > > > > > > > vs. everything else.
> > > > > > > 
> > > > > > > Well with the patches the async page-flip case is exactly the 
> > > > > > > same as
> > > > > > > the non-async page-flip case. In both cases, if a FB_ID is 
> > > > > > > included in
> > > > > > > an atomic commit then the side effects are triggered even if the 
> > > > > > > property
> > > > > > > value didn't change. The rules are the same for everything.
> > > > > > 
> > > > > > I see it only checking if FB_ID changes or not. If it doesn't
> > > > > > change then the implication is that the side effects will in
> > > > > > fact be skipped as not all planes may even support async flips.
> > > > > 
> > > > > Hm right. So the problem is that setting any prop = same value as
> > > > > previous one will result in a new page-flip for asynchronous 
> > > > > page-flips,
> > > > > but will not result in any side-effect for asynchronous page-flips.
> > > > > 
> > > > > Does it actually matter though? For async page-flips, I don't think 
> > > > > this
> > > > > would result in any actual difference in behavior?
> > > 
> > > Hi Simon,
> > > 
> > > a fly-by question...
> > > 
> > > > To sum this up, here is a matrix of behavior as seen by user-space:
> > > > 
> > > > - Sync atomic page-flip
> > > > - Set FB_ID to different value: programs hw for page-flip, sends uevent
> > > > - Set FB_ID to same value: same (important for VRR)
> > > > - Set another plane prop to same value: same
> > > > - Set another plane prop to different value: maybe rejected if modeset 
> > > > required
> > > > - Async atomic page-flip
> > > > - Set FB_ID to different value: updates hw with new FB address, sends
> > > > immediate uevent
> > > > - Set FB_ID to same value: same (no-op for the hw)
> > > 
> > > It should not be a no-op for the hw, because the hw might be in the
> > > middle of a VRR front-porch waiting period, and the commit needs to end
> > > the waiting immediately rather than time out?
> > 
> > I'm not sure
> 
> 

Re: [PATCH v6 6/6] drm/doc: Define KMS atomic state set

2023-11-13 Thread Pekka Paalanen
On Mon, 13 Nov 2023 09:44:15 +
Simon Ser  wrote:

> On Monday, November 13th, 2023 at 10:38, Pekka Paalanen  
> wrote:
> 
> > On Mon, 13 Nov 2023 09:18:39 +
> > Simon Ser cont...@emersion.fr wrote:
> >   
> > > On Monday, October 23rd, 2023 at 10:25, Simon Ser cont...@emersion.fr 
> > > wrote:
> > >   
> > > > > > > > > > > > > +An atomic commit with the flag 
> > > > > > > > > > > > > DRM_MODE_PAGE_FLIP_ASYNC is allowed to
> > > > > > > > > > > > > +effectively change only the FB_ID property on any 
> > > > > > > > > > > > > planes. No-operation changes
> > > > > > > > > > > > > +are ignored as always. [...]
> > > > > > > > > > > > > During the hackfest in Brno, it was mentioned that a 
> > > > > > > > > > > > > commit which re-sets the same FB_ID could actually 
> > > > > > > > > > > > > have an effect with VRR: It could trigger scanout of 
> > > > > > > > > > > > > the next frame before vertical blank has reached its 
> > > > > > > > > > > > > maximum duration. Some kind of mechanism is required 
> > > > > > > > > > > > > for this in order to allow user space to perform low 
> > > > > > > > > > > > > frame rate compensation.  
> > > > > > > > > > > 
> > > > > > > > > > > Xaver tested this hypothesis in a flipping the same fb in 
> > > > > > > > > > > a VRR monitor
> > > > > > > > > > > and it worked as expected, so this shouldn't be a concern.
> > > > > > > > > > > Right, so it must have some effect. It cannot be simply 
> > > > > > > > > > > ignored like in
> > > > > > > > > > > the proposed doc wording. Do we special-case re-setting 
> > > > > > > > > > > the same FB_ID
> > > > > > > > > > > as "not a no-op" or "not ignored" or some other way?
> > > > > > > > > > > There's an effect in the refresh rate, the image won't 
> > > > > > > > > > > change but it
> > > > > > > > > > > will report that a flip had happened asynchronously so 
> > > > > > > > > > > the reported
> > > > > > > > > > > framerate will be increased. Maybe an additional wording 
> > > > > > > > > > > could be like:  
> > > > > > > > > 
> > > > > > > > > Flipping to the same FB_ID will result in a immediate flip as 
> > > > > > > > > if it was
> > > > > > > > > changing to a different one, with no effect on the image but 
> > > > > > > > > effecting
> > > > > > > > > the reported frame rate.  
> > > > > > > > 
> > > > > > > > Re-setting FB_ID to its current value is a special case 
> > > > > > > > regardless of
> > > > > > > > PAGE_FLIP_ASYNC, is it not?  
> > > > > > > 
> > > > > > > No. The rule has so far been that all side effects are observed
> > > > > > > even if you flip to the same fb. And that is one of my annoyances
> > > > > > > with this proposal. The rules will now be different for async 
> > > > > > > flips
> > > > > > > vs. everything else.  
> > > > > > 
> > > > > > Well with the patches the async page-flip case is exactly the same 
> > > > > > as
> > > > > > the non-async page-flip case. In both cases, if a FB_ID is included 
> > > > > > in
> > > > > > an atomic commit then the side effects are triggered even if the 
> > > > > > property
> > > > > > value didn't change. The rules are the same for everything.  
> > > > > 
> > > > > I see it only checking if FB_ID changes or not. If it doesn't
> > > > > change then the implication is that the side effects will in
> > > > > fact be skipped as not all planes may even support async flips.  
> > > > 
> > > > Hm right. So the problem is that setting any prop = same value as
> > > > previous one will result in a new page-flip for asynchronous page-flips,
> > > > but will not result in any side-effect for asynchronous page-flips.
> > > > 
> > > > Does it actually matter though? For async page-flips, I don't think this
> > > > would result in any actual difference in behavior?  
> > 
> > 
> > Hi Simon,
> > 
> > a fly-by question...
> >   
> > > To sum this up, here is a matrix of behavior as seen by user-space:
> > > 
> > > - Sync atomic page-flip
> > > - Set FB_ID to different value: programs hw for page-flip, sends uevent
> > > - Set FB_ID to same value: same (important for VRR)
> > > - Set another plane prop to same value: same
> > > - Set another plane prop to different value: maybe rejected if modeset 
> > > required
> > > - Async atomic page-flip
> > > - Set FB_ID to different value: updates hw with new FB address, sends
> > > immediate uevent
> > > - Set FB_ID to same value: same (no-op for the hw)  
> > 
> > It should not be a no-op for the hw, because the hw might be in the
> > middle of a VRR front-porch waiting period, and the commit needs to end
> > the waiting immediately rather than time out?  
> 
> I'm not sure 

Would people not want to use async commits to trigger new VRR scanout
cycles without content updates?

I seem to recall previous comments that switching between sync and
async commit modes may take a moment (intel's one last sync flip), so
using sync once in a while then using async otherwise is probably not a
good idea.

> > > - Set another plane 

Re: [PATCH v6 6/6] drm/doc: Define KMS atomic state set

2023-11-13 Thread Michel Dänzer
On 11/13/23 10:47, Simon Ser wrote:
> On Monday, November 13th, 2023 at 10:41, Michel Dänzer 
>  wrote:
> 
>> On 11/13/23 10:18, Simon Ser wrote:
>>
>>> On Monday, October 23rd, 2023 at 10:25, Simon Ser cont...@emersion.fr wrote:
>>>
> +An atomic commit with the flag DRM_MODE_PAGE_FLIP_ASYNC is 
> allowed to
> +effectively change only the FB_ID property on any planes. 
> No-operation changes
> +are ignored as always. [...]
> During the hackfest in Brno, it was mentioned that a commit which 
> re-sets the same FB_ID could actually have an effect with VRR: It 
> could trigger scanout of the next frame before vertical blank has 
> reached its maximum duration. Some kind of mechanism is required 
> for this in order to allow user space to perform low frame rate 
> compensation.
>>>
>>> Xaver tested this hypothesis in a flipping the same fb in a VRR 
>>> monitor
>>> and it worked as expected, so this shouldn't be a concern.
>>> Right, so it must have some effect. It cannot be simply ignored 
>>> like in
>>> the proposed doc wording. Do we special-case re-setting the same 
>>> FB_ID
>>> as "not a no-op" or "not ignored" or some other way?
>>> There's an effect in the refresh rate, the image won't change but it
>>> will report that a flip had happened asynchronously so the reported
>>> framerate will be increased. Maybe an additional wording could be 
>>> like:
>
> Flipping to the same FB_ID will result in a immediate flip as if it 
> was
> changing to a different one, with no effect on the image but effecting
> the reported frame rate.

 Re-setting FB_ID to its current value is a special case regardless of
 PAGE_FLIP_ASYNC, is it not?
>>>
>>> No. The rule has so far been that all side effects are observed
>>> even if you flip to the same fb. And that is one of my annoyances
>>> with this proposal. The rules will now be different for async flips
>>> vs. everything else.
>>
>> Well with the patches the async page-flip case is exactly the same as
>> the non-async page-flip case. In both cases, if a FB_ID is included in
>> an atomic commit then the side effects are triggered even if the property
>> value didn't change. The rules are the same for everything.
>
> I see it only checking if FB_ID changes or not. If it doesn't
> change then the implication is that the side effects will in
> fact be skipped as not all planes may even support async flips.

 Hm right. So the problem is that setting any prop = same value as
 previous one will result in a new page-flip for asynchronous page-flips,
 but will not result in any side-effect for asynchronous page-flips.

 Does it actually matter though? For async page-flips, I don't think this
 would result in any actual difference in behavior?
>>>
>>> To sum this up, here is a matrix of behavior as seen by user-space:
>>>
>>> - Sync atomic page-flip
>>> - Set FB_ID to different value: programs hw for page-flip, sends uevent
>>> - Set FB_ID to same value: same (important for VRR)
>>> - Set another plane prop to same value: same
>>
>> A page flip is programmed even if FB_ID isn't touched?
> 
> I believe so. Set CRTC_X on a plane to the same value as before, and the
> CRTC gets implicitly included in the atomic commit?
> 
>>> - Set another plane prop to different value: maybe rejected if modeset 
>>> required
>>> - Async atomic page-flip
>>> - Set FB_ID to different value: updates hw with new FB address, sends
>>> immediate uevent
>>> - Set FB_ID to same value: same (no-op for the hw)
>>
>> No-op implies it doesn't trigger scanning out a frame with VRR, if
>> scanout is currently in vertical blank. Is that the case? If so, async
>> flips can't reliably trigger scanning out a frame with VRR.
> 
> By no-op I mean that the hw is programmed for an immediate async flip
> with the same buffer addr as the previous one. So this doesn't actually
> change anything.

If a flip is programmed to the HW, it's not a no-op any more than in the sync 
case, in particular not with VRR.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v2] drm/amdgpu: Address member 'ring' not described in 'amdgpu_ vce, uvd_entity_init()'

2023-11-13 Thread Christian König

Am 13.11.23 um 06:31 schrieb Srinivasan Shanmugam:

Fixes the following:

drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:237: warning: Function parameter or 
member 'ring' not described in 'amdgpu_vce_entity_init'
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c:405: warning: Function parameter or 
member 'ring' not described in 'amdgpu_uvd_entity_init'

Cc: Christian König 
Cc: Alex Deucher 
Cc: "Pan, Xinhui" 
Signed-off-by: Srinivasan Shanmugam 


Reviewed-by: Christian König 


---

Updated ring variable description to "amdgpu_ring pointer to check" (Alex)

  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 1 +
  2 files changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index 65949cc7abb9..07d930339b07 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -398,6 +398,7 @@ int amdgpu_uvd_sw_fini(struct amdgpu_device *adev)
   * amdgpu_uvd_entity_init - init entity
   *
   * @adev: amdgpu_device pointer
+ * @ring: amdgpu_ring pointer to check
   *
   * Initialize the entity used for handle management in the kernel driver.
   */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index 0954447f689d..59acf424a078 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -230,6 +230,7 @@ int amdgpu_vce_sw_fini(struct amdgpu_device *adev)
   * amdgpu_vce_entity_init - init entity
   *
   * @adev: amdgpu_device pointer
+ * @ring: amdgpu_ring pointer to check
   *
   * Initialize the entity used for handle management in the kernel driver.
   */




Re: [PATCH v6 6/6] drm/doc: Define KMS atomic state set

2023-11-13 Thread Simon Ser
On Monday, November 13th, 2023 at 10:41, Michel Dänzer 
 wrote:

> On 11/13/23 10:18, Simon Ser wrote:
> 
> > On Monday, October 23rd, 2023 at 10:25, Simon Ser cont...@emersion.fr wrote:
> > 
> > > > > > > > > > > > +An atomic commit with the flag 
> > > > > > > > > > > > DRM_MODE_PAGE_FLIP_ASYNC is allowed to
> > > > > > > > > > > > +effectively change only the FB_ID property on any 
> > > > > > > > > > > > planes. No-operation changes
> > > > > > > > > > > > +are ignored as always. [...]
> > > > > > > > > > > > During the hackfest in Brno, it was mentioned that a 
> > > > > > > > > > > > commit which re-sets the same FB_ID could actually have 
> > > > > > > > > > > > an effect with VRR: It could trigger scanout of the 
> > > > > > > > > > > > next frame before vertical blank has reached its 
> > > > > > > > > > > > maximum duration. Some kind of mechanism is required 
> > > > > > > > > > > > for this in order to allow user space to perform low 
> > > > > > > > > > > > frame rate compensation.
> > > > > > > > > > 
> > > > > > > > > > Xaver tested this hypothesis in a flipping the same fb in a 
> > > > > > > > > > VRR monitor
> > > > > > > > > > and it worked as expected, so this shouldn't be a concern.
> > > > > > > > > > Right, so it must have some effect. It cannot be simply 
> > > > > > > > > > ignored like in
> > > > > > > > > > the proposed doc wording. Do we special-case re-setting the 
> > > > > > > > > > same FB_ID
> > > > > > > > > > as "not a no-op" or "not ignored" or some other way?
> > > > > > > > > > There's an effect in the refresh rate, the image won't 
> > > > > > > > > > change but it
> > > > > > > > > > will report that a flip had happened asynchronously so the 
> > > > > > > > > > reported
> > > > > > > > > > framerate will be increased. Maybe an additional wording 
> > > > > > > > > > could be like:
> > > > > > > > 
> > > > > > > > Flipping to the same FB_ID will result in a immediate flip as 
> > > > > > > > if it was
> > > > > > > > changing to a different one, with no effect on the image but 
> > > > > > > > effecting
> > > > > > > > the reported frame rate.
> > > > > > > 
> > > > > > > Re-setting FB_ID to its current value is a special case 
> > > > > > > regardless of
> > > > > > > PAGE_FLIP_ASYNC, is it not?
> > > > > > 
> > > > > > No. The rule has so far been that all side effects are observed
> > > > > > even if you flip to the same fb. And that is one of my annoyances
> > > > > > with this proposal. The rules will now be different for async flips
> > > > > > vs. everything else.
> > > > > 
> > > > > Well with the patches the async page-flip case is exactly the same as
> > > > > the non-async page-flip case. In both cases, if a FB_ID is included in
> > > > > an atomic commit then the side effects are triggered even if the 
> > > > > property
> > > > > value didn't change. The rules are the same for everything.
> > > > 
> > > > I see it only checking if FB_ID changes or not. If it doesn't
> > > > change then the implication is that the side effects will in
> > > > fact be skipped as not all planes may even support async flips.
> > > 
> > > Hm right. So the problem is that setting any prop = same value as
> > > previous one will result in a new page-flip for asynchronous page-flips,
> > > but will not result in any side-effect for asynchronous page-flips.
> > > 
> > > Does it actually matter though? For async page-flips, I don't think this
> > > would result in any actual difference in behavior?
> > 
> > To sum this up, here is a matrix of behavior as seen by user-space:
> > 
> > - Sync atomic page-flip
> > - Set FB_ID to different value: programs hw for page-flip, sends uevent
> > - Set FB_ID to same value: same (important for VRR)
> > - Set another plane prop to same value: same
> 
> A page flip is programmed even if FB_ID isn't touched?

I believe so. Set CRTC_X on a plane to the same value as before, and the
CRTC gets implicitly included in the atomic commit?

> > - Set another plane prop to different value: maybe rejected if modeset 
> > required
> > - Async atomic page-flip
> > - Set FB_ID to different value: updates hw with new FB address, sends
> > immediate uevent
> > - Set FB_ID to same value: same (no-op for the hw)
> 
> No-op implies it doesn't trigger scanning out a frame with VRR, if
> scanout is currently in vertical blank. Is that the case? If so, async
> flips can't reliably trigger scanning out a frame with VRR.

By no-op I mean that the hw is programmed for an immediate async flip
with the same buffer addr as the previous one. So this doesn't actually
change anything.


Re: [PATCH v6 6/6] drm/doc: Define KMS atomic state set

2023-11-13 Thread Simon Ser
On Monday, November 13th, 2023 at 10:38, Pekka Paalanen  
wrote:

> On Mon, 13 Nov 2023 09:18:39 +
> Simon Ser cont...@emersion.fr wrote:
> 
> > On Monday, October 23rd, 2023 at 10:25, Simon Ser cont...@emersion.fr wrote:
> > 
> > > > > > > > > > > > +An atomic commit with the flag 
> > > > > > > > > > > > DRM_MODE_PAGE_FLIP_ASYNC is allowed to
> > > > > > > > > > > > +effectively change only the FB_ID property on any 
> > > > > > > > > > > > planes. No-operation changes
> > > > > > > > > > > > +are ignored as always. [...]
> > > > > > > > > > > > During the hackfest in Brno, it was mentioned that a 
> > > > > > > > > > > > commit which re-sets the same FB_ID could actually have 
> > > > > > > > > > > > an effect with VRR: It could trigger scanout of the 
> > > > > > > > > > > > next frame before vertical blank has reached its 
> > > > > > > > > > > > maximum duration. Some kind of mechanism is required 
> > > > > > > > > > > > for this in order to allow user space to perform low 
> > > > > > > > > > > > frame rate compensation.
> > > > > > > > > > 
> > > > > > > > > > Xaver tested this hypothesis in a flipping the same fb in a 
> > > > > > > > > > VRR monitor
> > > > > > > > > > and it worked as expected, so this shouldn't be a concern.
> > > > > > > > > > Right, so it must have some effect. It cannot be simply 
> > > > > > > > > > ignored like in
> > > > > > > > > > the proposed doc wording. Do we special-case re-setting the 
> > > > > > > > > > same FB_ID
> > > > > > > > > > as "not a no-op" or "not ignored" or some other way?
> > > > > > > > > > There's an effect in the refresh rate, the image won't 
> > > > > > > > > > change but it
> > > > > > > > > > will report that a flip had happened asynchronously so the 
> > > > > > > > > > reported
> > > > > > > > > > framerate will be increased. Maybe an additional wording 
> > > > > > > > > > could be like:
> > > > > > > > 
> > > > > > > > Flipping to the same FB_ID will result in a immediate flip as 
> > > > > > > > if it was
> > > > > > > > changing to a different one, with no effect on the image but 
> > > > > > > > effecting
> > > > > > > > the reported frame rate.
> > > > > > > 
> > > > > > > Re-setting FB_ID to its current value is a special case 
> > > > > > > regardless of
> > > > > > > PAGE_FLIP_ASYNC, is it not?
> > > > > > 
> > > > > > No. The rule has so far been that all side effects are observed
> > > > > > even if you flip to the same fb. And that is one of my annoyances
> > > > > > with this proposal. The rules will now be different for async flips
> > > > > > vs. everything else.
> > > > > 
> > > > > Well with the patches the async page-flip case is exactly the same as
> > > > > the non-async page-flip case. In both cases, if a FB_ID is included in
> > > > > an atomic commit then the side effects are triggered even if the 
> > > > > property
> > > > > value didn't change. The rules are the same for everything.
> > > > 
> > > > I see it only checking if FB_ID changes or not. If it doesn't
> > > > change then the implication is that the side effects will in
> > > > fact be skipped as not all planes may even support async flips.
> > > 
> > > Hm right. So the problem is that setting any prop = same value as
> > > previous one will result in a new page-flip for asynchronous page-flips,
> > > but will not result in any side-effect for asynchronous page-flips.
> > > 
> > > Does it actually matter though? For async page-flips, I don't think this
> > > would result in any actual difference in behavior?
> 
> 
> Hi Simon,
> 
> a fly-by question...
> 
> > To sum this up, here is a matrix of behavior as seen by user-space:
> > 
> > - Sync atomic page-flip
> > - Set FB_ID to different value: programs hw for page-flip, sends uevent
> > - Set FB_ID to same value: same (important for VRR)
> > - Set another plane prop to same value: same
> > - Set another plane prop to different value: maybe rejected if modeset 
> > required
> > - Async atomic page-flip
> > - Set FB_ID to different value: updates hw with new FB address, sends
> > immediate uevent
> > - Set FB_ID to same value: same (no-op for the hw)
> 
> It should not be a no-op for the hw, because the hw might be in the
> middle of a VRR front-porch waiting period, and the commit needs to end
> the waiting immediately rather than time out?

I'm not sure 

> > - Set another plane prop to same value: ignored, sends immediate uevent
> > (special codepath)
> 
> If the sync case says "same", then shouldn't this say "same" as well to
> be consistent?

Okay, I think I chose my words badly. By "same" I meant "same as
previous item in the list".

Here I tried to be more explicit and explain why it's the same behavior.
We have a special path in the kernel code that ignores the change, but
the effective result is that it doesn't differ from the sync case.

Here's a fixed matrix where I don't use confusing words:

- Sync atomic page-flip
  - Set FB_ID to different value: programs hw for page-flip, sends uevent
  - 

Re: [PATCH] drm/amdgpu: Address member 'ring' not described in 'amdgpu_ vce, uvd_entity_init()'

2023-11-13 Thread Christian König

Am 12.11.23 um 05:25 schrieb Srinivasan Shanmugam:

Fixes the following:

drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:237: warning: Function parameter or 
member 'ring' not described in 'amdgpu_vce_entity_init'
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c:405: warning: Function parameter or 
member 'ring' not described in 'amdgpu_uvd_entity_init'

Cc: Christian König 
Cc: Alex Deucher 
Cc: "Pan, Xinhui" 
Signed-off-by: Srinivasan Shanmugam 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 1 +
  2 files changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index 65949cc7abb9..354317c9e47a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -398,6 +398,7 @@ int amdgpu_uvd_sw_fini(struct amdgpu_device *adev)
   * amdgpu_uvd_entity_init - init entity
   *
   * @adev: amdgpu_device pointer
+ * @ring: UVD scheduling entity on ring


Better use something like "hardware ring the entity might want to use".

Christian.


   *
   * Initialize the entity used for handle management in the kernel driver.
   */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index 0954447f689d..578c5c90448f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -230,6 +230,7 @@ int amdgpu_vce_sw_fini(struct amdgpu_device *adev)
   * amdgpu_vce_entity_init - init entity
   *
   * @adev: amdgpu_device pointer
+ * @ring: VCE scheduling entity on ring
   *
   * Initialize the entity used for handle management in the kernel driver.
   */




Re: [PATCH] drm/ttm: set max_active to recommened default

2023-11-13 Thread Christian König

Am 11.11.23 um 14:11 schrieb Rajneesh Bhardwaj:

To maximize per cpu execution context for the work items, use the
recommended settings i.e. WQ_DFL_ACTIVE(256). There is no apparent
reason to throttle to 16 while process tear down.


Well big NAK to this. During process tear down it can be that hundreds 
of BOs are released at the same time.


We really don't want to start a kernel thread for each of them just to 
wait for it to be idle.


Regards,
Christian.



Signed-off-by: Rajneesh Bhardwaj 
---
  drivers/gpu/drm/ttm/ttm_device.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index bc97e3dd40f0..5443c0f19213 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -205,7 +205,7 @@ int ttm_device_init(struct ttm_device *bdev, struct 
ttm_device_funcs *funcs,
return ret;
  
  	bdev->wq = alloc_workqueue("ttm",

-  WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 
16);
+  WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
if (!bdev->wq) {
ttm_global_release();
return -ENOMEM;




Re: [PATCH v6 6/6] drm/doc: Define KMS atomic state set

2023-11-13 Thread Michel Dänzer
On 11/13/23 10:18, Simon Ser wrote:
> On Monday, October 23rd, 2023 at 10:25, Simon Ser  wrote:
> 
>>> +An atomic commit with the flag DRM_MODE_PAGE_FLIP_ASYNC is allowed 
>>> to
>>> +effectively change only the FB_ID property on any planes. 
>>> No-operation changes
>>> +are ignored as always. [...]
>>> During the hackfest in Brno, it was mentioned that a commit which 
>>> re-sets the same FB_ID could actually have an effect with VRR: It 
>>> could trigger scanout of the next frame before vertical blank has 
>>> reached its maximum duration. Some kind of mechanism is required 
>>> for this in order to allow user space to perform low frame rate 
>>> compensation.
>
> Xaver tested this hypothesis in a flipping the same fb in a VRR 
> monitor
> and it worked as expected, so this shouldn't be a concern.
> Right, so it must have some effect. It cannot be simply ignored like 
> in
> the proposed doc wording. Do we special-case re-setting the same FB_ID
> as "not a no-op" or "not ignored" or some other way?
> There's an effect in the refresh rate, the image won't change but it
> will report that a flip had happened asynchronously so the reported
> framerate will be increased. Maybe an additional wording could be 
> like:
>>>
>>> Flipping to the same FB_ID will result in a immediate flip as if it was
>>> changing to a different one, with no effect on the image but effecting
>>> the reported frame rate.
>>
>> Re-setting FB_ID to its current value is a special case regardless of
>> PAGE_FLIP_ASYNC, is it not?
>
> No. The rule has so far been that all side effects are observed
> even if you flip to the same fb. And that is one of my annoyances
> with this proposal. The rules will now be different for async flips
> vs. everything else.

 Well with the patches the async page-flip case is exactly the same as
 the non-async page-flip case. In both cases, if a FB_ID is included in
 an atomic commit then the side effects are triggered even if the property
 value didn't change. The rules are the same for everything.
>>>
>>> I see it only checking if FB_ID changes or not. If it doesn't
>>> change then the implication is that the side effects will in
>>> fact be skipped as not all planes may even support async flips.
>>
>> Hm right. So the problem is that setting any prop = same value as
>> previous one will result in a new page-flip for asynchronous page-flips,
>> but will not result in any side-effect for asynchronous page-flips.
>>
>> Does it actually matter though? For async page-flips, I don't think this
>> would result in any actual difference in behavior?
> 
> To sum this up, here is a matrix of behavior as seen by user-space:
> 
> - Sync atomic page-flip
>   - Set FB_ID to different value: programs hw for page-flip, sends uevent
>   - Set FB_ID to same value: same (important for VRR)
>   - Set another plane prop to same value: same

A page flip is programmed even if FB_ID isn't touched?


>   - Set another plane prop to different value: maybe rejected if modeset 
> required
> - Async atomic page-flip
>   - Set FB_ID to different value: updates hw with new FB address, sends
> immediate uevent
>   - Set FB_ID to same value: same (no-op for the hw)

No-op implies it doesn't trigger scanning out a frame with VRR, if scanout is 
currently in vertical blank. Is that the case? If so, async flips can't 
reliably trigger scanning out a frame with VRR.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v6 6/6] drm/doc: Define KMS atomic state set

2023-11-13 Thread Pekka Paalanen
On Mon, 13 Nov 2023 09:18:39 +
Simon Ser  wrote:

> On Monday, October 23rd, 2023 at 10:25, Simon Ser  wrote:
> 
> > > > > > > > > > > +An atomic commit with the flag DRM_MODE_PAGE_FLIP_ASYNC 
> > > > > > > > > > > is allowed to
> > > > > > > > > > > +effectively change only the FB_ID property on any 
> > > > > > > > > > > planes. No-operation changes
> > > > > > > > > > > +are ignored as always. [...]
> > > > > > > > > > > During the hackfest in Brno, it was mentioned that a 
> > > > > > > > > > > commit which re-sets the same FB_ID could actually have 
> > > > > > > > > > > an effect with VRR: It could trigger scanout of the next 
> > > > > > > > > > > frame before vertical blank has reached its maximum 
> > > > > > > > > > > duration. Some kind of mechanism is required for this in 
> > > > > > > > > > > order to allow user space to perform low frame rate 
> > > > > > > > > > > compensation.  
> > > > > > > > > 
> > > > > > > > > Xaver tested this hypothesis in a flipping the same fb in a 
> > > > > > > > > VRR monitor
> > > > > > > > > and it worked as expected, so this shouldn't be a concern.
> > > > > > > > > Right, so it must have some effect. It cannot be simply 
> > > > > > > > > ignored like in
> > > > > > > > > the proposed doc wording. Do we special-case re-setting the 
> > > > > > > > > same FB_ID
> > > > > > > > > as "not a no-op" or "not ignored" or some other way?
> > > > > > > > > There's an effect in the refresh rate, the image won't change 
> > > > > > > > > but it
> > > > > > > > > will report that a flip had happened asynchronously so the 
> > > > > > > > > reported
> > > > > > > > > framerate will be increased. Maybe an additional wording 
> > > > > > > > > could be like:  
> > > > > > > 
> > > > > > > Flipping to the same FB_ID will result in a immediate flip as if 
> > > > > > > it was
> > > > > > > changing to a different one, with no effect on the image but 
> > > > > > > effecting
> > > > > > > the reported frame rate.  
> > > > > > 
> > > > > > Re-setting FB_ID to its current value is a special case regardless 
> > > > > > of
> > > > > > PAGE_FLIP_ASYNC, is it not?  
> > > > > 
> > > > > No. The rule has so far been that all side effects are observed
> > > > > even if you flip to the same fb. And that is one of my annoyances
> > > > > with this proposal. The rules will now be different for async flips
> > > > > vs. everything else.  
> > > > 
> > > > Well with the patches the async page-flip case is exactly the same as
> > > > the non-async page-flip case. In both cases, if a FB_ID is included in
> > > > an atomic commit then the side effects are triggered even if the 
> > > > property
> > > > value didn't change. The rules are the same for everything.  
> > > 
> > > I see it only checking if FB_ID changes or not. If it doesn't
> > > change then the implication is that the side effects will in
> > > fact be skipped as not all planes may even support async flips.  
> > 
> > Hm right. So the problem is that setting any prop = same value as
> > previous one will result in a new page-flip for asynchronous page-flips,
> > but will not result in any side-effect for asynchronous page-flips.
> > 
> > Does it actually matter though? For async page-flips, I don't think this
> > would result in any actual difference in behavior?  

Hi Simon,

a fly-by question...

> To sum this up, here is a matrix of behavior as seen by user-space:
> 
> - Sync atomic page-flip
>   - Set FB_ID to different value: programs hw for page-flip, sends uevent
>   - Set FB_ID to same value: same (important for VRR)
>   - Set another plane prop to same value: same
>   - Set another plane prop to different value: maybe rejected if modeset 
> required
> - Async atomic page-flip
>   - Set FB_ID to different value: updates hw with new FB address, sends
> immediate uevent
>   - Set FB_ID to same value: same (no-op for the hw)

It should not be a no-op for the hw, because the hw might be in the
middle of a VRR front-porch waiting period, and the commit needs to end
the waiting immediately rather than time out?

>   - Set another plane prop to same value: ignored, sends immediate uevent
> (special codepath)

If the sync case says "same", then shouldn't this say "same" as well to
be consistent?

>   - Set another plane prop to different value: always rejected
> 
> To me sync and async look consistent.


Thanks,
pq


pgp046Y78ir6V.pgp
Description: OpenPGP digital signature


Re: [PATCH v6 6/6] drm/doc: Define KMS atomic state set

2023-11-13 Thread Simon Ser
On Monday, October 23rd, 2023 at 10:25, Simon Ser  wrote:

> > > > > > > > > > +An atomic commit with the flag DRM_MODE_PAGE_FLIP_ASYNC is 
> > > > > > > > > > allowed to
> > > > > > > > > > +effectively change only the FB_ID property on any planes. 
> > > > > > > > > > No-operation changes
> > > > > > > > > > +are ignored as always. [...]
> > > > > > > > > > During the hackfest in Brno, it was mentioned that a commit 
> > > > > > > > > > which re-sets the same FB_ID could actually have an effect 
> > > > > > > > > > with VRR: It could trigger scanout of the next frame before 
> > > > > > > > > > vertical blank has reached its maximum duration. Some kind 
> > > > > > > > > > of mechanism is required for this in order to allow user 
> > > > > > > > > > space to perform low frame rate compensation.
> > > > > > > > 
> > > > > > > > Xaver tested this hypothesis in a flipping the same fb in a VRR 
> > > > > > > > monitor
> > > > > > > > and it worked as expected, so this shouldn't be a concern.
> > > > > > > > Right, so it must have some effect. It cannot be simply ignored 
> > > > > > > > like in
> > > > > > > > the proposed doc wording. Do we special-case re-setting the 
> > > > > > > > same FB_ID
> > > > > > > > as "not a no-op" or "not ignored" or some other way?
> > > > > > > > There's an effect in the refresh rate, the image won't change 
> > > > > > > > but it
> > > > > > > > will report that a flip had happened asynchronously so the 
> > > > > > > > reported
> > > > > > > > framerate will be increased. Maybe an additional wording could 
> > > > > > > > be like:
> > > > > > 
> > > > > > Flipping to the same FB_ID will result in a immediate flip as if it 
> > > > > > was
> > > > > > changing to a different one, with no effect on the image but 
> > > > > > effecting
> > > > > > the reported frame rate.
> > > > > 
> > > > > Re-setting FB_ID to its current value is a special case regardless of
> > > > > PAGE_FLIP_ASYNC, is it not?
> > > > 
> > > > No. The rule has so far been that all side effects are observed
> > > > even if you flip to the same fb. And that is one of my annoyances
> > > > with this proposal. The rules will now be different for async flips
> > > > vs. everything else.
> > > 
> > > Well with the patches the async page-flip case is exactly the same as
> > > the non-async page-flip case. In both cases, if a FB_ID is included in
> > > an atomic commit then the side effects are triggered even if the property
> > > value didn't change. The rules are the same for everything.
> > 
> > I see it only checking if FB_ID changes or not. If it doesn't
> > change then the implication is that the side effects will in
> > fact be skipped as not all planes may even support async flips.
> 
> Hm right. So the problem is that setting any prop = same value as
> previous one will result in a new page-flip for asynchronous page-flips,
> but will not result in any side-effect for asynchronous page-flips.
> 
> Does it actually matter though? For async page-flips, I don't think this
> would result in any actual difference in behavior?

To sum this up, here is a matrix of behavior as seen by user-space:

- Sync atomic page-flip
  - Set FB_ID to different value: programs hw for page-flip, sends uevent
  - Set FB_ID to same value: same (important for VRR)
  - Set another plane prop to same value: same
  - Set another plane prop to different value: maybe rejected if modeset 
required
- Async atomic page-flip
  - Set FB_ID to different value: updates hw with new FB address, sends
immediate uevent
  - Set FB_ID to same value: same (no-op for the hw)
  - Set another plane prop to same value: ignored, sends immediate uevent
(special codepath)
  - Set another plane prop to different value: always rejected

To me sync and async look consistent.


RE: [PATCH 2/2] drm/amdgpu: add amdgpu runpm usage trace for separate funcs

2023-11-13 Thread Liang, Prike
[Public]

> -Original Message-
> From: Deucher, Alexander 
> Sent: Friday, November 10, 2023 5:49 AM
> To: Liang, Prike ; amd-gfx@lists.freedesktop.org
> Subject: RE: [PATCH 2/2] drm/amdgpu: add amdgpu runpm usage trace for
> separate funcs
>
> [Public]
>
> > -Original Message-
> > From: Liang, Prike 
> > Sent: Thursday, November 9, 2023 2:37 AM
> > To: amd-gfx@lists.freedesktop.org
> > Cc: Deucher, Alexander ; Liang, Prike
> > 
> > Subject: [PATCH 2/2] drm/amdgpu: add amdgpu runpm usage trace for
> > separate funcs
> >
> > Add trace for amdgpu runpm separate funcs usage and this will help
> > debugging on the case of runpm usage missed to dereference.
> > In the normal case the runpm usage count referred by one kind of
> > functionality pairwise and usage should be changed from 1 to 0,
> > otherwise there will be an issue in the amdgpu runpm usage dereference.
> >
> > Signed-off-by: Prike Liang 
>
> Looks good.  Not sure if you want to add tracepoints to the other call sites 
> as
> well.  These are probably the trickiest however.
>
> Acked-by: Alex Deucher 
>
[Prike] Thanks for the review, now the trace points only added to the amdgpu 
functions which are referencing and dereferencing amdgpu runpm usage separately 
and from the checking that seems no more separate functions need the trace 
point at the moment.

> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c |  4 
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c   |  7 ++-
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h   | 15 +++
> >  3 files changed, 25 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> > index e7e87a3b2601..decbbe3d4f06 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> > @@ -42,6 +42,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include "amdgpu_trace.h"
> >
> >  /**
> >   * amdgpu_dma_buf_attach - _buf_ops.attach implementation @@
> -
> > 63,6 +64,7 @@ static int amdgpu_dma_buf_attach(struct dma_buf
> *dmabuf,
> >   attach->peer2peer = false;
> >
> >   r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
> > + trace_amdgpu_runpm_reference_dumps(1, __func__);
> >   if (r < 0)
> >   goto out;
> >
> > @@ -70,6 +72,7 @@ static int amdgpu_dma_buf_attach(struct dma_buf
> > *dmabuf,
> >
> >  out:
> >   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
> > + trace_amdgpu_runpm_reference_dumps(0, __func__);
> >   return r;
> >  }
> >
> > @@ -90,6 +93,7 @@ static void amdgpu_dma_buf_detach(struct dma_buf
> > *dmabuf,
> >
> >   pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);
> >   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
> > + trace_amdgpu_runpm_reference_dumps(0, __func__);
> >  }
> >
> >  /**
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > index 709a2c1b9d63..1026a9fa0c0f 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > @@ -183,6 +183,7 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring,
> > struct dma_fence **f, struct amd
> >   amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
> >  seq, flags | AMDGPU_FENCE_FLAG_INT);
> >   pm_runtime_get_noresume(adev_to_drm(adev)->dev);
> > + trace_amdgpu_runpm_reference_dumps(1, __func__);
> >   ptr = >fence_drv.fences[seq & ring-
> > >fence_drv.num_fences_mask];
> >   if (unlikely(rcu_dereference_protected(*ptr, 1))) {
> >   struct dma_fence *old;
> > @@ -286,8 +287,11 @@ bool amdgpu_fence_process(struct amdgpu_ring
> > *ring)
> >   seq != ring->fence_drv.sync_seq)
> >   amdgpu_fence_schedule_fallback(ring);
> >
> > - if (unlikely(seq == last_seq))
> > + if (unlikely(seq == last_seq)) {
> > + pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
> > + trace_amdgpu_runpm_reference_dumps(0, __func__);
> >   return false;
> > + }
> >
> >   last_seq &= drv->num_fences_mask;
> >   seq &= drv->num_fences_mask;
> > @@ -310,6 +314,7 @@ bool amdgpu_fence_process(struct amdgpu_ring
> > *ring)
> >   dma_fence_put(fence);
> >   pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);
> >   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
> > + trace_amdgpu_runpm_reference_dumps(0, __func__);
> >   } while (last_seq != seq);
> >
> >   return true;
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> > index 2fd1bfb35916..5d4792645540 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
> > @@ -554,6 +554,21 @@ TRACE_EVENT(amdgpu_reset_reg_dumps,
> > __entry->value)
> >  );
> >
> > 

Re: Radeon regression in 6.6 kernel

2023-11-13 Thread Phillip Susi
Bagas Sanjaya  writes:

> Please show the full bisect log, and also tell why these commits are
> skipped.

Two of them would not compile and one would not boot.

Here's the log.

# bad: [4bbdb725a36b0d235f3b832bd0c1e885f0442d9f] Merge tag 
'iommu-updates-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
# good: [94f6f0550c625fab1f373bb86a6669b45e9748b3] Linux 6.6-rc5
git bisect start 'HEAD' 'v6.6-rc5'
# good: [8bc9e6515183935fa0cccaf67455c439afe4982b] Merge tag 
'devicetree-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
git bisect good 8bc9e6515183935fa0cccaf67455c439afe4982b
# bad: [431f1051884e38d2a5751e4731d69b2ff289ee56] Merge tag 'leds-next-6.7' of 
git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
git bisect bad 431f1051884e38d2a5751e4731d69b2ff289ee56
# bad: [0364249d2073c32c5214f02866999ce940bc35a2] Merge tag 
'for-6.7/dm-changes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
git bisect bad 0364249d2073c32c5214f02866999ce940bc35a2
# good: [27442758e9b4e083bef3f164a1739475c01f3202] Merge tag 
'amd-drm-next-6.7-2023-10-13' of https://gitlab.freedesktop.org/agd5f/linux 
into drm-next
git bisect good 27442758e9b4e083bef3f164a1739475c01f3202
# bad: [631808095a82e6b6f8410a95f8b12b8d0d38b161] Merge tag 
'amd-drm-next-6.7-2023-10-27' of https://gitlab.freedesktop.org/agd5f/linux 
into drm-next
git bisect bad 631808095a82e6b6f8410a95f8b12b8d0d38b161
# skip: [0ecf4aa32b7896b9160688bdbd20153dc06a50fb] Merge tag 
'amd-drm-next-6.7-2023-10-20' of https://gitlab.freedesktop.org/agd5f/linux 
into drm-next
git bisect skip 0ecf4aa32b7896b9160688bdbd20153dc06a50fb
# good: [74ce0f3873821f12391bcf5469d81583d34f4c6c] accel/ivpu: Fix verbose 
version of REG_POLL macros
git bisect good 74ce0f3873821f12391bcf5469d81583d34f4c6c
# skip: [3f5ba636d6987ddffeaa056dea1c524da63912f3] Merge tag 
'drm-msm-next-2023-10-17' of https://gitlab.freedesktop.org/drm/msm into 
drm-next
git bisect skip 3f5ba636d6987ddffeaa056dea1c524da63912f3
# good: [a3cd664e7f971b0f33acb3ba790c142669cd34e5] accel/ivpu: Print IPC type 
string instead of number
git bisect good a3cd664e7f971b0f33acb3ba790c142669cd34e5
# good: [b0b0d811eac6b4c52cb9ad632fa6384cf48869e7] drm/mediatek: Fix coverity 
issue with unintentional integer overflow
git bisect good b0b0d811eac6b4c52cb9ad632fa6384cf48869e7
# good: [808b43fa7e56e94563b86af2703ba88ee156e3c2] drm/i915/dp_mst: Set 
connector DSC capabilities and decompression AUX
git bisect good 808b43fa7e56e94563b86af2703ba88ee156e3c2
# skip: [11ae5eb516b656e8a0e4efbea90ea24c152a346d] Merge tag 
'topic/vmemdup-user-array-2023-10-24-1' of 
git://anongit.freedesktop.org/drm/drm into drm-next
git bisect skip 11ae5eb516b656e8a0e4efbea90ea24c152a346d
# good: [a67f7a0b18c09d5b62eafb6d5c2f54e6f6ea6cf1] drm/amd/display: Update SDP 
VSC colorimetry from DP test automation request
git bisect good a67f7a0b18c09d5b62eafb6d5c2f54e6f6ea6cf1
# good: [dd3dd9829bf9a4ecd55482050745efdd9f7f97fc] drm/amdgpu: Remove unused 
variables from amdgpu_show_fdinfo
git bisect good dd3dd9829bf9a4ecd55482050745efdd9f7f97fc
# good: [b1abb484417ec8edd68df0c9bf8cb1c1fc035fd2] drm/ci: force-enable 
CONFIG_MSM_MMCC_8996 as built-in
git bisect good b1abb484417ec8edd68df0c9bf8cb1c1fc035fd2
# good: [5fa8f128462c5b3b20576b12286dca7fe95b3af1] drm/ci: increase i915 job 
timeout to 1h30m
git bisect good 5fa8f128462c5b3b20576b12286dca7fe95b3af1
# good: [3ddba96b0d7e714dee4db5aed4f7d413be43b4ba] MAINTAINERS: drm/ci: add 
entries for xfail files
git bisect good 3ddba96b0d7e714dee4db5aed4f7d413be43b4ba
# bad: [b70438004a14f4d0f9890b3297cd66248728546c] drm/amdgpu: move buffer funcs 
setting up a level
git bisect bad b70438004a14f4d0f9890b3297cd66248728546c
# skip: [56e449603f0ac580700621a356d35d5716a62ce5] drm/sched: Convert the GPU 
scheduler to variable number of run-queues
git bisect skip 56e449603f0ac580700621a356d35d5716a62ce5
# skip: [c07bf1636f0005f9eb7956404490672286ea59d3] MAINTAINERS: Update the GPU 
Scheduler email
git bisect skip c07bf1636f0005f9eb7956404490672286ea59d3
# only skipped commits left to test
# possible first bad commit: [b70438004a14f4d0f9890b3297cd66248728546c] 
drm/amdgpu: move buffer funcs setting up a level
# possible first bad commit: [c07bf1636f0005f9eb7956404490672286ea59d3] 
MAINTAINERS: Update the GPU Scheduler email
# possible first bad commit: [56e449603f0ac580700621a356d35d5716a62ce5] 
drm/sched: Convert the GPU scheduler to variable number of run-queues