[PATCH] drm/amdkfd: To flush tlb for MMHUB of GFX9 series

2022-06-20 Thread Ji, Ruili
From: Ruili Ji 

amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:40 vmid:8 pasid:32769, for 
process test_basic pid 3305 thread test_basic pid 3305)
amdgpu: in page starting at address 0x7ff990003000 from IH client 0x12 (VMC)
amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00840051
amdgpu: Faulty UTCL2 client ID: MP1 (0x0)
amdgpu: MORE_FAULTS: 0x1
amdgpu: WALKER_ERROR: 0x0
amdgpu: PERMISSION_FAULTS: 0x5
amdgpu: MAPPING_ERROR: 0x0
amdgpu: RW: 0x1

When memory is allocated by kfd, no one triggers the tlb flush for MMHUB0.
There is page fault from MMHUB0.

Signed-off-by: Ruili Ji 
Change-Id: I97786f02849dd047703d6e8feff53916b307715c
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 1d0c9762ebfb..12fc822c0a92 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -739,7 +739,8 @@ int amdgpu_amdkfd_flush_gpu_tlb_pasid(struct amdgpu_device 
*adev,
 {
bool all_hub = false;
 
-   if (adev->family == AMDGPU_FAMILY_AI)
+   if (adev->family == AMDGPU_FAMILY_AI
+   || adev->family == AMDGPU_FAMILY_RV)
all_hub = true;
 
return amdgpu_gmc_flush_gpu_tlb_pasid(adev, pasid, flush_type, all_hub);
-- 
2.25.1



RE: [PATCH] drm/amdkfd: correct sdma queue number of sdma 6.0.1

2022-06-20 Thread Huang, Tim
[AMD Official Use Only - General]

Reviewed-by: Tim Huang 

-Original Message-
From: Zhang, Yifan 
Sent: Monday, June 20, 2022 4:42 PM
To: amd-gfx@lists.freedesktop.org
Cc: Huang, Ray ; Deucher, Alexander 
; Huang, Tim ; Zhang, Yifan 

Subject: [PATCH] drm/amdkfd: correct sdma queue number of sdma 6.0.1

sdma 6.0.1 has 8 queues instead of 2.

Fixes: 2f68559102cb (drm/amdkfd: add GC 11.0.1 KFD support)
Signed-off-by: Yifan Zhang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index bf4200457772..c8fee0dbfdcb 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -75,7 +75,6 @@ static void kfd_device_info_set_sdma_info(struct kfd_dev *kfd)
case IP_VERSION(5, 2, 3):/* YELLOW_CARP */
case IP_VERSION(5, 2, 6):/* GC 10.3.6 */
case IP_VERSION(5, 2, 7):/* GC 10.3.7 */
-   case IP_VERSION(6, 0, 1):
kfd->device_info.num_sdma_queues_per_engine = 2;
break;
case IP_VERSION(4, 2, 0):/* VEGA20 */
@@ -90,6 +89,7 @@ static void kfd_device_info_set_sdma_info(struct kfd_dev *kfd)
case IP_VERSION(5, 2, 4):/* DIMGREY_CAVEFISH */
case IP_VERSION(5, 2, 5):/* BEIGE_GOBY */
case IP_VERSION(6, 0, 0):
+   case IP_VERSION(6, 0, 1):
case IP_VERSION(6, 0, 2):
kfd->device_info.num_sdma_queues_per_engine = 8;
break;
--
2.35.1



RE: [PATCH] drm/amdgpu: fix adev variable used in amdgpu_device_gpu_recover()

2022-06-20 Thread Chen, Guchun
Reviewed-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Monday, June 20, 2022 10:54 PM
To: Deucher, Alexander 
Cc: amd-gfx list 
Subject: Re: [PATCH] drm/amdgpu: fix adev variable used in 
amdgpu_device_gpu_recover()

Ping?

Alex

On Thu, Jun 16, 2022 at 5:12 PM Alex Deucher  wrote:
>
> Use the correct adev variable for the drm_fb_helper in 
> amdgpu_device_gpu_recover().  Noticed by inspection.
>
> Fixes: 087451f372bf ("drm/amdgpu: use generic fb helpers instead of 
> setting up AMD own's.")
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 2b92281dd0c1..eacecc672a4d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -5186,7 +5186,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device 
> *adev,
>  */
> amdgpu_unregister_gpu_instance(tmp_adev);
>
> -   
> drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, true);
> +   
> + drm_fb_helper_set_suspend_unlocked(adev_to_drm(tmp_adev)->fb_helper, 
> + true);
>
> /* disable ras on ALL IPs */
> if (!need_emergency_restart &&
> --
> 2.35.3
>


RE: [PATCH] amd/display/dc: Fix COLOR_ENCODING and COLOR_RANGE doing nothing for DCN20+

2022-06-20 Thread VURDIGERENATARAJ, CHANDAN
Hi Alex,

I think this was pushed earlier by Harry.
Not sure why it did not get merged.
https://www.spinics.net/lists/stable/msg543116.html has the history.

BR,
Chandan V N

>Applied.  Thanks!
>
>Alex
>
>On Wed, Jun 15, 2022 at 9:21 PM Joshua Ashton  wrote:
>>
>> For DCN20 and above, the code that actually hooks up the provided 
>> input_color_space got lost at some point.
>>
>> Fixes COLOR_ENCODING and COLOR_RANGE doing nothing on DCN20+.
>> Tested using Steam Remote Play Together + gamescope.
>>
>> Signed-off-by: Joshua Ashton 
>> ---
>>  drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c   | 3 +++
>>  drivers/gpu/drm/amd/display/dc/dcn201/dcn201_dpp.c | 3 +++
>>  drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c   | 3 +++
>>  3 files changed, 9 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c 
>> b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c
>> index 970b65efeac1..eaa7032f0f1a 100644
>> --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c
>> +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c
>> @@ -212,6 +212,9 @@ static void dpp2_cnv_setup (
>> break;
>> }
>>
>> +   /* Set default color space based on format if none is given. */
>> +   color_space = input_color_space ? input_color_space : 
>> + color_space;
>> +
>> if (is_2bit == 1 && alpha_2bit_lut != NULL) {
>> REG_UPDATE(ALPHA_2BIT_LUT, ALPHA_2BIT_LUT0, 
>> alpha_2bit_lut->lut0);
>> REG_UPDATE(ALPHA_2BIT_LUT, ALPHA_2BIT_LUT1, 
>> alpha_2bit_lut->lut1); diff --git 
>> a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_dpp.c 
>> b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_dpp.c
>> index 8b6505b7dca8..f50ab961bc17 100644
>> --- a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_dpp.c
>> +++ b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_dpp.c
>> @@ -153,6 +153,9 @@ static void dpp201_cnv_setup(
>> break;
>> }
>>
>> +   /* Set default color space based on format if none is given. */
>> +   color_space = input_color_space ? input_color_space : 
>> + color_space;
>> +
>> if (is_2bit == 1 && alpha_2bit_lut != NULL) {
>> REG_UPDATE(ALPHA_2BIT_LUT, ALPHA_2BIT_LUT0, 
>> alpha_2bit_lut->lut0);
>> REG_UPDATE(ALPHA_2BIT_LUT, ALPHA_2BIT_LUT1, 
>> alpha_2bit_lut->lut1); diff --git 
>> a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c 
>> b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c
>> index 9cca59bf2ae0..3c77949b8110 100644
>> --- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c
>> +++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c
>> @@ -294,6 +294,9 @@ void dpp3_cnv_setup (
>> break;
>> }
>>
>> +   /* Set default color space based on format if none is given. */
>> +   color_space = input_color_space ? input_color_space : 
>> + color_space;
>> +
>> if (is_2bit == 1 && alpha_2bit_lut != NULL) {
>> REG_UPDATE(ALPHA_2BIT_LUT, ALPHA_2BIT_LUT0, 
>> alpha_2bit_lut->lut0);
>> REG_UPDATE(ALPHA_2BIT_LUT, ALPHA_2BIT_LUT1, 
>> alpha_2bit_lut->lut1);
>> --
>> 2.36.1
>>


Re: [PATCH 2/2] drm/radeon: Drop CONFIG_BACKLIGHT_CLASS_DEVICE ifdefs

2022-06-20 Thread Alex Deucher
Applied the series.  Thanks,

Alex

On Mon, Jun 20, 2022 at 5:44 AM Hans de Goede  wrote:
>
> The DRM_RADEON Kconfig code contains:
>
> select BACKLIGHT_CLASS_DEVICE
>
> So the condition these ifdefs test for is always true, drop them.
>
> Signed-off-by: Hans de Goede 
> ---
>  drivers/gpu/drm/radeon/atombios_encoders.c  | 14 --
>  drivers/gpu/drm/radeon/radeon_acpi.c|  2 --
>  drivers/gpu/drm/radeon/radeon_legacy_encoders.c | 15 ---
>  drivers/gpu/drm/radeon/radeon_mode.h|  4 
>  4 files changed, 35 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/atombios_encoders.c 
> b/drivers/gpu/drm/radeon/atombios_encoders.c
> index f82577dc25e8..160a309e1048 100644
> --- a/drivers/gpu/drm/radeon/atombios_encoders.c
> +++ b/drivers/gpu/drm/radeon/atombios_encoders.c
> @@ -143,8 +143,6 @@ atombios_set_backlight_level(struct radeon_encoder 
> *radeon_encoder, u8 level)
> }
>  }
>
> -#if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) || 
> defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)
> -
>  static u8 radeon_atom_bl_level(struct backlight_device *bd)
>  {
> u8 level;
> @@ -293,18 +291,6 @@ static void radeon_atom_backlight_exit(struct 
> radeon_encoder *radeon_encoder)
> }
>  }
>
> -#else /* !CONFIG_BACKLIGHT_CLASS_DEVICE */
> -
> -void radeon_atom_backlight_init(struct radeon_encoder *encoder)
> -{
> -}
> -
> -static void radeon_atom_backlight_exit(struct radeon_encoder *encoder)
> -{
> -}
> -
> -#endif
> -
>  static bool radeon_atom_mode_fixup(struct drm_encoder *encoder,
>const struct drm_display_mode *mode,
>struct drm_display_mode *adjusted_mode)
> diff --git a/drivers/gpu/drm/radeon/radeon_acpi.c 
> b/drivers/gpu/drm/radeon/radeon_acpi.c
> index 1baef7b493de..b603c0b77075 100644
> --- a/drivers/gpu/drm/radeon/radeon_acpi.c
> +++ b/drivers/gpu/drm/radeon/radeon_acpi.c
> @@ -391,7 +391,6 @@ static int radeon_atif_handler(struct radeon_device *rdev,
>
> radeon_set_backlight_level(rdev, enc, 
> req.backlight_level);
>
> -#if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) || 
> defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)
> if (rdev->is_atom_bios) {
> struct radeon_encoder_atom_dig *dig = 
> enc->enc_priv;
> backlight_force_update(dig->bl_dev,
> @@ -401,7 +400,6 @@ static int radeon_atif_handler(struct radeon_device *rdev,
> backlight_force_update(dig->bl_dev,
>
> BACKLIGHT_UPDATE_HOTKEY);
> }
> -#endif
> }
> }
> if (req.pending & ATIF_DGPU_DISPLAY_EVENT) {
> diff --git a/drivers/gpu/drm/radeon/radeon_legacy_encoders.c 
> b/drivers/gpu/drm/radeon/radeon_legacy_encoders.c
> index d2180f5c80fa..1d207c76f53e 100644
> --- a/drivers/gpu/drm/radeon/radeon_legacy_encoders.c
> +++ b/drivers/gpu/drm/radeon/radeon_legacy_encoders.c
> @@ -320,8 +320,6 @@ radeon_legacy_set_backlight_level(struct radeon_encoder 
> *radeon_encoder, u8 leve
> radeon_legacy_lvds_update(&radeon_encoder->base, dpms_mode);
>  }
>
> -#if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) || 
> defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)
> -
>  static uint8_t radeon_legacy_lvds_level(struct backlight_device *bd)
>  {
> struct radeon_backlight_privdata *pdata = bl_get_data(bd);
> @@ -495,19 +493,6 @@ static void radeon_legacy_backlight_exit(struct 
> radeon_encoder *radeon_encoder)
> }
>  }
>
> -#else /* !CONFIG_BACKLIGHT_CLASS_DEVICE */
> -
> -void radeon_legacy_backlight_init(struct radeon_encoder *encoder)
> -{
> -}
> -
> -static void radeon_legacy_backlight_exit(struct radeon_encoder *encoder)
> -{
> -}
> -
> -#endif
> -
> -
>  static void radeon_lvds_enc_destroy(struct drm_encoder *encoder)
>  {
> struct radeon_encoder *radeon_encoder = to_radeon_encoder(encoder);
> diff --git a/drivers/gpu/drm/radeon/radeon_mode.h 
> b/drivers/gpu/drm/radeon/radeon_mode.h
> index 3485e7f142e9..b34cffc162e2 100644
> --- a/drivers/gpu/drm/radeon/radeon_mode.h
> +++ b/drivers/gpu/drm/radeon/radeon_mode.h
> @@ -281,15 +281,11 @@ struct radeon_mode_info {
>
>  #define RADEON_MAX_BL_LEVEL 0xFF
>
> -#if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) || 
> defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)
> -
>  struct radeon_backlight_privdata {
> struct radeon_encoder *encoder;
> uint8_t negative;
>  };
>
> -#endif
> -
>  #define MAX_H_CODE_TIMING_LEN 32
>  #define MAX_V_CODE_TIMING_LEN 32
>
> --
> 2.36.0
>


[PATCH] Revert "drm/amdgpu/display: set vblank_disable_immediate for DC"

2022-06-20 Thread Alex Deucher
This reverts commit 92020e81ddbeac351ea4a19bcf01743f32b9c800.

This causes stuttering and timeouts with DMCUB for some users
so revert it until we understand why and and safely enable it
to save power.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1887
Signed-off-by: Alex Deucher 
Cc: Nicholas Kazlauskas 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c   | 1 +
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 ---
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index b4cf8717f554..89011bae7588 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -320,6 +320,7 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
if (!amdgpu_device_has_dc_support(adev)) {
if (!adev->enable_virtual_display)
/* Disable vblank IRQs aggressively for power-saving */
+   /* XXX: can this be enabled for DC? */
adev_to_drm(adev)->vblank_disable_immediate = true;
 
r = drm_vblank_init(adev_to_drm(adev), 
adev->mode_info.num_crtc);
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index c2bc7db85d7e..24959cb85c48 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4293,9 +4293,6 @@ static int amdgpu_dm_initialize_drm_device(struct 
amdgpu_device *adev)
}
}
 
-   /* Disable vblank IRQs aggressively for power-saving. */
-   adev_to_drm(adev)->vblank_disable_immediate = true;
-
/* loops over all connectors on the board */
for (i = 0; i < link_cnt; i++) {
struct dc_link *link = NULL;
-- 
2.35.3



Re: [PATCH] drm/amd/display: Remove unused variable 'abo'

2022-06-20 Thread Alex Deucher
I sent out the same patch last week.  I just pushed it to drm-misc-next.

Thanks!

Alex

On Sat, Jun 18, 2022 at 1:38 AM Simon Ser  wrote:
>
> Reviewed-by: Simon Ser 


Re: [PATCH] amd/display/dc: Fix COLOR_ENCODING and COLOR_RANGE doing nothing for DCN20+

2022-06-20 Thread Alex Deucher
Applied.  Thanks!

Alex

On Wed, Jun 15, 2022 at 9:21 PM Joshua Ashton  wrote:
>
> For DCN20 and above, the code that actually hooks up the provided
> input_color_space got lost at some point.
>
> Fixes COLOR_ENCODING and COLOR_RANGE doing nothing on DCN20+.
> Tested using Steam Remote Play Together + gamescope.
>
> Signed-off-by: Joshua Ashton 
> ---
>  drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c   | 3 +++
>  drivers/gpu/drm/amd/display/dc/dcn201/dcn201_dpp.c | 3 +++
>  drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c   | 3 +++
>  3 files changed, 9 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c 
> b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c
> index 970b65efeac1..eaa7032f0f1a 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c
> @@ -212,6 +212,9 @@ static void dpp2_cnv_setup (
> break;
> }
>
> +   /* Set default color space based on format if none is given. */
> +   color_space = input_color_space ? input_color_space : color_space;
> +
> if (is_2bit == 1 && alpha_2bit_lut != NULL) {
> REG_UPDATE(ALPHA_2BIT_LUT, ALPHA_2BIT_LUT0, 
> alpha_2bit_lut->lut0);
> REG_UPDATE(ALPHA_2BIT_LUT, ALPHA_2BIT_LUT1, 
> alpha_2bit_lut->lut1);
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_dpp.c 
> b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_dpp.c
> index 8b6505b7dca8..f50ab961bc17 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_dpp.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_dpp.c
> @@ -153,6 +153,9 @@ static void dpp201_cnv_setup(
> break;
> }
>
> +   /* Set default color space based on format if none is given. */
> +   color_space = input_color_space ? input_color_space : color_space;
> +
> if (is_2bit == 1 && alpha_2bit_lut != NULL) {
> REG_UPDATE(ALPHA_2BIT_LUT, ALPHA_2BIT_LUT0, 
> alpha_2bit_lut->lut0);
> REG_UPDATE(ALPHA_2BIT_LUT, ALPHA_2BIT_LUT1, 
> alpha_2bit_lut->lut1);
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c 
> b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c
> index 9cca59bf2ae0..3c77949b8110 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c
> @@ -294,6 +294,9 @@ void dpp3_cnv_setup (
> break;
> }
>
> +   /* Set default color space based on format if none is given. */
> +   color_space = input_color_space ? input_color_space : color_space;
> +
> if (is_2bit == 1 && alpha_2bit_lut != NULL) {
> REG_UPDATE(ALPHA_2BIT_LUT, ALPHA_2BIT_LUT0, 
> alpha_2bit_lut->lut0);
> REG_UPDATE(ALPHA_2BIT_LUT, ALPHA_2BIT_LUT1, 
> alpha_2bit_lut->lut1);
> --
> 2.36.1
>


[PATCH 5/5] drm/amdgpu: Follow up change to previous drm scheduler change.

2022-06-20 Thread Andrey Grodzovsky
Align refcount behaviour for amdgpu_job embedded HW fence with
classic pointer style HW fences by increasing refcount each
time emit is called so amdgpu code doesn't need to make workarounds
using amdgpu_job.job_run_counter to keep the HW fence refcount balanced.

Also since in the previous patch we resumed setting s_fence->parent to NULL
in drm_sched_stop switch to directly checking if job->hw_fence is
signaled to short circuit reset if already signed.

Signed-off-by: Andrey Grodzovsky 
Tested-by: Yiqing Yao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 23 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  |  7 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c|  4 
 4 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 513c57f839d8..447bd92c4856 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -684,6 +684,8 @@ int amdgpu_amdkfd_submit_ib(struct amdgpu_device *adev,
goto err_ib_sched;
}
 
+   /* Drop the initial kref_init count (see drm_sched_main as example) */
+   dma_fence_put(f);
ret = dma_fence_wait(f, false);
 
 err_ib_sched:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index c99541685804..f9718119834f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5009,16 +5009,28 @@ static void amdgpu_device_recheck_guilty_jobs(
 
/* clear job's guilty and depend the folowing step to decide 
the real one */
drm_sched_reset_karma(s_job);
-   /* for the real bad job, it will be resubmitted twice, adding a 
dma_fence_get
-* to make sure fence is balanced */
-   dma_fence_get(s_job->s_fence->parent);
drm_sched_resubmit_jobs_ext(&ring->sched, 1);
 
+   if (!s_job->s_fence->parent) {
+   DRM_WARN("Failed to get a HW fence for job!");
+   continue;
+   }
+
ret = dma_fence_wait_timeout(s_job->s_fence->parent, false, 
ring->sched.timeout);
if (ret == 0) { /* timeout */
DRM_ERROR("Found the real bad job! ring:%s, 
job_id:%llx\n",
ring->sched.name, s_job->id);
 
+
+   /* Clear this failed job from fence array */
+   amdgpu_fence_driver_clear_job_fences(ring);
+
+   /* Since the job won't signal and we go for
+* another resubmit drop this parent pointer
+*/
+   dma_fence_put(s_job->s_fence->parent);
+   s_job->s_fence->parent = NULL;
+
/* set guilty */
drm_sched_increase_karma(s_job);
 retry:
@@ -5047,7 +5059,6 @@ static void amdgpu_device_recheck_guilty_jobs(
 
/* got the hw fence, signal finished fence */
atomic_dec(ring->sched.score);
-   dma_fence_put(s_job->s_fence->parent);
dma_fence_get(&s_job->s_fence->finished);
dma_fence_signal(&s_job->s_fence->finished);
dma_fence_put(&s_job->s_fence->finished);
@@ -5220,8 +5231,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 *
 * job->base holds a reference to parent fence
 */
-   if (job && job->base.s_fence->parent &&
-   dma_fence_is_signaled(job->base.s_fence->parent)) {
+   if (job && (job->hw_fence.ops != NULL) &&
+   dma_fence_is_signaled(&job->hw_fence)) {
job_signaled = true;
dev_info(adev->dev, "Guilty job already signaled, skipping HW 
reset");
goto skip_hw_reset;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index d6d54ba4c185..9bd4e18212fc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -164,11 +164,16 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct 
dma_fence **f, struct amd
if (job && job->job_run_counter) {
/* reinit seq for resubmitted jobs */
fence->seqno = seq;
+   /* TO be inline with external fence creation and other drivers 
*/
+   dma_fence_get(fence);
} else {
-   if (job)
+   if (job) {
dma_fence_init(fence, &amdgpu_job_fence_ops,
   &ring->fence_drv.lock,
   adev->fence_context + ring->idx, seq);
+   /* Against remove in amdgpu_job_{free, free_cb} */
+   dma_fence_get(f

[PATCH 4/5] drm/sched: Partial revert of 'drm/sched: Keep s_fence->parent pointer'

2022-06-20 Thread Andrey Grodzovsky
Problem:
This patch caused negative refcount as described in [1] because
for that case parent fence did not signal by the time of drm_sched_stop and 
hence
kept in pending list the assumption was they will not signal and
so fence was put to account for the s_fence->parent refcount but for
amdgpu which has embedded HW fence (always same parent fence)
drm_sched_fence_release_scheduled was always called and would
still drop the count for parent fence once more. For jobs that
never signaled this imbalance was masked by refcount bug in
amdgpu_fence_driver_clear_job_fences that would not drop
refcount on the fences that were removed from fence drive
fences array (against prevois insertion into the array in
get in amdgpu_fence_emit).

Fix:
Revert this patch and by setting s_job->s_fence->parent to NULL
as before prevent the extra refcount drop in amdgpu when
drm_sched_fence_release_scheduled is called on job release.

Also - align behaviour in drm_sched_resubmit_jobs_ext with that of
drm_sched_main when submitting jobs - take a refcount for the
new parent fence pointer and drop refcount for original kref_init
for new HW fence creation (or fake new HW fence in amdgpu - see next patch).

[1] - 
https://lore.kernel.org/all/731b7ff1-3cc9-e314-df2a-7c51b76d4...@amd.com/t/#r00c728fcc069b1276642c325bfa9d82bf8fa21a3

Signed-off-by: Andrey Grodzovsky 
Tested-by: Yiqing Yao 
---
 drivers/gpu/drm/scheduler/sched_main.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index b81fceb0b8a2..b38394f5694f 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -419,6 +419,11 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
struct drm_sched_job *bad)
if (s_job->s_fence->parent &&
dma_fence_remove_callback(s_job->s_fence->parent,
  &s_job->cb)) {
+   /* Revert drm/sched: Keep s_fence->parent pointer, no
+* need anymore for amdgpu and creates only troubles
+*/
+   dma_fence_put(s_job->s_fence->parent);
+   s_job->s_fence->parent = NULL;
atomic_dec(&sched->hw_rq_count);
} else {
/*
@@ -548,7 +553,6 @@ void drm_sched_resubmit_jobs_ext(struct drm_gpu_scheduler 
*sched, int max)
if (found_guilty && s_job->s_fence->scheduled.context == 
guilty_context)
dma_fence_set_error(&s_fence->finished, -ECANCELED);
 
-   dma_fence_put(s_job->s_fence->parent);
fence = sched->ops->run_job(s_job);
i++;
 
@@ -558,7 +562,11 @@ void drm_sched_resubmit_jobs_ext(struct drm_gpu_scheduler 
*sched, int max)
 
s_job->s_fence->parent = NULL;
} else {
-   s_job->s_fence->parent = fence;
+
+   s_job->s_fence->parent = dma_fence_get(fence);
+
+   /* Drop for orignal kref_init */
+   dma_fence_put(fence);
}
}
 }
@@ -952,6 +960,9 @@ static int drm_sched_main(void *param)
 
if (!IS_ERR_OR_NULL(fence)) {
s_fence->parent = dma_fence_get(fence);
+   /* Drop for original kref_init of the fence */
+   dma_fence_put(fence);
+
r = dma_fence_add_callback(fence, &sched_job->cb,
   drm_sched_job_done_cb);
if (r == -ENOENT)
@@ -959,7 +970,6 @@ static int drm_sched_main(void *param)
else if (r)
DRM_DEV_ERROR(sched->dev, "fence add callback 
failed (%d)\n",
  r);
-   dma_fence_put(fence);
} else {
if (IS_ERR(fence))
dma_fence_set_error(&s_fence->finished, 
PTR_ERR(fence));
-- 
2.25.1



[PATCH 3/5] drm/amdgpu: Prevent race between late signaled fences and GPU reset.

2022-06-20 Thread Andrey Grodzovsky
Problem:
After we start handling timed out jobs we assume there fences won't be
signaled but we cannot be sure and sometimes they fire late. We need
to prevent concurrent accesses to fence array from
amdgpu_fence_driver_clear_job_fences during GPU reset and amdgpu_fence_process
from a late EOP interrupt.

Fix:
Before accessing fence array in GPU disable EOP interrupt and flush
all pending interrupt handlers for amdgpu device's interrupt line.

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 26 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  1 +
 3 files changed, 31 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 2b92281dd0c1..c99541685804 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4605,6 +4605,8 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device 
*adev,
amdgpu_virt_fini_data_exchange(adev);
}
 
+   amdgpu_fence_driver_isr_toggle(adev, true);
+
/* block all schedulers and reset given job's ring */
for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
struct amdgpu_ring *ring = adev->rings[i];
@@ -4620,6 +4622,8 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device 
*adev,
amdgpu_fence_driver_force_completion(ring);
}
 
+   amdgpu_fence_driver_isr_toggle(adev, false);
+
if (job && job->vm)
drm_sched_increase_karma(&job->base);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index a9ae3beaa1d3..d6d54ba4c185 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -532,6 +532,32 @@ void amdgpu_fence_driver_hw_fini(struct amdgpu_device 
*adev)
}
 }
 
+void amdgpu_fence_driver_isr_toggle(struct amdgpu_device *adev, bool stop)
+{
+   int i;
+
+   for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
+   struct amdgpu_ring *ring = adev->rings[i];
+
+   if (!ring || !ring->fence_drv.initialized || 
!ring->fence_drv.irq_src)
+   continue;
+
+   if (stop)
+   amdgpu_irq_put(adev, ring->fence_drv.irq_src,
+  ring->fence_drv.irq_type);
+   else
+   amdgpu_irq_get(adev, ring->fence_drv.irq_src,
+   ring->fence_drv.irq_type);
+   }
+
+   /* TODO Only waits for irq handlers on other CPUs, maybe local_irq_save
+* local_irq_local_irq_restore are needed here for local interrupts ?
+*
+*/
+   if (stop)
+   synchronize_irq(adev->irq.irq);
+}
+
 void amdgpu_fence_driver_sw_fini(struct amdgpu_device *adev)
 {
unsigned int i, j;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 7d89a52091c0..82c178a9033a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -143,6 +143,7 @@ signed long amdgpu_fence_wait_polling(struct amdgpu_ring 
*ring,
  uint32_t wait_seq,
  signed long timeout);
 unsigned amdgpu_fence_count_emitted(struct amdgpu_ring *ring);
+void amdgpu_fence_driver_isr_toggle(struct amdgpu_device *adev, bool stop);
 
 /*
  * Rings.
-- 
2.25.1



[PATCH 2/5] drm/amdgpu: Add put fence in amdgpu_fence_driver_clear_job_fences

2022-06-20 Thread Andrey Grodzovsky
This function should drop the fence refcount when it extracts the
fence from the fence array, just as it's done in amdgpu_fence_process.

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 957437a5558c..a9ae3beaa1d3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -595,8 +595,10 @@ void amdgpu_fence_driver_clear_job_fences(struct 
amdgpu_ring *ring)
for (i = 0; i <= ring->fence_drv.num_fences_mask; i++) {
ptr = &ring->fence_drv.fences[i];
old = rcu_dereference_protected(*ptr, 1);
-   if (old && old->ops == &amdgpu_job_fence_ops)
+   if (old && old->ops == &amdgpu_job_fence_ops) {
RCU_INIT_POINTER(*ptr, NULL);
+   dma_fence_put(old);
+   }
}
 }
 
-- 
2.25.1



[PATCH 1/5] drm/amdgpu: Fix possible refcount leak for release of external_hw_fence

2022-06-20 Thread Andrey Grodzovsky
Problem:
In amdgpu_job_submit_direct - The refcount should drop by 2
but it drops only by 1.

amdgpu_ib_sched->emit -> refcount 1 from first fence init
dma_fence_get -> refcount 2
dme_fence_put -> refcount 1

Fix:
Add put for external_hw_fence in amdgpu_job_free/free_cb

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 10aa073600d4..58568fdde2d0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -152,8 +152,10 @@ static void amdgpu_job_free_cb(struct drm_sched_job *s_job)
 /* only put the hw fence if has embedded fence */
if (job->hw_fence.ops != NULL)
dma_fence_put(&job->hw_fence);
-   else
+   else {
+   dma_fence_put(job->external_hw_fence);
kfree(job);
+   }
 }
 
 void amdgpu_job_free(struct amdgpu_job *job)
@@ -165,8 +167,10 @@ void amdgpu_job_free(struct amdgpu_job *job)
/* only put the hw fence if has embedded fence */
if (job->hw_fence.ops != NULL)
dma_fence_put(&job->hw_fence);
-   else
+   else {
+   dma_fence_put(job->external_hw_fence);
kfree(job);
+   }
 }
 
 int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
-- 
2.25.1



[PATCH 0/5] Rework amdgpu HW fence refocunt and update scheduler parent fence refcount.

2022-06-20 Thread Andrey Grodzovsky
Yiqing raised a problem of negative fence refcount for resubmitted jobs
in amdgpu and suggested a workaround in [1]. I took  a look myself and 
discovered
some deeper problems both in amdgpu and scheduler code.

Yiqing helped with testing the new code and also drew a detailed refcount and 
flow
tracing diagram for parent (HW) fence life cycle and refcount under various
cases for the proposed patchset at [2].

[1] - 
https://lore.kernel.org/all/731b7ff1-3cc9-e314-df2a-7c51b76d4...@amd.com/t/#r00c728fcc069b1276642c325bfa9d82bf8fa21a3
[2] - 
https://drive.google.com/file/d/1yEoeW6OQC9WnwmzFW6NBLhFP_jD0xcHm/view?usp=sharing

Andrey Grodzovsky (5):
  drm/amdgpu: Fix possible refcount leak for release of
external_hw_fence
  drm/amdgpu: Add put fence in amdgpu_fence_driver_clear_job_fences
  drm/amdgpu: Prevent race between late signaled fences and GPU reset.
  drm/sched: Partial revert of 'drm/sched: Keep s_fence->parent pointer'
  drm/amdgpu: Follow up change to previous drm scheduler change.

 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 27 
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 37 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 12 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  1 +
 drivers/gpu/drm/scheduler/sched_main.c | 16 --
 6 files changed, 78 insertions(+), 17 deletions(-)

-- 
2.25.1



Using generic fbdev helpers breaks hibernation

2022-06-20 Thread Alex Deucher
Maybe someone more familiar with the generic drm fbdev helpers can
help me understand why they don't work with hibernation, at least with
AMD GPUs.  We converted amdgpu to use the generic helpers instead of
rolling our own in this patch[1], but it seems to have broken
hibernation[2].  amdgpu has always set mode_config.prefer_shadow = 1,
but that seems to be the cause of the hibernation breakage with the
generic helpers.  I've been staring at the code for a while now but I
can't see why this fails.  Any pointers?

Thanks,

Alex

[1] - 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=087451f372bf76d971184caa258807b7c35aac8f
[2] - https://bugzilla.kernel.org/show_bug.cgi?id=216119


Re: [PATCH] drm/amd/display: Add missing hard-float compile flags for PPC64 builds

2022-06-20 Thread Alex Deucher
On Sat, Jun 18, 2022 at 7:27 PM Guenter Roeck  wrote:
>
> ppc:allmodconfig builds fail with the following error.
>
> powerpc64-linux-ld:
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o
> uses hard float,
> drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o
> uses soft float
> powerpc64-linux-ld:
> failed to merge target specific data of file
> drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o
> powerpc64-linux-ld:
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o
> uses hard float,
> drivers/gpu/drm/amd/amdgpu/../display/dc/dcn315/dcn315_resource.o
> uses soft float
> powerpc64-linux-ld:
> failed to merge target specific data of
> file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn315/dcn315_resource.o
> powerpc64-linux-ld:
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o
> uses hard float,
> drivers/gpu/drm/amd/amdgpu/../display/dc/dcn316/dcn316_resource.o
> uses soft float
> powerpc64-linux-ld:
> failed to merge target specific data of file
> drivers/gpu/drm/amd/amdgpu/../display/dc/dcn316/dcn316_resource.o
>
> The problem was introduced with commit 41b7a347bf14 ("powerpc: Book3S
> 64-bit outline-only KASAN support") which adds support for KASAN. This
> commit in turn enables DRM_AMD_DC_DCN because KCOV_INSTRUMENT_ALL and
> KCOV_ENABLE_COMPARISONS are no longer enabled. As result, new files are
> compiled which lack the selection of hard-float.
>
> Fixes: 41b7a347bf14 ("powerpc: Book3S 64-bit outline-only KASAN support")
> Cc: Michael Ellerman 
> Cc: Daniel Axtens 
> Signed-off-by: Guenter Roeck 
> ---
>  drivers/gpu/drm/amd/display/dc/dcn31/Makefile  | 4 
>  drivers/gpu/drm/amd/display/dc/dcn315/Makefile | 4 
>  drivers/gpu/drm/amd/display/dc/dcn316/Makefile | 4 
>  3 files changed, 12 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile 
> b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
> index ec041e3cda30..74be02114ae4 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
> @@ -15,6 +15,10 @@ DCN31 = dcn31_resource.o dcn31_hubbub.o dcn31_hwseq.o 
> dcn31_init.o dcn31_hubp.o
> dcn31_apg.o dcn31_hpo_dp_stream_encoder.o dcn31_hpo_dp_link_encoder.o 
> \
> dcn31_afmt.o dcn31_vpg.o
>
> +ifdef CONFIG_PPC64
> +CFLAGS_$(AMDDALPATH)/dc/dcn31/dcn31_resource.o := -mhard-float -maltivec
> +endif

This stuff was all moved as part of the FP rework in:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=26f4712aedbdf4b9f5e3888a50a2a4b130ee4a9b
@Siqueira, Rodrigo
, @Melissa Wen, @Dhillon, Jasdeep  can you take a look to understand
why this is necessary?  If we add back the PPC flags, I think we need
to add back the x86 ones as well.

Alex

> +
>  AMD_DAL_DCN31 = $(addprefix $(AMDDALPATH)/dc/dcn31/,$(DCN31))
>
>  AMD_DISPLAY_FILES += $(AMD_DAL_DCN31)
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn315/Makefile 
> b/drivers/gpu/drm/amd/display/dc/dcn315/Makefile
> index 59381d24800b..1395c1ced8c5 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn315/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/dcn315/Makefile
> @@ -25,6 +25,10 @@
>
>  DCN315 = dcn315_resource.o
>
> +ifdef CONFIG_PPC64
> +CFLAGS_$(AMDDALPATH)/dc/dcn315/dcn315_resource.o := -mhard-float -maltivec
> +endif
> +
>  AMD_DAL_DCN315 = $(addprefix $(AMDDALPATH)/dc/dcn315/,$(DCN315))
>
>  AMD_DISPLAY_FILES += $(AMD_DAL_DCN315)
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn316/Makefile 
> b/drivers/gpu/drm/amd/display/dc/dcn316/Makefile
> index 819d44a9439b..c3d2dd78f1e2 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn316/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/dcn316/Makefile
> @@ -25,6 +25,10 @@
>
>  DCN316 = dcn316_resource.o
>
> +ifdef CONFIG_PPC64
> +CFLAGS_$(AMDDALPATH)/dc/dcn316/dcn316_resource.o := -mhard-float -maltivec
> +endif
> +
>  AMD_DAL_DCN316 = $(addprefix $(AMDDALPATH)/dc/dcn316/,$(DCN316))
>
>  AMD_DISPLAY_FILES += $(AMD_DAL_DCN316)
> --
> 2.35.1
>


Re: [PATCH] drm/amd: Revert "drm/amd/display: keep eDP Vdd on when eDP stream is already enabled"

2022-06-20 Thread Alex Deucher
Acked-by: Alex Deucher 

On Thu, Jun 16, 2022 at 11:48 AM Limonciello, Mario
 wrote:
>
> [Public]
>
> + people associated with original patch being reverted for comments
>
> > -Original Message-
> > From: Limonciello, Mario 
> > Sent: Wednesday, June 15, 2022 17:30
> > To: amd-gfx@lists.freedesktop.org
> > Cc: Limonciello, Mario ; Aaron Ma
> > 
> > Subject: [PATCH] drm/amd: Revert "drm/amd/display: keep eDP Vdd on
> > when eDP stream is already enabled"
> >
> > A variety of Lenovo machines with Rembrandt APUs and OLED panels have
> > stopped showing the display at login.  This behavior clears up after
> > leaving it idle and moving the mouse or touching keyboard.
> >
> > It was bisected to be caused by commit 559e2655220d ("drm/amd/display:
> > keep eDP Vdd on when eDP stream is already enabled").  Revert this commit
> > to fix the issue.
> >
> > Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2047
> > Reported-by: Aaron Ma 
> > Fixes: 559e2655220d ("drm/amd/display: keep eDP Vdd on when eDP stream
> > is already enabled")
> > Signed-off-by: Mario Limonciello 
> > ---
> >  .../display/dc/dce110/dce110_hw_sequencer.c   | 24 ++-
> >  1 file changed, 2 insertions(+), 22 deletions(-)
> >
> > diff --git
> > a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> > b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> > index 7eff7811769d..5f2afa5b4814 100644
> > --- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> > +++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> > @@ -1766,29 +1766,9 @@ void dce110_enable_accelerated_mode(struct dc
> > *dc, struct dc_state *context)
> >   break;
> >   }
> >   }
> > -
> > - /*
> > -  * TO-DO: So far the code logic below only addresses single
> > eDP case.
> > -  * For dual eDP case, there are a few things that need to be
> > -  * implemented first:
> > -  *
> > -  * 1. Change the fastboot logic above, so eDP link[0 or 1]'s
> > -  * stream[0 or 1] will all be checked.
> > -  *
> > -  * 2. Change keep_edp_vdd_on to an array, and maintain
> > keep_edp_vdd_on
> > -  * for each eDP.
> > -  *
> > -  * Once above 2 things are completed, we can then change
> > the logic below
> > -  * correspondingly, so dual eDP case will be fully covered.
> > -  */
> > -
> > - // We are trying to enable eDP, don't power down VDD if
> > eDP stream is existing
> > - if ((edp_stream_num == 1 && edp_streams[0] != NULL) ||
> > can_apply_edp_fast_boot) {
> > + // We are trying to enable eDP, don't power down VDD
> > + if (can_apply_edp_fast_boot)
> >   keep_edp_vdd_on = true;
> > - DC_LOG_EVENT_LINK_TRAINING("Keep eDP Vdd
> > on\n");
> > - } else {
> > - DC_LOG_EVENT_LINK_TRAINING("No eDP stream
> > enabled, turn eDP Vdd off\n");
> > - }
> >   }
> >
> >   // Check seamless boot support
> > --
> > 2.34.1


Re: Performance drop using deinterlace_vaapi on 5.19-rcX

2022-06-20 Thread Thomas Voegtle

On Mon, 20 Jun 2022, Christian König wrote:


Am 20.06.22 um 13:40 schrieb Thomas Voegtle:

 On Mon, 20 Jun 2022, Christian König wrote:


 Hi Thomas,

 [moving vger to bcc]

 mhm, sounds like something isn't running in parallel any more.

 We usually don't test the multimedia engines for this but we do test
 gfx+compute, so I'm really wondering what goes wrong here.

 Could you run some tests for me? Additional to that I'm going to raise
 that issue with our multimedia guys later today.


 Yes, I can run some tests for you. Which tests?


Try this as root:

echo 1 > /sys/kernel/debug/tracing/events/dma_fence/dma_fence_init/enable
echo 1 > /sys/kernel/debug/tracing/events/dma_fence/dma_fence_signaled/enable
cat /sys/kernel/debug/tracing/trace_pipe > trace.log

Then start the encoding in another shell, after it completed cancel the cat 
with cntr+c and save the log file.


Do this one with the old kernel and once with the new one.



   https://32h.de/tv/5.18.0-i5-trace.log.bz2
   https://32h.de/tv/5.19.0-rc3-i5-trace.log.bz2


I hope I have done this correctly.
All necessary tracing things switched on?

I want to add that this is a headless machine. No monitor connected.



Re: drm/amd: Revert "drm/amd/display: keep eDP Vdd on when eDP stream is already enabled"

2022-06-20 Thread Mark Pearson
On 6/15/22 18:30, Mario Limonciello wrote:
> A variety of Lenovo machines with Rembrandt APUs and OLED panels have
> stopped showing the display at login.  This behavior clears up after
> leaving it idle and moving the mouse or touching keyboard.
> 
> It was bisected to be caused by commit 559e2655220d ("drm/amd/display:
> keep eDP Vdd on when eDP stream is already enabled").  Revert this commit
> to fix the issue.
> 
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2047
> Reported-by: Aaron Ma 
> Fixes: 559e2655220d ("drm/amd/display: keep eDP Vdd on when eDP stream is 
> already enabled")
> Signed-off-by: Mario Limonciello 
> ---
>  .../display/dc/dce110/dce110_hw_sequencer.c   | 24 ++-
>  1 file changed, 2 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
> b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> index 7eff7811769d..5f2afa5b4814 100644
> --- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> +++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> @@ -1766,29 +1766,9 @@ void dce110_enable_accelerated_mode(struct dc *dc, 
> struct dc_state *context)
>   break;
>   }
>   }
> -
> - /*
> -  * TO-DO: So far the code logic below only addresses single eDP 
> case.
> -  * For dual eDP case, there are a few things that need to be
> -  * implemented first:
> -  *
> -  * 1. Change the fastboot logic above, so eDP link[0 or 1]'s
> -  * stream[0 or 1] will all be checked.
> -  *
> -  * 2. Change keep_edp_vdd_on to an array, and maintain 
> keep_edp_vdd_on
> -  * for each eDP.
> -  *
> -  * Once above 2 things are completed, we can then change the 
> logic below
> -  * correspondingly, so dual eDP case will be fully covered.
> -  */
> -
> - // We are trying to enable eDP, don't power down VDD if eDP 
> stream is existing
> - if ((edp_stream_num == 1 && edp_streams[0] != NULL) || 
> can_apply_edp_fast_boot) {
> + // We are trying to enable eDP, don't power down VDD
> + if (can_apply_edp_fast_boot)
>   keep_edp_vdd_on = true;
> - DC_LOG_EVENT_LINK_TRAINING("Keep eDP Vdd on\n");
> - } else {
> - DC_LOG_EVENT_LINK_TRAINING("No eDP stream enabled, turn 
> eDP Vdd off\n");
> - }
>   }
>  
>   // Check seamless boot support

Verified this fixed the issue on my Lenovo Z13 (R5 Pro 6650U) when
applied to a build of 5.18 from Linus's tree
Thanks Mario!

Tested-by: markpear...@lenovo.com




Re: [PATCH v6 17/22] drm/shmem-helper: Add generic memory shrinker

2022-06-20 Thread Dmitry Osipenko
On 6/19/22 20:53, Rob Clark wrote:
...
>> +static unsigned long
>> +drm_gem_shmem_shrinker_count_objects(struct shrinker *shrinker,
>> +struct shrink_control *sc)
>> +{
>> +   struct drm_gem_shmem_shrinker *gem_shrinker = 
>> to_drm_shrinker(shrinker);
>> +   struct drm_gem_shmem_object *shmem;
>> +   unsigned long count = 0;
>> +
>> +   if (!mutex_trylock(&gem_shrinker->lock))
>> +   return 0;
>> +
>> +   list_for_each_entry(shmem, &gem_shrinker->lru_evictable, madv_list) {
>> +   count += shmem->base.size;
>> +
>> +   if (count >= SHRINK_EMPTY)
>> +   break;
>> +   }
>> +
>> +   mutex_unlock(&gem_shrinker->lock);
> 
> As I mentioned on other thread, count_objects, being approximate but
> lockless and fast is the important thing.  Otherwise when you start
> hitting the shrinker on many threads, you end up serializing them all,
> even if you have no pages to return to the system at that point.

Daniel's point for dropping the lockless variant was that we're already
in trouble if we're hitting shrinker too often and extra optimizations
won't bring much benefits to us.

Alright, I'll add back the lockless variant (or will use yours
drm_gem_lru) in the next revision. The code difference is very small
after all.

...
>> +   /* prevent racing with the dma-buf importing/exporting */
>> +   if (!mutex_trylock(&gem_shrinker->dev->object_name_lock)) {
>> +   *lock_contention |= true;
>> +   goto resv_unlock;
>> +   }
> 
> I'm not sure this is a good idea to serialize on object_name_lock.
> Purgeable buffers should never be shared (imported or exported).  So
> at best you are avoiding evicting and immediately swapping back in, in
> a rare case, at the cost of serializing multiple threads trying to
> reclaim pages in parallel.

The object_name_lock shouldn't cause contention in practice. But objects
are also pinned on attachment, hence maybe this lock is indeed
unnecessary.. I'll re-check it.

-- 
Best regards,
Dmitry


Re: [PATCH] drm/amdkfd: correct sdma queue number of sdma 6.0.1

2022-06-20 Thread Alex Deucher
On Mon, Jun 20, 2022 at 4:48 AM Yifan Zhang  wrote:
>
> sdma 6.0.1 has 8 queues instead of 2.
>
> Fixes: 2f68559102cb (drm/amdkfd: add GC 11.0.1 KFD support)
> Signed-off-by: Yifan Zhang 

Acked-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index bf4200457772..c8fee0dbfdcb 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -75,7 +75,6 @@ static void kfd_device_info_set_sdma_info(struct kfd_dev 
> *kfd)
> case IP_VERSION(5, 2, 3):/* YELLOW_CARP */
> case IP_VERSION(5, 2, 6):/* GC 10.3.6 */
> case IP_VERSION(5, 2, 7):/* GC 10.3.7 */
> -   case IP_VERSION(6, 0, 1):
> kfd->device_info.num_sdma_queues_per_engine = 2;
> break;
> case IP_VERSION(4, 2, 0):/* VEGA20 */
> @@ -90,6 +89,7 @@ static void kfd_device_info_set_sdma_info(struct kfd_dev 
> *kfd)
> case IP_VERSION(5, 2, 4):/* DIMGREY_CAVEFISH */
> case IP_VERSION(5, 2, 5):/* BEIGE_GOBY */
> case IP_VERSION(6, 0, 0):
> +   case IP_VERSION(6, 0, 1):
> case IP_VERSION(6, 0, 2):
> kfd->device_info.num_sdma_queues_per_engine = 8;
> break;
> --
> 2.35.1
>


Re: [PATCH v6 17/22] drm/shmem-helper: Add generic memory shrinker

2022-06-20 Thread Rob Clark
()

On Thu, May 26, 2022 at 4:55 PM Dmitry Osipenko
 wrote:
>
> Introduce a common DRM SHMEM shrinker framework that allows to reduce
> code duplication among DRM drivers by replacing theirs custom shrinker
> implementations with the generic shrinker.
>
> In order to start using DRM SHMEM shrinker drivers should:
>
> 1. Implement new evict() shmem object callback.
> 2. Register shrinker using drm_gem_shmem_shrinker_register(drm_device).
> 3. Use drm_gem_shmem_set_purgeable(shmem) and alike API functions to
>activate shrinking of shmem GEMs.
>
> This patch is based on a ideas borrowed from Rob's Clark MSM shrinker,
> Thomas' Zimmermann variant of SHMEM shrinker and Intel's i915 shrinker.
>
> Signed-off-by: Daniel Almeida 
> Signed-off-by: Dmitry Osipenko 
> ---
>  drivers/gpu/drm/drm_gem_shmem_helper.c| 540 --
>  .../gpu/drm/panfrost/panfrost_gem_shrinker.c  |   9 +-
>  drivers/gpu/drm/virtio/virtgpu_drv.h  |   3 +
>  include/drm/drm_device.h  |   4 +
>  include/drm/drm_gem_shmem_helper.h|  87 ++-
>  5 files changed, 594 insertions(+), 49 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 555fe212bd98..4cd0b5913492 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -126,6 +126,42 @@ struct drm_gem_shmem_object *drm_gem_shmem_create(struct 
> drm_device *dev, size_t
>  }
>  EXPORT_SYMBOL_GPL(drm_gem_shmem_create);
>
> +static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object *shmem)
> +{
> +   return (shmem->madv >= 0) && shmem->evict &&
> +   shmem->eviction_enabled && shmem->pages_use_count &&
> +   !shmem->pages_pin_count && !shmem->base.dma_buf &&
> +   !shmem->base.import_attach && shmem->sgt && !shmem->evicted;
> +}
> +
> +static void
> +drm_gem_shmem_update_pages_state(struct drm_gem_shmem_object *shmem)
> +{
> +   struct drm_gem_object *obj = &shmem->base;
> +   struct drm_gem_shmem_shrinker *gem_shrinker = 
> obj->dev->shmem_shrinker;
> +
> +   dma_resv_assert_held(shmem->base.resv);
> +
> +   if (!gem_shrinker || obj->import_attach)
> +   return;
> +
> +   mutex_lock(&gem_shrinker->lock);
> +
> +   if (drm_gem_shmem_is_evictable(shmem) ||
> +   drm_gem_shmem_is_purgeable(shmem))
> +   list_move_tail(&shmem->madv_list, 
> &gem_shrinker->lru_evictable);
> +   else if (shmem->madv < 0)
> +   list_del_init(&shmem->madv_list);
> +   else if (shmem->evicted)
> +   list_move_tail(&shmem->madv_list, &gem_shrinker->lru_evicted);
> +   else if (!shmem->pages)
> +   list_del_init(&shmem->madv_list);
> +   else
> +   list_move_tail(&shmem->madv_list, &gem_shrinker->lru_pinned);
> +
> +   mutex_unlock(&gem_shrinker->lock);
> +}
> +
>  /**
>   * drm_gem_shmem_free - Free resources associated with a shmem GEM object
>   * @shmem: shmem GEM object to free
> @@ -142,6 +178,9 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object 
> *shmem)
> } else {
> dma_resv_lock(shmem->base.resv, NULL);
>
> +   /* take out shmem GEM object from the memory shrinker */
> +   drm_gem_shmem_madvise(shmem, -1);
> +
> WARN_ON(shmem->vmap_use_count);
>
> if (shmem->sgt) {
> @@ -150,7 +189,7 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object 
> *shmem)
> sg_free_table(shmem->sgt);
> kfree(shmem->sgt);
> }
> -   if (shmem->pages)
> +   if (shmem->pages_use_count)
> drm_gem_shmem_put_pages(shmem);
>
> WARN_ON(shmem->pages_use_count);
> @@ -163,18 +202,82 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object 
> *shmem)
>  }
>  EXPORT_SYMBOL_GPL(drm_gem_shmem_free);
>
> -static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
> +/**
> + * drm_gem_shmem_set_evictable() - Make GEM evictable by memory shrinker
> + * @shmem: shmem GEM object
> + *
> + * Tell memory shrinker that this GEM can be evicted. Initially eviction is
> + * disabled for all GEMs. If GEM was purged, then -ENOMEM is returned.
> + *
> + * Returns:
> + * 0 on success or a negative error code on failure.
> + */
> +int drm_gem_shmem_set_evictable(struct drm_gem_shmem_object *shmem)
> +{
> +   dma_resv_lock(shmem->base.resv, NULL);
> +
> +   if (shmem->madv < 0)
> +   return -ENOMEM;
> +
> +   shmem->eviction_enabled = true;
> +
> +   dma_resv_unlock(shmem->base.resv);
> +
> +   return 0;
> +}
> +EXPORT_SYMBOL_GPL(drm_gem_shmem_set_evictable);
> +
> +/**
> + * drm_gem_shmem_set_purgeable() - Make GEM purgeable by memory shrinker
> + * @shmem: shmem GEM object
> + *
> + * Tell memory shrinker that this GEM can be purged. Initially purging is
> + * disabled for al

Re: [PATCH 16/31] drm/amd/display: refactor function transmitter_to_phy_id

2022-06-20 Thread Rodrigo Siqueira Jordao




On 2022-06-17 15:51, Nathan Chancellor wrote:

Hi Rodrigo,

On Fri, Jun 17, 2022 at 03:34:57PM -0400, Rodrigo Siqueira wrote:

From: Nicholas Choi 

[Why & How]
Since we only need transmitter value in function transmitter_to_phy_id().
Replace argument struct dc_link with enum transmitter.

Reviewed-by: Chao-kai Wang 
Acked-by: Alan Liu 
Reviewed-by: Nicholas Kazlauskas 
Signed-off-by: Nathan Chancellor 
Signed-off-by: Alex Deucher 


How did I end up in the signoff chain for a patch I have never seen up
until this point? That should definitely be cleaned up.

Additionally, this commit message doesn't really seem to line up with
the change. It says that "struct dc_link" is being replaced with "enum
transmitter", when it is really the reverse, and that only the
transmitter value is needed, which is already the case, right? I guess
this is so that you can use DC_ERROR(), which requires a dc_ctx
variable? It is not immediately obvious from the commit message so that
should be clarified as well.


Hi Nathan,

Thanks for reporting this error; it looks like our scripts have some 
issues. I'll take a look at that.


About this patch, I'll drop it.

Thanks
Siqueira


Cheers,
Nathan


---
  drivers/gpu/drm/amd/display/dc/core/dc_link.c | 10 ++
  1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 43b55bc6e2db..58882d42eff5 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -3185,8 +3185,11 @@ bool dc_link_get_psr_state(const struct dc_link *link, 
enum dc_psr_state *state)
  }
  
  static inline enum physical_phy_id

-transmitter_to_phy_id(enum transmitter transmitter_value)
+transmitter_to_phy_id(struct dc_link *link)
  {
+   struct dc_context *dc_ctx = link->ctx;
+   enum transmitter transmitter_value = link->link_enc->transmitter;
+
switch (transmitter_value) {
case TRANSMITTER_UNIPHY_A:
return PHYLD_0;
@@ -3213,8 +3216,7 @@ transmitter_to_phy_id(enum transmitter transmitter_value)
case TRANSMITTER_UNKNOWN:
return PHYLD_UNKNOWN;
default:
-   WARN_ONCE(1, "Unknown transmitter value %d\n",
- transmitter_value);
+   DC_ERROR("Unknown transmitter value %d\n", transmitter_value);
return PHYLD_UNKNOWN;
}
  }
@@ -3331,7 +,7 @@ bool dc_link_setup_psr(struct dc_link *link,
psr_context->phyType = PHY_TYPE_UNIPHY;
/*PhyId is associated with the transmitter id*/
psr_context->smuPhyId =
-   transmitter_to_phy_id(link->link_enc->transmitter);
+   transmitter_to_phy_id(link);
  
  	psr_context->crtcTimingVerticalTotal = stream->timing.v_total;

psr_context->vsync_rate_hz = div64_u64(div64_u64((stream->
--
2.25.1






Re: [PATCH v6 17/22] drm/shmem-helper: Add generic memory shrinker

2022-06-20 Thread Rob Clark
On Mon, Jun 20, 2022 at 7:09 AM Dmitry Osipenko
 wrote:
>
> On 6/19/22 20:53, Rob Clark wrote:
> ...
> >> +static unsigned long
> >> +drm_gem_shmem_shrinker_count_objects(struct shrinker *shrinker,
> >> +struct shrink_control *sc)
> >> +{
> >> +   struct drm_gem_shmem_shrinker *gem_shrinker = 
> >> to_drm_shrinker(shrinker);
> >> +   struct drm_gem_shmem_object *shmem;
> >> +   unsigned long count = 0;
> >> +
> >> +   if (!mutex_trylock(&gem_shrinker->lock))
> >> +   return 0;
> >> +
> >> +   list_for_each_entry(shmem, &gem_shrinker->lru_evictable, 
> >> madv_list) {
> >> +   count += shmem->base.size;
> >> +
> >> +   if (count >= SHRINK_EMPTY)
> >> +   break;
> >> +   }
> >> +
> >> +   mutex_unlock(&gem_shrinker->lock);
> >
> > As I mentioned on other thread, count_objects, being approximate but
> > lockless and fast is the important thing.  Otherwise when you start
> > hitting the shrinker on many threads, you end up serializing them all,
> > even if you have no pages to return to the system at that point.
>
> Daniel's point for dropping the lockless variant was that we're already
> in trouble if we're hitting shrinker too often and extra optimizations
> won't bring much benefits to us.

At least with zram swap (which I highly recommend using even if you
are not using a physical swap file/partition), swapin/out is actually
quite fast.  And if you are leaning on zram swap to fit 8GB of chrome
browser on a 4GB device, the shrinker gets hit quite a lot.  Lower
spec (4GB RAM) chromebooks can be under constant memory pressure and
can quite easily get into a situation where you are hitting the
shrinker on many threads simultaneously.  So it is pretty important
for all shrinkers in the system (not just drm driver) to be as
concurrent as possible.  As long as you avoid serializing reclaim on
all the threads, performance can still be quite good, but if you don't
performance will fall off a cliff.

jfwiw, we are seeing pretty good results (iirc 40-70% increase in open
tab counts) with the combination of eviction + multigen LRU[1] +
sizing zram swap to be 2x physical RAM

[1] https://lwn.net/Articles/856931/

> Alright, I'll add back the lockless variant (or will use yours
> drm_gem_lru) in the next revision. The code difference is very small
> after all.
>
> ...
> >> +   /* prevent racing with the dma-buf importing/exporting */
> >> +   if (!mutex_trylock(&gem_shrinker->dev->object_name_lock)) {
> >> +   *lock_contention |= true;
> >> +   goto resv_unlock;
> >> +   }
> >
> > I'm not sure this is a good idea to serialize on object_name_lock.
> > Purgeable buffers should never be shared (imported or exported).  So
> > at best you are avoiding evicting and immediately swapping back in, in
> > a rare case, at the cost of serializing multiple threads trying to
> > reclaim pages in parallel.
>
> The object_name_lock shouldn't cause contention in practice. But objects
> are also pinned on attachment, hence maybe this lock is indeed
> unnecessary.. I'll re-check it.

I'm not worried about contention with export/import/etc, but
contention between multiple threads hitting the shrinker in parallel.
I guess since you are using trylock, it won't *block* the other
threads hitting shrinker, but they'll just end up looping in
do_shrink_slab() because they are hitting contention.

I'd have to do some experiments to see how it works out in practice,
but my gut feel is that it isn't a good idea

BR,
-R

> --
> Best regards,
> Dmitry


Re: [PATCH] drm/amdgpu: fix adev variable used in amdgpu_device_gpu_recover()

2022-06-20 Thread Alex Deucher
Ping?

Alex

On Thu, Jun 16, 2022 at 5:12 PM Alex Deucher  wrote:
>
> Use the correct adev variable for the drm_fb_helper in
> amdgpu_device_gpu_recover().  Noticed by inspection.
>
> Fixes: 087451f372bf ("drm/amdgpu: use generic fb helpers instead of setting 
> up AMD own's.")
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 2b92281dd0c1..eacecc672a4d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -5186,7 +5186,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device 
> *adev,
>  */
> amdgpu_unregister_gpu_instance(tmp_adev);
>
> -   
> drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, true);
> +   
> drm_fb_helper_set_suspend_unlocked(adev_to_drm(tmp_adev)->fb_helper, true);
>
> /* disable ras on ALL IPs */
> if (!need_emergency_restart &&
> --
> 2.35.3
>


RE: [PATCH v2 2/2] drm/amdkfd: Free queue after unmap queue success

2022-06-20 Thread Sider, Graham
[Public]

Reviewed-by: Graham Sider 

-Original Message-
From: Yang, Philip  
Sent: Friday, June 17, 2022 3:55 PM
To: amd-gfx@lists.freedesktop.org
Cc: Sider, Graham ; Yang, Philip 
Subject: [PATCH v2 2/2] drm/amdkfd: Free queue after unmap queue success

After queue unmap or remove from MES successfully, free queue sysfs entries, 
doorbell and remove from queue list. Otherwise, application may destroy queue 
again, cause below kernel warning or crash backtrace.

For outstanding queues, either application forget to destroy or failed to 
destroy, kfd_process_notifier_release will remove queue sysfs entries, 
kfd_process_wq_release will free queue doorbell.

v2: decrement_queue_count for MES queue

 refcount_t: underflow; use-after-free.
 WARNING: CPU: 7 PID: 3053 at lib/refcount.c:28
  Call Trace:
   kobject_put+0xd6/0x1a0
   kfd_procfs_del_queue+0x27/0x30 [amdgpu]
   pqm_destroy_queue+0xeb/0x240 [amdgpu]
   kfd_ioctl_destroy_queue+0x32/0x70 [amdgpu]
   kfd_ioctl+0x27d/0x500 [amdgpu]
   do_syscall_64+0x35/0x80

 WARNING: CPU: 2 PID: 3053 at 
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:400
  Call Trace:
   deallocate_doorbell.isra.0+0x39/0x40 [amdgpu]
   destroy_queue_cpsch+0xb3/0x270 [amdgpu]
   pqm_destroy_queue+0x108/0x240 [amdgpu]
   kfd_ioctl_destroy_queue+0x32/0x70 [amdgpu]
   kfd_ioctl+0x27d/0x500 [amdgpu]

 general protection fault, probably for non-canonical address
0xdead0108:
 Call Trace:
  pqm_destroy_queue+0xf0/0x200 [amdgpu]
  kfd_ioctl_destroy_queue+0x2f/0x60 [amdgpu]
  kfd_ioctl+0x19b/0x600 [amdgpu]

Signed-off-by: Philip Yang 
---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 28 +++
 .../amd/amdkfd/kfd_process_queue_manager.c|  2 +-
 2 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 21aeb05b17db..213246a5b4e4 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1872,6 +1872,22 @@ static int destroy_queue_cpsch(struct 
device_queue_manager *dqm,
 
}
 
+   if (q->properties.is_active) {
+   if (!dqm->dev->shared_resources.enable_mes) {
+   retval = execute_queues_cpsch(dqm,
+ 
KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES, 0);
+   if (retval == -ETIME)
+   qpd->reset_wavefronts = true;
+   } else {
+   retval = remove_queue_mes(dqm, q, qpd);
+   }
+
+   if (retval)
+   goto failed_unmap_queue;
+
+   decrement_queue_count(dqm, qpd, q);
+   }
+
mqd_mgr = dqm->mqd_mgrs[get_mqd_type_from_queue_type(
q->properties.type)];
 
@@ -1885,17 +1901,6 @@ static int destroy_queue_cpsch(struct 
device_queue_manager *dqm,
 
list_del(&q->list);
qpd->queue_count--;
-   if (q->properties.is_active) {
-   if (!dqm->dev->shared_resources.enable_mes) {
-   decrement_queue_count(dqm, qpd, q);
-   retval = execute_queues_cpsch(dqm,
- 
KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES, 0);
-   if (retval == -ETIME)
-   qpd->reset_wavefronts = true;
-   } else {
-   retval = remove_queue_mes(dqm, q, qpd);
-   }
-   }
 
/*
 * Unconditionally decrement this counter, regardless of the queue's @@ 
-1912,6 +1917,7 @@ static int destroy_queue_cpsch(struct device_queue_manager 
*dqm,
 
return retval;
 
+failed_unmap_queue:
 failed_try_destroy_debugged_queue:
 
dqm_unlock(dqm);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index dc00484ff484..99f2a6412201 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -419,7 +419,6 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, 
unsigned int qid)
}
 
if (pqn->q) {
-   kfd_procfs_del_queue(pqn->q);
dqm = pqn->q->device->dqm;
retval = dqm->ops.destroy_queue(dqm, &pdd->qpd, pqn->q);
if (retval) {
@@ -439,6 +438,7 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, 
unsigned int qid)
if (dev->shared_resources.enable_mes)
amdgpu_amdkfd_free_gtt_mem(dev->adev,
   pqn->q->gang_ctx_bo);
+   kfd_procfs_del_queue(pqn->q);
uninit_queue(pqn->q);
}
 
--
2.35.1


RE: [PATCH v2 1/2] drm/amdkfd: Add queue to MES if it becomes active

2022-06-20 Thread Sider, Graham
[Public]

Reviewed-by: Graham Sider 

-Original Message-
From: Yang, Philip  
Sent: Friday, June 17, 2022 3:55 PM
To: amd-gfx@lists.freedesktop.org
Cc: Sider, Graham ; Yang, Philip 
Subject: [PATCH v2 1/2] drm/amdkfd: Add queue to MES if it becomes active

We remove the user queue from MES scheduler to update queue properties.
If the queue becomes active after updating, add the user queue to MES 
scheduler, to be able to handle command packet submission.

v2: don't break pqm_set_gws

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index e1797657b04c..21aeb05b17db 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -811,7 +811,6 @@ static int update_queue(struct device_queue_manager *dqm, 
struct queue *q,
struct mqd_manager *mqd_mgr;
struct kfd_process_device *pdd;
bool prev_active = false;
-   bool add_queue = false;
 
dqm_lock(dqm);
pdd = kfd_get_process_device_data(q->device, q->process); @@ -887,7 
+886,7 @@ static int update_queue(struct device_queue_manager *dqm, struct 
queue *q,
if (dqm->sched_policy != KFD_SCHED_POLICY_NO_HWS) {
if (!dqm->dev->shared_resources.enable_mes)
retval = map_queues_cpsch(dqm);
-   else if (add_queue)
+   else if (q->properties.is_active)
retval = add_queue_mes(dqm, q, &pdd->qpd);
} else if (q->properties.is_active &&
 (q->properties.type == KFD_QUEUE_TYPE_COMPUTE ||
--
2.35.1


Re: [PATCH] drm/amdgpu: add LSDMA block for LSDMA v6.0.1

2022-06-20 Thread Alex Deucher
On Mon, Jun 20, 2022 at 7:45 AM Yifan Zhang  wrote:
>
> This patch adds LSDMA ip block for LSDMA v6.0.1.
>
> Signed-off-by: Yifan Zhang 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> index 11bbd76c581c..37234c2998d7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> @@ -2333,6 +2333,7 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device 
> *adev)
>
> switch (adev->ip_versions[LSDMA_HWIP][0]) {
> case IP_VERSION(6, 0, 0):
> +   case IP_VERSION(6, 0, 1):
> case IP_VERSION(6, 0, 2):
> adev->lsdma.funcs = &lsdma_v6_0_funcs;
> break;
> --
> 2.35.1
>


RE: [PATCH 00/31] DC Patches June 17, 2022

2022-06-20 Thread Wheeler, Daniel
[Public]

Hi all,
 
This week this patchset was tested on the following systems:
 
HP Envy 360, with Ryzen 5 4500U
Lenovo Thinkpad T14s Gen2, with AMD Ryzen 5 5650U 
Sapphire Pulse RX5700XT 
Reference AMD RX6800
Engineering board with Ryzen 9 5900H
 
These systems were tested on the following display types: 
eDP, (1080p 60hz [4500U, 5650U, 5900H])
VGA and DVI (1680x1050 60HZ [DP to VGA/DVI, USB-C to DVI/VGA])
DP/HDMI/USB-C (1440p 170hz, 4k 60hz, 4k 144hz [Includes USB-C to DP/HDMI 
adapters])
 
MST tested with Startech MST14DP123DP and 2x 4k 60Hz displays
DSC tested with Cable Matters 101075 (DP to 3x DP), and 201375 (USB-C to 3x DP) 
with 3x 4k60 displays
 
The testing is a mix of automated and manual tests. Manual testing includes 
(but is not limited to):
Changing display configurations and settings
Benchmark testing
Feature testing (Freesync, etc.)
 
Automated testing includes (but is not limited to):
Script testing (scripts to automate some of the manual checks)
IGT testing
 
The patchset consists of the amd-staging-drm-next branch (Head commit - 
daa21bfa14f16caef5b7d8f8938a1334c620aaf1) with new patches added on top of it. 
This branch is used for both Ubuntu and Chrome OS testing (ChromeOS on a 
bi-weekly basis).

 
Tested on Ubuntu 22.04
 
Tested-by: Daniel Wheeler 
 
 
Thank you,
 
Dan Wheeler
Technologist | AMD
SW Display
--
1 Commerce Valley Dr E, Thornhill, ON L3T 7X6
amd.com

-Original Message-
From: amd-gfx  On Behalf Of Rodrigo 
Siqueira
Sent: June 17, 2022 3:35 PM
To: amd-gfx@lists.freedesktop.org
Cc: Wang, Chao-kai (Stylon) ; Li, Sun peng (Leo) 
; Wentland, Harry ; Zhuo, Qingqing 
(Lillian) ; Siqueira, Rodrigo 
; Li, Roman ; Chiu, Solomon 
; Zuo, Jerry ; Pillai, Aurabindo 
; Mahfooz, Hamza ; Lin, Wayne 
; Lakha, Bhawanpreet ; Gutierrez, 
Agustin ; Kotarac, Pavle 
Subject: [PATCH 00/31] DC Patches June 17, 2022

This DC patchset brings improvements in multiple areas. In summary, we
have:

- Remove unnecessary code;
- Small fixes (compilation warnings, typos, etc);
- Improvements in the DPMS code;
- Fix eDP issues;
- Improvements in the MST code.

Thanks
Siqueira

Alvin Lee (2):
  drm/amd/display: Update DPPCLK programming sequence
  drm/amd/display: Update SW state correctly for FCLK

Aric Cyr (2):
  drm/amd/display: Change initializer to single brace
  drm/amd/display: 3.2.191

Chaitanya Dhere (1):
  drm/amd/display: Implement a pme workaround function

Cruise Hung (1):
  drm/amd/display: Remove compiler warning

Dmytro Laktyushkin (1):
  drm/amd/display: Fix in dp link-training when updating payload
allocation table

George Shen (5):
  drm/amd/display: Fix in overriding DP drive settings
  drm/amd/display: Fix typo in override_lane_settings
  drm/amd/display: Handle downstream LTTPR with fixed VS sequence
  drm/amd/display: Remove unused vendor specific w/a
  drm/amd/display: Fix divide-by-zero in DPPCLK and DISPCLK calculation

Ian Chen (1):
  drm/amd/display: Drop unnecessary detect link code

JinZe.Xu (1):
  drm/amd/display: Change HDMI judgement condition.

Nicholas Choi (1):
  drm/amd/display: refactor function transmitter_to_phy_id

Qingqing Zhuo (1):
  drm/amd/display: Fix DC warning at driver load

Rodrigo Siqueira (4):
  drm/amd/display: Check minimum disp_clk and dpp_clk debug option
  drm/amd/display: Get VCO frequency from registers
  drm/amd/display: Update hook dcn32_funcs
  drm/amd/display: Drop duplicate define

Saaem Rizvi (1):
  drm/amd/display: Add SMU logging code

Sung Joon Kim (2):
  drm/amd/display: Fix eDP not light up on resume
  drm/amd/display: Turn off internal backlight when plugging external
monitor

Wayne Lin (4):
  drm/amd/display: Revert "drm/amd/display: Add flag to detect dpms
force off during HPD"
  drm/amd/display: Revert "drm/amd/display: turn DPMS off on connector
unplug"
  drm/amd/display: Release remote dc_sink under mst scenario
  drm/amd/display: Take emulated dc_sink into account for HDCP

Wenjing Liu (3):
  drm/amd/display: Enrich the log in MST payload update
  drm/amd/display: rename lane_settings to hw_lane_settings
  drm/amd/display: extract update stream allocation to link_hwss

hersen wu (1):
  drm/amd/display: add mst port output bw check

 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  57 +
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |   5 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c |   8 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_hdcp.c|   1 +
 .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c |  18 +-
 .../display/amdgpu_dm/amdgpu_dm_mst_types.c   |  70 +-
 .../display/amdgpu_dm/amdgpu_dm_mst_types.h   |   4 +
 .../dc/clk_mgr/dcn21/rn_clk_mgr_vbios_smu.c   |  12 +
 .../display/dc/clk_mgr/dcn301/dcn301_smu.c|  12 +
 .../amd/display/dc/clk_mgr/dcn31/dcn31_smu.c  |   8 +
 .../dc/clk_mgr/dcn315/dcn315_clk_mgr.c|   2 +-
 .../display/dc/clk_mgr/dcn315/dcn315_smu.c|   8 +
 ..

Re: Performance drop using deinterlace_vaapi on 5.19-rcX

2022-06-20 Thread Christian König

Am 20.06.22 um 13:40 schrieb Thomas Voegtle:

On Mon, 20 Jun 2022, Christian König wrote:


Hi Thomas,

[moving vger to bcc]

mhm, sounds like something isn't running in parallel any more.

We usually don't test the multimedia engines for this but we do test 
gfx+compute, so I'm really wondering what goes wrong here.


Could you run some tests for me? Additional to that I'm going to 
raise that issue with our multimedia guys later today.


Yes, I can run some tests for you. Which tests?


Try this as root:

echo 1 > /sys/kernel/debug/tracing/events/dma_fence/dma_fence_init/enable
echo 1 > 
/sys/kernel/debug/tracing/events/dma_fence/dma_fence_signaled/enable

cat /sys/kernel/debug/tracing/trace_pipe > trace.log

Then start the encoding in another shell, after it completed cancel the 
cat with cntr+c and save the log file.


Do this one with the old kernel and once with the new one.

Regards,
Christian.




  Thomas




Re: Performance drop using deinterlace_vaapi on 5.19-rcX

2022-06-20 Thread Thomas Voegtle

On Mon, 20 Jun 2022, Christian König wrote:


Hi Thomas,

[moving vger to bcc]

mhm, sounds like something isn't running in parallel any more.

We usually don't test the multimedia engines for this but we do test 
gfx+compute, so I'm really wondering what goes wrong here.


Could you run some tests for me? Additional to that I'm going to raise that 
issue with our multimedia guys later today.


Yes, I can run some tests for you. Which tests?


  Thomas


[PATCH 2/2] drm/radeon: Drop CONFIG_BACKLIGHT_CLASS_DEVICE ifdefs

2022-06-20 Thread Hans de Goede
The DRM_RADEON Kconfig code contains:

select BACKLIGHT_CLASS_DEVICE

So the condition these ifdefs test for is always true, drop them.

Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/radeon/atombios_encoders.c  | 14 --
 drivers/gpu/drm/radeon/radeon_acpi.c|  2 --
 drivers/gpu/drm/radeon/radeon_legacy_encoders.c | 15 ---
 drivers/gpu/drm/radeon/radeon_mode.h|  4 
 4 files changed, 35 deletions(-)

diff --git a/drivers/gpu/drm/radeon/atombios_encoders.c 
b/drivers/gpu/drm/radeon/atombios_encoders.c
index f82577dc25e8..160a309e1048 100644
--- a/drivers/gpu/drm/radeon/atombios_encoders.c
+++ b/drivers/gpu/drm/radeon/atombios_encoders.c
@@ -143,8 +143,6 @@ atombios_set_backlight_level(struct radeon_encoder 
*radeon_encoder, u8 level)
}
 }
 
-#if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) || 
defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)
-
 static u8 radeon_atom_bl_level(struct backlight_device *bd)
 {
u8 level;
@@ -293,18 +291,6 @@ static void radeon_atom_backlight_exit(struct 
radeon_encoder *radeon_encoder)
}
 }
 
-#else /* !CONFIG_BACKLIGHT_CLASS_DEVICE */
-
-void radeon_atom_backlight_init(struct radeon_encoder *encoder)
-{
-}
-
-static void radeon_atom_backlight_exit(struct radeon_encoder *encoder)
-{
-}
-
-#endif
-
 static bool radeon_atom_mode_fixup(struct drm_encoder *encoder,
   const struct drm_display_mode *mode,
   struct drm_display_mode *adjusted_mode)
diff --git a/drivers/gpu/drm/radeon/radeon_acpi.c 
b/drivers/gpu/drm/radeon/radeon_acpi.c
index 1baef7b493de..b603c0b77075 100644
--- a/drivers/gpu/drm/radeon/radeon_acpi.c
+++ b/drivers/gpu/drm/radeon/radeon_acpi.c
@@ -391,7 +391,6 @@ static int radeon_atif_handler(struct radeon_device *rdev,
 
radeon_set_backlight_level(rdev, enc, 
req.backlight_level);
 
-#if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) || 
defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)
if (rdev->is_atom_bios) {
struct radeon_encoder_atom_dig *dig = 
enc->enc_priv;
backlight_force_update(dig->bl_dev,
@@ -401,7 +400,6 @@ static int radeon_atif_handler(struct radeon_device *rdev,
backlight_force_update(dig->bl_dev,
   BACKLIGHT_UPDATE_HOTKEY);
}
-#endif
}
}
if (req.pending & ATIF_DGPU_DISPLAY_EVENT) {
diff --git a/drivers/gpu/drm/radeon/radeon_legacy_encoders.c 
b/drivers/gpu/drm/radeon/radeon_legacy_encoders.c
index d2180f5c80fa..1d207c76f53e 100644
--- a/drivers/gpu/drm/radeon/radeon_legacy_encoders.c
+++ b/drivers/gpu/drm/radeon/radeon_legacy_encoders.c
@@ -320,8 +320,6 @@ radeon_legacy_set_backlight_level(struct radeon_encoder 
*radeon_encoder, u8 leve
radeon_legacy_lvds_update(&radeon_encoder->base, dpms_mode);
 }
 
-#if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) || 
defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)
-
 static uint8_t radeon_legacy_lvds_level(struct backlight_device *bd)
 {
struct radeon_backlight_privdata *pdata = bl_get_data(bd);
@@ -495,19 +493,6 @@ static void radeon_legacy_backlight_exit(struct 
radeon_encoder *radeon_encoder)
}
 }
 
-#else /* !CONFIG_BACKLIGHT_CLASS_DEVICE */
-
-void radeon_legacy_backlight_init(struct radeon_encoder *encoder)
-{
-}
-
-static void radeon_legacy_backlight_exit(struct radeon_encoder *encoder)
-{
-}
-
-#endif
-
-
 static void radeon_lvds_enc_destroy(struct drm_encoder *encoder)
 {
struct radeon_encoder *radeon_encoder = to_radeon_encoder(encoder);
diff --git a/drivers/gpu/drm/radeon/radeon_mode.h 
b/drivers/gpu/drm/radeon/radeon_mode.h
index 3485e7f142e9..b34cffc162e2 100644
--- a/drivers/gpu/drm/radeon/radeon_mode.h
+++ b/drivers/gpu/drm/radeon/radeon_mode.h
@@ -281,15 +281,11 @@ struct radeon_mode_info {
 
 #define RADEON_MAX_BL_LEVEL 0xFF
 
-#if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) || 
defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)
-
 struct radeon_backlight_privdata {
struct radeon_encoder *encoder;
uint8_t negative;
 };
 
-#endif
-
 #define MAX_H_CODE_TIMING_LEN 32
 #define MAX_V_CODE_TIMING_LEN 32
 
-- 
2.36.0



[PATCH 1/2] drm/amdgpu: Drop CONFIG_BACKLIGHT_CLASS_DEVICE ifdefs

2022-06-20 Thread Hans de Goede
The DRM_AMDGPU Kconfig code contains:

select BACKLIGHT_CLASS_DEVICE

So the condition these ifdefs test for is always true, drop them.

Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c  |  6 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  |  4 
 drivers/gpu/drm/amd/amdgpu/atombios_encoders.c| 14 --
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 15 +--
 4 files changed, 1 insertion(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
index 98ac53ee6bb5..130060834b4e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
@@ -66,9 +66,7 @@ struct amdgpu_atif {
struct amdgpu_atif_notifications notifications;
struct amdgpu_atif_functions functions;
struct amdgpu_atif_notification_cfg notification_cfg;
-#if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) || 
defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)
struct backlight_device *bd;
-#endif
struct amdgpu_dm_backlight_caps backlight_caps;
 };
 
@@ -436,7 +434,6 @@ static int amdgpu_atif_handler(struct amdgpu_device *adev,
DRM_DEBUG_DRIVER("ATIF: %d pending SBIOS requests\n", count);
 
if (req.pending & ATIF_PANEL_BRIGHTNESS_CHANGE_REQUEST) {
-#if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) || 
defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)
if (atif->bd) {
DRM_DEBUG_DRIVER("Changing brightness to %d\n",
 req.backlight_level);
@@ -447,7 +444,6 @@ static int amdgpu_atif_handler(struct amdgpu_device *adev,
 */
backlight_device_set_brightness(atif->bd, 
req.backlight_level);
}
-#endif
}
 
if (req.pending & ATIF_DGPU_DISPLAY_EVENT) {
@@ -849,7 +845,6 @@ int amdgpu_acpi_init(struct amdgpu_device *adev)
 {
struct amdgpu_atif *atif = &amdgpu_acpi_priv.atif;
 
-#if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) || 
defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)
if (atif->notifications.brightness_change) {
if (amdgpu_device_has_dc_support(adev)) {
 #if defined(CONFIG_DRM_AMD_DC)
@@ -876,7 +871,6 @@ int amdgpu_acpi_init(struct amdgpu_device *adev)
}
}
}
-#endif
adev->acpi_nb.notifier_call = amdgpu_acpi_event;
register_acpi_notifier(&adev->acpi_nb);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index f80b4838cea1..dbe2904e015b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -349,15 +349,11 @@ struct amdgpu_mode_info {
 
 #define AMDGPU_MAX_BL_LEVEL 0xFF
 
-#if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) || 
defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)
-
 struct amdgpu_backlight_privdata {
struct amdgpu_encoder *encoder;
uint8_t negative;
 };
 
-#endif
-
 struct amdgpu_atom_ss {
uint16_t percentage;
uint16_t percentage_divider;
diff --git a/drivers/gpu/drm/amd/amdgpu/atombios_encoders.c 
b/drivers/gpu/drm/amd/amdgpu/atombios_encoders.c
index 8158677302fe..e2056fbbc750 100644
--- a/drivers/gpu/drm/amd/amdgpu/atombios_encoders.c
+++ b/drivers/gpu/drm/amd/amdgpu/atombios_encoders.c
@@ -120,8 +120,6 @@ amdgpu_atombios_encoder_set_backlight_level(struct 
amdgpu_encoder *amdgpu_encode
}
 }
 
-#if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) || 
defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)
-
 static u8 amdgpu_atombios_encoder_backlight_level(struct backlight_device *bd)
 {
u8 level;
@@ -263,18 +261,6 @@ amdgpu_atombios_encoder_fini_backlight(struct 
amdgpu_encoder *amdgpu_encoder)
}
 }
 
-#else /* !CONFIG_BACKLIGHT_CLASS_DEVICE */
-
-void amdgpu_atombios_encoder_init_backlight(struct amdgpu_encoder *encoder)
-{
-}
-
-void amdgpu_atombios_encoder_fini_backlight(struct amdgpu_encoder *encoder)
-{
-}
-
-#endif
-
 bool amdgpu_atombios_encoder_is_digital(struct drm_encoder *encoder)
 {
struct amdgpu_encoder *amdgpu_encoder = to_amdgpu_encoder(encoder);
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 6def02dec82d..87c1f7190752 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3868,9 +3868,6 @@ static int amdgpu_dm_mode_config_init(struct 
amdgpu_device *adev)
 #define AMDGPU_DM_DEFAULT_MAX_BACKLIGHT 255
 #define AUX_BL_DEFAULT_TRANSITION_TIME_MS 50
 
-#if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) ||\
-   defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)
-
 static void amdgpu_dm_update_backlight_caps(struct amdgpu_display_manager *dm,
int bl_idx)
 {
@@ -4081,7 +4078,6 @@ amdgpu_dm_reg

Re: [PATCH v5 01/13] mm: add zone device coherent type memory support

2022-06-20 Thread Alistair Popple


Oded Gabbay  writes:

> On Mon, Jun 20, 2022 at 3:33 AM Alistair Popple  wrote:
>>
>>
>> Oded Gabbay  writes:
>>
>> > On Fri, Jun 17, 2022 at 8:20 PM Sierra Guiza, Alejandro (Alex)
>> >  wrote:
>> >>
>> >>
>> >> On 6/17/2022 4:40 AM, David Hildenbrand wrote:
>> >> > On 31.05.22 22:00, Alex Sierra wrote:
>> >> >> Device memory that is cache coherent from device and CPU point of view.
>> >> >> This is used on platforms that have an advanced system bus (like CAPI
>> >> >> or CXL). Any page of a process can be migrated to such memory. However,
>> >> >> no one should be allowed to pin such memory so that it can always be
>> >> >> evicted.
>> >> >>
>> >> >> Signed-off-by: Alex Sierra 
>> >> >> Acked-by: Felix Kuehling 
>> >> >> Reviewed-by: Alistair Popple 
>> >> >> [hch: rebased ontop of the refcount changes,
>> >> >>removed is_dev_private_or_coherent_page]
>> >> >> Signed-off-by: Christoph Hellwig 
>> >> >> ---
>> >> >>   include/linux/memremap.h | 19 +++
>> >> >>   mm/memcontrol.c  |  7 ---
>> >> >>   mm/memory-failure.c  |  8 ++--
>> >> >>   mm/memremap.c| 10 ++
>> >> >>   mm/migrate_device.c  | 16 +++-
>> >> >>   mm/rmap.c|  5 +++--
>> >> >>   6 files changed, 49 insertions(+), 16 deletions(-)
>> >> >>
>> >> >> diff --git a/include/linux/memremap.h b/include/linux/memremap.h
>> >> >> index 8af304f6b504..9f752ebed613 100644
>> >> >> --- a/include/linux/memremap.h
>> >> >> +++ b/include/linux/memremap.h
>> >> >> @@ -41,6 +41,13 @@ struct vmem_altmap {
>> >> >>* A more complete discussion of unaddressable memory may be found in
>> >> >>* include/linux/hmm.h and Documentation/vm/hmm.rst.
>> >> >>*
>> >> >> + * MEMORY_DEVICE_COHERENT:
>> >> >> + * Device memory that is cache coherent from device and CPU point of 
>> >> >> view. This
>> >> >> + * is used on platforms that have an advanced system bus (like CAPI 
>> >> >> or CXL). A
>> >> >> + * driver can hotplug the device memory using ZONE_DEVICE and with 
>> >> >> that memory
>> >> >> + * type. Any page of a process can be migrated to such memory. 
>> >> >> However no one
>> >> > Any page might not be right, I'm pretty sure. ... just thinking about 
>> >> > special pages
>> >> > like vdso, shared zeropage, ... pinned pages ...
>> >>
>> >> Hi David,
>> >>
>> >> Yes, I think you're right. This type does not cover all special pages.
>> >> I need to correct that on the cover letter.
>> >> Pinned pages are allowed as long as they're not long term pinned.
>> >>
>> >> Regards,
>> >> Alex Sierra
>> >
>> > What if I want to hotplug this device's coherent memory, but I do
>> > *not* want the OS
>> > to migrate any page to it ?
>> > I want to fully-control what resides on this memory, as I can consider
>> > this memory
>> > "expensive". i.e. I don't have a lot of it, I want to use it for
>> > specific purposes and
>> > I don't want the OS to start using it when there is some memory pressure in
>> > the system.
>>
>> This is exactly what MEMORY_DEVICE_COHERENT is for. Device coherent
>> pages are only allocated by a device driver and exposed to user-space by
>> a driver migrating pages to them with migrate_vma. The OS can't just
>> start using them due to memory pressure for example.
>>
>>  - Alistair
> Thanks for the explanation.
>
> I guess the commit message confused me a bit, especially these two sentences:
>
> "Any page of a process can be migrated to such memory. However no one should 
> be
> allowed to pin such memory so that it can always be evicted."
>
> I read them as if the OS is free to choose which pages are migrated to
> this memory,
> and anything is eligible for migration to that memory (and that's why
> we also don't
> allow it to pin memory there).
>
> If we are not allowed to pin anything there, can the device driver
> decide to disable
> any option for oversubscription of this memory area ?

I'm not sure I follow your thinking on how oversubscription would work
here, however all allocations are controlled by the driver. So if a
device's coherent memory is full a driver would be unable to migrate
pages to that device until pages are freed by the OS due to being
unmapped or the driver evicts pages by migrating them back to normal CPU
memory.

Pinning of pages is allowed, and could prevent such migrations. However
this patch series prevents device coherent pages from being pinned
longterm (ie. with FOLL_LONGTERM), so it should always be able to evict
pages eventually.

> Let's assume the user uses this memory area for doing p2p with other
> CXL devices.
> In that case, I wouldn't want the driver/OS to migrate pages in and
> out of that memory...

The OS will not migrate pages in or out (although it may free them if no
longer required), but a driver might choose to. So at the moment it's
really up to the driver to implement what you want in this regards.

> So either I should let the user pin those pages, or prevent him from
> doing (accidently or n

Re: [PATCH v5 01/13] mm: add zone device coherent type memory support

2022-06-20 Thread Oded Gabbay
On Mon, Jun 20, 2022 at 11:50 AM Alistair Popple  wrote:
>
>
> Oded Gabbay  writes:
>
> > On Mon, Jun 20, 2022 at 3:33 AM Alistair Popple  wrote:
> >>
> >>
> >> Oded Gabbay  writes:
> >>
> >> > On Fri, Jun 17, 2022 at 8:20 PM Sierra Guiza, Alejandro (Alex)
> >> >  wrote:
> >> >>
> >> >>
> >> >> On 6/17/2022 4:40 AM, David Hildenbrand wrote:
> >> >> > On 31.05.22 22:00, Alex Sierra wrote:
> >> >> >> Device memory that is cache coherent from device and CPU point of 
> >> >> >> view.
> >> >> >> This is used on platforms that have an advanced system bus (like CAPI
> >> >> >> or CXL). Any page of a process can be migrated to such memory. 
> >> >> >> However,
> >> >> >> no one should be allowed to pin such memory so that it can always be
> >> >> >> evicted.
> >> >> >>
> >> >> >> Signed-off-by: Alex Sierra 
> >> >> >> Acked-by: Felix Kuehling 
> >> >> >> Reviewed-by: Alistair Popple 
> >> >> >> [hch: rebased ontop of the refcount changes,
> >> >> >>removed is_dev_private_or_coherent_page]
> >> >> >> Signed-off-by: Christoph Hellwig 
> >> >> >> ---
> >> >> >>   include/linux/memremap.h | 19 +++
> >> >> >>   mm/memcontrol.c  |  7 ---
> >> >> >>   mm/memory-failure.c  |  8 ++--
> >> >> >>   mm/memremap.c| 10 ++
> >> >> >>   mm/migrate_device.c  | 16 +++-
> >> >> >>   mm/rmap.c|  5 +++--
> >> >> >>   6 files changed, 49 insertions(+), 16 deletions(-)
> >> >> >>
> >> >> >> diff --git a/include/linux/memremap.h b/include/linux/memremap.h
> >> >> >> index 8af304f6b504..9f752ebed613 100644
> >> >> >> --- a/include/linux/memremap.h
> >> >> >> +++ b/include/linux/memremap.h
> >> >> >> @@ -41,6 +41,13 @@ struct vmem_altmap {
> >> >> >>* A more complete discussion of unaddressable memory may be found 
> >> >> >> in
> >> >> >>* include/linux/hmm.h and Documentation/vm/hmm.rst.
> >> >> >>*
> >> >> >> + * MEMORY_DEVICE_COHERENT:
> >> >> >> + * Device memory that is cache coherent from device and CPU point 
> >> >> >> of view. This
> >> >> >> + * is used on platforms that have an advanced system bus (like CAPI 
> >> >> >> or CXL). A
> >> >> >> + * driver can hotplug the device memory using ZONE_DEVICE and with 
> >> >> >> that memory
> >> >> >> + * type. Any page of a process can be migrated to such memory. 
> >> >> >> However no one
> >> >> > Any page might not be right, I'm pretty sure. ... just thinking about 
> >> >> > special pages
> >> >> > like vdso, shared zeropage, ... pinned pages ...
> >> >>
> >> >> Hi David,
> >> >>
> >> >> Yes, I think you're right. This type does not cover all special pages.
> >> >> I need to correct that on the cover letter.
> >> >> Pinned pages are allowed as long as they're not long term pinned.
> >> >>
> >> >> Regards,
> >> >> Alex Sierra
> >> >
> >> > What if I want to hotplug this device's coherent memory, but I do
> >> > *not* want the OS
> >> > to migrate any page to it ?
> >> > I want to fully-control what resides on this memory, as I can consider
> >> > this memory
> >> > "expensive". i.e. I don't have a lot of it, I want to use it for
> >> > specific purposes and
> >> > I don't want the OS to start using it when there is some memory pressure 
> >> > in
> >> > the system.
> >>
> >> This is exactly what MEMORY_DEVICE_COHERENT is for. Device coherent
> >> pages are only allocated by a device driver and exposed to user-space by
> >> a driver migrating pages to them with migrate_vma. The OS can't just
> >> start using them due to memory pressure for example.
> >>
> >>  - Alistair
> > Thanks for the explanation.
> >
> > I guess the commit message confused me a bit, especially these two 
> > sentences:
> >
> > "Any page of a process can be migrated to such memory. However no one 
> > should be
> > allowed to pin such memory so that it can always be evicted."
> >
> > I read them as if the OS is free to choose which pages are migrated to
> > this memory,
> > and anything is eligible for migration to that memory (and that's why
> > we also don't
> > allow it to pin memory there).
> >
> > If we are not allowed to pin anything there, can the device driver
> > decide to disable
> > any option for oversubscription of this memory area ?
>
> I'm not sure I follow your thinking on how oversubscription would work
> here, however all allocations are controlled by the driver. So if a
> device's coherent memory is full a driver would be unable to migrate
> pages to that device until pages are freed by the OS due to being
> unmapped or the driver evicts pages by migrating them back to normal CPU
> memory.
>
> Pinning of pages is allowed, and could prevent such migrations. However
> this patch series prevents device coherent pages from being pinned
> longterm (ie. with FOLL_LONGTERM), so it should always be able to evict
> pages eventually.
>
> > Let's assume the user uses this memory area for doing p2p with other
> > CXL devices.
> > In that case, I wouldn't want the driver/OS to migrate pages in and
> 

[PATCH] drm/amdgpu: add LSDMA block for LSDMA v6.0.1

2022-06-20 Thread Yifan Zhang
This patch adds LSDMA ip block for LSDMA v6.0.1.

Signed-off-by: Yifan Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 11bbd76c581c..37234c2998d7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -2333,6 +2333,7 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device 
*adev)
 
switch (adev->ip_versions[LSDMA_HWIP][0]) {
case IP_VERSION(6, 0, 0):
+   case IP_VERSION(6, 0, 1):
case IP_VERSION(6, 0, 2):
adev->lsdma.funcs = &lsdma_v6_0_funcs;
break;
-- 
2.35.1



Re: Performance drop using deinterlace_vaapi on 5.19-rcX

2022-06-20 Thread Christian König

Hi Thomas,

[moving vger to bcc]

mhm, sounds like something isn't running in parallel any more.

We usually don't test the multimedia engines for this but we do test 
gfx+compute, so I'm really wondering what goes wrong here.


Could you run some tests for me? Additional to that I'm going to raise 
that issue with our multimedia guys later today.


Thanks for the info,
Christian.

Am 18.06.22 um 18:13 schrieb Thomas Voegtle:


Hello,

I noticed a performance drop encoding a mpeg file to a h264 video using
the vaapi option deinterlace_vaapi on a Haswell i5-4570 with Linux
5.19-rc1.

A 10 minute long video takes normally 41s to convert, now with 5.19-rc1
it takes about 2m 36s.

My ffmpeg line is:
ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
-hwaccel_output_format vaapi -i test.vdr -vf 'deinterlace_vaapi' -c:v
h264_vaapi

Removing the option deinterlace_vaapi shows no difference in 
performance between 5.18 and 5.19-rcX.



I bisected this down to:

commit 047a1b877ed48098bed71fcfb1d4891e1b54441d
Author: Christian König 
Date:   Tue Nov 23 09:33:07 2021 +0100

    dma-buf & drm/amdgpu: remove dma_resv workaround


and wasn't able to revert this one on top of 5.19-rcX.

I tried the predecessor commit:

commit 73511edf8b196e6f1ccda0fdf294ff57aa2dc9db (HEAD)
Author: Christian König 
Date:   Tue Nov 9 11:08:18 2021 +0100

    dma-buf: specify usage while adding fences to dma_resv obj v7

which is fine.

Using ffmpeg 5.0.1 with libva 2.10.0 and intel vaapi driver 2.4.1


 Best regards,

    Thomas




Re: radeon driver warning

2022-06-20 Thread Christian König

Am 17.06.22 um 16:22 schrieb John Garry:

Hi Christian,


Am 17.06.22 um 14:01 schrieb John Garry:

On 17/06/2022 12:57, Christian König wrote:


And/Or compile out the warning when "warnings = errors"?


That should be doable I think.


ok, if something can be done then I would appreciate it. I do much 
randconfig builds as part of my upstream process and anything 
breaking is a bit of a pain.


I've just double checked the code and we have already wrapped the 
warning into "#ifndef CONFIG_COMPILE_TEST".


Yes



So the question is why does your random config not set 
CONFIG_COMPILE_TEST?


My randconfig does not have CONFIG_COMPILE_TEST set - see attached. 
AFAIK randconfig does not always set CONFIG_COMPILE_TEST.


Mhm, we could probably change the ifdef. But a random configuration 
which doesn't sets CONFIG_COMPILE_TEST sounds like a bug to me as well.


Going to provide a patch for changing the ifdef, but not sure when I 
will have time for that.


Regards,
Christian.



Thanks,
John




[PATCH] drm/amdkfd: correct sdma queue number of sdma 6.0.1

2022-06-20 Thread Yifan Zhang
sdma 6.0.1 has 8 queues instead of 2.

Fixes: 2f68559102cb (drm/amdkfd: add GC 11.0.1 KFD support)
Signed-off-by: Yifan Zhang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index bf4200457772..c8fee0dbfdcb 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -75,7 +75,6 @@ static void kfd_device_info_set_sdma_info(struct kfd_dev *kfd)
case IP_VERSION(5, 2, 3):/* YELLOW_CARP */
case IP_VERSION(5, 2, 6):/* GC 10.3.6 */
case IP_VERSION(5, 2, 7):/* GC 10.3.7 */
-   case IP_VERSION(6, 0, 1):
kfd->device_info.num_sdma_queues_per_engine = 2;
break;
case IP_VERSION(4, 2, 0):/* VEGA20 */
@@ -90,6 +89,7 @@ static void kfd_device_info_set_sdma_info(struct kfd_dev *kfd)
case IP_VERSION(5, 2, 4):/* DIMGREY_CAVEFISH */
case IP_VERSION(5, 2, 5):/* BEIGE_GOBY */
case IP_VERSION(6, 0, 0):
+   case IP_VERSION(6, 0, 1):
case IP_VERSION(6, 0, 2):
kfd->device_info.num_sdma_queues_per_engine = 8;
break;
-- 
2.35.1



Re: [PATCH v5 01/13] mm: add zone device coherent type memory support

2022-06-20 Thread Alistair Popple


Oded Gabbay  writes:

> On Fri, Jun 17, 2022 at 8:20 PM Sierra Guiza, Alejandro (Alex)
>  wrote:
>>
>>
>> On 6/17/2022 4:40 AM, David Hildenbrand wrote:
>> > On 31.05.22 22:00, Alex Sierra wrote:
>> >> Device memory that is cache coherent from device and CPU point of view.
>> >> This is used on platforms that have an advanced system bus (like CAPI
>> >> or CXL). Any page of a process can be migrated to such memory. However,
>> >> no one should be allowed to pin such memory so that it can always be
>> >> evicted.
>> >>
>> >> Signed-off-by: Alex Sierra 
>> >> Acked-by: Felix Kuehling 
>> >> Reviewed-by: Alistair Popple 
>> >> [hch: rebased ontop of the refcount changes,
>> >>removed is_dev_private_or_coherent_page]
>> >> Signed-off-by: Christoph Hellwig 
>> >> ---
>> >>   include/linux/memremap.h | 19 +++
>> >>   mm/memcontrol.c  |  7 ---
>> >>   mm/memory-failure.c  |  8 ++--
>> >>   mm/memremap.c| 10 ++
>> >>   mm/migrate_device.c  | 16 +++-
>> >>   mm/rmap.c|  5 +++--
>> >>   6 files changed, 49 insertions(+), 16 deletions(-)
>> >>
>> >> diff --git a/include/linux/memremap.h b/include/linux/memremap.h
>> >> index 8af304f6b504..9f752ebed613 100644
>> >> --- a/include/linux/memremap.h
>> >> +++ b/include/linux/memremap.h
>> >> @@ -41,6 +41,13 @@ struct vmem_altmap {
>> >>* A more complete discussion of unaddressable memory may be found in
>> >>* include/linux/hmm.h and Documentation/vm/hmm.rst.
>> >>*
>> >> + * MEMORY_DEVICE_COHERENT:
>> >> + * Device memory that is cache coherent from device and CPU point of 
>> >> view. This
>> >> + * is used on platforms that have an advanced system bus (like CAPI or 
>> >> CXL). A
>> >> + * driver can hotplug the device memory using ZONE_DEVICE and with that 
>> >> memory
>> >> + * type. Any page of a process can be migrated to such memory. However 
>> >> no one
>> > Any page might not be right, I'm pretty sure. ... just thinking about 
>> > special pages
>> > like vdso, shared zeropage, ... pinned pages ...
>>
>> Hi David,
>>
>> Yes, I think you're right. This type does not cover all special pages.
>> I need to correct that on the cover letter.
>> Pinned pages are allowed as long as they're not long term pinned.
>>
>> Regards,
>> Alex Sierra
>
> What if I want to hotplug this device's coherent memory, but I do
> *not* want the OS
> to migrate any page to it ?
> I want to fully-control what resides on this memory, as I can consider
> this memory
> "expensive". i.e. I don't have a lot of it, I want to use it for
> specific purposes and
> I don't want the OS to start using it when there is some memory pressure in
> the system.

This is exactly what MEMORY_DEVICE_COHERENT is for. Device coherent
pages are only allocated by a device driver and exposed to user-space by
a driver migrating pages to them with migrate_vma. The OS can't just
start using them due to memory pressure for example.

 - Alistair

> Oded
>
>>
>> >
>> >> + * should be allowed to pin such memory so that it can always be evicted.
>> >> + *
>> >>* MEMORY_DEVICE_FS_DAX:
>> >>* Host memory that has similar access semantics as System RAM i.e. DMA
>> >>* coherent and supports page pinning. In support of coordinating page
>> >> @@ -61,6 +68,7 @@ struct vmem_altmap {
>> >>   enum memory_type {
>> >>  /* 0 is reserved to catch uninitialized type fields */
>> >>  MEMORY_DEVICE_PRIVATE = 1,
>> >> +MEMORY_DEVICE_COHERENT,
>> >>  MEMORY_DEVICE_FS_DAX,
>> >>  MEMORY_DEVICE_GENERIC,
>> >>  MEMORY_DEVICE_PCI_P2PDMA,
>> >> @@ -143,6 +151,17 @@ static inline bool folio_is_device_private(const 
>> >> struct folio *folio)
>> > In general, this LGTM, and it should be correct with PageAnonExclusive I 
>> > think.
>> >
>> >
>> > However, where exactly is pinning forbidden?
>>
>> Long-term pinning is forbidden since it would interfere with the device
>> memory manager owning the
>> device-coherent pages (e.g. evictions in TTM). However, normal pinning
>> is allowed on this device type.
>>
>> Regards,
>> Alex Sierra
>>
>> >


Performance drop using deinterlace_vaapi on 5.19-rcX

2022-06-20 Thread Thomas Voegtle


Hello,

I noticed a performance drop encoding a mpeg file to a h264 video using
the vaapi option deinterlace_vaapi on a Haswell i5-4570 with Linux
5.19-rc1.

A 10 minute long video takes normally 41s to convert, now with 5.19-rc1
it takes about 2m 36s.

My ffmpeg line is:
ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
-hwaccel_output_format vaapi -i test.vdr -vf 'deinterlace_vaapi' -c:v
h264_vaapi

Removing the option deinterlace_vaapi shows no difference in performance 
between 5.18 and 5.19-rcX.



I bisected this down to:

commit 047a1b877ed48098bed71fcfb1d4891e1b54441d
Author: Christian König 
Date:   Tue Nov 23 09:33:07 2021 +0100

dma-buf & drm/amdgpu: remove dma_resv workaround


and wasn't able to revert this one on top of 5.19-rcX.

I tried the predecessor commit:

commit 73511edf8b196e6f1ccda0fdf294ff57aa2dc9db (HEAD)
Author: Christian König 
Date:   Tue Nov 9 11:08:18 2021 +0100

dma-buf: specify usage while adding fences to dma_resv obj v7

which is fine.

Using ffmpeg 5.0.1 with libva 2.10.0 and intel vaapi driver 2.4.1


 Best regards,

Thomas

[PATCH] drm/amd/display: Add missing hard-float compile flags for PPC64 builds

2022-06-20 Thread Guenter Roeck
ppc:allmodconfig builds fail with the following error.

powerpc64-linux-ld:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o
uses hard float,
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o
uses soft float
powerpc64-linux-ld:
failed to merge target specific data of file
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o
powerpc64-linux-ld:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o
uses hard float,
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn315/dcn315_resource.o
uses soft float
powerpc64-linux-ld:
failed to merge target specific data of
file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn315/dcn315_resource.o
powerpc64-linux-ld:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o
uses hard float,
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn316/dcn316_resource.o
uses soft float
powerpc64-linux-ld:
failed to merge target specific data of file
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn316/dcn316_resource.o

The problem was introduced with commit 41b7a347bf14 ("powerpc: Book3S
64-bit outline-only KASAN support") which adds support for KASAN. This
commit in turn enables DRM_AMD_DC_DCN because KCOV_INSTRUMENT_ALL and
KCOV_ENABLE_COMPARISONS are no longer enabled. As result, new files are
compiled which lack the selection of hard-float.

Fixes: 41b7a347bf14 ("powerpc: Book3S 64-bit outline-only KASAN support")
Cc: Michael Ellerman 
Cc: Daniel Axtens 
Signed-off-by: Guenter Roeck 
---
 drivers/gpu/drm/amd/display/dc/dcn31/Makefile  | 4 
 drivers/gpu/drm/amd/display/dc/dcn315/Makefile | 4 
 drivers/gpu/drm/amd/display/dc/dcn316/Makefile | 4 
 3 files changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
index ec041e3cda30..74be02114ae4 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
@@ -15,6 +15,10 @@ DCN31 = dcn31_resource.o dcn31_hubbub.o dcn31_hwseq.o 
dcn31_init.o dcn31_hubp.o
dcn31_apg.o dcn31_hpo_dp_stream_encoder.o dcn31_hpo_dp_link_encoder.o \
dcn31_afmt.o dcn31_vpg.o
 
+ifdef CONFIG_PPC64
+CFLAGS_$(AMDDALPATH)/dc/dcn31/dcn31_resource.o := -mhard-float -maltivec
+endif
+
 AMD_DAL_DCN31 = $(addprefix $(AMDDALPATH)/dc/dcn31/,$(DCN31))
 
 AMD_DISPLAY_FILES += $(AMD_DAL_DCN31)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn315/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn315/Makefile
index 59381d24800b..1395c1ced8c5 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn315/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn315/Makefile
@@ -25,6 +25,10 @@
 
 DCN315 = dcn315_resource.o
 
+ifdef CONFIG_PPC64
+CFLAGS_$(AMDDALPATH)/dc/dcn315/dcn315_resource.o := -mhard-float -maltivec
+endif
+
 AMD_DAL_DCN315 = $(addprefix $(AMDDALPATH)/dc/dcn315/,$(DCN315))
 
 AMD_DISPLAY_FILES += $(AMD_DAL_DCN315)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn316/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn316/Makefile
index 819d44a9439b..c3d2dd78f1e2 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn316/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn316/Makefile
@@ -25,6 +25,10 @@
 
 DCN316 = dcn316_resource.o
 
+ifdef CONFIG_PPC64
+CFLAGS_$(AMDDALPATH)/dc/dcn316/dcn316_resource.o := -mhard-float -maltivec
+endif
+
 AMD_DAL_DCN316 = $(addprefix $(AMDDALPATH)/dc/dcn316/,$(DCN316))
 
 AMD_DISPLAY_FILES += $(AMD_DAL_DCN316)
-- 
2.35.1