Re: [PATCH 3/3] drm/amdgpu: wire up the can_remove() callback

2024-02-09 Thread Daniel Vetter
On Tue, Feb 06, 2024 at 07:42:49PM +0100, Christian König wrote:
> Am 06.02.24 um 15:29 schrieb Daniel Vetter:
> > On Fri, Feb 02, 2024 at 03:40:03PM -0800, Greg Kroah-Hartman wrote:
> > > On Fri, Feb 02, 2024 at 05:25:56PM -0500, Hamza Mahfooz wrote:
> > > > Removing an amdgpu device that still has user space references allocated
> > > > to it causes undefined behaviour.
> > > Then fix that please.  There should not be anything special about your
> > > hardware that all of the tens of thousands of other devices can't handle
> > > today.
> > > 
> > > What happens when I yank your device out of a system with a pci hotplug
> > > bus?  You can't prevent that either, so this should not be any different
> > > at all.
> > > 
> > > sorry, but please, just fix your driver.
> > fwiw Christian König from amd already rejected this too, I have no idea
> > why this was submitted
> 
> Well that was my fault.
> 
> I commented on an internal bug tracker that when sysfs bind/undbind is a
> different code path from PCI remove/re-scan we could try to reject it.
> 
> Turned out it isn't a different code path.

Yeah it's exactly the same code, and removing the sysfs stuff means we
cant test hotunplug without physical hotunplugging stuff anymore. So
really not great - if one is buggy so is the other, and sysfs allows us to
control the timing a lot better to hit specific issues.
-Sima

> >   since the very elaborate plan I developed with a
> > bunch of amd folks was to fix the various lifetime lolz we still have in
> > drm. We unfortunately export the world of internal objects to userspace as
> > uabi objects with dma_buf, dma_fence and everything else, but it's all
> > fixable and we have the plan even documented:
> > 
> > https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#device-hot-unplug
> > 
> > So yeah anything that isn't that plan of record is very much no-go for drm
> > drivers. Unless we change that plan of course, but that needs a
> > documentation patch first and a big discussion.
> > 
> > Aside from an absolute massive pile of kernel-internal refcounting bugs
> > the really big one we agreed on after a lot of discussion is that SIGBUS
> > on dma-buf mmaps is no-go for drm drivers, because it would break way too
> > much userspace in ways which are simply not fixable (since sig handlers
> > are shared in a process, which means the gl/vk driver cannot use it).
> > 
> > Otherwise it's bog standard "fix the kernel bugs" work, just a lot of it.
> 
> Ignoring a few memory leaks because of messed up refcounting we actually got
> that working quite nicely.
> 
> At least hot unplug / hot add seems to be working rather reliable in our
> internal testing.
> 
> So it can't be that messed up.
> 
> Regards,
> Christian.
> 
> > 
> > Cheers, Sima
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v4 1/3] drm: Add drm_get_acpi_edid() helper

2024-02-09 Thread Daniel Vetter
On Thu, Feb 08, 2024 at 11:57:11AM +0200, Jani Nikula wrote:
> On Wed, 07 Feb 2024, Mario Limonciello  wrote:
> > Some manufacturers have intentionally put an EDID that differs from
> > the EDID on the internal panel on laptops.  Drivers can call this
> > helper to attempt to fetch the EDID from the BIOS's ACPI _DDC method.
> >
> > Signed-off-by: Mario Limonciello 
> > ---
> >  drivers/gpu/drm/Kconfig|  5 +++
> >  drivers/gpu/drm/drm_edid.c | 77 ++
> >  include/drm/drm_edid.h |  1 +
> >  3 files changed, 83 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> > index 6ec33d36f3a4..ec2bb71e8b36 100644
> > --- a/drivers/gpu/drm/Kconfig
> > +++ b/drivers/gpu/drm/Kconfig
> > @@ -21,6 +21,11 @@ menuconfig DRM
> > select KCMP
> > select VIDEO_CMDLINE
> > select VIDEO_NOMODESET
> > +   select ACPI_VIDEO if ACPI
> > +   select BACKLIGHT_CLASS_DEVICE if ACPI
> > +   select INPUT if ACPI
> > +   select X86_PLATFORM_DEVICES if ACPI && X86
> > +   select ACPI_WMI if ACPI && X86
> 
> I think I'll defer to drm maintainers on whether this is okay or
> something to be avoided.

Uh yeah this is a bit much, and select just messes with everything. Just
#ifdef this in the code with a dummy alternative, if users configure their
kernel without acpi but need it, they get to keep all the pieces.

Alternatively make a DRM_ACPI_HELPERS symbol, but imo a Kconfig for every
function is also not great. And just using #ifdef in the code also works
for CONFIG_OF, which is exactly the same thing for platforms using dt to
describe hw.

Also I'd expect ACPI code to already provide dummy functions if ACPI is
provided, so you probably dont even need all that much #ifdef in the code.

What we defo cant do is select platform/hw stuff just because you enable
CONFIG_DRM.
-Sima

> 
> 
> > help
> >   Kernel-level support for the Direct Rendering Infrastructure (DRI)
> >   introduced in XFree86 4.0. If you say Y here, you need to select
> > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
> > index 923c4423151c..c649b4f9fd8e 100644
> > --- a/drivers/gpu/drm/drm_edid.c
> > +++ b/drivers/gpu/drm/drm_edid.c
> > @@ -28,6 +28,7 @@
> >   * DEALINGS IN THE SOFTWARE.
> >   */
> >  
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -2188,6 +2189,49 @@ drm_do_probe_ddc_edid(void *data, u8 *buf, unsigned 
> > int block, size_t len)
> > return ret == xfers ? 0 : -1;
> >  }
> >  
> > +/**
> > + * drm_do_probe_acpi_edid() - get EDID information via ACPI _DDC
> > + * @data: struct drm_device
> > + * @buf: EDID data buffer to be filled
> > + * @block: 128 byte EDID block to start fetching from
> > + * @len: EDID data buffer length to fetch
> > + *
> > + * Try to fetch EDID information by calling acpi_video_get_edid() function.
> > + *
> > + * Return: 0 on success or error code on failure.
> > + */
> > +static int
> > +drm_do_probe_acpi_edid(void *data, u8 *buf, unsigned int block, size_t len)
> > +{
> > +   struct drm_device *ddev = data;
> > +   struct acpi_device *acpidev = ACPI_COMPANION(ddev->dev);
> > +   unsigned char start = block * EDID_LENGTH;
> > +   void *edid;
> > +   int r;
> > +
> > +   if (!acpidev)
> > +   return -ENODEV;
> > +
> > +   /* fetch the entire edid from BIOS */
> > +   r = acpi_video_get_edid(acpidev, ACPI_VIDEO_DISPLAY_LCD, -1, &edid);
> > +   if (r < 0) {
> > +   DRM_DEBUG_KMS("Failed to get EDID from ACPI: %d\n", r);
> > +   return -EINVAL;
> > +   }
> > +   if (len > r || start > r || start + len > r) {
> > +   r = -EINVAL;
> > +   goto cleanup;
> > +   }
> > +
> > +   memcpy(buf, edid + start, len);
> > +   r = 0;
> > +
> > +cleanup:
> > +   kfree(edid);
> > +
> > +   return r;
> > +}
> > +
> >  static void connector_bad_edid(struct drm_connector *connector,
> >const struct edid *edid, int num_blocks)
> >  {
> > @@ -2643,6 +2687,39 @@ struct edid *drm_get_edid(struct drm_connector 
> > *connector,
> >  }
> >  EXPORT_SYMBOL(drm_get_edid);
> >  
> > +/**
> > + * drm_get_acpi_edid - get EDID data, if available
> 
> I'd prefer all the new EDID API to be named drm_edid_*. Makes a clean
> break from the old API, and is more consistent.
> 
> So perhaps drm_edid_read_acpi() to be in line with all the other struct
> drm_edid based EDID reading functions.
> 
> > + * @connector: connector we're probing
> > + *
> > + * Use the BIOS to attempt to grab EDID data if possible.
> > + *
> > + * The returned pointer must be freed using drm_edid_free().
> > + *
> > + * Return: Pointer to valid EDID or NULL if we couldn't find any.
> > + */
> > +const struct drm_edid *drm_get_acpi_edid(struct drm_connector *connector)
> > +{
> > +   const struct drm_edid *drm_edid;
> > +
> > +   switch (connector->connector_type) {
> > +   case DRM_MODE_CONNECTOR_LVDS:
> > +   case DRM_MODE_CONNECTOR_eDP:
> > +   break;
> > +   default:
> > +   re

Re: [PATCH] drm/buddy: Fix alloc_range() error handling code

2024-02-09 Thread Arunpravin Paneer Selvam




On 2/8/2024 7:47 PM, Matthew Auld wrote:

On 08/02/2024 13:47, Arunpravin Paneer Selvam wrote:

Hi Matthew,

On 2/8/2024 7:00 PM, Matthew Auld wrote:

On 07/02/2024 17:44, Arunpravin Paneer Selvam wrote:

Few users have observed display corruption when they boot
the machine to KDE Plasma or playing games. We have root
caused the problem that whenever alloc_range() couldn't
find the required memory blocks the function was returning
SUCCESS in some of the corner cases.


Can you please give an example here?

In the try hard contiguous allocation, for example the requested 
memory is 1024 pages,
it might go and pick the highest and last block (of size 512 pages) 
in the freelist where
there are no more space exist in the total address range. In this 
kind of corner case,
alloc_range was returning success though the allocated size is less 
than the requested size.
Hence in try_hard_contiguous_allocation, we will not proceed to the 
LHS allocation and
we return only with the RHS allocation having only the 512 pages of 
allocation. This
leads to display corruption in many use cases (I think mainly when 
requested for contiguous huge buffer)

mainly on APU platforms.


Ok, I guess other thing is doing:

lhs_offset = drm_buddy_block_offset(block) - lhs_size;

I presume it's possible for block_offset < lhs_size here, which might 
be funny?
yes, seems it is possible, I will modify the lhs_offset calculation and 
send the patch for review.


Thanks,
Arun.




Thanks,
Arun.


The right approach would be if the total allocated size
is less than the required size, the function should
return -ENOSPC.

Gitlab ticket link - 
https://gitlab.freedesktop.org/drm/amd/-/issues/3097
Fixes: 0a1844bf0b53 ("drm/buddy: Improve contiguous memory 
allocation")
Signed-off-by: Arunpravin Paneer Selvam 


Tested-by: Mario Limonciello 
---
  drivers/gpu/drm/drm_buddy.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index f57e6d74fb0e..c1a99bf4dffd 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -539,6 +539,12 @@ static int __alloc_range(struct drm_buddy *mm,
  } while (1);
    list_splice_tail(&allocated, blocks);
+
+    if (total_allocated < size) {
+    err = -ENOSPC;
+    goto err_free;
+    }
+
  return 0;
    err_undo:






RE: [PATCH] drm/amd/display: Fix && vs || typos

2024-02-09 Thread Koo, Anthony
[AMD Official Use Only - General]

Reviewed-by: Anthony Koo 

Looks good, my mistake for not noticing this!

Thanks,
Anthony

-Original Message-
From: Dan Carpenter 
Sent: Friday, February 9, 2024 8:03 AM
To: SHANMUGAM, SRINIVASAN 
Cc: Wentland, Harry ; Li, Sun peng (Leo) 
; Siqueira, Rodrigo ; Deucher, 
Alexander ; Koenig, Christian 
; Pan, Xinhui ; David Airlie 
; Daniel Vetter ; Kazlauskas, Nicholas 
; Koo, Anthony ; Pavic, Josip 
; Huang, Leon ; Adhuri, Mounika 
; Huang, Lewis ; 
amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; 
linux-ker...@vger.kernel.org; kernel-janit...@vger.kernel.org
Subject: [PATCH] drm/amd/display: Fix && vs || typos

These ANDs should be ORs or it will lead to a NULL dereference.

Fixes: fb5a3d037082 ("drm/amd/display: Add NULL test for 'timing generator' in 
'dcn21_set_pipe()'")
Fixes: 886571d217d7 ("drm/amd/display: Fix 'panel_cntl' could be null in 
'dcn21_set_backlight_level()'")
Signed-off-by: Dan Carpenter 
---
 drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
index 5c7f380a84f9..7252f5f781f0 100644
--- a/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
@@ -211,7 +211,7 @@ void dcn21_set_pipe(struct pipe_ctx *pipe_ctx)
struct dmcu *dmcu = pipe_ctx->stream->ctx->dc->res_pool->dmcu;
uint32_t otg_inst;

-   if (!abm && !tg && !panel_cntl)
+   if (!abm || !tg || !panel_cntl)
return;

otg_inst = tg->inst;
@@ -245,7 +245,7 @@ bool dcn21_set_backlight_level(struct pipe_ctx *pipe_ctx,
struct panel_cntl *panel_cntl = pipe_ctx->stream->link->panel_cntl;
uint32_t otg_inst;

-   if (!abm && !tg && !panel_cntl)
+   if (!abm || !tg || !panel_cntl)
return false;

otg_inst = tg->inst;
--
2.43.0



[PATCH] drm/amd/display: Fix && vs || typos

2024-02-09 Thread Dan Carpenter
These ANDs should be ORs or it will lead to a NULL dereference.

Fixes: fb5a3d037082 ("drm/amd/display: Add NULL test for 'timing generator' in 
'dcn21_set_pipe()'")
Fixes: 886571d217d7 ("drm/amd/display: Fix 'panel_cntl' could be null in 
'dcn21_set_backlight_level()'")
Signed-off-by: Dan Carpenter 
---
 drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
index 5c7f380a84f9..7252f5f781f0 100644
--- a/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
@@ -211,7 +211,7 @@ void dcn21_set_pipe(struct pipe_ctx *pipe_ctx)
struct dmcu *dmcu = pipe_ctx->stream->ctx->dc->res_pool->dmcu;
uint32_t otg_inst;
 
-   if (!abm && !tg && !panel_cntl)
+   if (!abm || !tg || !panel_cntl)
return;
 
otg_inst = tg->inst;
@@ -245,7 +245,7 @@ bool dcn21_set_backlight_level(struct pipe_ctx *pipe_ctx,
struct panel_cntl *panel_cntl = pipe_ctx->stream->link->panel_cntl;
uint32_t otg_inst;
 
-   if (!abm && !tg && !panel_cntl)
+   if (!abm || !tg || !panel_cntl)
return false;
 
otg_inst = tg->inst;
-- 
2.43.0



[PATCH] drm/amd/display: Fix && vs || in 'edp_set_replay_allow_active()'

2024-02-09 Thread Srinivasan Shanmugam
AND should be OR or it will lead to a NULL dereference.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_edp_panel_control.c:895
 edp_set_replay_allow_active() error: we previously assumed 'replay' could be 
null (see line 887)

Fixes: c7ddc0a800bc ("drm/amd/display: Add Functions to enable Freesync Panel 
Replay")
Cc: Bhawanpreet Lakha 
Cc: Harry Wentland 
Cc: Rodrigo Siqueira 
Cc: Aurabindo Pillai 
Signed-off-by: Srinivasan Shanmugam 
---
 .../drm/amd/display/dc/link/protocols/link_edp_panel_control.c  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/drivers/gpu/drm/amd/display/dc/link/protocols/link_edp_panel_control.c 
b/drivers/gpu/drm/amd/display/dc/link/protocols/link_edp_panel_control.c
index 443215b96308..77648228ec60 100644
--- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_edp_panel_control.c
+++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_edp_panel_control.c
@@ -884,7 +884,7 @@ bool edp_set_replay_allow_active(struct dc_link *link, 
const bool *allow_active,
struct dmub_replay *replay = dc->res_pool->replay;
unsigned int panel_inst;
 
-   if (replay == NULL && force_static)
+   if (!replay || force_static)
return false;
 
if (!dc_get_edp_link_panel_inst(dc, link, &panel_inst))
-- 
2.34.1



[PATCH] drm/buddy: Fix alloc_range() error handling code

2024-02-09 Thread Arunpravin Paneer Selvam
Few users have observed display corruption when they boot
the machine to KDE Plasma or playing games. We have root
caused the problem that whenever alloc_range() couldn't
find the required memory blocks the function was returning
SUCCESS in some of the corner cases.

The right approach would be if the total allocated size
is less than the required size, the function should
return -ENOSPC.

Cc:   # 6.7+
Fixes: 0a1844bf0b53 ("drm/buddy: Improve contiguous memory allocation")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3097
Tested-by: Mario Limonciello 
Link: 
https://patchwork.kernel.org/project/dri-devel/patch/20240207174456.341121-1-arunpravin.paneersel...@amd.com/
Reviewed-by: Matthew Auld 
Signed-off-by: Arunpravin Paneer Selvam 
---
 drivers/gpu/drm/drm_buddy.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index f57e6d74fb0e..c1a99bf4dffd 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -539,6 +539,12 @@ static int __alloc_range(struct drm_buddy *mm,
} while (1);
 
list_splice_tail(&allocated, blocks);
+
+   if (total_allocated < size) {
+   err = -ENOSPC;
+   goto err_free;
+   }
+
return 0;
 
 err_undo:
-- 
2.25.1



RE: [PATCH] drm/amd/display: Fix possible buffer overflow in 'find_dcfclk_for_voltage()'

2024-02-09 Thread Li, Roman
[Public]

> -Original Message-
> From: SHANMUGAM, SRINIVASAN 
> Sent: Tuesday, February 6, 2024 11:55 PM
> To: Siqueira, Rodrigo ; Pillai, Aurabindo
> 
> Cc: amd-gfx@lists.freedesktop.org; SHANMUGAM, SRINIVASAN
> ; Li, Roman 
> Subject: [PATCH] drm/amd/display: Fix possible buffer overflow in
> 'find_dcfclk_for_voltage()'
>
> when 'find_dcfclk_for_voltage()' function is looping over
> VG_NUM_SOC_VOLTAGE_LEVELS (which is 8), but the size of the DcfClocks
> array is VG_NUM_DCFCLK_DPM_LEVELS (which is 7).
>
> When the loop variable i reaches 7, the function tries to access clock_table-
> >DcfClocks[7]. However, since the size of the DcfClocks array is 7, the valid
> indices are 0 to 6. Index 7 is beyond the size of the array, leading to a 
> buffer
> overflow.
>
> Fixes the below:
> drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn301/vg_clk_mgr.c:
> 550 find_dcfclk_for_voltage() error: buffer overflow 'clock_table->DcfClocks' 
> 7
> <= 7

I recommend mentioning that this is a static analysis tool error.
With that:
Reviewed-by: Roman Li 

>
> Fixes: 3a83e4e64bb1 ("drm/amd/display: Add dcn3.01 support to DC (v2)")
> Cc: Roman Li 
> Cc: Rodrigo Siqueira 
> Cc: Aurabindo Pillai 
> Signed-off-by: Srinivasan Shanmugam 
> ---
>  drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
> b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
> index a5489fe6875f..aa9fd1dc550a 100644
> --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
> +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
> @@ -546,6 +546,8 @@ static unsigned int find_dcfclk_for_voltage(const
> struct vg_dpm_clocks *clock_ta
>   int i;
>
>   for (i = 0; i < VG_NUM_SOC_VOLTAGE_LEVELS; i++) {
> + if (i >= VG_NUM_DCFCLK_DPM_LEVELS)
> + break;
>   if (clock_table->SocVoltage[i] == voltage)
>   return clock_table->DcfClocks[i];
>   }
> --
> 2.34.1



RE: [PATCH] drm/amd/display: Initialize 'wait_time_microsec' variable in link_dp_training_dpia.c

2024-02-09 Thread Li, Roman
[Public]

Reviewed-by: Roman Li 

> -Original Message-
> From: amd-gfx  On Behalf Of
> Srinivasan Shanmugam
> Sent: Tuesday, February 6, 2024 11:55 PM
> To: Siqueira, Rodrigo ; Pillai, Aurabindo
> 
> Cc: amd-gfx@lists.freedesktop.org; SHANMUGAM, SRINIVASAN
> ; Liu, Wenjing
> 
> Subject: [PATCH] drm/amd/display: Initialize 'wait_time_microsec' variable in
> link_dp_training_dpia.c
>
> wait_time_microsec = max(wait_time_microsec, (uint32_t)
> DPIA_CLK_SYNC_DELAY);
>
> Above line is trying to assign the maximum value between
> 'wait_time_microsec' and 'DPIA_CLK_SYNC_DELAY' to wait_time_microsec.
> However, 'wait_time_microsec' has not been assigned a value before this line,
> initialize 'wait_time_microsec' at the point of declaration.
>
> Fixes the below:
> drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_training
> _dpia.c:697 dpia_training_eq_non_transparent() error: uninitialized symbol
> 'wait_time_microsec'.
>
> Fixes: 630168a97314 ("drm/amd/display: move dp link training logic to
> link_dp_training")
> Cc: Wenjing Liu 
> Cc: Rodrigo Siqueira 
> Cc: Aurabindo Pillai 
> Signed-off-by: Srinivasan Shanmugam 
> ---
>  .../drm/amd/display/dc/link/protocols/link_dp_training_dpia.c   | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git
> a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_dpia.c
> b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_dpia.c
> index e8dda44b23cb..5d36bab0029c 100644
> --- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_dpia.c
> +++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_dpi
> +++ a.c
> @@ -619,7 +619,7 @@ static enum link_training_result
> dpia_training_eq_non_transparent(
>   uint32_t retries_eq = 0;
>   enum dc_status status;
>   enum dc_dp_training_pattern tr_pattern;
> - uint32_t wait_time_microsec;
> + uint32_t wait_time_microsec = 0;
>   enum dc_lane_count lane_count = lt_settings-
> >link_settings.lane_count;
>   union lane_align_status_updated dpcd_lane_status_updated = {0};
>   union lane_status dpcd_lane_status[LANE_COUNT_DP_MAX] = {0};
> --
> 2.34.1



RE: [PATCH] drm/amd/display: Fix possible use of uninitialized 'max_chunks_fbc_mode' in 'calculate_bandwidth()'

2024-02-09 Thread Li, Roman
[Public]

Reviewed-by: Roman Li 

> -Original Message-
> From: amd-gfx  On Behalf Of
> Srinivasan Shanmugam
> Sent: Tuesday, February 6, 2024 11:55 PM
> To: Siqueira, Rodrigo ; Pillai, Aurabindo
> 
> Cc: amd-gfx@lists.freedesktop.org; SHANMUGAM, SRINIVASAN
> ; Wentland, Harry
> ; Deucher, Alexander
> 
> Subject: [PATCH] drm/amd/display: Fix possible use of uninitialized
> 'max_chunks_fbc_mode' in 'calculate_bandwidth()'
>
> 'max_chunks_fbc_mode' is only declared and assigned a value under a specific
> condition in the following lines:
>
> if (data->fbc_en[i] == 1) {
>   max_chunks_fbc_mode = 128 - dmif_chunk_buff_margin; }
>
> If 'data->fbc_en[i]' is not equal to 1 for any i, max_chunks_fbc_mode will not
> be initialized if it's used outside of this for loop.
>
> Ensure that 'max_chunks_fbc_mode' is properly initialized before it's used.
> Initialize it to a default value right after its declaration to ensure that 
> it gets a
> value assigned under all possible control flow paths.
>
> Thus fixing the below:
> drivers/gpu/drm/amd/amdgpu/../display/dc/basics/dce_calcs.c:914
> calculate_bandwidth() error: uninitialized symbol 'max_chunks_fbc_mode'.
> drivers/gpu/drm/amd/amdgpu/../display/dc/basics/dce_calcs.c:917
> calculate_bandwidth() error: uninitialized symbol 'max_chunks_fbc_mode'.
>
> Fixes: 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)")
> Cc: Harry Wentland 
> Cc: Alex Deucher 
> Cc: Rodrigo Siqueira 
> Cc: Aurabindo Pillai 
> Signed-off-by: Srinivasan Shanmugam 
> ---
>  drivers/gpu/drm/amd/display/dc/basics/dce_calcs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/basics/dce_calcs.c
> b/drivers/gpu/drm/amd/display/dc/basics/dce_calcs.c
> index f2dfa96f9ef5..39530b2ea495 100644
> --- a/drivers/gpu/drm/amd/display/dc/basics/dce_calcs.c
> +++ b/drivers/gpu/drm/amd/display/dc/basics/dce_calcs.c
> @@ -94,7 +94,7 @@ static void calculate_bandwidth(
>   const uint32_t s_high = 7;
>   const uint32_t dmif_chunk_buff_margin = 1;
>
> - uint32_t max_chunks_fbc_mode;
> + uint32_t max_chunks_fbc_mode = 0;
>   int32_t num_cursor_lines;
>
>   int32_t i, j, k;
> --
> 2.34.1



[PATCH] drm/buddy: Fix alloc_range() error handling code

2024-02-09 Thread Arunpravin Paneer Selvam
Few users have observed display corruption when they boot
the machine to KDE Plasma or playing games. We have root
caused the problem that whenever alloc_range() couldn't
find the required memory blocks the function was returning
SUCCESS in some of the corner cases.

The right approach would be if the total allocated size
is less than the required size, the function should
return -ENOSPC.

Cc:   # 6.7+
Fixes: 0a1844bf0b53 ("drm/buddy: Improve contiguous memory allocation")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3097
Tested-by: Mario Limonciello 
Link: 
https://patchwork.kernel.org/project/dri-devel/patch/20240207174456.341121-1-arunpravin.paneersel...@amd.com/
Acked-by: Christian König 
Reviewed-by: Matthew Auld 
Signed-off-by: Arunpravin Paneer Selvam 
---
 drivers/gpu/drm/drm_buddy.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index f57e6d74fb0e..c1a99bf4dffd 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -539,6 +539,12 @@ static int __alloc_range(struct drm_buddy *mm,
} while (1);
 
list_splice_tail(&allocated, blocks);
+
+   if (total_allocated < size) {
+   err = -ENOSPC;
+   goto err_free;
+   }
+
return 0;
 
 err_undo:
-- 
2.25.1



Re: [PATCH v4 1/3] drm: Add drm_get_acpi_edid() helper

2024-02-09 Thread Mario Limonciello

On 2/9/2024 05:07, Daniel Vetter wrote:

On Thu, Feb 08, 2024 at 11:57:11AM +0200, Jani Nikula wrote:

On Wed, 07 Feb 2024, Mario Limonciello  wrote:

Some manufacturers have intentionally put an EDID that differs from
the EDID on the internal panel on laptops.  Drivers can call this
helper to attempt to fetch the EDID from the BIOS's ACPI _DDC method.

Signed-off-by: Mario Limonciello 
---
  drivers/gpu/drm/Kconfig|  5 +++
  drivers/gpu/drm/drm_edid.c | 77 ++
  include/drm/drm_edid.h |  1 +
  3 files changed, 83 insertions(+)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 6ec33d36f3a4..ec2bb71e8b36 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -21,6 +21,11 @@ menuconfig DRM
select KCMP
select VIDEO_CMDLINE
select VIDEO_NOMODESET
+   select ACPI_VIDEO if ACPI
+   select BACKLIGHT_CLASS_DEVICE if ACPI
+   select INPUT if ACPI
+   select X86_PLATFORM_DEVICES if ACPI && X86
+   select ACPI_WMI if ACPI && X86


I think I'll defer to drm maintainers on whether this is okay or
something to be avoided.


Uh yeah this is a bit much, and select just messes with everything. Just
#ifdef this in the code with a dummy alternative, if users configure their
kernel without acpi but need it, they get to keep all the pieces.

Alternatively make a DRM_ACPI_HELPERS symbol, but imo a Kconfig for every
function is also not great. And just using #ifdef in the code also works
for CONFIG_OF, which is exactly the same thing for platforms using dt to
describe hw.

Also I'd expect ACPI code to already provide dummy functions if ACPI is
provided, so you probably dont even need all that much #ifdef in the code.

What we defo cant do is select platform/hw stuff just because you enable
CONFIG_DRM.
-Sima


The problem was with linking.  I'll experiment with #ifdef for the next 
version.








help
  Kernel-level support for the Direct Rendering Infrastructure (DRI)
  introduced in XFree86 4.0. If you say Y here, you need to select
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 923c4423151c..c649b4f9fd8e 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -28,6 +28,7 @@
   * DEALINGS IN THE SOFTWARE.
   */
  
+#include 

  #include 
  #include 
  #include 
@@ -2188,6 +2189,49 @@ drm_do_probe_ddc_edid(void *data, u8 *buf, unsigned int 
block, size_t len)
return ret == xfers ? 0 : -1;
  }
  
+/**

+ * drm_do_probe_acpi_edid() - get EDID information via ACPI _DDC
+ * @data: struct drm_device
+ * @buf: EDID data buffer to be filled
+ * @block: 128 byte EDID block to start fetching from
+ * @len: EDID data buffer length to fetch
+ *
+ * Try to fetch EDID information by calling acpi_video_get_edid() function.
+ *
+ * Return: 0 on success or error code on failure.
+ */
+static int
+drm_do_probe_acpi_edid(void *data, u8 *buf, unsigned int block, size_t len)
+{
+   struct drm_device *ddev = data;
+   struct acpi_device *acpidev = ACPI_COMPANION(ddev->dev);
+   unsigned char start = block * EDID_LENGTH;
+   void *edid;
+   int r;
+
+   if (!acpidev)
+   return -ENODEV;
+
+   /* fetch the entire edid from BIOS */
+   r = acpi_video_get_edid(acpidev, ACPI_VIDEO_DISPLAY_LCD, -1, &edid);
+   if (r < 0) {
+   DRM_DEBUG_KMS("Failed to get EDID from ACPI: %d\n", r);
+   return -EINVAL;
+   }
+   if (len > r || start > r || start + len > r) {
+   r = -EINVAL;
+   goto cleanup;
+   }
+
+   memcpy(buf, edid + start, len);
+   r = 0;
+
+cleanup:
+   kfree(edid);
+
+   return r;
+}
+
  static void connector_bad_edid(struct drm_connector *connector,
   const struct edid *edid, int num_blocks)
  {
@@ -2643,6 +2687,39 @@ struct edid *drm_get_edid(struct drm_connector 
*connector,
  }
  EXPORT_SYMBOL(drm_get_edid);
  
+/**

+ * drm_get_acpi_edid - get EDID data, if available


I'd prefer all the new EDID API to be named drm_edid_*. Makes a clean
break from the old API, and is more consistent.

So perhaps drm_edid_read_acpi() to be in line with all the other struct
drm_edid based EDID reading functions.


+ * @connector: connector we're probing
+ *
+ * Use the BIOS to attempt to grab EDID data if possible.
+ *
+ * The returned pointer must be freed using drm_edid_free().
+ *
+ * Return: Pointer to valid EDID or NULL if we couldn't find any.
+ */
+const struct drm_edid *drm_get_acpi_edid(struct drm_connector *connector)
+{
+   const struct drm_edid *drm_edid;
+
+   switch (connector->connector_type) {
+   case DRM_MODE_CONNECTOR_LVDS:
+   case DRM_MODE_CONNECTOR_eDP:
+   break;
+   default:
+   return NULL;
+   }
+
+   if (connector->force == DRM_FORCE_OFF)
+   return NULL;
+
+   drm_edid = drm_edid_read_custom(connector, drm_do_prob

Re: [PATCH] drm/amd/display: Fix && vs || typos

2024-02-09 Thread Hamza Mahfooz

On 2/9/24 08:02, Dan Carpenter wrote:

These ANDs should be ORs or it will lead to a NULL dereference.

Fixes: fb5a3d037082 ("drm/amd/display: Add NULL test for 'timing generator' in 
'dcn21_set_pipe()'")
Fixes: 886571d217d7 ("drm/amd/display: Fix 'panel_cntl' could be null in 
'dcn21_set_backlight_level()'")
Signed-off-by: Dan Carpenter 


Applied, thanks!


---
  drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
index 5c7f380a84f9..7252f5f781f0 100644
--- a/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
@@ -211,7 +211,7 @@ void dcn21_set_pipe(struct pipe_ctx *pipe_ctx)
struct dmcu *dmcu = pipe_ctx->stream->ctx->dc->res_pool->dmcu;
uint32_t otg_inst;
  
-	if (!abm && !tg && !panel_cntl)

+   if (!abm || !tg || !panel_cntl)
return;
  
  	otg_inst = tg->inst;

@@ -245,7 +245,7 @@ bool dcn21_set_backlight_level(struct pipe_ctx *pipe_ctx,
struct panel_cntl *panel_cntl = pipe_ctx->stream->link->panel_cntl;
uint32_t otg_inst;
  
-	if (!abm && !tg && !panel_cntl)

+   if (!abm || !tg || !panel_cntl)
return false;
  
  	otg_inst = tg->inst;

--
Hamza



Re: [PATCH] drm/buddy: Fix alloc_range() error handling code

2024-02-09 Thread Daniel Vetter
On Fri, Feb 09, 2024 at 08:56:24PM +0530, Arunpravin Paneer Selvam wrote:
> Few users have observed display corruption when they boot
> the machine to KDE Plasma or playing games. We have root
> caused the problem that whenever alloc_range() couldn't
> find the required memory blocks the function was returning
> SUCCESS in some of the corner cases.
> 
> The right approach would be if the total allocated size
> is less than the required size, the function should
> return -ENOSPC.
> 
> Cc:   # 6.7+
> Fixes: 0a1844bf0b53 ("drm/buddy: Improve contiguous memory allocation")
> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3097
> Tested-by: Mario Limonciello 
> Link: 
> https://patchwork.kernel.org/project/dri-devel/patch/20240207174456.341121-1-arunpravin.paneersel...@amd.com/
> Acked-by: Christian König 
> Reviewed-by: Matthew Auld 
> Signed-off-by: Arunpravin Paneer Selvam 

New unit test for this would be most excellent - these kind of missed edge
cases is exactly what kunit is for. Can you please follow up with, since
we don't want to hold up the bugfix for longer?
-Sima

> ---
>  drivers/gpu/drm/drm_buddy.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
> index f57e6d74fb0e..c1a99bf4dffd 100644
> --- a/drivers/gpu/drm/drm_buddy.c
> +++ b/drivers/gpu/drm/drm_buddy.c
> @@ -539,6 +539,12 @@ static int __alloc_range(struct drm_buddy *mm,
>   } while (1);
>  
>   list_splice_tail(&allocated, blocks);
> +
> + if (total_allocated < size) {
> + err = -ENOSPC;
> + goto err_free;
> + }
> +
>   return 0;
>  
>  err_undo:
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/radeon/ni: Fix wrong firmware size logging in ni_init_microcode()

2024-02-09 Thread Alex Deucher
Applied.  Thanks!

On Tue, Feb 6, 2024 at 11:48 AM Nikita Zhandarovich
 wrote:
>
> Clean up a typo in pr_err() erroneously printing NI MC 'rdev->mc_fw->size'
> during SMC firmware load. Log 'rdev->smc_fw->size' instead.
>
> Found by Linux Verification Center (linuxtesting.org) with static
> analysis tool SVACE.
>
> Fixes: 6596afd48af4 ("drm/radeon/kms: add dpm support for btc (v3)")
> Signed-off-by: Nikita Zhandarovich 
> ---
>  drivers/gpu/drm/radeon/ni.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
> index 927e5f42e97d..3e48cbb522a1 100644
> --- a/drivers/gpu/drm/radeon/ni.c
> +++ b/drivers/gpu/drm/radeon/ni.c
> @@ -813,7 +813,7 @@ int ni_init_microcode(struct radeon_device *rdev)
> err = 0;
> } else if (rdev->smc_fw->size != smc_req_size) {
> pr_err("ni_mc: Bogus length %zu in firmware \"%s\"\n",
> -  rdev->mc_fw->size, fw_name);
> +  rdev->smc_fw->size, fw_name);
> err = -EINVAL;
> }
> }
> --
> 2.25.1
>


Re: [PATCH] drm/amd/display: fix NULL checks for adev->dm.dc in amdgpu_dm_fini()

2024-02-09 Thread Alex Deucher
Applied.  Thanks!

Alex

On Tue, Feb 6, 2024 at 11:51 AM Nikita Zhandarovich
 wrote:
>
> Since 'adev->dm.dc' in amdgpu_dm_fini() might turn out to be NULL
> before the call to dc_enable_dmub_notifications(), check
> beforehand to ensure there will not be a possible NULL-ptr-deref
> there.
>
> Also, since commit 1e88eb1b2c25 ("drm/amd/display: Drop
> CONFIG_DRM_AMD_DC_HDCP") there are two separate checks for NULL in
> 'adev->dm.dc' before dc_deinit_callbacks() and dc_dmub_srv_destroy().
> Clean up by combining them all under one 'if'.
>
> Found by Linux Verification Center (linuxtesting.org) with static
> analysis tool SVACE.
>
> Fixes: 81927e2808be ("drm/amd/display: Support for DMUB AUX")
> Signed-off-by: Nikita Zhandarovich 
> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c| 16 +++-
>  1 file changed, 7 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index d292f290cd6e..46ac3e6f42bb 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -1938,17 +1938,15 @@ static void amdgpu_dm_fini(struct amdgpu_device *adev)
> adev->dm.hdcp_workqueue = NULL;
> }
>
> -   if (adev->dm.dc)
> +   if (adev->dm.dc) {
> dc_deinit_callbacks(adev->dm.dc);
> -
> -   if (adev->dm.dc)
> dc_dmub_srv_destroy(&adev->dm.dc->ctx->dmub_srv);
> -
> -   if (dc_enable_dmub_notifications(adev->dm.dc)) {
> -   kfree(adev->dm.dmub_notify);
> -   adev->dm.dmub_notify = NULL;
> -   destroy_workqueue(adev->dm.delayed_hpd_wq);
> -   adev->dm.delayed_hpd_wq = NULL;
> +   if (dc_enable_dmub_notifications(adev->dm.dc)) {
> +   kfree(adev->dm.dmub_notify);
> +   adev->dm.dmub_notify = NULL;
> +   destroy_workqueue(adev->dm.delayed_hpd_wq);
> +   adev->dm.delayed_hpd_wq = NULL;
> +   }
> }
>
> if (adev->dm.dmub_bo)
> --
> 2.25.1
>


Re: [PATCH v3 3/9] drm/ci: mediatek: Add job to test panfrost and powervr GPU driver

2024-02-09 Thread Helen Koike




On 30/01/2024 12:03, Vignesh Raman wrote:

For mediatek mt8173, the GPU driver is powervr and for mediatek
mt8183, the GPU driver is panfrost. So add support in drm-ci to
test panfrost and powervr GPU driver for mediatek SOCs and update
xfails. Powervr driver was merged in linux kernel, but there's no
mediatek support yet. So disable the mt8173-gpu job which uses
powervr driver.

Add panfrost specific tests to testlist and skip KMS tests for
panfrost driver since it is not a not a KMS driver. Also update
the MAINTAINERS file to include xfails for panfrost driver.

Signed-off-by: Vignesh Raman 


Hi Vignesh, thanks for your work.

I'm still wondering about a few things, please check below.


---

v2:
   - Add panfrost and PVR GPU jobs for mediatek SOC with new xfails, add xfail
 entry to MAINTAINERS.


Maybe we should review how the xfails failes are named. I think they 
should start with the DRIVER_NAME instead of GPU_VERSION.


For instance, consider the following job:

mediatek:mt8183-gpu:
  extends:
- .mt8183
  variables:
GPU_VERSION: mediatek-mt8183-gpu
DRIVER_NAME: panfrost

And we have mediatek-mt8183-gpu-skips.txt

If there is an error, we want to notify the panfrost driver maintainers 
(and maybe not the mediatek driver maintainers), so MAINTAINERS file 
doesn't correspond to this.


How about a naming __ ?

powervr_mediatek-mt8173_gpu-skipts.txt
mediatek_mediatek-mt8173_display-skipts.txt
panfrost_mediatek-mt8183_gpu-skips.txt
mediatek_mediatek-mt8183_display-skips.txt
...

What do you think?

Thanks
Helen





v3:
   - Add panfrost specific tests to testlist and skip KMS tests for
 panfrost driver since it is not a not a KMS driver and update xfails.
 Update the MAINTAINERS file to include xfails for panfrost driver.
 Add the job name in GPU_VERSION and use it for xfail file names instead
 of using DRIVER_NAME.

---
  MAINTAINERS|  1 +
  drivers/gpu/drm/ci/test.yml| 18 ++
  drivers/gpu/drm/ci/testlist.txt| 16 
  .../ci/xfails/mediatek-mt8183-gpu-skips.txt|  2 ++
  4 files changed, 37 insertions(+)
  create mode 100644 drivers/gpu/drm/ci/xfails/mediatek-mt8183-gpu-skips.txt

diff --git a/MAINTAINERS b/MAINTAINERS
index 9d959a6881f7..bcdc17d1aa26 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1645,6 +1645,7 @@ L:dri-de...@lists.freedesktop.org
  S:Supported
  T:git git://anongit.freedesktop.org/drm/drm-misc
  F:Documentation/gpu/panfrost.rst
+F: drivers/gpu/drm/ci/xfails/panfrost*
  F:drivers/gpu/drm/panfrost/
  F:include/uapi/drm/panfrost_drm.h
  
diff --git a/drivers/gpu/drm/ci/test.yml b/drivers/gpu/drm/ci/test.yml

index 0cd44e6ea18b..e153c5a7ad80 100644
--- a/drivers/gpu/drm/ci/test.yml
+++ b/drivers/gpu/drm/ci/test.yml
@@ -299,6 +299,17 @@ amdgpu:stoney:
  DEVICE_TYPE: mt8183-kukui-jacuzzi-juniper-sku16
  RUNNER_TAG: mesa-ci-x86-64-lava-mt8183-kukui-jacuzzi-juniper-sku16
  
+mediatek:mt8173-gpu:

+  extends:
+- .mt8173
+  variables:
+GPU_VERSION: mediatek-mt8173-gpu
+DRIVER_NAME: powervr
+  rules:
+# TODO: powervr driver was merged in linux kernel, but there's no mediatek 
support yet
+# Remove the rule once mediatek support is added for powervr
+- when: never
+
  mediatek:mt8173-display:
extends:
  - .mt8173
@@ -306,6 +317,13 @@ mediatek:mt8173-display:
  GPU_VERSION: mediatek-mt8173-display
  DRIVER_NAME: mediatek
  
+mediatek:mt8183-gpu:

+  extends:
+- .mt8183
+  variables:
+GPU_VERSION: mediatek-mt8183-gpu
+DRIVER_NAME: panfrost
+
  mediatek:mt8183-display:
extends:
  - .mt8183
diff --git a/drivers/gpu/drm/ci/testlist.txt b/drivers/gpu/drm/ci/testlist.txt
index eaeb751bb0ad..772fc025b1f8 100644
--- a/drivers/gpu/drm/ci/testlist.txt
+++ b/drivers/gpu/drm/ci/testlist.txt
@@ -2959,3 +2959,19 @@ msm_submit@invalid-duplicate-bo-submit
  msm_submit@invalid-cmd-idx-submit
  msm_submit@invalid-cmd-type-submit
  msm_submit@valid-submit
+panfrost_get_param@base-params
+panfrost_get_param@get-bad-param
+panfrost_get_param@get-bad-padding
+panfrost_gem_new@gem-new-4096
+panfrost_gem_new@gem-new-0
+panfrost_gem_new@gem-new-zeroed
+panfrost_prime@gem-prime-import
+panfrost_submit@pan-submit
+panfrost_submit@pan-submit-error-no-jc
+panfrost_submit@pan-submit-error-bad-in-syncs
+panfrost_submit@pan-submit-error-bad-bo-handles
+panfrost_submit@pan-submit-error-bad-requirements
+panfrost_submit@pan-submit-error-bad-out-sync
+panfrost_submit@pan-reset
+panfrost_submit@pan-submit-and-close
+panfrost_submit@pan-unhandled-pagefault
diff --git a/drivers/gpu/drm/ci/xfails/mediatek-mt8183-gpu-skips.txt 
b/drivers/gpu/drm/ci/xfails/mediatek-mt8183-gpu-skips.txt
new file mode 100644
index ..2ea09d1648bc
--- /dev/null
+++ b/drivers/gpu/drm/ci/xfails/mediatek-mt8183-gpu-skips.txt
@@ -0,0 +1,2 @@
+# Panfrost is not a KMS driver, so skip the KMS tests
+kms_.*


Re: [PATCH v3 1/9] drm/ci: arm64.config: Enable CONFIG_DRM_ANALOGIX_ANX7625

2024-02-09 Thread Helen Koike




On 30/01/2024 12:03, Vignesh Raman wrote:

Enable CONFIG_DRM_ANALOGIX_ANX7625 in the arm64 defconfig to get
display driver probed on the mt8183-kukui-jacuzzi-juniper machine.

arch/arm64/configs/defconfig has CONFIG_DRM_ANALOGIX_ANX7625=m,
but drm-ci don't have initrd with modules, so add
CONFIG_DRM_ANALOGIX_ANX7625=y in CI arm64 config.


Couldn't you load the module as it is done on
https://cgit.freedesktop.org/drm/drm-misc/tree/drivers/gpu/drm/ci/igt_runner.sh#n35 
?


This is not a blocker, in any case

Acked-by: Helen Koike 

Thanks
Helen



Signed-off-by: Vignesh Raman 
---

v2:
   - No changes

v3:
   - No changes

---
  drivers/gpu/drm/ci/arm64.config | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/ci/arm64.config b/drivers/gpu/drm/ci/arm64.config
index 8dbce9919a57..37d23fd7a367 100644
--- a/drivers/gpu/drm/ci/arm64.config
+++ b/drivers/gpu/drm/ci/arm64.config
@@ -187,6 +187,7 @@ CONFIG_MTK_DEVAPC=y
  CONFIG_PWM_MTK_DISP=y
  CONFIG_MTK_CMDQ=y
  CONFIG_REGULATOR_DA9211=y
+CONFIG_DRM_ANALOGIX_ANX7625=y
  
  # For nouveau.  Note that DRM must be a module so that it's loaded after NFS is up to provide the firmware.

  CONFIG_ARCH_TEGRA=y


Re: [PATCH] drm/buddy: Fix alloc_range() error handling code

2024-02-09 Thread Arunpravin Paneer Selvam

Hi Daniel,

On 2/9/2024 11:34 PM, Daniel Vetter wrote:

On Fri, Feb 09, 2024 at 08:56:24PM +0530, Arunpravin Paneer Selvam wrote:

Few users have observed display corruption when they boot
the machine to KDE Plasma or playing games. We have root
caused the problem that whenever alloc_range() couldn't
find the required memory blocks the function was returning
SUCCESS in some of the corner cases.

The right approach would be if the total allocated size
is less than the required size, the function should
return -ENOSPC.

Cc:   # 6.7+
Fixes: 0a1844bf0b53 ("drm/buddy: Improve contiguous memory allocation")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3097
Tested-by: Mario Limonciello 
Link: 
https://patchwork.kernel.org/project/dri-devel/patch/20240207174456.341121-1-arunpravin.paneersel...@amd.com/
Acked-by: Christian König 
Reviewed-by: Matthew Auld 
Signed-off-by: Arunpravin Paneer Selvam 

New unit test for this would be most excellent - these kind of missed edge
cases is exactly what kunit is for. Can you please follow up with, since
we don't want to hold up the bugfix for longer?
Matthew Auld has added a new unit test for this case. Please let us know 
if this will suffice.

https://patchwork.freedesktop.org/patch/577497/?series=129671&rev=1

Thanks,
Arun.

-Sima


---
  drivers/gpu/drm/drm_buddy.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index f57e6d74fb0e..c1a99bf4dffd 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -539,6 +539,12 @@ static int __alloc_range(struct drm_buddy *mm,
} while (1);
  
  	list_splice_tail(&allocated, blocks);

+
+   if (total_allocated < size) {
+   err = -ENOSPC;
+   goto err_free;
+   }
+
return 0;
  
  err_undo:

--
2.25.1





Re: [PATCH v4 1/3] drm: Add drm_get_acpi_edid() helper

2024-02-09 Thread Daniel Vetter
On Fri, Feb 09, 2024 at 09:34:13AM -0600, Mario Limonciello wrote:
> On 2/9/2024 05:07, Daniel Vetter wrote:
> > On Thu, Feb 08, 2024 at 11:57:11AM +0200, Jani Nikula wrote:
> > > On Wed, 07 Feb 2024, Mario Limonciello  wrote:
> > > > Some manufacturers have intentionally put an EDID that differs from
> > > > the EDID on the internal panel on laptops.  Drivers can call this
> > > > helper to attempt to fetch the EDID from the BIOS's ACPI _DDC method.
> > > > 
> > > > Signed-off-by: Mario Limonciello 
> > > > ---
> > > >   drivers/gpu/drm/Kconfig|  5 +++
> > > >   drivers/gpu/drm/drm_edid.c | 77 ++
> > > >   include/drm/drm_edid.h |  1 +
> > > >   3 files changed, 83 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> > > > index 6ec33d36f3a4..ec2bb71e8b36 100644
> > > > --- a/drivers/gpu/drm/Kconfig
> > > > +++ b/drivers/gpu/drm/Kconfig
> > > > @@ -21,6 +21,11 @@ menuconfig DRM
> > > > select KCMP
> > > > select VIDEO_CMDLINE
> > > > select VIDEO_NOMODESET
> > > > +   select ACPI_VIDEO if ACPI
> > > > +   select BACKLIGHT_CLASS_DEVICE if ACPI
> > > > +   select INPUT if ACPI
> > > > +   select X86_PLATFORM_DEVICES if ACPI && X86
> > > > +   select ACPI_WMI if ACPI && X86
> > > 
> > > I think I'll defer to drm maintainers on whether this is okay or
> > > something to be avoided.
> > 
> > Uh yeah this is a bit much, and select just messes with everything. Just
> > #ifdef this in the code with a dummy alternative, if users configure their
> > kernel without acpi but need it, they get to keep all the pieces.
> > 
> > Alternatively make a DRM_ACPI_HELPERS symbol, but imo a Kconfig for every
> > function is also not great. And just using #ifdef in the code also works
> > for CONFIG_OF, which is exactly the same thing for platforms using dt to
> > describe hw.
> > 
> > Also I'd expect ACPI code to already provide dummy functions if ACPI is
> > provided, so you probably dont even need all that much #ifdef in the code.
> > 
> > What we defo cant do is select platform/hw stuff just because you enable
> > CONFIG_DRM.
> > -Sima
> 
> The problem was with linking.  I'll experiment with #ifdef for the next
> version.

Ah yes, if e.g. acpi is a module but drm is built-in then it will compile,
but not link.

You need

depends on (ACPI || ACPI=n)

for this. Looks a bit funny but works for all combinations.

Since this gets mess it might be useful to have a DRM_ACPI_HELPERS Kconfig
that controls all this.
-Sima

> 
> > 
> > > 
> > > 
> > > > help
> > > >   Kernel-level support for the Direct Rendering Infrastructure 
> > > > (DRI)
> > > >   introduced in XFree86 4.0. If you say Y here, you need to 
> > > > select
> > > > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
> > > > index 923c4423151c..c649b4f9fd8e 100644
> > > > --- a/drivers/gpu/drm/drm_edid.c
> > > > +++ b/drivers/gpu/drm/drm_edid.c
> > > > @@ -28,6 +28,7 @@
> > > >* DEALINGS IN THE SOFTWARE.
> > > >*/
> > > > +#include 
> > > >   #include 
> > > >   #include 
> > > >   #include 
> > > > @@ -2188,6 +2189,49 @@ drm_do_probe_ddc_edid(void *data, u8 *buf, 
> > > > unsigned int block, size_t len)
> > > > return ret == xfers ? 0 : -1;
> > > >   }
> > > > +/**
> > > > + * drm_do_probe_acpi_edid() - get EDID information via ACPI _DDC
> > > > + * @data: struct drm_device
> > > > + * @buf: EDID data buffer to be filled
> > > > + * @block: 128 byte EDID block to start fetching from
> > > > + * @len: EDID data buffer length to fetch
> > > > + *
> > > > + * Try to fetch EDID information by calling acpi_video_get_edid() 
> > > > function.
> > > > + *
> > > > + * Return: 0 on success or error code on failure.
> > > > + */
> > > > +static int
> > > > +drm_do_probe_acpi_edid(void *data, u8 *buf, unsigned int block, size_t 
> > > > len)
> > > > +{
> > > > +   struct drm_device *ddev = data;
> > > > +   struct acpi_device *acpidev = ACPI_COMPANION(ddev->dev);
> > > > +   unsigned char start = block * EDID_LENGTH;
> > > > +   void *edid;
> > > > +   int r;
> > > > +
> > > > +   if (!acpidev)
> > > > +   return -ENODEV;
> > > > +
> > > > +   /* fetch the entire edid from BIOS */
> > > > +   r = acpi_video_get_edid(acpidev, ACPI_VIDEO_DISPLAY_LCD, -1, 
> > > > &edid);
> > > > +   if (r < 0) {
> > > > +   DRM_DEBUG_KMS("Failed to get EDID from ACPI: %d\n", r);
> > > > +   return -EINVAL;
> > > > +   }
> > > > +   if (len > r || start > r || start + len > r) {
> > > > +   r = -EINVAL;
> > > > +   goto cleanup;
> > > > +   }
> > > > +
> > > > +   memcpy(buf, edid + start, len);
> > > > +   r = 0;
> > > > +
> > > > +cleanup:
> > > > +   kfree(edid);
> > > > +
> > > > +   return r;
> > > > +}
> > > > +
> > > >   static void connector_bad_edid(struct drm_connector *conn

Re: [PATCH] drm/buddy: Fix alloc_range() error handling code

2024-02-09 Thread Daniel Vetter
On Sat, Feb 10, 2024 at 12:06:58AM +0530, Arunpravin Paneer Selvam wrote:
> Hi Daniel,
> 
> On 2/9/2024 11:34 PM, Daniel Vetter wrote:
> > On Fri, Feb 09, 2024 at 08:56:24PM +0530, Arunpravin Paneer Selvam wrote:
> > > Few users have observed display corruption when they boot
> > > the machine to KDE Plasma or playing games. We have root
> > > caused the problem that whenever alloc_range() couldn't
> > > find the required memory blocks the function was returning
> > > SUCCESS in some of the corner cases.
> > > 
> > > The right approach would be if the total allocated size
> > > is less than the required size, the function should
> > > return -ENOSPC.
> > > 
> > > Cc:   # 6.7+
> > > Fixes: 0a1844bf0b53 ("drm/buddy: Improve contiguous memory allocation")
> > > Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3097
> > > Tested-by: Mario Limonciello 
> > > Link: 
> > > https://patchwork.kernel.org/project/dri-devel/patch/20240207174456.341121-1-arunpravin.paneersel...@amd.com/
> > > Acked-by: Christian König 
> > > Reviewed-by: Matthew Auld 
> > > Signed-off-by: Arunpravin Paneer Selvam 
> > New unit test for this would be most excellent - these kind of missed edge
> > cases is exactly what kunit is for. Can you please follow up with, since
> > we don't want to hold up the bugfix for longer?
> Matthew Auld has added a new unit test for this case. Please let us know if
> this will suffice.
> https://patchwork.freedesktop.org/patch/577497/?series=129671&rev=1

Ah yeah, might be best to submit them both together as one series (you
just need to add your own signed-off-by if you resend other people's
patches). That way bots can pick it up together, since new testcase and
bugfix only make sense together.
-Sima

> 
> Thanks,
> Arun.
> > -Sima
> > 
> > > ---
> > >   drivers/gpu/drm/drm_buddy.c | 6 ++
> > >   1 file changed, 6 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
> > > index f57e6d74fb0e..c1a99bf4dffd 100644
> > > --- a/drivers/gpu/drm/drm_buddy.c
> > > +++ b/drivers/gpu/drm/drm_buddy.c
> > > @@ -539,6 +539,12 @@ static int __alloc_range(struct drm_buddy *mm,
> > >   } while (1);
> > >   list_splice_tail(&allocated, blocks);
> > > +
> > > + if (total_allocated < size) {
> > > + err = -ENOSPC;
> > > + goto err_free;
> > > + }
> > > +
> > >   return 0;
> > >   err_undo:
> > > -- 
> > > 2.25.1
> > > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[PATCH] drm/amdgpu: respect the abmlevel module parameter value if it is set

2024-02-09 Thread Hamza Mahfooz
Currently, if the abmlevel module parameter is set, it is possible for
user space to override the ABM level at some point after boot. However,
that is undesirable because it means that we aren't respecting the
user's wishes with regard to the level that they want to use. So,
prevent user space from changing the ABM level if the module parameter
is set to a non-auto value.

Signed-off-by: Hamza Mahfooz 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   | 11 ++-
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 15 ++-
 3 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 1291b8eb9dff..f5c8187e0d58 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -196,7 +196,7 @@ extern int amdgpu_smu_pptable_id;
 extern uint amdgpu_dc_feature_mask;
 extern uint amdgpu_dc_debug_mask;
 extern uint amdgpu_dc_visual_confirm;
-extern uint amdgpu_dm_abm_level;
+extern int amdgpu_dm_abm_level;
 extern int amdgpu_backlight;
 extern int amdgpu_damage_clips;
 extern struct amdgpu_mgpu_info mgpu_info;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6ef7f22c1152..af7fae7907d7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -849,12 +849,13 @@ module_param_named(visualconfirm, 
amdgpu_dc_visual_confirm, uint, 0444);
  * the ABM algorithm, with 1 being the least reduction and 4 being the most
  * reduction.
  *
- * Defaults to 0, or disabled. Userspace can still override this level later
- * after boot.
+ * Defaults to -1, or disabled. Userspace can only override this level after
+ * boot if it's set to auto.
  */
-uint amdgpu_dm_abm_level;
-MODULE_PARM_DESC(abmlevel, "ABM level (0 = off (default), 1-4 = backlight 
reduction level) ");
-module_param_named(abmlevel, amdgpu_dm_abm_level, uint, 0444);
+int amdgpu_dm_abm_level = -1;
+MODULE_PARM_DESC(abmlevel,
+"ABM level (0 = off, 1-4 = backlight reduction level, -1 auto 
(default))");
+module_param_named(abmlevel, amdgpu_dm_abm_level, int, 0444);
 
 int amdgpu_backlight = -1;
 MODULE_PARM_DESC(backlight, "Backlight control (0 = pwm, 1 = aux, -1 auto 
(default))");
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index fbe2aa40c21a..a5b3330879f3 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6513,7 +6513,8 @@ static void amdgpu_dm_connector_unregister(struct 
drm_connector *connector)
 {
struct amdgpu_dm_connector *amdgpu_dm_connector = 
to_amdgpu_dm_connector(connector);
 
-   if (connector->connector_type == DRM_MODE_CONNECTOR_eDP)
+   if (connector->connector_type == DRM_MODE_CONNECTOR_eDP &&
+   amdgpu_dm_abm_level < 0)
sysfs_remove_group(&connector->kdev->kobj, &amdgpu_group);
 
drm_dp_aux_unregister(&amdgpu_dm_connector->dm_dp_aux.aux);
@@ -6577,9 +6578,12 @@ void amdgpu_dm_connector_funcs_reset(struct 
drm_connector *connector)
state->vcpi_slots = 0;
state->pbn = 0;
 
-   if (connector->connector_type == DRM_MODE_CONNECTOR_eDP)
-   state->abm_level = amdgpu_dm_abm_level ?:
-   ABM_LEVEL_IMMEDIATE_DISABLE;
+   if (connector->connector_type == DRM_MODE_CONNECTOR_eDP) {
+   if (amdgpu_dm_abm_level <= 0)
+   state->abm_level = ABM_LEVEL_IMMEDIATE_DISABLE;
+   else
+   state->abm_level = amdgpu_dm_abm_level;
+   }
 
__drm_atomic_helper_connector_reset(connector, &state->base);
}
@@ -6617,7 +6621,8 @@ amdgpu_dm_connector_late_register(struct drm_connector 
*connector)
to_amdgpu_dm_connector(connector);
int r;
 
-   if (connector->connector_type == DRM_MODE_CONNECTOR_eDP) {
+   if (connector->connector_type == DRM_MODE_CONNECTOR_eDP &&
+   amdgpu_dm_abm_level < 0) {
r = sysfs_create_group(&connector->kdev->kobj,
   &amdgpu_group);
if (r)
-- 
2.43.0



[pull] amdgpu, amdkfd, radeon drm-next-6.9

2024-02-09 Thread Alex Deucher
Hi Dave, Sima,

New stuff for 6.9.

The following changes since commit d7643fe6fb76edb1f2f1497bf5e8b8f4774b5129:

  drm/amd/display: Avoid enum conversion warning (2024-01-15 18:35:07 -0500)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-next-6.9-2024-02-09

for you to fetch changes up to d5597444032b2f5c8624918fb5b29be5bba78a3c:

  drm/amdgpu: Fix HDP flush for VFs on nbio v7.9 (2024-02-07 12:26:24 -0500)


amd-drm-next-6.9-2024-02-09:

amdgpu:
- Validate DMABuf imports in compute VMs
- Add RAS ACA framework
- PSP 13 fixes
- Misc code cleanups
- Replay fixes
- Atom interpretor PS, WS bounds checking
- DML2 fixes
- Audio fixes
- DCN 3.5 Z state fixes
- Remove deprecated ida_simple usage
- UBSAN fixes
- RAS fixes
- Enable seq64 infrastructure
- DC color block enablement
- Documentation updates
- DC documentation updates
- DMCUB updates
- S3 fixes
- VCN 4.0.5 fixes
- DP MST fixes
- SR-IOV fixes

amdkfd:
- Validate DMABuf imports in compute VMs
- SVM fixes
- Trap handler updates

radeon:
- Atom interpretor PS, WS bounds checking
- Misc code cleanups

UAPI:
- Bump KFD version so UMDs know that the fixes that enable the management of
  VA mappings in compute VMs using the GEM_VA ioctl for DMABufs exported from 
KFD are present
- Add INFO query for input power.  This matches the existing INFO query for 
average
  power.  Used in gaming HUDs, etc.
  Example userspace: 
https://github.com/Umio-Yasuno/libdrm-amdgpu-sys-rs/tree/input_power


Alex Deucher (8):
  drm/amdgpu: add new INFO IOCTL query for input power
  drm/amdgpu: move kiq_reg_write_reg_wait() out of amdgpu_virt.c
  drm/amdgpu/pptable: convert some variable sized arrays to [] style
  drm/amdgpu/gfx10: set UNORD_DISPATCH in compute MQDs
  drm/amdgpu/gfx11: set UNORD_DISPATCH in compute MQDs
  drm/amdgpu: convert some variable sized arrays to [] style
  drm/amdgpu: update documentation on new chips
  drm/amdgpu: fix typo in parameter description

Alexander Richards (2):
  drm/amdgpu: check PS, WS index
  drm/radeon: check PS, WS index

Allen Pan (2):
  drm/amd/display: Add NULL-checks in dml2 assigned pipe search
  drm/amd/display: correct static screen event mask

Alvin Lee (6):
  drm/amd/display: Add Replay IPS register for DMUB command table
  drm/amd/display: Ensure populate uclk in bb construction
  drm/amd/display: For FPO and SubVP/DRR configs program vmin/max sel
  drm/amd/display: Populate invalid split index to be 0xF
  Revert "drm/amd/display: For FPO and SubVP/DRR configs program vmin/max 
sel"
  drm/amd/display: Update phantom pipe enable / disable sequence

Anthony Koo (2):
  drm/amd/display: [FW Promotion] Release 0.0.201.0
  drm/amd/display: [FW Promotion] Release 0.0.202.0

Aric Cyr (5):
  drm/amd/display: Promote DAL to 3.2.268
  drm/amd/display: Promote DAL to 3.2.269
  drm/amd/display: Unify optimize_required flags and VRR adjustments
  drm/amd/display: 3.2.270
  drm/amd/display: 3.2.271

Arunpravin Paneer Selvam (1):
  drm/amdgpu: Enable seq64 manager and fix bugs

Camille Cho (1):
  drm/amd/display: correct comment in set_default_brightness_aux()

Candice Li (3):
  drm/amdgpu: Do bad page retirement for deferred errors
  drm/amdgpu: Log deferred error separately
  drm/amd/pm: Retrieve UMC ODECC error count from aca bank

Charlene Liu (6):
  drm/amd/display: Add logging resource checks
  drm/amd/display: Update P010 scaling cap
  drm/amd/display: Revert "Rework DC Z10 restore"
  Revert "drm/amd/display: initialize all the dpm level's stutter latency"
  drm/amd/display: fix USB-C flag update after enc10 feature init
  drm/amd/display: fix DP audio settings

Christian König (1):
  drm/amdgpu: revert "Adjust removal control flow for smu v13_0_2"

Christophe JAILLET (2):
  drm/amd/display: Fix a switch statement in 
populate_dml_output_cfg_from_stream_state()
  drm/amdgpu: Remove usage of the deprecated ida_simple_xx() API

ChunTao Tso (1):
  drm/amd/display: Replay + IPS + ABM in Full Screen VPB

David McFarland (1):
  drm/amd: Don't init MEC2 firmware when it fails to load

Dillon Varone (1):
  drm/amd/display: Init link enc resources in dc_state only if res_pool 
presents

Dmytro Laktyushkin (2):
  drm/amd/display: Fix dml2 assigned pipe search
  drm/amd/display: Fix DPSTREAM CLK on and off sequence

Eric Yang (1):
  drm/amd/display: fix invalid reg access on DCN35 FPGA

Ethan Bitnun (2):
  drm/amd/display: Add delay before logging clks from hw
  drm/amd/display: Adjust set_p_state calls to fix logging

Fangzhi Zuo (2):
  drm/amd/display: Fix dcn35 8k30 Underflow/Corruption Issue
  drm/amd/display: Fix MST Null Ptr for RV

Felix Kuehling (3):
  drm/amdgpu: 

[PATCH 1/2] drm/amdkfd: update SIMD distribution algo for GFXIP 9.4.2 onwards

2024-02-09 Thread Rajneesh Bhardwaj
In certain cooperative group dispatch scenarios the default SPI resource
allocation may cause reduced per-CU workgroup occupancy. Set
COMPUTE_RESOURCE_LIMITS.FORCE_SIMD_DIST=1 to mitigate soft hang
scenarions.

Suggested-by: Joseph Greathouse 
Signed-off-by: Rajneesh Bhardwaj 
---
* Incorporate review feedback from Felix from
  https://www.mail-archive.com/amd-gfx@lists.freedesktop.org/msg102840.html
  and split one of the suggested gfx11 changes as a seperate patch.


 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c| 9 +
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h  | 1 +
 drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 4 +++-
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
index 42d881809dc7..697b6d530d12 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
@@ -303,6 +303,15 @@ static void update_mqd(struct mqd_manager *mm, void *mqd,
update_cu_mask(mm, mqd, minfo, 0);
set_priority(m, q);
 
+   if (minfo && KFD_GC_VERSION(mm->dev) >= IP_VERSION(9, 4, 2)) {
+   if (minfo->update_flag & UPDATE_FLAG_IS_GWS)
+   m->compute_resource_limits |=
+   COMPUTE_RESOURCE_LIMITS__FORCE_SIMD_DIST_MASK;
+   else
+   m->compute_resource_limits &=
+   ~COMPUTE_RESOURCE_LIMITS__FORCE_SIMD_DIST_MASK;
+   }
+
q->is_active = QUEUE_IS_ACTIVE(*q);
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 677281c0793e..65b504813576 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -532,6 +532,7 @@ struct queue_properties {
 enum mqd_update_flag {
UPDATE_FLAG_DBG_WA_ENABLE = 1,
UPDATE_FLAG_DBG_WA_DISABLE = 2,
+   UPDATE_FLAG_IS_GWS = 3, /* quirk for gfx9 IP */
 };
 
 struct mqd_update_info {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 43eff221eae5..4858112f9a53 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -95,6 +95,7 @@ void kfd_process_dequeue_from_device(struct 
kfd_process_device *pdd)
 int pqm_set_gws(struct process_queue_manager *pqm, unsigned int qid,
void *gws)
 {
+   struct mqd_update_info minfo = {0};
struct kfd_node *dev = NULL;
struct process_queue_node *pqn;
struct kfd_process_device *pdd;
@@ -146,9 +147,10 @@ int pqm_set_gws(struct process_queue_manager *pqm, 
unsigned int qid,
}
 
pdd->qpd.num_gws = gws ? dev->adev->gds.gws_size : 0;
+   minfo.update_flag = gws ? UPDATE_FLAG_IS_GWS : 0;
 
return pqn->q->device->dqm->ops.update_queue(pqn->q->device->dqm,
-   pqn->q, NULL);
+   pqn->q, &minfo);
 }
 
 void kfd_process_dequeue_from_all_devices(struct kfd_process *p)
-- 
2.34.1



[PATCH 2/2] drm/amdgpu: Fix implicit assumtion in gfx11 debug flags

2024-02-09 Thread Rajneesh Bhardwaj
Gfx11 debug flags mask is currently set with an implicit assumption that
no other mqd update flags exist. This needs to be fixed with newly
introduced flag UPDATE_FLAG_IS_GWS by the previous patch.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
index d722cbd31783..826bc4f6c8a7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
@@ -55,8 +55,8 @@ static void update_cu_mask(struct mqd_manager *mm, void *mqd,
m = get_mqd(mqd);
 
if (has_wa_flag) {
-   uint32_t wa_mask = minfo->update_flag == 
UPDATE_FLAG_DBG_WA_ENABLE ?
-   0x : 0x;
+   uint32_t wa_mask =
+   (minfo->update_flag & UPDATE_FLAG_DBG_WA_ENABLE) ? 
0x : 0x;
 
m->compute_static_thread_mgmt_se0 = wa_mask;
m->compute_static_thread_mgmt_se1 = wa_mask;
-- 
2.34.1