[PATCH] drm/amdgpu: Fix use-after-free in amdgpu_cs_ioctl

2022-08-25 Thread YuBiao Wang
Hello,

This patch is reviewed by Andrey and Christian and pushed into bringup
temp branch. It need to be cherry-picked to drm-next, too. Does anyone
has any comments on this patch?

Thanks,
Yubiao Wang


[Why]
In amdgpu_cs_ioctl, amdgpu_job_free could be performed ealier if there
is -ERESTARTSYS error. In this case, job->hw_fence could be not
initialized yet. Putting hw_fence during amdgpu_job_free could lead to a
use-after-free warning.

[How]
Check if drm_sched_job_init is performed before job_free by checking
s_fence.

v2: Check hw_fence.ops instead since it could be NULL if fence is not
initialized. Reverse the condition since !=NULL check is discouraged in
kernel.

Signed-off-by: YuBiao Wang 
Reviewed-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 8f51adf3b329..1062b7ed74ec 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -162,7 +162,10 @@ void amdgpu_job_free(struct amdgpu_job *job)
amdgpu_sync_free(&job->sync);
amdgpu_sync_free(&job->sched_sync);
 
-   dma_fence_put(&job->hw_fence);
+   if (!job->hw_fence.ops)
+   kfree(job);
+   else
+   dma_fence_put(&job->hw_fence);
 }
 
 int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
-- 
2.25.1



Re: mainline build failure for x86_64 allmodconfig with clang

2022-08-25 Thread Nathan Chancellor
Hi AMD folks,

Top posting because it might not have been obvious but I was looking for
your feedback on this message (which can be viewed on lore.kernel.org if
you do not have the original [1]) so that we can try to get this fixed
in some way for 6.0/6.1. If my approach is not welcome, please consider
suggesting another one or looking to see if this is something you all
could look into.

[1]: https://lore.kernel.org/Yv5h0rb3AgTZLVJv@dev-arch.thelio-3990X/

Cheers,
Nathan

On Thu, Aug 18, 2022 at 08:59:14AM -0700, Nathan Chancellor wrote:
> Hi Arnd,
> 
> Doubling back around to this now since I think this is the only thing
> breaking x86_64 allmodconfig with clang 11 through 15.
> 
> On Fri, Aug 05, 2022 at 09:32:13PM +0200, Arnd Bergmann wrote:
> > On Fri, Aug 5, 2022 at 8:02 PM Nathan Chancellor  wrote:
> > > On Fri, Aug 05, 2022 at 06:16:45PM +0200, Arnd Bergmann wrote:
> > > > On Fri, Aug 5, 2022 at 5:32 PM Harry Wentland  
> > > > wrote:
> > > > While splitting out sub-functions can help reduce the maximum stack
> > > > usage, it seems that in this case it makes the actual problem worse:
> > > > I see 2168 bytes for the combined
> > > > dml32_ModeSupportAndSystemConfigurationFull(), but marking
> > > > mode_support_configuration() as noinline gives me 1992 bytes
> > > > for the outer function plus 384 bytes for the inner one. So it does
> > > > avoid the warning (barely), but not the problem that the warning tries
> > > > to point out.
> > >
> > > I haven't had a chance to take a look at splitting things up yet, would
> > > you recommend a different approach?
> > 
> > Splitting up large functions can help when you have large local variables
> > that are used in different parts of the function, and the split gets the
> > compiler to reuse stack locations.
> > 
> > I think in this particular function, the problem isn't actually local 
> > variables
> > but either pushing variables on the stack for argument passing,
> > or something that causes the compiler to run out of registers so it
> > has to spill registers to the stack.
> > 
> > In either case, one has to actually look at the generated output
> > and then try to rearrange the codes so this does not happen.
> > 
> > One thing to try would be to condense a function call like
> > 
> > dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
> > 
> > &v->dummy_vars.dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport,
> > mode_lib->vba.USRRetrainingRequiredFinal,
> > mode_lib->vba.UsesMALLForPStateChange,
> > 
> > mode_lib->vba.PrefetchModePerState[mode_lib->vba.VoltageLevel][mode_lib->vba.maxMpcComb],
> > mode_lib->vba.NumberOfActiveSurfaces,
> > mode_lib->vba.MaxLineBufferLines,
> > mode_lib->vba.LineBufferSizeFinal,
> > mode_lib->vba.WritebackInterfaceBufferSize,
> > mode_lib->vba.DCFCLK,
> > mode_lib->vba.ReturnBW,
> > mode_lib->vba.SynchronizeTimingsFinal,
> > 
> > mode_lib->vba.SynchronizeDRRDisplaysForUCLKPStateChangeFinal,
> > mode_lib->vba.DRRDisplay,
> > v->dpte_group_bytes,
> > v->meta_row_height,
> > v->meta_row_height_chroma,
> > 
> > v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.mmSOCParameters,
> > mode_lib->vba.WritebackChunkSize,
> > mode_lib->vba.SOCCLK,
> > v->DCFCLKDeepSleep,
> > mode_lib->vba.DETBufferSizeY,
> > mode_lib->vba.DETBufferSizeC,
> > mode_lib->vba.SwathHeightY,
> > mode_lib->vba.SwathHeightC,
> > mode_lib->vba.LBBitPerPixel,
> > v->SwathWidthY,
> > v->SwathWidthC,
> > mode_lib->vba.HRatio,
> > mode_lib->vba.HRatioChroma,
> > mode_lib->vba.vtaps,
> > mode_lib->vba.VTAPsChroma,
> > mode_lib->vba.VRatio,
> > mode_lib->vba.VRatioChroma,
> > mode_lib->vba.HTotal,
> > mode_lib->vba.VTotal,
> > mode_lib->vba.VActive,
> > mode_lib->vba.PixelClock,
> > mode_lib->vba.BlendingAndTiming,
> >  /* more arguments */);
> > 
> > into calling conventions that take a pointer to 'mode_lib->vba' and another
> > one to 'v', so these are no longer passed on the stack individually.
> 
> So I took a whack at reducing this function's number of parameters and
> ended up with the attached patch. I basically just removed any
> parameters that were 

Re: [PATCH] gpu: move from strlcpy with unused retval to strscpy

2022-08-25 Thread Jernej Škrabec
Dne četrtek, 18. avgust 2022 ob 23:00:07 CEST je Wolfram Sang napisal(a):
> Follow the advice of the below link and prefer 'strscpy' in this
> subsystem. Conversion is 1:1 because the return value is not used.
> Generated by a coccinelle script.
> 
> Link:
> https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmk
> n...@mail.gmail.com/ Signed-off-by: Wolfram Sang
> 

Acked-by: Jernej Skrabec 

Best regards,
Jernej

> ---
>  drivers/gpu/drm/amd/amdgpu/atom.c   | 2 +-
>  drivers/gpu/drm/amd/pm/legacy-dpm/legacy_dpm.c  | 2 +-
>  drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c | 6 +++---
>  drivers/gpu/drm/bridge/synopsys/dw-hdmi.c   | 2 +-
>  drivers/gpu/drm/display/drm_dp_helper.c | 2 +-
>  drivers/gpu/drm/display/drm_dp_mst_topology.c   | 2 +-
>  drivers/gpu/drm/drm_mipi_dsi.c  | 2 +-
>  drivers/gpu/drm/i2c/tda998x_drv.c   | 2 +-
>  drivers/gpu/drm/i915/selftests/i915_perf.c  | 2 +-
>  drivers/gpu/drm/mediatek/mtk_hdmi_ddc.c | 2 +-
>  drivers/gpu/drm/radeon/radeon_atombios.c| 4 ++--
>  drivers/gpu/drm/radeon/radeon_combios.c | 4 ++--
>  drivers/gpu/drm/rockchip/inno_hdmi.c| 2 +-
>  drivers/gpu/drm/rockchip/rk3066_hdmi.c  | 2 +-
>  drivers/gpu/drm/sun4i/sun4i_hdmi_i2c.c  | 2 +-
>  15 files changed, 19 insertions(+), 19 deletions(-)





Re: [PATCH 4/4] amd/display: indicate support for atomic async page-flips on DCN

2022-08-25 Thread Alex Deucher
On Wed, Aug 24, 2022 at 11:09 AM Simon Ser  wrote:
>
> amdgpu_dm_commit_planes already sets the flip_immediate flag for
> async page-flips. This flag is used to set the UNP_FLIP_CONTROL
> register. Thus, no additional change is required to handle async
> page-flips with the atomic uAPI.
>
> Note, async page-flips are still unsupported on DCE with the atomic
> uAPI. The mode_set_base callbacks unconditionally set the
> GRPH_SURFACE_UPDATE_H_RETRACE_EN field of the GRPH_FLIP_CONTROL
> register to 0, which disables async page-flips.

Can you elaborate a bit on this? We don't use hsync flips at all, even
in non-atomic, as far as I recall.  The hardware can also do immediate
flips which take effect as soon as you update the base address
register which is what we use for async updates today IIRC.

Alex

>
> Signed-off-by: Simon Ser 
> Cc: Daniel Vetter 
> Cc: Joshua Ashton 
> Cc: Melissa Wen 
> Cc: Alex Deucher 
> Cc: Harry Wentland 
> Cc: Nicholas Kazlauskas 
> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index ef816bf295eb..9ab01c58bedb 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -3804,7 +3804,6 @@ static int amdgpu_dm_mode_config_init(struct 
> amdgpu_device *adev)
> adev_to_drm(adev)->mode_config.prefer_shadow = 0;
> /* indicates support for immediate flip */
> adev_to_drm(adev)->mode_config.async_page_flip = true;
> -   adev_to_drm(adev)->mode_config.atomic_async_page_flip_not_supported = 
> true;
>
> adev_to_drm(adev)->mode_config.fb_base = adev->gmc.aper_base;
>
> --
> 2.37.2
>
>


[pull] amdgpu, amdkfd, radeon drm-fixes-6.0

2022-08-25 Thread Alex Deucher
Hi Dave, Daniel,

Fixes for 6.0.  Mainly fixes for new IPs added in 6.0.

The following changes since commit b1fb6b87ed55ced458b322ea10cf0d0ab151e01b:

  Merge tag 'amd-drm-fixes-6.0-2022-08-17' of 
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes (2022-08-19 09:45:22 
+1000)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-fixes-6.0-2022-08-25

for you to fetch changes up to b8983d42524f10ac6bf35bbce6a7cc8e45f61e04:

  drm/amdgpu: mmVM_L2_CNTL3 register not initialized correctly (2022-08-25 
13:54:35 -0400)


amd-drm-fixes-6.0-2022-08-25:

amdgpu:
- GFX 11.0 fixes
- PSP XGMI handling fixes
- GFX9 fix for compute-only IPs
- Drop duplicated function call
- Fix warning due to missing header
- NBIO 7.7 fixes
- DCN 3.1.4 fixes
- SDMA 6.0 fixes
- SMU 13.0 fixes
- Arcturus GPUVM page table fix
- MMHUB 1.0 fix

amdkfd:
- GC 10.3.7 fix

radeon:
- Delayed work flush fix


Candice Li (1):
  drm/amdgpu: Check num_gfx_rings for gfx v9_0 rb setup.

Evan Quan (1):
  drm/amd/pm: update SMU 13.0.0 driver_if header

Likun Gao (1):
  drm/amdgpu: add MGCG perfmon setting for gfx11

Maíra Canal (1):
  drm/amd/display: Include missing header

Mukul Joshi (1):
  drm/amdgpu: Fix page table setup on Arcturus

Prike Liang (1):
  drm/amdkfd: Fix isa version for the GC 10.3.7

Qu Huang (1):
  drm/amdgpu: mmVM_L2_CNTL3 register not initialized correctly

Roman Li (1):
  drm/amd/display: enable PCON support for dcn314

Tim Huang (5):
  drm/amdgpu: enable GFXOFF allow control for GC IP v11.0.1
  drm/amdgpu: add TX_POWER_CTRL_1 macro definitions for NBIO IP v7.7.0
  drm/amdgpu: add NBIO IP v7.7.0 Clock Gating support
  drm/amdgpu: enable NBIO IP v7.7.0 Clock Gating
  drm/amdgpu: add sdma instance check for gfx11 CGCG

YiPeng Chai (2):
  drm/amdgpu: Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device 
to psp_hw_fini
  drm/amdgpu: fix hive reference leak when adding xgmi device

Zhenneng Li (1):
  drm/radeon: add a force flush to delay work when radeon

shaoyunl (1):
  drm/amdgpu: Remove the additional kfd pre reset call for sriov

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c|  3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 24 ---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  |  3 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c|  1 +
 drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c| 12 +++-
 drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c | 78 ++
 drivers/gpu/drm/amd/amdgpu/soc21.c | 22 --
 drivers/gpu/drm/amd/amdkfd/kfd_device.c|  6 +-
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c|  5 +-
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_plane.h|  8 ---
 .../drm/amd/display/dc/dcn314/dcn314_resource.c|  1 +
 .../amd/include/asic_reg/nbio/nbio_7_7_0_offset.h  |  2 +
 .../amd/include/asic_reg/nbio/nbio_7_7_0_sh_mask.h | 13 
 .../pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h | 31 +
 drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h   |  2 +-
 drivers/gpu/drm/radeon/radeon_device.c |  3 +
 18 files changed, 173 insertions(+), 47 deletions(-)


Re: [PATCH 1/2] drm/amdgpu: Move HDP remapping earlier during init

2022-08-25 Thread Bjorn Helgaas
[+cc Greg, no action needed yet, just FYI that stable will want these]

On Thu, Aug 25, 2022 at 02:28:19PM +0530, Lijo Lazar wrote:
> HDP flush is used early in the init sequence as part of memory controller
> block initialization. Hence remapping of HDP registers needed for flush
> needs to happen earlier.
> 
> This also fixes the AER error reported as Unsupported Request during
> driver load.

I would say something like:

  This prevents writes to unimplemented space, which would cause
  Unsupported Request errors.  Prior to 8795e182b02d ("PCI/portdrv:
  Don't disable AER reporting in get_port_device_capability()"), these
  errors occurred but were ignored.

The write is the error; AER is just the reporting mechanism.

> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373

We need a cc: stable because 8795e182b02d ("PCI/portdrv: Don't disable
AER reporting in get_port_device_capability()") has already been
backported to at lealst these stable kernels:

  5.10.137 5.15.61 5.18.18 5.19.2

and these fixes need to go there as well.  So add something like this:

  Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in 
get_port_device_capability()")
  cc: sta...@vger.kernel.org

It's not that there was something wrong with 8795e182b02d and these
patches fix it; it's just that 8795e182b02d *exposed* an amdgpu
problem that was there all along.  But we need some way to connect
with it.

> Reported-by: Tom Seewald 
> Signed-off-by: Lijo Lazar 


Re: [Bug 216373] New: Uncorrected errors reported for AMD GPU

2022-08-25 Thread Bjorn Helgaas
On Thu, Aug 25, 2022 at 10:18:28AM +0200, Christian König wrote:
> Am 25.08.22 um 09:54 schrieb Lazar, Lijo:
> > On 8/25/2022 1:04 PM, Christian König wrote:
> > > Am 25.08.22 um 08:40 schrieb Stefan Roese:
> > > > On 24.08.22 16:45, Tom Seewald wrote:
> > > > > On Wed, Aug 24, 2022 at 12:11 AM Lazar, Lijo
> > > > >  wrote:
> > > > > > Unfortunately, I don't have any NV platforms to test. Attached is an
> > > > > > 'untested-patch' based on your trace logs.
> > > > > ...
> > > > 
> > > > I did not follow this thread in depth, but FWICT the bug is solved now
> > > > with this patch. So is it correct, that the now fully enabled AER
> > > > support in the PCI subsystem in v6.0 helped detecting a bug in the AMD
> > > > GPU driver?
> > > 
> > > It looks like it, but I'm not 100% sure about the rational behind it.
> > > 
> > > Lijo can you explain more on this?
> > 
> > From the trace, during gmc hw_init it takes this route -
> > 
> > gart_enable -> amdgpu_gtt_mgr_recover -> amdgpu_gart_invalidate_tlb ->
> > amdgpu_device_flush_hdp -> amdgpu_asic_flush_hdp (non-ring based HDP
> > flush)
> > 
> > HDP flush is done using remapped offset which is MMIO_REG_HOLE_OFFSET
> > (0x8 - PAGE_SIZE)
> > 
> > WREG32_NO_KIQ((adev->rmmio_remap.reg_offset +
> > KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0);
> > 
> > However, the remapping is not yet done at this point. It's done at a
> > later point during common block initialization. Access to the unmapped
> > offset '(0x8 - PAGE_SIZE)' seems to come back as unsupported request
> > and reported through AER.
> 
> That's interesting behavior. So far AER always indicated some kind of
> transmission error.
> 
> When that happens as well on unmapped areas of the MMIO BAR then we need to
> keep that in mind.

AER can log many different kinds of errors, some related to hardware
issues and some related to software.

PCI writes are normally posted and get no response, so AER is the main
way to find out about writes to unimplemented addresses.

Reads do get a response, of course, and reads to unimplemented
addresses cause errors that most hardware turns into a ~0 data return
(in addition to reporting via AER if enabled).

Bjorn


Re: [PATCH] drm/amd: fix potential memory leak

2022-08-25 Thread Alex Deucher
On Tue, Aug 23, 2022 at 3:29 AM Bernard Zhao  wrote:
>
> This patch fix potential memory leak (clk_src) when function run
> into last return NULL.
>
> Signed-off-by: Bernard Zhao 
> ---
>  drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c 
> b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
> index 85f32206a766..76f263846c6b 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
> @@ -1715,6 +1715,7 @@ static struct clock_source *dcn30_clock_source_create(
> }
>
> BREAK_TO_DEBUGGER();
> +   free(clk_src);

This should be kfree().  Fixed up locally.

Alex

> return NULL;
>  }
>
> --
> 2.33.1
>


Re: [PATCH] drm/amd: remove possible condition with no effect (if == else)

2022-08-25 Thread Alex Deucher
Applied.  Thanks!

On Tue, Aug 23, 2022 at 3:30 AM Bernard Zhao  wrote:
>
> This patch fix cocci warning:
> drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c:1816:6-8:
> WARNING: possible condition with no effect (if == else).
>
> Signed-off-by: Bernard Zhao 
> ---
>  drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c 
> b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
> index 85f32206a766..dccc9794e6a2 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
> @@ -1813,8 +1813,6 @@ static bool dcn314_resource_construct(
>
> if (dc->ctx->dce_environment == DCE_ENV_PRODUCTION_DRV)
> dc->debug = debug_defaults_drv;
> -   else if (dc->ctx->dce_environment == DCE_ENV_FPGA_MAXIMUS)
> -   dc->debug = debug_defaults_diags;
> else
> dc->debug = debug_defaults_diags;
> // Init the vm_helper
> --
> 2.33.1
>


Re: [PATCH] drm/amd: remove possible condition with no effect (if == else)

2022-08-25 Thread Alex Deucher
Applied.  Thanks!

Alex

On Tue, Aug 23, 2022 at 3:01 AM Bernard Zhao  wrote:
>
> This patch fix cocci warning:
> drivers/gpu/drm/amd/display/dc/core/dc.c:3335:2-4: WARNING:
> possible condition with no effect (if == else).
>
> Signed-off-by: Bernard Zhao 
> ---
>  drivers/gpu/drm/amd/display/dc/core/dc.c | 9 ++---
>  1 file changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
> b/drivers/gpu/drm/amd/display/dc/core/dc.c
> index aeecca68dea7..2d4c44083d79 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
> @@ -3332,13 +3332,8 @@ static void commit_planes_for_stream(struct dc *dc,
> /* Since phantom pipe programming is moved to 
> post_unlock_program_front_end,
>  * move the SubVP lock to after the phantom pipes have been 
> setup
>  */
> -   if (should_lock_all_pipes && 
> dc->hwss.interdependent_update_lock) {
> -   if (dc->hwss.subvp_pipe_control_lock)
> -   dc->hwss.subvp_pipe_control_lock(dc, context, 
> false, should_lock_all_pipes, NULL, subvp_prev_use);
> -   } else {
> -   if (dc->hwss.subvp_pipe_control_lock)
> -   dc->hwss.subvp_pipe_control_lock(dc, context, 
> false, should_lock_all_pipes, NULL, subvp_prev_use);
> -   }
> +   if (dc->hwss.subvp_pipe_control_lock)
> +   dc->hwss.subvp_pipe_control_lock(dc, context, false, 
> should_lock_all_pipes, NULL, subvp_prev_use);
> return;
> }
>
> --
> 2.33.1
>


Re: [PATCH] drm/amd: fix potential memory leak

2022-08-25 Thread Alex Deucher
Applied.  Thanks!

Alex

On Tue, Aug 23, 2022 at 2:36 AM Bernard Zhao  wrote:
>
> This patch fix potential memory leak (clk_src) when function run
> into last return NULL.
> Signed-off-by: Bernard Zhao 
> ---
>  drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c 
> b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
> index 85f32206a766..c7bb76a2a8c2 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
> @@ -1643,6 +1643,7 @@ static struct clock_source *dcn31_clock_source_create(
> }
>
> BREAK_TO_DEBUGGER();
> +   kfree(clk_src);
> return NULL;
>  }
>
> --
> 2.33.1
>


Re: [PATCH] drm/amd: fix potential memory leak

2022-08-25 Thread Alex Deucher
Applied.  Thanks!

On Tue, Aug 23, 2022 at 3:29 AM Bernard Zhao  wrote:
>
> This patch fix potential memory leak (clk_src) when function run
> into last return NULL.
>
> Signed-off-by: Bernard Zhao 
> ---
>  drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c 
> b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
> index 85f32206a766..76f263846c6b 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
> @@ -1715,6 +1715,7 @@ static struct clock_source *dcn30_clock_source_create(
> }
>
> BREAK_TO_DEBUGGER();
> +   free(clk_src);
> return NULL;
>  }
>
> --
> 2.33.1
>


Re: [PATCH] drm/amdgpu: mmVM_L2_CNTL3 register not initialized correctly

2022-08-25 Thread Alex Deucher
Applied.  Thanks!

Alex

On Tue, Aug 23, 2022 at 3:15 AM  wrote:
>
> From: Qu Huang 
>
> The mmVM_L2_CNTL3 register is not assigned an initial value
>
> Signed-off-by: Qu Huang 
> ---
>  drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c 
> b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
> index 1da2ec692057e..b8a987a032a8e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
> @@ -176,6 +176,7 @@ static void mmhub_v1_0_init_cache_regs(struct 
> amdgpu_device *adev)
> tmp = REG_SET_FIELD(tmp, VM_L2_CNTL2, INVALIDATE_L2_CACHE, 1);
> WREG32_SOC15(MMHUB, 0, mmVM_L2_CNTL2, tmp);
>
> +   tmp = mmVM_L2_CNTL3_DEFAULT;
> if (adev->gmc.translate_further) {
> tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, BANK_SELECT, 12);
> tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3,
> --
> 2.31.1
>


RE: [PATCH] drm/amdgpu: Fix page table setup on Arcturus

2022-08-25 Thread Joshi, Mukul
[AMD Official Use Only - General]



> -Original Message-
> From: Alex Deucher 
> Sent: Thursday, August 25, 2022 11:26 AM
> To: Joshi, Mukul 
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: Fix page table setup on Arcturus
> 
> [CAUTION: External Email]
> 
> On Thu, Aug 25, 2022 at 10:49 AM Joshi, Mukul 
> wrote:
> >
> > [AMD Official Use Only - General]
> >
> >
> >
> > > -Original Message-
> > > From: Alex Deucher 
> > > Sent: Thursday, August 25, 2022 9:33 AM
> > > To: Joshi, Mukul 
> > > Cc: amd-gfx@lists.freedesktop.org
> > > Subject: Re: [PATCH] drm/amdgpu: Fix page table setup on Arcturus
> > >
> > > [CAUTION: External Email]
> > >
> > > On Mon, Aug 22, 2022 at 11:53 AM Mukul Joshi 
> > > wrote:
> > > >
> > > > When translate_further is enabled, page table depth needs to be
> > > > updated. This was missing on Arcturus MMHUB init. This was causing
> > > > address translations to fail for SDMA user-mode queues.
> > > >
> > >
> > > Do other mmhub implementations need a similar fix?  It looks like
> > > some of them are missing similar changes.
> > >
> >
> > I am not sure if there is a plan to enable translate_further on other ASICs.
> > For now, its only enabled for Arcturus, Aldebaran and Raven.
> > If we plan to enable it on other ASICs, then yes the other mmhub
> > implementations would need similar changes.
> 
> It would be nice to fix them up preemptively so that if we ever enable it on
> another asic, it will just work.
> 
Sure I can take a look at all mmhub and gfxhub implementations and send out a 
patch
for the ones that are missing this page table setup change when 
translate_further is
enabled.

Regards,
Mukul

> Alex
> 
> 
> >
> > Regards,
> > Mukul
> >
> > > Alex
> > >
> > > > Fixes: 2abf2573b1c69 ("drm/amdgpu: Enable translate_further to
> > > > extend
> > > UTCL2 reach"
> > > > Signed-off-by: Mukul Joshi 
> > > > ---
> > > >  drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c | 12 ++--
> > > >  1 file changed, 10 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> > > > b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> > > > index 6e0145b2b408..445cb06b9d26 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> > > > @@ -295,9 +295,17 @@ static void
> > > > mmhub_v9_4_disable_identity_aperture(struct amdgpu_device
> *adev,
> > > > static void mmhub_v9_4_setup_vmid_config(struct amdgpu_device
> > > *adev, int hubid)  {
> > > > struct amdgpu_vmhub *hub = &adev-
> >vmhub[AMDGPU_MMHUB_0];
> > > > +   unsigned int num_level, block_size;
> > > > uint32_t tmp;
> > > > int i;
> > > >
> > > > +   num_level = adev->vm_manager.num_level;
> > > > +   block_size = adev->vm_manager.block_size;
> > > > +   if (adev->gmc.translate_further)
> > > > +   num_level -= 1;
> > > > +   else
> > > > +   block_size -= 9;
> > > > +
> > > > for (i = 0; i <= 14; i++) {
> > > > tmp = RREG32_SOC15_OFFSET(MMHUB, 0,
> > > mmVML2VC0_VM_CONTEXT1_CNTL,
> > > > hubid *
> > > > MMHUB_INSTANCE_REGISTER_OFFSET
> > > > + i); @@ -305,7 +313,7 @@ static void
> > > mmhub_v9_4_setup_vmid_config(struct amdgpu_device *adev, int
> hubid)
> > > > ENABLE_CONTEXT, 1);
> > > > tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> > > > PAGE_TABLE_DEPTH,
> > > > -   adev->vm_manager.num_level);
> > > > +   num_level);
> > > > tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> > > > 
> > > > RANGE_PROTECTION_FAULT_ENABLE_DEFAULT, 1);
> > > > tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> > > > @@
> > > > -323,7 +331,7 @@ static void mmhub_v9_4_setup_vmid_config(struct
> > > amdgpu_device *adev, int hubid)
> > > > 
> > > > EXECUTE_PROTECTION_FAULT_ENABLE_DEFAULT, 1);
> > > > tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> > > > PAGE_TABLE_BLOCK_SIZE,
> > > > -   adev->vm_manager.block_size - 9);
> > > > +   block_size);
> > > > /* Send no-retry XNACK on fault to suppress VM fault 
> > > > storm. */
> > > > tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> > > >
> > > > RETRY_PERMISSION_OR_INVALID_PAGE_FAULT,
> > > > --
> > > > 2.35.1
> > > >


Re: [PATCH] drm: amd: amdgpu: ACPI: Add comment about ACPI_FADT_LOW_POWER_S0

2022-08-25 Thread Alex Deucher
Applied.  Thanks!

Alex

On Thu, Aug 25, 2022 at 3:58 AM Limonciello, Mario
 wrote:
>
> On 8/24/2022 12:32, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki 
> >
> > According to the ACPI specification [1], the ACPI_FADT_LOW_POWER_S0
> > flag merely means that it is better to use low-power S0 idle on the
> > given platform than S3 (provided that the latter is supported) and it
> > doesn't preclude using either of them (which of them will be used
> > depends on the choices made by user space).
> >
> > However, on some systems that flag is used to indicate whether or not
> > to enable special firmware mechanics allowing the system to save more
> > energy when suspended to idle.  If that flag is unset, doing so is
> > generally risky.
> >
> > Accordingly, add a comment to explain the ACPI_FADT_LOW_POWER_S0 check
> > in amdgpu_acpi_is_s0ix_active(), the purpose of which is otherwise
> > somewhat unclear.
> >
> > Link: 
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuefi.org%2Fspecs%2FACPI%2F6.4%2F05_ACPI_Software_Programming_Model%2FACPI_Software_Programming_Model.html%23fixed-acpi-description-table-fadt&data=05%7C01%7Cmario.limonciello%40amd.com%7Cf43320dda5114deeb16908da85f69d3b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637969591512297179%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=xp8pNzsXCkLcIJOJFY77yaLkMrvz5he3S%2Bi%2FwaxTwwg%3D&reserved=0
> >  # [1]
> > Signed-off-by: Rafael J. Wysocki 
>
> Reviewed-by: Mario Limonciello 
>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c |6 ++
> >   1 file changed, 6 insertions(+)
> >
> > Index: linux-pm/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > ===
> > --- linux-pm.orig/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > +++ linux-pm/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > @@ -1066,6 +1066,12 @@ bool amdgpu_acpi_is_s0ix_active(struct a
> >   (pm_suspend_target_state != PM_SUSPEND_TO_IDLE))
> >   return false;
> >
> > + /*
> > +  * If ACPI_FADT_LOW_POWER_S0 is not set in the FADT, it is generally
> > +  * risky to do any special firmware-related preparations for entering
> > +  * S0ix even though the system is suspending to idle, so return false
> > +  * in that case.
> > +  */
> >   if (!(acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0)) {
> >   dev_warn_once(adev->dev,
> > "Power consumption will be higher as BIOS has 
> > not been configured for suspend-to-idle.\n"
> >
> >
> >
>


Re: [PATCH] drm/radeon: use time_after(a,b) to replace "a>b"

2022-08-25 Thread Alex Deucher
Applied.  Thanks!

Alex

On Wed, Aug 24, 2022 at 10:40 PM Yu Zhe  wrote:
>
> time_after() deals with timer wrapping correctly.
>
> Signed-off-by: Yu Zhe 
> ---
>  drivers/gpu/drm/radeon/radeon_pm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/radeon/radeon_pm.c 
> b/drivers/gpu/drm/radeon/radeon_pm.c
> index e765abcb3b01..04c693ca419a 100644
> --- a/drivers/gpu/drm/radeon/radeon_pm.c
> +++ b/drivers/gpu/drm/radeon/radeon_pm.c
> @@ -1899,7 +1899,7 @@ static void radeon_dynpm_idle_work_handler(struct 
> work_struct *work)
>  * to false since we want to wait for vbl to avoid flicker.
>  */
> if (rdev->pm.dynpm_planned_action != DYNPM_ACTION_NONE &&
> -   jiffies > rdev->pm.dynpm_action_timeout) {
> +   time_after(jiffies, rdev->pm.dynpm_action_timeout)) {
> radeon_pm_get_dynpm_state(rdev);
> radeon_pm_set_clocks(rdev);
> }
> --
> 2.11.0
>


Re: [PATCH] drm/amdgpu: Fix page table setup on Arcturus

2022-08-25 Thread Alex Deucher
On Thu, Aug 25, 2022 at 10:49 AM Joshi, Mukul  wrote:
>
> [AMD Official Use Only - General]
>
>
>
> > -Original Message-
> > From: Alex Deucher 
> > Sent: Thursday, August 25, 2022 9:33 AM
> > To: Joshi, Mukul 
> > Cc: amd-gfx@lists.freedesktop.org
> > Subject: Re: [PATCH] drm/amdgpu: Fix page table setup on Arcturus
> >
> > [CAUTION: External Email]
> >
> > On Mon, Aug 22, 2022 at 11:53 AM Mukul Joshi 
> > wrote:
> > >
> > > When translate_further is enabled, page table depth needs to be
> > > updated. This was missing on Arcturus MMHUB init. This was causing
> > > address translations to fail for SDMA user-mode queues.
> > >
> >
> > Do other mmhub implementations need a similar fix?  It looks like some of
> > them are missing similar changes.
> >
>
> I am not sure if there is a plan to enable translate_further on other ASICs.
> For now, its only enabled for Arcturus, Aldebaran and Raven.
> If we plan to enable it on other ASICs, then yes the other mmhub 
> implementations
> would need similar changes.

It would be nice to fix them up preemptively so that if we ever enable
it on another asic, it will just work.

Alex


>
> Regards,
> Mukul
>
> > Alex
> >
> > > Fixes: 2abf2573b1c69 ("drm/amdgpu: Enable translate_further to extend
> > UTCL2 reach"
> > > Signed-off-by: Mukul Joshi 
> > > ---
> > >  drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c | 12 ++--
> > >  1 file changed, 10 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> > > b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> > > index 6e0145b2b408..445cb06b9d26 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> > > @@ -295,9 +295,17 @@ static void
> > > mmhub_v9_4_disable_identity_aperture(struct amdgpu_device *adev,
> > > static void mmhub_v9_4_setup_vmid_config(struct amdgpu_device
> > *adev, int hubid)  {
> > > struct amdgpu_vmhub *hub = &adev->vmhub[AMDGPU_MMHUB_0];
> > > +   unsigned int num_level, block_size;
> > > uint32_t tmp;
> > > int i;
> > >
> > > +   num_level = adev->vm_manager.num_level;
> > > +   block_size = adev->vm_manager.block_size;
> > > +   if (adev->gmc.translate_further)
> > > +   num_level -= 1;
> > > +   else
> > > +   block_size -= 9;
> > > +
> > > for (i = 0; i <= 14; i++) {
> > > tmp = RREG32_SOC15_OFFSET(MMHUB, 0,
> > mmVML2VC0_VM_CONTEXT1_CNTL,
> > > hubid * MMHUB_INSTANCE_REGISTER_OFFSET
> > > + i); @@ -305,7 +313,7 @@ static void
> > mmhub_v9_4_setup_vmid_config(struct amdgpu_device *adev, int hubid)
> > > ENABLE_CONTEXT, 1);
> > > tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> > > PAGE_TABLE_DEPTH,
> > > -   adev->vm_manager.num_level);
> > > +   num_level);
> > > tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> > > 
> > > RANGE_PROTECTION_FAULT_ENABLE_DEFAULT, 1);
> > > tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL, @@
> > > -323,7 +331,7 @@ static void mmhub_v9_4_setup_vmid_config(struct
> > amdgpu_device *adev, int hubid)
> > > 
> > > EXECUTE_PROTECTION_FAULT_ENABLE_DEFAULT, 1);
> > > tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> > > PAGE_TABLE_BLOCK_SIZE,
> > > -   adev->vm_manager.block_size - 9);
> > > +   block_size);
> > > /* Send no-retry XNACK on fault to suppress VM fault 
> > > storm. */
> > > tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> > >
> > > RETRY_PERMISSION_OR_INVALID_PAGE_FAULT,
> > > --
> > > 2.35.1
> > >


Re: [PATCH 1/2] drm/amdgpu: Move HDP remapping earlier during init

2022-08-25 Thread Felix Kuehling

Am 2022-08-25 um 04:58 schrieb Lijo Lazar:

HDP flush is used early in the init sequence as part of memory controller
block initialization. Hence remapping of HDP registers needed for flush
needs to happen earlier.

This also fixes the AER error reported as Unsupported Request during
driver load.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373

Reported-by: Tom Seewald 
Signed-off-by: Lijo Lazar 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +
  drivers/gpu/drm/amd/amdgpu/nv.c| 6 --
  drivers/gpu/drm/amd/amdgpu/soc15.c | 6 --
  drivers/gpu/drm/amd/amdgpu/soc21.c | 6 --
  4 files changed, 9 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index ce7d117efdb5..53d753e94a71 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2376,6 +2376,15 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
*adev)
DRM_ERROR("amdgpu_vram_scratch_init failed 
%d\n", r);
goto init_failed;
}
+
+   /* remap HDP registers to a hole in mmio space,
+* for the purpose of expose those registers
+* to process space. This is needed for any early HDP
+* flush operation during gmc initialization.
+*/
+   if (adev->nbio.funcs->remap_hdp_registers && 
!amdgpu_sriov_vf(adev))


Does this work on GFXv8? You may need a NULL-check for adev->nbio.funcs.

Regards,
  Felix



+   adev->nbio.funcs->remap_hdp_registers(adev);
+
r = adev->ip_blocks[i].version->funcs->hw_init((void 
*)adev);
if (r) {
DRM_ERROR("hw_init %d failed %d\n", i, r);
diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index b3fba8dea63c..3ac7fef74277 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -1032,12 +1032,6 @@ static int nv_common_hw_init(void *handle)
nv_program_aspm(adev);
/* setup nbio registers */
adev->nbio.funcs->init_registers(adev);
-   /* remap HDP registers to a hole in mmio space,
-* for the purpose of expose those registers
-* to process space
-*/
-   if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
-   adev->nbio.funcs->remap_hdp_registers(adev);
/* enable the doorbell aperture */
nv_enable_doorbell_aperture(adev, true);
  
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c

index fde6154f2009..a0481e37d7cf 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -1240,12 +1240,6 @@ static int soc15_common_hw_init(void *handle)
soc15_program_aspm(adev);
/* setup nbio registers */
adev->nbio.funcs->init_registers(adev);
-   /* remap HDP registers to a hole in mmio space,
-* for the purpose of expose those registers
-* to process space
-*/
-   if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
-   adev->nbio.funcs->remap_hdp_registers(adev);
  
  	/* enable the doorbell aperture */

soc15_enable_doorbell_aperture(adev, true);
diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c 
b/drivers/gpu/drm/amd/amdgpu/soc21.c
index 55284b24f113..16b447055102 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc21.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
@@ -660,12 +660,6 @@ static int soc21_common_hw_init(void *handle)
soc21_program_aspm(adev);
/* setup nbio registers */
adev->nbio.funcs->init_registers(adev);
-   /* remap HDP registers to a hole in mmio space,
-* for the purpose of expose those registers
-* to process space
-*/
-   if (adev->nbio.funcs->remap_hdp_registers)
-   adev->nbio.funcs->remap_hdp_registers(adev);
/* enable the doorbell aperture */
soc21_enable_doorbell_aperture(adev, true);
  


Re: [PATCH 03/11] drm/amdgpu: use DMA_RESV_USAGE_BOOKKEEP

2022-08-25 Thread Felix Kuehling

Am 2022-08-25 um 09:31 schrieb Christian König:

Use DMA_RESV_USAGE_BOOKKEEP for VM page table updates and KFD preemption fence.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c  | 3 ++-
  2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index cbd593f7d553..85eb68ec692e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -297,7 +297,7 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct 
amdgpu_bo *bo,
 */
replacement = dma_fence_get_stub();
dma_resv_replace_fences(bo->tbo.base.resv, ef->base.context,
-   replacement, DMA_RESV_USAGE_READ);
+   replacement, DMA_RESV_USAGE_BOOKKEEP);


This is only for the dummy fence when removing a real eviction fence. 
I'd expect another change where the eviction fence gets added.


Regards,
  Felix



dma_fence_put(replacement);
return 0;
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
index 1fd3cbca20a2..03ec099d64e0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
@@ -112,7 +112,8 @@ static int amdgpu_vm_sdma_commit(struct 
amdgpu_vm_update_params *p,
swap(p->vm->last_unlocked, tmp);
dma_fence_put(tmp);
} else {
-   amdgpu_bo_fence(p->vm->root.bo, f, true);
+   dma_resv_add_fence(p->vm->root.bo->tbo.base.resv, f,
+  DMA_RESV_USAGE_BOOKKEEP);
}
  
  	if (fence && !p->immediate)


Re: [Bug 216373] New: Uncorrected errors reported for AMD GPU

2022-08-25 Thread Felix Kuehling



Am 2022-08-24 um 01:10 schrieb Lazar, Lijo:



On 8/23/2022 10:34 PM, Tom Seewald wrote:

On Sat, Aug 20, 2022 at 2:53 AM Lazar, Lijo  wrote:


Missed the remap part, the offset is here -

https://elixir.bootlin.com/linux/v6.0-rc1/source/drivers/gpu/drm/amd/amdgpu/nv.c#L680 




The trace is coming from *_flush_hdp.

You may also check if *_remap_hdp_registers() is getting called. It is
done in nbio_vx_y files, most likely this one for your device -
https://elixir.bootlin.com/linux/v6.0-rc1/source/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c#L68 



Thanks,
Lijo


Hi Lijo,

I would be happy to test any patches that you think would shed some
light on this.

Unfortunately, I don't have any NV platforms to test. Attached is an 
'untested-patch' based on your trace logs.

Hi Lijo,

I like that the patch also removes some code duplication. Can you check 
that this doesn't break GFXv8 GPUs? You may need to add a NULL-check for 
adev->nbio.funcs to the if-condition.


Regards,
  Felix




Thanks,
Lijo


Re: [PATCH 1/2] drm/amdgpu: Move HDP remapping earlier during init

2022-08-25 Thread Christian König




Am 25.08.22 um 16:26 schrieb Alex Deucher:

On Thu, Aug 25, 2022 at 10:22 AM Lazar, Lijo  wrote:



On 8/25/2022 7:37 PM, Alex Deucher wrote:

On Thu, Aug 25, 2022 at 4:58 AM Lijo Lazar  wrote:

HDP flush is used early in the init sequence as part of memory controller
block initialization. Hence remapping of HDP registers needed for flush
needs to happen earlier.

This also fixes the AER error reported as Unsupported Request during
driver load.

Link: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D216373&data=05%7C01%7Clijo.lazar%40amd.com%7Caeec5a5e8ec7402e546708da86a31e41%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637970332414985963%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EQuUjHTaVPSKZdCUhL6iI4EJ56UMhKTLl86uVpSL8AU%3D&reserved=0

Reported-by: Tom Seewald 
Signed-off-by: Lijo Lazar 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +
   drivers/gpu/drm/amd/amdgpu/nv.c| 6 --
   drivers/gpu/drm/amd/amdgpu/soc15.c | 6 --
   drivers/gpu/drm/amd/amdgpu/soc21.c | 6 --
   4 files changed, 9 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index ce7d117efdb5..53d753e94a71 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2376,6 +2376,15 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
*adev)
  DRM_ERROR("amdgpu_vram_scratch_init failed 
%d\n", r);
  goto init_failed;
  }
+
+   /* remap HDP registers to a hole in mmio space,
+* for the purpose of expose those registers
+* to process space. This is needed for any early HDP
+* flush operation during gmc initialization.
+*/
+   if (adev->nbio.funcs->remap_hdp_registers && 
!amdgpu_sriov_vf(adev))
+   adev->nbio.funcs->remap_hdp_registers(adev);
+

We probably also need this in ip_resume() as well to handle the
suspend and resume case.  Thinking about this more, maybe it's easier
to just track whether the remap has happened yet and use the old or
new offset based on that.

If we can use the default offset without a remap, does it make sense to
remap? What about calling the same in ip_resume?

The remap is necessary so that userspace drivers can access this to
flush the HDP registers when they need to since normally it's in a
non-accessible region of the MMIO space.  I'm fine with updating it in
ip_resume as well.


Correct me if I'm wrong but I think remap means it is available at an 
additional location, the privileged region still works as well.


So Lijo suggestion is to use the privileged area in the kernel instead 
of the remapped one.


Right?

Christian.



Alex



Thanks,
Lijo


Alex



  r = adev->ip_blocks[i].version->funcs->hw_init((void 
*)adev);
  if (r) {
  DRM_ERROR("hw_init %d failed %d\n", i, r);
diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index b3fba8dea63c..3ac7fef74277 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -1032,12 +1032,6 @@ static int nv_common_hw_init(void *handle)
  nv_program_aspm(adev);
  /* setup nbio registers */
  adev->nbio.funcs->init_registers(adev);
-   /* remap HDP registers to a hole in mmio space,
-* for the purpose of expose those registers
-* to process space
-*/
-   if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
-   adev->nbio.funcs->remap_hdp_registers(adev);
  /* enable the doorbell aperture */
  nv_enable_doorbell_aperture(adev, true);

diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index fde6154f2009..a0481e37d7cf 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -1240,12 +1240,6 @@ static int soc15_common_hw_init(void *handle)
  soc15_program_aspm(adev);
  /* setup nbio registers */
  adev->nbio.funcs->init_registers(adev);
-   /* remap HDP registers to a hole in mmio space,
-* for the purpose of expose those registers
-* to process space
-*/
-   if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
-   adev->nbio.funcs->remap_hdp_registers(adev);

  /* enable the doorbell aperture */
  soc15_enable_doorbell_aperture(adev, true);
diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c 
b/drivers/gpu/drm/amd/amdgpu/soc21.c
index 55284b24f113..16b447055102 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc21.c
+++ 

RE: [PATCH] drm/amdgpu: Fix page table setup on Arcturus

2022-08-25 Thread Joshi, Mukul
[AMD Official Use Only - General]



> -Original Message-
> From: Alex Deucher 
> Sent: Thursday, August 25, 2022 9:33 AM
> To: Joshi, Mukul 
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: Fix page table setup on Arcturus
> 
> [CAUTION: External Email]
> 
> On Mon, Aug 22, 2022 at 11:53 AM Mukul Joshi 
> wrote:
> >
> > When translate_further is enabled, page table depth needs to be
> > updated. This was missing on Arcturus MMHUB init. This was causing
> > address translations to fail for SDMA user-mode queues.
> >
> 
> Do other mmhub implementations need a similar fix?  It looks like some of
> them are missing similar changes.
> 

I am not sure if there is a plan to enable translate_further on other ASICs.
For now, its only enabled for Arcturus, Aldebaran and Raven.
If we plan to enable it on other ASICs, then yes the other mmhub implementations
would need similar changes.

Regards,
Mukul

> Alex
> 
> > Fixes: 2abf2573b1c69 ("drm/amdgpu: Enable translate_further to extend
> UTCL2 reach"
> > Signed-off-by: Mukul Joshi 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c | 12 ++--
> >  1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> > b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> > index 6e0145b2b408..445cb06b9d26 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> > @@ -295,9 +295,17 @@ static void
> > mmhub_v9_4_disable_identity_aperture(struct amdgpu_device *adev,
> > static void mmhub_v9_4_setup_vmid_config(struct amdgpu_device
> *adev, int hubid)  {
> > struct amdgpu_vmhub *hub = &adev->vmhub[AMDGPU_MMHUB_0];
> > +   unsigned int num_level, block_size;
> > uint32_t tmp;
> > int i;
> >
> > +   num_level = adev->vm_manager.num_level;
> > +   block_size = adev->vm_manager.block_size;
> > +   if (adev->gmc.translate_further)
> > +   num_level -= 1;
> > +   else
> > +   block_size -= 9;
> > +
> > for (i = 0; i <= 14; i++) {
> > tmp = RREG32_SOC15_OFFSET(MMHUB, 0,
> mmVML2VC0_VM_CONTEXT1_CNTL,
> > hubid * MMHUB_INSTANCE_REGISTER_OFFSET
> > + i); @@ -305,7 +313,7 @@ static void
> mmhub_v9_4_setup_vmid_config(struct amdgpu_device *adev, int hubid)
> > ENABLE_CONTEXT, 1);
> > tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> > PAGE_TABLE_DEPTH,
> > -   adev->vm_manager.num_level);
> > +   num_level);
> > tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> > RANGE_PROTECTION_FAULT_ENABLE_DEFAULT, 
> > 1);
> > tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL, @@
> > -323,7 +331,7 @@ static void mmhub_v9_4_setup_vmid_config(struct
> amdgpu_device *adev, int hubid)
> > 
> > EXECUTE_PROTECTION_FAULT_ENABLE_DEFAULT, 1);
> > tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> > PAGE_TABLE_BLOCK_SIZE,
> > -   adev->vm_manager.block_size - 9);
> > +   block_size);
> > /* Send no-retry XNACK on fault to suppress VM fault storm. 
> > */
> > tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> >
> > RETRY_PERMISSION_OR_INVALID_PAGE_FAULT,
> > --
> > 2.35.1
> >


[PATCH v5 24/31] platform/x86: asus-wmi: Move acpi_backlight=vendor quirks to ACPI video_detect.c

2022-08-25 Thread Hans de Goede
Remove the asus-wmi quirk_entry.wmi_backlight_power quirk-flag, which
called acpi_video_set_dmi_backlight_type(acpi_backlight_vendor) and replace
it with acpi/video_detect.c video_detect_dmi_table[] entries using the
video_detect_force_vendor callback.

acpi_video_set_dmi_backlight_type() is troublesome because it may end up
getting called after other backlight drivers have already called
acpi_video_get_backlight_type() resulting in the other drivers
already being registered even though they should not.

Note no entries are dropped from the dmi_system_id table in asus-nb-wmi.c.
This is because the entries using the removed wmi_backlight_power flag
also use other model specific quirks from the asus-wmi quirk_entry struct.
So the quirk_asus_x55u struct and the entries pointing to it cannot be
dropped.

Acked-by: Rafael J. Wysocki 
Signed-off-by: Hans de Goede 
---
 drivers/acpi/video_detect.c| 40 ++
 drivers/platform/x86/asus-nb-wmi.c |  7 --
 drivers/platform/x86/asus-wmi.c|  3 ---
 drivers/platform/x86/asus-wmi.h|  1 -
 drivers/platform/x86/eeepc-wmi.c   | 25 +--
 5 files changed, 41 insertions(+), 35 deletions(-)

diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index 6a2523bc02ba..d893313fe1a0 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -174,6 +174,46 @@ static const struct dmi_system_id video_detect_dmi_table[] 
= {
DMI_MATCH(DMI_PRODUCT_NAME, "UL30A"),
},
},
+   {
+.callback = video_detect_force_vendor,
+/* Asus X55U */
+.matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "X55U"),
+   },
+   },
+   {
+.callback = video_detect_force_vendor,
+/* Asus X101CH */
+.matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "X101CH"),
+   },
+   },
+   {
+.callback = video_detect_force_vendor,
+/* Asus X401U */
+.matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "X401U"),
+   },
+   },
+   {
+.callback = video_detect_force_vendor,
+/* Asus X501U */
+.matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "X501U"),
+   },
+   },
+   {
+.callback = video_detect_force_vendor,
+/* Asus 1015CX */
+.matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "1015CX"),
+   },
+   },
{
.callback = video_detect_force_vendor,
/* GIGABYTE GB-BXBT-2807 */
diff --git a/drivers/platform/x86/asus-nb-wmi.c 
b/drivers/platform/x86/asus-nb-wmi.c
index 478dd300b9c9..810a94557a85 100644
--- a/drivers/platform/x86/asus-nb-wmi.c
+++ b/drivers/platform/x86/asus-nb-wmi.c
@@ -79,12 +79,10 @@ static struct quirk_entry quirk_asus_q500a = {
 
 /*
  * For those machines that need software to control bt/wifi status
- * and can't adjust brightness through ACPI interface
  * and have duplicate events(ACPI and WMI) for display toggle
  */
 static struct quirk_entry quirk_asus_x55u = {
.wapf = 4,
-   .wmi_backlight_power = true,
.wmi_backlight_set_devstate = true,
.no_display_toggle = true,
 };
@@ -147,11 +145,6 @@ static const struct dmi_system_id asus_quirks[] = {
DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK Computer Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "U32U"),
},
-   /*
-* Note this machine has a Brazos APU, and most Brazos Asus
-* machines need quirk_asus_x55u / wmi_backlight_power but
-* here acpi-video seems to work fine for backlight control.
-*/
.driver_data = &quirk_asus_wapf4,
},
{
diff --git a/drivers/platform/x86/asus-wmi.c b/drivers/platform/x86/asus-wmi.c
index 301166a5697d..5cf9d9aff164 100644
--- a/drivers/platform/x86/asus-wmi.c
+++ b/drivers/platform/x86/asus-wmi.c
@@ -3634,9 +3634,6 @@ static int asus_wmi_add(struct platform_device *pdev)
if (asus->driver->quirks->wmi_force_als_set)
asus_wmi_set_als();
 
-   if (asus->driver->quirks->wmi_backlight_power)
-   acpi_video_set_dmi_backlight_type(acpi_backlight_vendor);
-
if (asus->driver->quirks->wmi_backlight_native)
acpi_video_set_dmi_backlight_type(acpi_backlight_native);
 
diff --git a/drivers/platform/x86/asus-wmi.h b/drivers/platform/x86/asus-wmi.h
index b302415bf1d9..30770e411301 100644
--- a/drivers/platform/x86/asus-wmi.h
+++ b/drivers/platform/x86/asus-wmi.h
@@ -29,7 +29,6 @@ struct

[PATCH v5 17/31] ACPI: video: Add Nvidia WMI EC brightness control detection (v3)

2022-08-25 Thread Hans de Goede
On some new laptop designs a new Nvidia specific WMI interface is present
which gives info about panel brightness control and may allow controlling
the brightness through this interface when the embedded controller is used
for brightness control.

When this WMI interface is present and indicates that the EC is used,
then this interface should be used for brightness control.

Changes in v2:
- Use the new shared nvidia-wmi-ec-backlight.h header for the
  WMI firmware API definitions
- ACPI_VIDEO can now be enabled on non X86 too,
  adjust the Kconfig changes to match this.

Changes in v3:
- Use WMI_BRIGHTNESS_GUID define

Acked-by: Rafael J. Wysocki 
Signed-off-by: Hans de Goede 
---
 drivers/acpi/Kconfig   |  1 +
 drivers/acpi/video_detect.c| 37 ++
 drivers/gpu/drm/gma500/Kconfig |  2 ++
 drivers/gpu/drm/i915/Kconfig   |  2 ++
 include/acpi/video.h   |  1 +
 5 files changed, 43 insertions(+)

diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index 7802d8846a8d..44ad4b6bd234 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -212,6 +212,7 @@ config ACPI_VIDEO
tristate "Video"
depends on BACKLIGHT_CLASS_DEVICE
depends on INPUT
+   depends on ACPI_WMI || !X86
select THERMAL
help
  This driver implements the ACPI Extensions For Display Adapters
diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index cc9d0d91e268..4dc7fb865083 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -75,6 +76,36 @@ find_video(acpi_handle handle, u32 lvl, void *context, void 
**rv)
return AE_OK;
 }
 
+/* This depends on ACPI_WMI which is X86 only */
+#ifdef CONFIG_X86
+static bool nvidia_wmi_ec_supported(void)
+{
+   struct wmi_brightness_args args = {
+   .mode = WMI_BRIGHTNESS_MODE_GET,
+   .val = 0,
+   .ret = 0,
+   };
+   struct acpi_buffer buf = { (acpi_size)sizeof(args), &args };
+   acpi_status status;
+
+   status = wmi_evaluate_method(WMI_BRIGHTNESS_GUID, 0,
+WMI_BRIGHTNESS_METHOD_SOURCE, &buf, &buf);
+   if (ACPI_FAILURE(status))
+   return false;
+
+   /*
+* If brightness is handled by the EC then nvidia-wmi-ec-backlight
+* should be used, else the GPU driver(s) should be used.
+*/
+   return args.ret == WMI_BRIGHTNESS_SOURCE_EC;
+}
+#else
+static bool nvidia_wmi_ec_supported(void)
+{
+   return false;
+}
+#endif
+
 /* Force to use vendor driver when the ACPI device is known to be
  * buggy */
 static int video_detect_force_vendor(const struct dmi_system_id *d)
@@ -541,6 +572,7 @@ static const struct dmi_system_id video_detect_dmi_table[] 
= {
 static enum acpi_backlight_type __acpi_video_get_backlight_type(bool native)
 {
static DEFINE_MUTEX(init_mutex);
+   static bool nvidia_wmi_ec_present;
static bool native_available;
static bool init_done;
static long video_caps;
@@ -553,6 +585,7 @@ static enum acpi_backlight_type 
__acpi_video_get_backlight_type(bool native)
acpi_walk_namespace(ACPI_TYPE_DEVICE, ACPI_ROOT_OBJECT,
ACPI_UINT32_MAX, find_video, NULL,
&video_caps, NULL);
+   nvidia_wmi_ec_present = nvidia_wmi_ec_supported();
init_done = true;
}
if (native)
@@ -570,6 +603,10 @@ static enum acpi_backlight_type 
__acpi_video_get_backlight_type(bool native)
if (acpi_backlight_dmi != acpi_backlight_undef)
return acpi_backlight_dmi;
 
+   /* Special cases such as nvidia_wmi_ec and apple gmux. */
+   if (nvidia_wmi_ec_present)
+   return acpi_backlight_nvidia_wmi_ec;
+
/* On systems with ACPI video use either native or ACPI video. */
if (video_caps & ACPI_VIDEO_BACKLIGHT) {
/*
diff --git a/drivers/gpu/drm/gma500/Kconfig b/drivers/gpu/drm/gma500/Kconfig
index 0cff20265f97..807b989e3c77 100644
--- a/drivers/gpu/drm/gma500/Kconfig
+++ b/drivers/gpu/drm/gma500/Kconfig
@@ -7,6 +7,8 @@ config DRM_GMA500
select ACPI_VIDEO if ACPI
select BACKLIGHT_CLASS_DEVICE if ACPI
select INPUT if ACPI
+   select X86_PLATFORM_DEVICES if ACPI
+   select ACPI_WMI if ACPI
help
  Say yes for an experimental 2D KMS framebuffer driver for the
  Intel GMA500 (Poulsbo), Intel GMA600 (Moorestown/Oak Trail) and
diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 7ae3b7d67fcf..3efce05d7b57 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -23,6 +23,8 @@ config DRM_I915
# but for select to work, need to select ACPI_VIDEO's dependencies, ick
select BACKLIGHT_CLASS_DEVICE if ACPI
  

[PATCH v5 31/31] drm/todo: Add entry about dealing with brightness control on devices with > 1 panel

2022-08-25 Thread Hans de Goede
Add an entry summarizing the discussion about dealing with brightness
control on devices with more then 1 internal panel.

The original discussion can be found here:
https://lore.kernel.org/dri-devel/20220517152331.16217-1-hdego...@redhat.com/

Reviewed-by: Lyude Paul 
Signed-off-by: Hans de Goede 
---
 Documentation/gpu/todo.rst | 68 ++
 1 file changed, 68 insertions(+)

diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst
index 7634c27ac562..393d218e4a0c 100644
--- a/Documentation/gpu/todo.rst
+++ b/Documentation/gpu/todo.rst
@@ -679,6 +679,74 @@ Contact: Sam Ravnborg
 
 Level: Advanced
 
+Brightness handling on devices with multiple internal panels
+
+
+On x86/ACPI devices there can be multiple backlight firmware interfaces:
+(ACPI) video, vendor specific and others. As well as direct/native (PWM)
+register programming by the KMS driver.
+
+To deal with this backlight drivers used on x86/ACPI call
+acpi_video_get_backlight_type() which has heuristics (+quirks) to select
+which backlight interface to use; and backlight drivers which do not match
+the returned type will not register themselves, so that only one backlight
+device gets registered (in a single GPU setup, see below).
+
+At the moment this more or less assumes that there will only
+be 1 (internal) panel on a system.
+
+On systems with 2 panels this may be a problem, depending on
+what interface acpi_video_get_backlight_type() selects:
+
+1. native: in this case the KMS driver is expected to know which backlight
+   device belongs to which output so everything should just work.
+2. video: this does support controlling multiple backlights, but some work
+   will need to be done to get the output <-> backlight device mapping
+
+The above assumes both panels will require the same backlight interface type.
+Things will break on systems with multiple panels where the 2 panels need
+a different type of control. E.g. one panel needs ACPI video backlight control,
+where as the other is using native backlight control. Currently in this case
+only one of the 2 required backlight devices will get registered, based on
+the acpi_video_get_backlight_type() return value.
+
+If this (theoretical) case ever shows up, then supporting this will need some
+work. A possible solution here would be to pass a device and connector-name
+to acpi_video_get_backlight_type() so that it can deal with this.
+
+Note in a way we already have a case where userspace sees 2 panels,
+in dual GPU laptop setups with a mux. On those systems we may see
+either 2 native backlight devices; or 2 native backlight devices.
+
+Userspace already has code to deal with this by detecting if the related
+panel is active (iow which way the mux between the GPU and the panels
+points) and then uses that backlight device. Userspace here very much
+assumes a single panel though. It picks only 1 of the 2 backlight devices
+and then only uses that one.
+
+Note that all userspace code (that I know off) is currently hardcoded
+to assume a single panel.
+
+Before the recent changes to not register multiple (e.g. video + native)
+/sys/class/backlight devices for a single panel (on a single GPU laptop),
+userspace would see multiple backlight devices all controlling the same
+backlight.
+
+To deal with this userspace had to always picks one preferred device under
+/sys/class/backlight and will ignore the others. So to support brightness
+control on multiple panels userspace will need to be updated too.
+
+There are plans to allow brightness control through the KMS API by adding
+a "display brightness" property to drm_connector objects for panels. This
+solves a number of issues with the /sys/class/backlight API, including not
+being able to map a sysfs backlight device to a specific connector. Any
+userspace changes to add support for brightness control on devices with
+multiple panels really should build on top of this new KMS property.
+
+Contact: Hans de Goede
+
+Level: Advanced
+
 Outside DRM
 ===
 
-- 
2.37.2



[PATCH v5 27/31] ACPI: video: Remove acpi_video_set_dmi_backlight_type()

2022-08-25 Thread Hans de Goede
acpi_video_set_dmi_backlight_type() is troublesome because it may end
up getting called after other backlight drivers have already called
acpi_video_get_backlight_type() resulting in the other drivers
already being registered even though they should not.

In case of the acpi_video backlight, acpi_video_set_dmi_backlight_type()
actually calls acpi_video_unregister_backlight() since that is often
probed earlier, leading to userspace seeing the acpi_video0 class
device being briefly available, leading to races in userspace where
udev probe-rules try to access the device and it is already gone.

All callers have been fixed to no longer call it, so remove
acpi_video_set_dmi_backlight_type() now.

This means we now also no longer need acpi_video_unregister_backlight()
for the remove acpi_video backlight after it was wrongly registered hack,
so remove that too.

Acked-by: Rafael J. Wysocki 
Signed-off-by: Hans de Goede 
---
 drivers/acpi/acpi_video.c   | 10 --
 drivers/acpi/video_detect.c | 16 
 include/acpi/video.h|  4 
 3 files changed, 30 deletions(-)

diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c
index d1e41f30c004..a7c3d11e0dac 100644
--- a/drivers/acpi/acpi_video.c
+++ b/drivers/acpi/acpi_video.c
@@ -2296,16 +2296,6 @@ void acpi_video_register_backlight(void)
 }
 EXPORT_SYMBOL(acpi_video_register_backlight);
 
-void acpi_video_unregister_backlight(void)
-{
-   struct acpi_video_bus *video;
-
-   mutex_lock(&video_list_lock);
-   list_for_each_entry(video, &video_bus_head, entry)
-   acpi_video_bus_unregister_backlight(video);
-   mutex_unlock(&video_list_lock);
-}
-
 bool acpi_video_handles_brightness_key_presses(void)
 {
return may_report_brightness_keys &&
diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index 3861d4121172..67a0211c07b4 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -38,8 +38,6 @@
 #include 
 #include 
 
-void acpi_video_unregister_backlight(void);
-
 static enum acpi_backlight_type acpi_backlight_cmdline = acpi_backlight_undef;
 static enum acpi_backlight_type acpi_backlight_dmi = acpi_backlight_undef;
 
@@ -817,17 +815,3 @@ bool acpi_video_backlight_use_native(void)
return __acpi_video_get_backlight_type(true) == acpi_backlight_native;
 }
 EXPORT_SYMBOL(acpi_video_backlight_use_native);
-
-/*
- * Set the preferred backlight interface type based on DMI info.
- * This function allows DMI blacklists to be implemented by external
- * platform drivers instead of putting a big blacklist in video_detect.c
- */
-void acpi_video_set_dmi_backlight_type(enum acpi_backlight_type type)
-{
-   acpi_backlight_dmi = type;
-   /* Remove acpi-video backlight interface if it is no longer desired */
-   if (acpi_video_get_backlight_type() != acpi_backlight_video)
-   acpi_video_unregister_backlight();
-}
-EXPORT_SYMBOL(acpi_video_set_dmi_backlight_type);
diff --git a/include/acpi/video.h b/include/acpi/video.h
index dbd48cb8bd23..a275c35e5249 100644
--- a/include/acpi/video.h
+++ b/include/acpi/video.h
@@ -60,7 +60,6 @@ extern int acpi_video_get_edid(struct acpi_device *device, 
int type,
   int device_id, void **edid);
 extern enum acpi_backlight_type acpi_video_get_backlight_type(void);
 extern bool acpi_video_backlight_use_native(void);
-extern void acpi_video_set_dmi_backlight_type(enum acpi_backlight_type type);
 /*
  * Note: The value returned by acpi_video_handles_brightness_key_presses()
  * may change over time and should not be cached.
@@ -86,9 +85,6 @@ static inline bool acpi_video_backlight_use_native(void)
 {
return true;
 }
-static inline void acpi_video_set_dmi_backlight_type(enum acpi_backlight_type 
type)
-{
-}
 static inline bool acpi_video_handles_brightness_key_presses(void)
 {
return false;
-- 
2.37.2



[PATCH v5 23/31] platform/x86: asus-wmi: Drop DMI chassis-type check from backlight handling

2022-08-25 Thread Hans de Goede
Remove this check from the asus-wmi backlight handling:

/* Some Asus desktop boards export an acpi-video backlight interface,
   stop this from showing up */
chassis_type = dmi_get_system_info(DMI_CHASSIS_TYPE);
if (chassis_type && !strcmp(chassis_type, "3"))
acpi_video_set_dmi_backlight_type(acpi_backlight_vendor);

This acpi_video_set_dmi_backlight_type(acpi_backlight_vendor) call must be
removed because other changes in this series change the native backlight
drivers to no longer unconditionally register their backlight. Instead
these drivers now do this check:

if (acpi_video_get_backlight_type(false) != acpi_backlight_native)
return 0; /* bail */

So leaving this in place can break things on laptops with a broken
DMI chassis-type, which would have GPU native brightness control before
the addition of the acpi_video_get_backlight_type() != native check.

Removing this should be ok now, since the ACPI video code has improved
heuristics for this itself now (which includes a chassis-type check).

Signed-off-by: Hans de Goede 
---
 drivers/platform/x86/asus-wmi.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/drivers/platform/x86/asus-wmi.c b/drivers/platform/x86/asus-wmi.c
index 89b604e04d7f..301166a5697d 100644
--- a/drivers/platform/x86/asus-wmi.c
+++ b/drivers/platform/x86/asus-wmi.c
@@ -3553,7 +3553,6 @@ static int asus_wmi_add(struct platform_device *pdev)
struct platform_driver *pdrv = to_platform_driver(pdev->dev.driver);
struct asus_wmi_driver *wdrv = to_asus_wmi_driver(pdrv);
struct asus_wmi *asus;
-   const char *chassis_type;
acpi_status status;
int err;
u32 result;
@@ -3635,12 +3634,6 @@ static int asus_wmi_add(struct platform_device *pdev)
if (asus->driver->quirks->wmi_force_als_set)
asus_wmi_set_als();
 
-   /* Some Asus desktop boards export an acpi-video backlight interface,
-  stop this from showing up */
-   chassis_type = dmi_get_system_info(DMI_CHASSIS_TYPE);
-   if (chassis_type && !strcmp(chassis_type, "3"))
-   acpi_video_set_dmi_backlight_type(acpi_backlight_vendor);
-
if (asus->driver->quirks->wmi_backlight_power)
acpi_video_set_dmi_backlight_type(acpi_backlight_vendor);
 
-- 
2.37.2



[PATCH v5 15/31] platform/x86: nvidia-wmi-ec-backlight: Move fw interface definitions to a header (v2)

2022-08-25 Thread Hans de Goede
Move the WMI interface definitions to a header, so that the definitions
can be shared with drivers/acpi/video_detect.c .

Changes in v2:
- Add missing Nvidia copyright header
- Move WMI_BRIGHTNESS_GUID to nvidia-wmi-ec-backlight.h as well

Suggested-by: Daniel Dadap 
Signed-off-by: Hans de Goede 
---
 MAINTAINERS   |  1 +
 .../platform/x86/nvidia-wmi-ec-backlight.c| 68 +
 .../x86/nvidia-wmi-ec-backlight.h | 76 +++
 3 files changed, 78 insertions(+), 67 deletions(-)
 create mode 100644 include/linux/platform_data/x86/nvidia-wmi-ec-backlight.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 9d7f64dc0efe..d6f6b96f51f7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14527,6 +14527,7 @@ M:  Daniel Dadap 
 L: platform-driver-...@vger.kernel.org
 S: Supported
 F: drivers/platform/x86/nvidia-wmi-ec-backlight.c
+F: include/linux/platform_data/x86/nvidia-wmi-ec-backlight.h
 
 NVM EXPRESS DRIVER
 M: Keith Busch 
diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c 
b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
index 61e37194df70..be803e47eac0 100644
--- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
+++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
@@ -7,74 +7,10 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
-/**
- * enum wmi_brightness_method - WMI method IDs
- * @WMI_BRIGHTNESS_METHOD_LEVEL:  Get/Set EC brightness level status
- * @WMI_BRIGHTNESS_METHOD_SOURCE: Get/Set EC Brightness Source
- */
-enum wmi_brightness_method {
-   WMI_BRIGHTNESS_METHOD_LEVEL = 1,
-   WMI_BRIGHTNESS_METHOD_SOURCE = 2,
-   WMI_BRIGHTNESS_METHOD_MAX
-};
-
-/**
- * enum wmi_brightness_mode - Operation mode for WMI-wrapped method
- * @WMI_BRIGHTNESS_MODE_GET:Get the current brightness 
level/source.
- * @WMI_BRIGHTNESS_MODE_SET:Set the brightness level.
- * @WMI_BRIGHTNESS_MODE_GET_MAX_LEVEL:  Get the maximum brightness level. This
- *  is only valid when the WMI method is
- *  %WMI_BRIGHTNESS_METHOD_LEVEL.
- */
-enum wmi_brightness_mode {
-   WMI_BRIGHTNESS_MODE_GET = 0,
-   WMI_BRIGHTNESS_MODE_SET = 1,
-   WMI_BRIGHTNESS_MODE_GET_MAX_LEVEL = 2,
-   WMI_BRIGHTNESS_MODE_MAX
-};
-
-/**
- * enum wmi_brightness_source - Backlight brightness control source selection
- * @WMI_BRIGHTNESS_SOURCE_GPU: Backlight brightness is controlled by the GPU.
- * @WMI_BRIGHTNESS_SOURCE_EC:  Backlight brightness is controlled by the
- * system's Embedded Controller (EC).
- * @WMI_BRIGHTNESS_SOURCE_AUX: Backlight brightness is controlled over the
- * DisplayPort AUX channel.
- */
-enum wmi_brightness_source {
-   WMI_BRIGHTNESS_SOURCE_GPU = 1,
-   WMI_BRIGHTNESS_SOURCE_EC = 2,
-   WMI_BRIGHTNESS_SOURCE_AUX = 3,
-   WMI_BRIGHTNESS_SOURCE_MAX
-};
-
-/**
- * struct wmi_brightness_args - arguments for the WMI-wrapped ACPI method
- * @mode:Pass in an &enum wmi_brightness_mode value to select between
- *   getting or setting a value.
- * @val: In parameter for value to set when using %WMI_BRIGHTNESS_MODE_SET
- *   mode. Not used in conjunction with %WMI_BRIGHTNESS_MODE_GET or
- *   %WMI_BRIGHTNESS_MODE_GET_MAX_LEVEL mode.
- * @ret: Out parameter returning retrieved value when operating in
- *   %WMI_BRIGHTNESS_MODE_GET or %WMI_BRIGHTNESS_MODE_GET_MAX_LEVEL
- *   mode. Not used in %WMI_BRIGHTNESS_MODE_SET mode.
- * @ignored: Padding; not used. The ACPI method expects a 24 byte params 
struct.
- *
- * This is the parameters structure for the WmiBrightnessNotify ACPI method as
- * wrapped by WMI. The value passed in to @val or returned by @ret will be a
- * brightness value when the WMI method ID is %WMI_BRIGHTNESS_METHOD_LEVEL, or
- * an &enum wmi_brightness_source value with %WMI_BRIGHTNESS_METHOD_SOURCE.
- */
-struct wmi_brightness_args {
-   u32 mode;
-   u32 val;
-   u32 ret;
-   u32 ignored[3];
-};
-
 /**
  * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI 
method
  * @w:Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
@@ -191,8 +127,6 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device 
*wdev, const void *ct
return PTR_ERR_OR_ZERO(bdev);
 }
 
-#define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
-
 static const struct wmi_device_id nvidia_wmi_ec_backlight_id_table[] = {
{ .guid_string = WMI_BRIGHTNESS_GUID },
{ }
diff --git a/include/linux/platform_data/x86/nvidia-wmi-ec-backlight.h 
b/include/linux/platform_data/x86/nvidia-wmi-ec-backlight.h
new file mode 100644
index ..23d60130272c
--- /dev/null
+++ b/include/linux/platform_data/x86/nvidia-wmi-ec-backlight.h
@@ -0,0 +1,76 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2020, NVIDIA CORPOR

[PATCH v5 09/31] ACPI: video: Make backlight class device registration a separate step (v2)

2022-08-25 Thread Hans de Goede
On x86/ACPI boards the acpi_video driver will usually initialize before
the kms driver (except i915). This causes /sys/class/backlight/acpi_video0
to show up and then the kms driver registers its own native backlight
device after which the drivers/acpi/video_detect.c code unregisters
the acpi_video0 device (when acpi_video_get_backlight_type()==native).

This means that userspace briefly sees 2 devices and the disappearing of
acpi_video0 after a brief time confuses the systemd backlight level
save/restore code, see e.g.:
https://bbs.archlinux.org/viewtopic.php?id=269920

To fix this make backlight class device registration a separate step
done by a new acpi_video_register_backlight() function. The intend is for
this to be called by the drm/kms driver *after* it is done setting up its
own native backlight device. So that acpi_video_get_backlight_type() knows
if a native backlight will be available or not at acpi_video backlight
registration time, avoiding the add + remove dance.

Note the new acpi_video_register_backlight() function is also called from
a delayed work to ensure that the acpi_video backlight devices does get
registered if necessary even if there is no drm/kms driver or when it is
disabled.

Changes in v2:
- Make register_backlight_delay a module parameter, mainly so that it can
  be disabled by Nvidia binary driver users

Acked-by: Rafael J. Wysocki 
Signed-off-by: Hans de Goede 
---
 drivers/acpi/acpi_video.c | 50 ---
 include/acpi/video.h  |  2 ++
 2 files changed, 49 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c
index 8545bf94866f..09dd86f86cf3 100644
--- a/drivers/acpi/acpi_video.c
+++ b/drivers/acpi/acpi_video.c
@@ -73,6 +73,16 @@ module_param(device_id_scheme, bool, 0444);
 static int only_lcd = -1;
 module_param(only_lcd, int, 0444);
 
+/*
+ * Display probing is known to take up to 5 seconds, so delay the fallback
+ * backlight registration by 5 seconds + 3 seconds for some extra margin.
+ */
+static int register_backlight_delay = 8;
+module_param(register_backlight_delay, int, 0444);
+MODULE_PARM_DESC(register_backlight_delay,
+   "Delay in seconds before doing fallback (non GPU driver triggered) "
+   "backlight registration, set to 0 to disable.");
+
 static bool may_report_brightness_keys;
 static int register_count;
 static DEFINE_MUTEX(register_count_mutex);
@@ -81,6 +91,9 @@ static LIST_HEAD(video_bus_head);
 static int acpi_video_bus_add(struct acpi_device *device);
 static int acpi_video_bus_remove(struct acpi_device *device);
 static void acpi_video_bus_notify(struct acpi_device *device, u32 event);
+static void acpi_video_bus_register_backlight_work(struct work_struct 
*ignored);
+static DECLARE_DELAYED_WORK(video_bus_register_backlight_work,
+   acpi_video_bus_register_backlight_work);
 void acpi_video_detect_exit(void);
 
 /*
@@ -1859,8 +1872,6 @@ static int acpi_video_bus_register_backlight(struct 
acpi_video_bus *video)
if (video->backlight_registered)
return 0;
 
-   acpi_video_run_bcl_for_osi(video);
-
if (acpi_video_get_backlight_type() != acpi_backlight_video)
return 0;
 
@@ -2086,7 +2097,11 @@ static int acpi_video_bus_add(struct acpi_device *device)
list_add_tail(&video->entry, &video_bus_head);
mutex_unlock(&video_list_lock);
 
-   acpi_video_bus_register_backlight(video);
+   /*
+* The userspace visible backlight_device gets registered separately
+* from acpi_video_register_backlight().
+*/
+   acpi_video_run_bcl_for_osi(video);
acpi_video_bus_add_notify_handler(video);
 
return 0;
@@ -2125,6 +2140,11 @@ static int acpi_video_bus_remove(struct acpi_device 
*device)
return 0;
 }
 
+static void acpi_video_bus_register_backlight_work(struct work_struct *ignored)
+{
+   acpi_video_register_backlight();
+}
+
 static int __init is_i740(struct pci_dev *dev)
 {
if (dev->device == 0x00D1)
@@ -2235,6 +2255,18 @@ int acpi_video_register(void)
 */
register_count = 1;
 
+   /*
+* acpi_video_bus_add() skips registering the userspace visible
+* backlight_device. The intend is for this to be registered by the
+* drm/kms driver calling acpi_video_register_backlight() *after* it is
+* done setting up its own native backlight device. The delayed work
+* ensures that acpi_video_register_backlight() always gets called
+* eventually, in case there is no drm/kms driver or it is disabled.
+*/
+   if (register_backlight_delay)
+   schedule_delayed_work(&video_bus_register_backlight_work,
+ register_backlight_delay * HZ);
+
 leave:
mutex_unlock(®ister_count_mutex);
return ret;
@@ -2245,6 +2277,7 @@ void acpi_video_unregister(void)
 {
mutex_lock(®ister_count_mutex);
if (reg

[PATCH v5 28/31] ACPI: video: Drop "Samsung X360" acpi_backlight=native quirk

2022-08-25 Thread Hans de Goede
acpi_backlight=native is the default for the "Samsung X360", but as
the comment explains the quirk was still necessary because even
briefly registering the acpi_video0 backlight; and then unregistering
it once the native driver showed up, was leading to issues.

After the "ACPI: video: Make backlight class device registration
a separate step" patch from earlier in this patch-series, we no
longer briefly register the acpi_video0 backlight on systems where
the native driver should be used.

So this is no longer an issue an the quirk is no longer needed.

Acked-by: Rafael J. Wysocki 
Signed-off-by: Hans de Goede 
---
 drivers/acpi/video_detect.c | 15 ---
 1 file changed, 15 deletions(-)

diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index 67a0211c07b4..af2833b57b8b 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -132,21 +132,6 @@ static int video_detect_force_none(const struct 
dmi_system_id *d)
 }
 
 static const struct dmi_system_id video_detect_dmi_table[] = {
-   /* On Samsung X360, the BIOS will set a flag (VDRV) if generic
-* ACPI backlight device is used. This flag will definitively break
-* the backlight interface (even the vendor interface) until next
-* reboot. It's why we should prevent video.ko from being used here
-* and we can't rely on a later call to acpi_video_unregister().
-*/
-   {
-.callback = video_detect_force_vendor,
-/* X360 */
-.matches = {
-   DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD."),
-   DMI_MATCH(DMI_PRODUCT_NAME, "X360"),
-   DMI_MATCH(DMI_BOARD_NAME, "X360"),
-   },
-   },
{
 /* https://bugzilla.redhat.com/show_bug.cgi?id=1128309 */
 .callback = video_detect_force_vendor,
-- 
2.37.2



[PATCH v5 16/31] ACPI: video: Refactor acpi_video_get_backlight_type() a bit

2022-08-25 Thread Hans de Goede
Refactor acpi_video_get_backlight_type() so that the heuristics /
detection steps are stricly in order of descending precedence.

Also move the comments describing the steps to when the various steps are
actually done, to avoid the comments getting out of sync with the code.

Acked-by: Rafael J. Wysocki 
Signed-off-by: Hans de Goede 
---
 drivers/acpi/video_detect.c | 39 ++---
 1 file changed, 23 insertions(+), 16 deletions(-)

diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index fb49b8f4523a..cc9d0d91e268 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -537,16 +537,6 @@ static const struct dmi_system_id video_detect_dmi_table[] 
= {
 /*
  * Determine which type of backlight interface to use on this system,
  * First check cmdline, then dmi quirks, then do autodetect.
- *
- * The autodetect order is:
- * 1) Is the acpi-video backlight interface supported ->
- *  no, use a vendor interface
- * 2) Is this a win8 "ready" BIOS and do we have a native interface ->
- *  yes, use a native interface
- * 3) Else use the acpi-video interface
- *
- * Arguably the native on win8 check should be done first, but that would
- * be a behavior change, which may causes issues.
  */
 static enum acpi_backlight_type __acpi_video_get_backlight_type(bool native)
 {
@@ -569,19 +559,36 @@ static enum acpi_backlight_type 
__acpi_video_get_backlight_type(bool native)
native_available = true;
mutex_unlock(&init_mutex);
 
+   /*
+* The below heuristics / detection steps are in order of descending
+* presedence. The commandline takes presedence over anything else.
+*/
if (acpi_backlight_cmdline != acpi_backlight_undef)
return acpi_backlight_cmdline;
 
+   /* DMI quirks override any autodetection. */
if (acpi_backlight_dmi != acpi_backlight_undef)
return acpi_backlight_dmi;
 
-   if (!(video_caps & ACPI_VIDEO_BACKLIGHT))
-   return acpi_backlight_vendor;
-
-   if (acpi_osi_is_win8() && native_available)
-   return acpi_backlight_native;
+   /* On systems with ACPI video use either native or ACPI video. */
+   if (video_caps & ACPI_VIDEO_BACKLIGHT) {
+   /*
+* Windows 8 and newer no longer use the ACPI video interface,
+* so it often does not work. If the ACPI tables are written
+* for win8 and native brightness ctl is available, use that.
+*
+* The native check deliberately is inside the if acpi-video
+* block on older devices without acpi-video support native
+* is usually not the best choice.
+*/
+   if (acpi_osi_is_win8() && native_available)
+   return acpi_backlight_native;
+   else
+   return acpi_backlight_video;
+   }
 
-   return acpi_backlight_video;
+   /* No ACPI video (old hw), use vendor specific fw methods. */
+   return acpi_backlight_vendor;
 }
 
 enum acpi_backlight_type acpi_video_get_backlight_type(void)
-- 
2.37.2



[PATCH v5 29/31] ACPI: video: Drop NL5x?U, PF4NU1F and PF5?U?? acpi_backlight=native quirks

2022-08-25 Thread Hans de Goede
acpi_backlight=native is the default for these, but as the comment
explains the quirk was still necessary because even briefly registering
the acpi_video0 backlight; and then unregistering it once the native
driver showed up, was leading to issues.

After the "ACPI: video: Make backlight class device registration
a separate step" patch from earlier in this patch-series, we no
longer briefly register the acpi_video0 backlight on systems where
the native driver should be used.

So this is no longer an issue an the quirks are no longer needed.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215683
Tested-by: Werner Sembach 
Acked-by: Rafael J. Wysocki 
Signed-off-by: Hans de Goede 
---
 drivers/acpi/video_detect.c | 92 +
 1 file changed, 1 insertion(+), 91 deletions(-)

diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index af2833b57b8b..789d5913c178 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -609,97 +609,7 @@ static const struct dmi_system_id video_detect_dmi_table[] 
= {
DMI_MATCH(DMI_BOARD_NAME, "N250P"),
},
},
-   /*
-* Clevo NL5xRU and NL5xNU/TUXEDO Aura 15 Gen1 and Gen2 have both a
-* working native and video interface. However the default detection
-* mechanism first registers the video interface before unregistering
-* it again and switching to the native interface during boot. This
-* results in a dangling SBIOS request for backlight change for some
-* reason, causing the backlight to switch to ~2% once per boot on the
-* first power cord connect or disconnect event. Setting the native
-* interface explicitly circumvents this buggy behaviour, by avoiding
-* the unregistering process.
-*/
-   {
-   .callback = video_detect_force_native,
-   .ident = "Clevo NL5xRU",
-   .matches = {
-   DMI_MATCH(DMI_BOARD_NAME, "NL5xRU"),
-   },
-   },
-   {
-   .callback = video_detect_force_native,
-   .ident = "Clevo NL5xRU",
-   .matches = {
-   DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
-   DMI_MATCH(DMI_BOARD_NAME, "AURA1501"),
-   },
-   },
-   {
-   .callback = video_detect_force_native,
-   .ident = "Clevo NL5xRU",
-   .matches = {
-   DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
-   DMI_MATCH(DMI_BOARD_NAME, "EDUBOOK1502"),
-   },
-   },
-   {
-   .callback = video_detect_force_native,
-   .ident = "Clevo NL5xNU",
-   .matches = {
-   DMI_MATCH(DMI_BOARD_NAME, "NL5xNU"),
-   },
-   },
-   /*
-* The TongFang PF5PU1G, PF4NU1F, PF5NU1G, and PF5LUXG/TUXEDO BA15 
Gen10,
-* Pulse 14/15 Gen1, and Pulse 15 Gen2 have the same problem as the 
Clevo
-* NL5xRU and NL5xNU/TUXEDO Aura 15 Gen1 and Gen2. See the description
-* above.
-*/
-   {
-   .callback = video_detect_force_native,
-   .ident = "TongFang PF5PU1G",
-   .matches = {
-   DMI_MATCH(DMI_BOARD_NAME, "PF5PU1G"),
-   },
-   },
-   {
-   .callback = video_detect_force_native,
-   .ident = "TongFang PF4NU1F",
-   .matches = {
-   DMI_MATCH(DMI_BOARD_NAME, "PF4NU1F"),
-   },
-   },
-   {
-   .callback = video_detect_force_native,
-   .ident = "TongFang PF4NU1F",
-   .matches = {
-   DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
-   DMI_MATCH(DMI_BOARD_NAME, "PULSE1401"),
-   },
-   },
-   {
-   .callback = video_detect_force_native,
-   .ident = "TongFang PF5NU1G",
-   .matches = {
-   DMI_MATCH(DMI_BOARD_NAME, "PF5NU1G"),
-   },
-   },
-   {
-   .callback = video_detect_force_native,
-   .ident = "TongFang PF5NU1G",
-   .matches = {
-   DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
-   DMI_MATCH(DMI_BOARD_NAME, "PULSE1501"),
-   },
-   },
-   {
-   .callback = video_detect_force_native,
-   .ident = "TongFang PF5LUXG",
-   .matches = {
-   DMI_MATCH(DMI_BOARD_NAME, "PF5LUXG"),
-   },
-   },
+
/*
 * Desktops which falsely report a backlight and which our heuristics
 * for this do not catch.
-- 
2.37.2



[PATCH v5 26/31] platform/x86: samsung-laptop: Move acpi_backlight=[vendor|native] quirks to ACPI video_detect.c

2022-08-25 Thread Hans de Goede
acpi_video_set_dmi_backlight_type() is troublesome because it may end up
getting called after other backlight drivers have already called
acpi_video_get_backlight_type() resulting in the other drivers
already being registered even though they should not.

Move all the acpi_backlight=[vendor|native] quirks from samsung-laptop to
drivers/acpi/video_detect.c .

Note the X360 -> acpi_backlight=native quirk is not moved because that
already was present in drivers/acpi/video_detect.c .

Acked-by: Rafael J. Wysocki 
Signed-off-by: Hans de Goede 
---
 drivers/acpi/video_detect.c   | 54 +
 drivers/platform/x86/samsung-laptop.c | 87 ---
 2 files changed, 54 insertions(+), 87 deletions(-)

diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index a09089e7fada..3861d4121172 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -222,6 +222,33 @@ static const struct dmi_system_id video_detect_dmi_table[] 
= {
DMI_MATCH(DMI_PRODUCT_NAME, "GB-BXBT-2807"),
},
},
+   {
+.callback = video_detect_force_vendor,
+/* Samsung N150/N210/N220 */
+.matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "N150/N210/N220"),
+   DMI_MATCH(DMI_BOARD_NAME, "N150/N210/N220"),
+   },
+   },
+   {
+.callback = video_detect_force_vendor,
+/* Samsung NF110/NF210/NF310 */
+.matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "NF110/NF210/NF310"),
+   DMI_MATCH(DMI_BOARD_NAME, "NF110/NF210/NF310"),
+   },
+   },
+   {
+.callback = video_detect_force_vendor,
+/* Samsung NC210 */
+.matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "NC210/NC110"),
+   DMI_MATCH(DMI_BOARD_NAME, "NC210/NC110"),
+   },
+   },
{
.callback = video_detect_force_vendor,
/* Sony VPCEH3U1E */
@@ -572,6 +599,33 @@ static const struct dmi_system_id video_detect_dmi_table[] 
= {
DMI_MATCH(DMI_PRODUCT_NAME, "UX303UB"),
},
},
+   {
+.callback = video_detect_force_native,
+/* Samsung N150P */
+.matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "N150P"),
+   DMI_MATCH(DMI_BOARD_NAME, "N150P"),
+   },
+   },
+   {
+.callback = video_detect_force_native,
+/* Samsung N145P/N250P/N260P */
+.matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "N145P/N250P/N260P"),
+   DMI_MATCH(DMI_BOARD_NAME, "N145P/N250P/N260P"),
+   },
+   },
+   {
+.callback = video_detect_force_native,
+/* Samsung N250P */
+.matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "N250P"),
+   DMI_MATCH(DMI_BOARD_NAME, "N250P"),
+   },
+   },
/*
 * Clevo NL5xRU and NL5xNU/TUXEDO Aura 15 Gen1 and Gen2 have both a
 * working native and video interface. However the default detection
diff --git a/drivers/platform/x86/samsung-laptop.c 
b/drivers/platform/x86/samsung-laptop.c
index c187dcdf82f0..cc30cf08f32d 100644
--- a/drivers/platform/x86/samsung-laptop.c
+++ b/drivers/platform/x86/samsung-laptop.c
@@ -356,23 +356,13 @@ struct samsung_laptop {
 };
 
 struct samsung_quirks {
-   bool broken_acpi_video;
bool four_kbd_backlight_levels;
bool enable_kbd_backlight;
-   bool use_native_backlight;
bool lid_handling;
 };
 
 static struct samsung_quirks samsung_unknown = {};
 
-static struct samsung_quirks samsung_broken_acpi_video = {
-   .broken_acpi_video = true,
-};
-
-static struct samsung_quirks samsung_use_native_backlight = {
-   .use_native_backlight = true,
-};
-
 static struct samsung_quirks samsung_np740u3e = {
.four_kbd_backlight_levels = true,
.enable_kbd_backlight = true,
@@ -1540,76 +1530,6 @@ static const struct dmi_system_id samsung_dmi_table[] 
__initconst = {
},
},
/* Specific DMI ids for laptop with quirks */
-   {
-.callback = samsung_dmi_matched,
-.ident = "N150P",
-.matches = {
-   DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD."),
-   DMI_MATCH(DMI_PRODUCT_NAME, "N150P"),
-   DMI_MATCH(DMI_BOARD_NAME, "N150P"),
-   },
-.driver_data = &samsung_use_native_backlight,
-   },
-   {
-.callback = samsung

[PATCH v5 12/31] drm/nouveau: Register ACPI video backlight when nv_backlight registration fails (v2)

2022-08-25 Thread Hans de Goede
Typically the acpi_video driver will initialize before nouveau, which
used to cause /sys/class/backlight/acpi_video0 to get registered and then
nouveau would register its own nv_backlight device later. After which
the drivers/acpi/video_detect.c code unregistered the acpi_video0 device
to avoid there being 2 backlight devices.

This means that userspace used to briefly see 2 devices and the
disappearing of acpi_video0 after a brief time confuses the systemd
backlight level save/restore code, see e.g.:
https://bbs.archlinux.org/viewtopic.php?id=269920

To fix this the ACPI video code has been modified to make backlight class
device registration a separate step, relying on the drm/kms driver to
ask for the acpi_video backlight registration after it is done setting up
its native backlight device.

Add a call to the new acpi_video_register_backlight() when native backlight
device registration has failed / was skipped to ensure that there is a
backlight device available before the drm_device gets registered with
userspace.

Changes in v2:
- Add nouveau_acpi_video_register_backlight() wrapper to avoid unresolved
  symbol errors on non X86

Reviewed-by: Lyude Paul 
Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c  | 5 +
 drivers/gpu/drm/nouveau/nouveau_acpi.h  | 2 ++
 drivers/gpu/drm/nouveau/nouveau_backlight.c | 7 +++
 3 files changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index 1592c9cd7750..8cf096f841a9 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -391,3 +391,8 @@ bool nouveau_acpi_video_backlight_use_native(void)
 {
return acpi_video_backlight_use_native();
 }
+
+void nouveau_acpi_video_register_backlight(void)
+{
+   acpi_video_register_backlight();
+}
diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.h 
b/drivers/gpu/drm/nouveau/nouveau_acpi.h
index 3c666c30dfca..e39dd8b94b8b 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.h
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.h
@@ -12,6 +12,7 @@ void nouveau_unregister_dsm_handler(void);
 void nouveau_switcheroo_optimus_dsm(void);
 void *nouveau_acpi_edid(struct drm_device *, struct drm_connector *);
 bool nouveau_acpi_video_backlight_use_native(void);
+void nouveau_acpi_video_register_backlight(void);
 #else
 static inline bool nouveau_is_optimus(void) { return false; };
 static inline bool nouveau_is_v1_dsm(void) { return false; };
@@ -20,6 +21,7 @@ static inline void nouveau_unregister_dsm_handler(void) {}
 static inline void nouveau_switcheroo_optimus_dsm(void) {}
 static inline void *nouveau_acpi_edid(struct drm_device *dev, struct 
drm_connector *connector) { return NULL; }
 static inline bool nouveau_acpi_video_backlight_use_native(void) { return 
true; }
+static inline void nouveau_acpi_video_register_backlight(void) {}
 #endif
 
 #endif
diff --git a/drivers/gpu/drm/nouveau/nouveau_backlight.c 
b/drivers/gpu/drm/nouveau/nouveau_backlight.c
index d2b8f8c13db4..a614582779ca 100644
--- a/drivers/gpu/drm/nouveau/nouveau_backlight.c
+++ b/drivers/gpu/drm/nouveau/nouveau_backlight.c
@@ -436,6 +436,13 @@ nouveau_backlight_init(struct drm_connector *connector)
 
 fail_alloc:
kfree(bl);
+   /*
+* If we get here we have an internal panel, but no nv_backlight,
+* try registering an ACPI video backlight device instead.
+*/
+   if (ret == 0)
+   nouveau_acpi_video_register_backlight();
+
return ret;
 }
 
-- 
2.37.2



[PATCH v5 21/31] platform/x86: toshiba_acpi: Stop using acpi_video_set_dmi_backlight_type()

2022-08-25 Thread Hans de Goede
acpi_video_set_dmi_backlight_type() is troublesome because it may end up
getting called after other backlight drivers have already called
acpi_video_get_backlight_type() resulting in the other drivers
already being registered even though they should not.

In case of the acpi_video backlight, acpi_video_set_dmi_backlight_type()
actually calls acpi_video_unregister_backlight() since that is often
probed earlier, leading to userspace seeing the acpi_video0 class
device being briefly available, leading to races in userspace where
udev probe-rules try to access the device and it is already gone.

In case of toshiba_acpi there are no DMI quirks to move to
acpi/video_detect.c, but it also (ab)uses it for transflective
displays. Adding transflective display support to video_detect.c would
be quite involved. But luckily there are only 2 known models with
a transflective display, so we can just add DMI quirks for those.

Acked-by: Rafael J. Wysocki 
Signed-off-by: Hans de Goede 
---
 drivers/acpi/video_detect.c | 19 +++
 drivers/platform/x86/toshiba_acpi.c | 16 
 2 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index be2fc43418af..74e2087c8ff0 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -190,6 +190,25 @@ static const struct dmi_system_id video_detect_dmi_table[] 
= {
},
},
 
+   /*
+* Toshiba models with Transflective display, these need to use
+* the toshiba_acpi vendor driver for proper Transflective handling.
+*/
+   {
+.callback = video_detect_force_vendor,
+.matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "TOSHIBA"),
+   DMI_MATCH(DMI_PRODUCT_NAME, "PORTEGE R500"),
+   },
+   },
+   {
+.callback = video_detect_force_vendor,
+.matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "TOSHIBA"),
+   DMI_MATCH(DMI_PRODUCT_NAME, "PORTEGE R600"),
+   },
+   },
+
/*
 * These models have a working acpi_video backlight control, and using
 * native backlight causes a regression where backlight does not work
diff --git a/drivers/platform/x86/toshiba_acpi.c 
b/drivers/platform/x86/toshiba_acpi.c
index 0fc9e8b8827b..030dc37d50b8 100644
--- a/drivers/platform/x86/toshiba_acpi.c
+++ b/drivers/platform/x86/toshiba_acpi.c
@@ -271,14 +271,6 @@ static const struct key_entry toshiba_acpi_alt_keymap[] = {
{ KE_END, 0 },
 };
 
-/*
- * List of models which have a broken acpi-video backlight interface and thus
- * need to use the toshiba (vendor) interface instead.
- */
-static const struct dmi_system_id toshiba_vendor_backlight_dmi[] = {
-   {}
-};
-
 /*
  * Utility
  */
@@ -2881,14 +2873,6 @@ static int toshiba_acpi_setup_backlight(struct 
toshiba_acpi_dev *dev)
return 0;
}
 
-   /*
-* Tell acpi-video-detect code to prefer vendor backlight on all
-* systems with transflective backlight and on dmi matched systems.
-*/
-   if (dev->tr_backlight_supported ||
-   dmi_check_system(toshiba_vendor_backlight_dmi))
-   acpi_video_set_dmi_backlight_type(acpi_backlight_vendor);
-
if (acpi_video_get_backlight_type() != acpi_backlight_vendor)
return 0;
 
-- 
2.37.2



[PATCH v5 08/31] ACPI: video: Simplify acpi_video_unregister_backlight()

2022-08-25 Thread Hans de Goede
When acpi_video_register() has not run yet the video_bus_head will be
empty, so there is no need to check the register_count flag first.

Acked-by: Rafael J. Wysocki 
Signed-off-by: Hans de Goede 
---
 drivers/acpi/acpi_video.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c
index cde8ffa9f0b8..8545bf94866f 100644
--- a/drivers/acpi/acpi_video.c
+++ b/drivers/acpi/acpi_video.c
@@ -2257,14 +2257,10 @@ void acpi_video_unregister_backlight(void)
 {
struct acpi_video_bus *video;
 
-   mutex_lock(®ister_count_mutex);
-   if (register_count) {
-   mutex_lock(&video_list_lock);
-   list_for_each_entry(video, &video_bus_head, entry)
-   acpi_video_bus_unregister_backlight(video);
-   mutex_unlock(&video_list_lock);
-   }
-   mutex_unlock(®ister_count_mutex);
+   mutex_lock(&video_list_lock);
+   list_for_each_entry(video, &video_bus_head, entry)
+   acpi_video_bus_unregister_backlight(video);
+   mutex_unlock(&video_list_lock);
 }
 
 bool acpi_video_handles_brightness_key_presses(void)
-- 
2.37.2



[PATCH v5 11/31] drm/i915: Call acpi_video_register_backlight() (v3)

2022-08-25 Thread Hans de Goede
On machins without an i915 opregion the acpi_video driver immediately
probes the ACPI video bus and used to also immediately register
acpi_video# backlight devices when supported.

Once the drm/kms driver then loaded later and possibly registered
a native backlight device then the drivers/acpi/video_detect.c code
unregistered the acpi_video0 device to avoid there being 2 backlight
devices (when acpi_video_get_backlight_type()==native).

This means that userspace used to briefly see 2 devices and the
disappearing of acpi_video0 after a brief time confuses the systemd
backlight level save/restore code, see e.g.:
https://bbs.archlinux.org/viewtopic.php?id=269920

To fix this the ACPI video code has been modified to make backlight class
device registration a separate step, relying on the drm/kms driver to
ask for the acpi_video backlight registration after it is done setting up
its native backlight device.

Add a call to the new acpi_video_register_backlight() after the i915 calls
acpi_video_register() (after setting up the i915 opregion) so that the
acpi_video backlight devices get registered on systems where the i915
native backlight device is not registered.

Changes in v2:
-Only call acpi_video_register_backlight() when a panel is detected

Changes in v3:
-Add a new intel_acpi_video_register() helper which checks if a panel
 is present and then calls acpi_video_register_backlight()

Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/i915/display/intel_acpi.c| 27 
 drivers/gpu/drm/i915/display/intel_acpi.h|  3 +++
 drivers/gpu/drm/i915/display/intel_display.c |  2 +-
 3 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_acpi.c 
b/drivers/gpu/drm/i915/display/intel_acpi.c
index e78430001f07..9df78e7caa2b 100644
--- a/drivers/gpu/drm/i915/display/intel_acpi.c
+++ b/drivers/gpu/drm/i915/display/intel_acpi.c
@@ -7,6 +7,7 @@
 
 #include 
 #include 
+#include 
 
 #include "i915_drv.h"
 #include "intel_acpi.h"
@@ -331,3 +332,29 @@ void intel_acpi_assign_connector_fwnodes(struct 
drm_i915_private *i915)
 */
fwnode_handle_put(fwnode);
 }
+
+void intel_acpi_video_register(struct drm_i915_private *i915)
+{
+   struct drm_connector_list_iter conn_iter;
+   struct drm_connector *connector;
+
+   acpi_video_register();
+
+   /*
+* If i915 is driving an internal panel without registering its native
+* backlight handler try to register the acpi_video backlight.
+* For panels not driven by i915 another GPU driver may still register
+* a native backlight later and acpi_video_register_backlight() should
+* only be called after any native backlights have been registered.
+*/
+   drm_connector_list_iter_begin(&i915->drm, &conn_iter);
+   drm_for_each_connector_iter(connector, &conn_iter) {
+   struct intel_panel *panel = 
&to_intel_connector(connector)->panel;
+
+   if (panel->backlight.funcs && !panel->backlight.device) {
+   acpi_video_register_backlight();
+   break;
+   }
+   }
+   drm_connector_list_iter_end(&conn_iter);
+}
diff --git a/drivers/gpu/drm/i915/display/intel_acpi.h 
b/drivers/gpu/drm/i915/display/intel_acpi.h
index 4a760a2baed9..6a0007452f95 100644
--- a/drivers/gpu/drm/i915/display/intel_acpi.h
+++ b/drivers/gpu/drm/i915/display/intel_acpi.h
@@ -14,6 +14,7 @@ void intel_unregister_dsm_handler(void);
 void intel_dsm_get_bios_data_funcs_supported(struct drm_i915_private *i915);
 void intel_acpi_device_id_update(struct drm_i915_private *i915);
 void intel_acpi_assign_connector_fwnodes(struct drm_i915_private *i915);
+void intel_acpi_video_register(struct drm_i915_private *i915);
 #else
 static inline void intel_register_dsm_handler(void) { return; }
 static inline void intel_unregister_dsm_handler(void) { return; }
@@ -23,6 +24,8 @@ static inline
 void intel_acpi_device_id_update(struct drm_i915_private *i915) { return; }
 static inline
 void intel_acpi_assign_connector_fwnodes(struct drm_i915_private *i915) { 
return; }
+static inline
+void intel_acpi_video_register(struct drm_i915_private *i915) { return; }
 #endif /* CONFIG_ACPI */
 
 #endif /* __INTEL_ACPI_H__ */
diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 6103b02c081f..129a13375101 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -9087,7 +9087,7 @@ void intel_display_driver_register(struct 
drm_i915_private *i915)
 
/* Must be done after probing outputs */
intel_opregion_register(i915);
-   acpi_video_register();
+   intel_acpi_video_register(i915);
 
intel_audio_init(i915);
 
-- 
2.37.2



[PATCH v5 22/31] platform/x86: acer-wmi: Move backlight DMI quirks to acpi/video_detect.c

2022-08-25 Thread Hans de Goede
Move the backlight DMI quirks to acpi/video_detect.c, so that
the driver no longer needs to call acpi_video_set_dmi_backlight_type().

acpi_video_set_dmi_backlight_type() is troublesome because it may end up
getting called after other backlight drivers have already called
acpi_video_get_backlight_type() resulting in the other drivers
already being registered even though they should not.

Note that even though the DMI quirk table name was video_vendor_dmi_table,
5/6 quirks were actually quirks to use the GPU native backlight.

These 5 quirks also had a callback in their dmi_system_id entry which
disabled the acer-wmi vendor driver; and any DMI match resulted in:

acpi_video_set_dmi_backlight_type(acpi_backlight_vendor);

which disabled the acpi_video driver, so only the native driver was left.
The new entries for these 5/6 devices correctly marks these as needing
the native backlight driver.

Also note that other changes in this series change the native backlight
drivers to no longer unconditionally register their backlight. Instead
these drivers now do this check:

if (acpi_video_get_backlight_type(false) != acpi_backlight_native)
return 0; /* bail */

which without this patch would have broken these 5/6 "special" quirks.

Since I had to look at all the commits adding the quirks anyways, to make
sure that I understood the code correctly, I've also added links to
the various original bugzillas for these quirks to the new entries.

Acked-by: Rafael J. Wysocki 
Signed-off-by: Hans de Goede 
---
 drivers/acpi/video_detect.c | 53 ++
 drivers/platform/x86/acer-wmi.c | 66 -
 2 files changed, 53 insertions(+), 66 deletions(-)

diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index 74e2087c8ff0..6a2523bc02ba 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -149,6 +149,15 @@ static const struct dmi_system_id video_detect_dmi_table[] 
= {
DMI_MATCH(DMI_BOARD_NAME, "X360"),
},
},
+   {
+/* https://bugzilla.redhat.com/show_bug.cgi?id=1128309 */
+.callback = video_detect_force_vendor,
+/* Acer KAV80 */
+.matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "Acer"),
+   DMI_MATCH(DMI_PRODUCT_NAME, "KAV80"),
+   },
+   },
{
.callback = video_detect_force_vendor,
/* Asus UL30VT */
@@ -437,6 +446,41 @@ static const struct dmi_system_id video_detect_dmi_table[] 
= {
DMI_MATCH(DMI_BOARD_NAME, "JV50"),
},
},
+   {
+/* https://bugzilla.redhat.com/show_bug.cgi?id=1012674 */
+.callback = video_detect_force_native,
+/* Acer Aspire 5741 */
+.matches = {
+   DMI_MATCH(DMI_BOARD_VENDOR, "Acer"),
+   DMI_MATCH(DMI_PRODUCT_NAME, "Aspire 5741"),
+   },
+   },
+   {
+/* https://bugzilla.kernel.org/show_bug.cgi?id=42993 */
+.callback = video_detect_force_native,
+/* Acer Aspire 5750 */
+.matches = {
+   DMI_MATCH(DMI_BOARD_VENDOR, "Acer"),
+   DMI_MATCH(DMI_PRODUCT_NAME, "Aspire 5750"),
+   },
+   },
+   {
+/* https://bugzilla.kernel.org/show_bug.cgi?id=42833 */
+.callback = video_detect_force_native,
+/* Acer Extensa 5235 */
+.matches = {
+   DMI_MATCH(DMI_BOARD_VENDOR, "Acer"),
+   DMI_MATCH(DMI_PRODUCT_NAME, "Extensa 5235"),
+   },
+   },
+   {
+.callback = video_detect_force_native,
+/* Acer TravelMate 4750 */
+.matches = {
+   DMI_MATCH(DMI_BOARD_VENDOR, "Acer"),
+   DMI_MATCH(DMI_PRODUCT_NAME, "TravelMate 4750"),
+   },
+   },
{
 /* https://bugzilla.kernel.org/show_bug.cgi?id=207835 */
 .callback = video_detect_force_native,
@@ -447,6 +491,15 @@ static const struct dmi_system_id video_detect_dmi_table[] 
= {
DMI_MATCH(DMI_BOARD_NAME, "BA51_MV"),
},
},
+   {
+/* https://bugzilla.kernel.org/show_bug.cgi?id=36322 */
+.callback = video_detect_force_native,
+/* Acer TravelMate 5760 */
+.matches = {
+   DMI_MATCH(DMI_BOARD_VENDOR, "Acer"),
+   DMI_MATCH(DMI_PRODUCT_NAME, "TravelMate 5760"),
+   },
+   },
{
.callback = video_detect_force_native,
/* ASUSTeK COMPUTER INC. GA401 */
diff --git a/drivers/platform/x86/acer-wmi.c b/drivers/platform/x86/acer-wmi.c
index e0230ea0cb7e..b933a5165edb 100644
--- a/drivers/platform/x86/acer-wmi.c
+++ b/drivers/platform/x86/acer-wmi.c
@@ -643,69 +643,6 @@ static const struct dmi_system_id non_acer_quirks[] 
__initconst = {
{}
 };
 
-static int __init
-video_set_backlight_video_vendor(const struct dmi_system_id *d)
-{
-   interface->cap

[PATCH v5 20/31] platform/x86: apple-gmux: Stop calling acpi/video.h functions

2022-08-25 Thread Hans de Goede
Now that acpi_video_get_backlight_type() has apple-gmux detection (using
apple_gmux_present()), it is no longer necessary for the apple-gmux code
to manually remove possibly conflicting drivers.

So remove the handling for this from the apple-gmux driver.

Signed-off-by: Hans de Goede 
---
 drivers/platform/x86/apple-gmux.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/platform/x86/apple-gmux.c 
b/drivers/platform/x86/apple-gmux.c
index ffe98a18440b..ca33df7ea550 100644
--- a/drivers/platform/x86/apple-gmux.c
+++ b/drivers/platform/x86/apple-gmux.c
@@ -21,7 +21,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 /**
@@ -694,7 +693,6 @@ static int gmux_probe(struct pnp_dev *pnp, const struct 
pnp_device_id *id)
 * backlight control and supports more levels than other options.
 * Disable the other backlight choices.
 */
-   acpi_video_set_dmi_backlight_type(acpi_backlight_vendor);
apple_bl_unregister();
 
gmux_data->power_state = VGA_SWITCHEROO_ON;
@@ -804,7 +802,6 @@ static void gmux_remove(struct pnp_dev *pnp)
apple_gmux_data = NULL;
kfree(gmux_data);
 
-   acpi_video_register();
apple_bl_register();
 }
 
-- 
2.37.2



[PATCH v5 02/31] drm/i915: Don't register backlight when another backlight should be used (v2)

2022-08-25 Thread Hans de Goede
Before this commit when we want userspace to use the acpi_video backlight
device we register both the GPU's native backlight device and acpi_video's
firmware acpi_video# backlight device. This relies on userspace preferring
firmware type backlight devices over native ones.

Registering 2 backlight devices for a single display really is
undesirable, don't register the GPU's native backlight device when
another backlight device should be used.

Changes in v2:
- Use drm_info(drm_dev,  ...) for log messages

Reviewed-by: Jani Nikula 
Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/i915/display/intel_backlight.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_backlight.c 
b/drivers/gpu/drm/i915/display/intel_backlight.c
index 681ebcda97ad..03c7966f68d6 100644
--- a/drivers/gpu/drm/i915/display/intel_backlight.c
+++ b/drivers/gpu/drm/i915/display/intel_backlight.c
@@ -8,6 +8,8 @@
 #include 
 #include 
 
+#include 
+
 #include "intel_backlight.h"
 #include "intel_backlight_regs.h"
 #include "intel_connector.h"
@@ -952,6 +954,11 @@ int intel_backlight_device_register(struct intel_connector 
*connector)
 
WARN_ON(panel->backlight.max == 0);
 
+   if (!acpi_video_backlight_use_native()) {
+   drm_info(&i915->drm, "Skipping intel_backlight registration\n");
+   return 0;
+   }
+
memset(&props, 0, sizeof(props));
props.type = BACKLIGHT_RAW;
 
-- 
2.37.2



[PATCH v5 25/31] platform/x86: asus-wmi: Move acpi_backlight=native quirks to ACPI video_detect.c

2022-08-25 Thread Hans de Goede
Remove the asus-wmi quirk_entry.wmi_backlight_native quirk-flag, which
called acpi_video_set_dmi_backlight_type(acpi_backlight_native) and replace
it with acpi/video_detect.c video_detect_dmi_table[] entries using the
video_detect_force_native callback.

acpi_video_set_dmi_backlight_type() is troublesome because it may end up
getting called after other backlight drivers have already called
acpi_video_get_backlight_type() resulting in the other drivers
already being registered even though they should not.

Acked-by: Rafael J. Wysocki 
Signed-off-by: Hans de Goede 
---
 drivers/acpi/video_detect.c|  8 
 drivers/platform/x86/asus-nb-wmi.c | 14 --
 drivers/platform/x86/asus-wmi.c|  3 ---
 drivers/platform/x86/asus-wmi.h|  1 -
 4 files changed, 8 insertions(+), 18 deletions(-)

diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index d893313fe1a0..a09089e7fada 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -564,6 +564,14 @@ static const struct dmi_system_id video_detect_dmi_table[] 
= {
DMI_MATCH(DMI_PRODUCT_NAME, "GA503"),
},
},
+   {
+.callback = video_detect_force_native,
+/* Asus UX303UB */
+.matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "UX303UB"),
+   },
+   },
/*
 * Clevo NL5xRU and NL5xNU/TUXEDO Aura 15 Gen1 and Gen2 have both a
 * working native and video interface. However the default detection
diff --git a/drivers/platform/x86/asus-nb-wmi.c 
b/drivers/platform/x86/asus-nb-wmi.c
index 810a94557a85..bbfed85051ee 100644
--- a/drivers/platform/x86/asus-nb-wmi.c
+++ b/drivers/platform/x86/asus-nb-wmi.c
@@ -97,11 +97,6 @@ static struct quirk_entry quirk_asus_x200ca = {
.wmi_backlight_set_devstate = true,
 };
 
-static struct quirk_entry quirk_asus_ux303ub = {
-   .wmi_backlight_native = true,
-   .wmi_backlight_set_devstate = true,
-};
-
 static struct quirk_entry quirk_asus_x550lb = {
.wmi_backlight_set_devstate = true,
.xusb2pr = 0x01D9,
@@ -372,15 +367,6 @@ static const struct dmi_system_id asus_quirks[] = {
},
.driver_data = &quirk_asus_x200ca,
},
-   {
-   .callback = dmi_matched,
-   .ident = "ASUSTeK COMPUTER INC. UX303UB",
-   .matches = {
-   DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
-   DMI_MATCH(DMI_PRODUCT_NAME, "UX303UB"),
-   },
-   .driver_data = &quirk_asus_ux303ub,
-   },
{
.callback = dmi_matched,
.ident = "ASUSTeK COMPUTER INC. UX330UAK",
diff --git a/drivers/platform/x86/asus-wmi.c b/drivers/platform/x86/asus-wmi.c
index 5cf9d9aff164..434249ac47a5 100644
--- a/drivers/platform/x86/asus-wmi.c
+++ b/drivers/platform/x86/asus-wmi.c
@@ -3634,9 +3634,6 @@ static int asus_wmi_add(struct platform_device *pdev)
if (asus->driver->quirks->wmi_force_als_set)
asus_wmi_set_als();
 
-   if (asus->driver->quirks->wmi_backlight_native)
-   acpi_video_set_dmi_backlight_type(acpi_backlight_native);
-
if (asus->driver->quirks->xusb2pr)
asus_wmi_set_xusb2pr(asus);
 
diff --git a/drivers/platform/x86/asus-wmi.h b/drivers/platform/x86/asus-wmi.h
index 30770e411301..f30252efe1db 100644
--- a/drivers/platform/x86/asus-wmi.h
+++ b/drivers/platform/x86/asus-wmi.h
@@ -29,7 +29,6 @@ struct quirk_entry {
bool hotplug_wireless;
bool scalar_panel_brightness;
bool store_backlight_power;
-   bool wmi_backlight_native;
bool wmi_backlight_set_devstate;
bool wmi_force_als_set;
bool use_kbd_dock_devid;
-- 
2.37.2



[PATCH v5 19/31] platform/x86: nvidia-wmi-ec-backlight: Use acpi_video_get_backlight_type()

2022-08-25 Thread Hans de Goede
Add an acpi_video_get_backlight_type() == acpi_backlight_nvidia_wmi_ec
check. This will make nvidia-wmi-ec-backlight properly honor the user
selecting a different backlight driver through the acpi_backlight=...
kernel commandline option.

Since the auto-detect code check for nvidia-wmi-ec-backlight in
drivers/acpi/video_detect.c already checks that the WMI advertised
brightness-source is the embedded controller, this new check makes it
unnecessary for nvidia_wmi_ec_backlight_probe() to check this itself.

Suggested-by: Daniel Dadap 
Reviewed-by: Daniel Dadap 
Signed-off-by: Hans de Goede 
---
 drivers/platform/x86/Kconfig   |  1 +
 drivers/platform/x86/nvidia-wmi-ec-backlight.c | 14 +++---
 2 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
index f2f98e942cf2..0cc5ac35fc57 100644
--- a/drivers/platform/x86/Kconfig
+++ b/drivers/platform/x86/Kconfig
@@ -93,6 +93,7 @@ config PEAQ_WMI
 
 config NVIDIA_WMI_EC_BACKLIGHT
tristate "EC Backlight Driver for Hybrid Graphics Notebook Systems"
+   depends on ACPI_VIDEO
depends on ACPI_WMI
depends on BACKLIGHT_CLASS_DEVICE
help
diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c 
b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
index be803e47eac0..baccdf658538 100644
--- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
+++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /**
  * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI 
method
@@ -87,19 +88,10 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device 
*wdev, const void *ct
 {
struct backlight_properties props = {};
struct backlight_device *bdev;
-   u32 source;
int ret;
 
-   ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
-  WMI_BRIGHTNESS_MODE_GET, &source);
-   if (ret)
-   return ret;
-
-   /*
-* This driver is only to be used when brightness control is handled
-* by the EC; otherwise, the GPU driver(s) should control brightness.
-*/
-   if (source != WMI_BRIGHTNESS_SOURCE_EC)
+   /* drivers/acpi/video_detect.c also checks that SOURCE == EC */
+   if (acpi_video_get_backlight_type() != acpi_backlight_nvidia_wmi_ec)
return -ENODEV;
 
/*
-- 
2.37.2



[PATCH v5 04/31] drm/radeon: Don't register backlight when another backlight should be used (v3)

2022-08-25 Thread Hans de Goede
Before this commit when we want userspace to use the acpi_video backlight
device we register both the GPU's native backlight device and acpi_video's
firmware acpi_video# backlight device. This relies on userspace preferring
firmware type backlight devices over native ones.

Registering 2 backlight devices for a single display really is
undesirable, don't register the GPU's native backlight device when
another backlight device should be used.

Changes in v2:
- To avoid linker errors when amdgpu is builtin and video_detect.c is in
  a module, select ACPI_VIDEO and its deps if ACPI is enabled.
  When ACPI is disabled, ACPI_VIDEO is also always disabled, ensuring
  the stubs from acpi/video.h will be used.

Changes in v3:
- Use drm_info(drm_dev, "...") to log messages
- ACPI_VIDEO can now be enabled on non X86 too,
  adjust the Kconfig changes to match this.

Acked-by: Alex Deucher 
Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/Kconfig | 7 +++
 drivers/gpu/drm/radeon/atombios_encoders.c  | 7 +++
 drivers/gpu/drm/radeon/radeon_legacy_encoders.c | 7 +++
 3 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 95ca33938b4a..0471505e951d 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -234,6 +234,13 @@ config DRM_RADEON
select HWMON
select BACKLIGHT_CLASS_DEVICE
select INTERVAL_TREE
+   # radeon depends on ACPI_VIDEO when ACPI is enabled, for select to work
+   # ACPI_VIDEO's dependencies must also be selected.
+   select INPUT if ACPI
+   select ACPI_VIDEO if ACPI
+   # On x86 ACPI_VIDEO also needs ACPI_WMI
+   select X86_PLATFORM_DEVICES if ACPI && X86
+   select ACPI_WMI if ACPI && X86
help
  Choose this option if you have an ATI Radeon graphics card.  There
  are both PCI and AGP versions.  You don't need to choose this to
diff --git a/drivers/gpu/drm/radeon/atombios_encoders.c 
b/drivers/gpu/drm/radeon/atombios_encoders.c
index 0eae05dfb385..c841c273222e 100644
--- a/drivers/gpu/drm/radeon/atombios_encoders.c
+++ b/drivers/gpu/drm/radeon/atombios_encoders.c
@@ -32,6 +32,8 @@
 #include 
 #include 
 
+#include 
+
 #include "atom.h"
 #include "radeon_atombios.h"
 #include "radeon.h"
@@ -209,6 +211,11 @@ void radeon_atom_backlight_init(struct radeon_encoder 
*radeon_encoder,
if (!(rdev->mode_info.firmware_flags & 
ATOM_BIOS_INFO_BL_CONTROLLED_BY_GPU))
return;
 
+   if (!acpi_video_backlight_use_native()) {
+   drm_info(dev, "Skipping radeon atom DIG backlight 
registration\n");
+   return;
+   }
+
pdata = kmalloc(sizeof(struct radeon_backlight_privdata), GFP_KERNEL);
if (!pdata) {
DRM_ERROR("Memory allocation failed\n");
diff --git a/drivers/gpu/drm/radeon/radeon_legacy_encoders.c 
b/drivers/gpu/drm/radeon/radeon_legacy_encoders.c
index 1a66fb969ee7..0cd32c65456c 100644
--- a/drivers/gpu/drm/radeon/radeon_legacy_encoders.c
+++ b/drivers/gpu/drm/radeon/radeon_legacy_encoders.c
@@ -33,6 +33,8 @@
 #include 
 #include 
 
+#include 
+
 #include "radeon.h"
 #include "radeon_asic.h"
 #include "radeon_legacy_encoders.h"
@@ -387,6 +389,11 @@ void radeon_legacy_backlight_init(struct radeon_encoder 
*radeon_encoder,
return;
 #endif
 
+   if (!acpi_video_backlight_use_native()) {
+   drm_info(dev, "Skipping radeon legacy LVDS backlight 
registration\n");
+   return;
+   }
+
pdata = kmalloc(sizeof(struct radeon_backlight_privdata), GFP_KERNEL);
if (!pdata) {
DRM_ERROR("Memory allocation failed\n");
-- 
2.37.2



[PATCH v5 10/31] ACPI: video: Remove code to unregister acpi_video backlight when a native backlight registers

2022-08-25 Thread Hans de Goede
Remove the code to unregister acpi_video backlight devices when
a native backlight device gets registered later.

Now that the acpi_video backlight device registration is a separate step
which runs later, after the drm/kms driver is done setting up its own
native backlight device, it is no longer necessary to monitor for a
native (BACKLIGHT_RAW) device showing up later and to then unregister
the acpi_video backlight device(s).

Acked-by: Rafael J. Wysocki 
Signed-off-by: Hans de Goede 
---
 drivers/acpi/acpi_video.c   |  2 --
 drivers/acpi/video_detect.c | 36 
 2 files changed, 38 deletions(-)

diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c
index 09dd86f86cf3..d1e41f30c004 100644
--- a/drivers/acpi/acpi_video.c
+++ b/drivers/acpi/acpi_video.c
@@ -94,7 +94,6 @@ static void acpi_video_bus_notify(struct acpi_device *device, 
u32 event);
 static void acpi_video_bus_register_backlight_work(struct work_struct 
*ignored);
 static DECLARE_DELAYED_WORK(video_bus_register_backlight_work,
acpi_video_bus_register_backlight_work);
-void acpi_video_detect_exit(void);
 
 /*
  * Indices in the _BCL method response: the first two items are special,
@@ -2342,7 +2341,6 @@ static int __init acpi_video_init(void)
 
 static void __exit acpi_video_exit(void)
 {
-   acpi_video_detect_exit();
acpi_video_unregister();
 }
 
diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index 385eb49c763f..fb49b8f4523a 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -38,10 +38,6 @@
 
 void acpi_video_unregister_backlight(void);
 
-static bool backlight_notifier_registered;
-static struct notifier_block backlight_nb;
-static struct work_struct backlight_notify_work;
-
 static enum acpi_backlight_type acpi_backlight_cmdline = acpi_backlight_undef;
 static enum acpi_backlight_type acpi_backlight_dmi = acpi_backlight_undef;
 
@@ -538,26 +534,6 @@ static const struct dmi_system_id video_detect_dmi_table[] 
= {
{ },
 };
 
-/* This uses a workqueue to avoid various locking ordering issues */
-static void acpi_video_backlight_notify_work(struct work_struct *work)
-{
-   if (acpi_video_get_backlight_type() != acpi_backlight_video)
-   acpi_video_unregister_backlight();
-}
-
-static int acpi_video_backlight_notify(struct notifier_block *nb,
-  unsigned long val, void *bd)
-{
-   struct backlight_device *backlight = bd;
-
-   /* A raw bl registering may change video -> native */
-   if (backlight->props.type == BACKLIGHT_RAW &&
-   val == BACKLIGHT_REGISTERED)
-   schedule_work(&backlight_notify_work);
-
-   return NOTIFY_OK;
-}
-
 /*
  * Determine which type of backlight interface to use on this system,
  * First check cmdline, then dmi quirks, then do autodetect.
@@ -587,12 +563,6 @@ static enum acpi_backlight_type 
__acpi_video_get_backlight_type(bool native)
acpi_walk_namespace(ACPI_TYPE_DEVICE, ACPI_ROOT_OBJECT,
ACPI_UINT32_MAX, find_video, NULL,
&video_caps, NULL);
-   INIT_WORK(&backlight_notify_work,
- acpi_video_backlight_notify_work);
-   backlight_nb.notifier_call = acpi_video_backlight_notify;
-   backlight_nb.priority = 0;
-   if (backlight_register_notifier(&backlight_nb) == 0)
-   backlight_notifier_registered = true;
init_done = true;
}
if (native)
@@ -639,9 +609,3 @@ void acpi_video_set_dmi_backlight_type(enum 
acpi_backlight_type type)
acpi_video_unregister_backlight();
 }
 EXPORT_SYMBOL(acpi_video_set_dmi_backlight_type);
-
-void __exit acpi_video_detect_exit(void)
-{
-   if (backlight_notifier_registered)
-   backlight_unregister_notifier(&backlight_nb);
-}
-- 
2.37.2



[PATCH v5 00/31] drm/kms: Stop registering multiple /sys/class/backlight devs for a single display

2022-08-25 Thread Hans de Goede
Hi All,

As mentioned in my RFC titled "drm/kms: control display brightness through
drm_connector properties":
https://lore.kernel.org/dri-devel/0d188965-d809-81b5-74ce-7d30c49fe...@redhat.com/

The first step towards this is to deal with some existing technical debt
in backlight handling on x86/ACPI boards, specifically we need to stop
registering multiple /sys/class/backlight devs for a single display.

This series implements my RFC describing my plan for these cleanups:
https://lore.kernel.org/dri-devel/98519ba0-7f18-201a-ea34-652f50343...@redhat.com/

Changes in version 5:
- Use drm_info(drm_dev, ...) in patch 2/31
- Modify "drm/i915: Call acpi_video_register_backlight()", dropping
  the global has_panel flag, replacing it with a new
  intel_acpi_video_register() helper

Changes in version 4:
- Minor tweaks to nvidia-wmi-ec-backlight changes
- Add nouveau_acpi_* wrappers around used include/acpi/video.h functions to
  fix unresolved symbol errors on non X86

Changes in version 3:
- ACPI_VIDEO can now be enabled on non X86 too, adjust various Kconfig changes
- Make the delay before doing fallback acpi_video backlight registration
  a module option (patch 9)
- Move the nvidia-wmi-ec-backlight fw API definitions to a shared header
- Add a "acpi_video_get_backlight_type() == acpi_backlight_nvidia_wmi_ec"
  check to the nvidia-wmi-ec-backlight driver (patch 19)

Changes in version 2:
- Introduce acpi_video_backlight_use_native() helper
- Finishes the refactoring, addressing all the bits from the "Other issues"
  section of the refactor RFC

This series as submitted is based on drm-tip for CI purposes.

Assuming the last i915 patch also pass review now, I hope to push
out an immutable branch with this series on top of v6.0-rc1 and
send out a pull-request to all involved subsystems based on
this branch soon.

Regards,

Hans


Hans de Goede (31):
  ACPI: video: Add acpi_video_backlight_use_native() helper
  drm/i915: Don't register backlight when another backlight should be
used (v2)
  drm/amdgpu: Don't register backlight when another backlight should be
used (v3)
  drm/radeon: Don't register backlight when another backlight should be
used (v3)
  drm/nouveau: Don't register backlight when another backlight should be
used (v2)
  ACPI: video: Drop backlight_device_get_by_type() call from
acpi_video_get_backlight_type()
  ACPI: video: Remove acpi_video_bus from list before tearing it down
  ACPI: video: Simplify acpi_video_unregister_backlight()
  ACPI: video: Make backlight class device registration a separate step
(v2)
  ACPI: video: Remove code to unregister acpi_video backlight when a
native backlight registers
  drm/i915: Call acpi_video_register_backlight() (v3)
  drm/nouveau: Register ACPI video backlight when nv_backlight
registration fails (v2)
  drm/amdgpu: Register ACPI video backlight when skipping amdgpu
backlight registration
  drm/radeon: Register ACPI video backlight when skipping radeon
backlight registration
  platform/x86: nvidia-wmi-ec-backlight: Move fw interface definitions
to a header (v2)
  ACPI: video: Refactor acpi_video_get_backlight_type() a bit
  ACPI: video: Add Nvidia WMI EC brightness control detection (v3)
  ACPI: video: Add Apple GMUX brightness control detection
  platform/x86: nvidia-wmi-ec-backlight: Use
acpi_video_get_backlight_type()
  platform/x86: apple-gmux: Stop calling acpi/video.h functions
  platform/x86: toshiba_acpi: Stop using
acpi_video_set_dmi_backlight_type()
  platform/x86: acer-wmi: Move backlight DMI quirks to
acpi/video_detect.c
  platform/x86: asus-wmi: Drop DMI chassis-type check from backlight
handling
  platform/x86: asus-wmi: Move acpi_backlight=vendor quirks to ACPI
video_detect.c
  platform/x86: asus-wmi: Move acpi_backlight=native quirks to ACPI
video_detect.c
  platform/x86: samsung-laptop: Move acpi_backlight=[vendor|native]
quirks to ACPI video_detect.c
  ACPI: video: Remove acpi_video_set_dmi_backlight_type()
  ACPI: video: Drop "Samsung X360" acpi_backlight=native quirk
  ACPI: video: Drop NL5x?U, PF4NU1F and PF5?U?? acpi_backlight=native
quirks
  ACPI: video: Fix indentation of video_detect_dmi_table[] entries
  drm/todo: Add entry about dealing with brightness control on devices
with > 1 panel

 Documentation/gpu/todo.rst|  68 +++
 MAINTAINERS   |   1 +
 drivers/acpi/Kconfig  |   1 +
 drivers/acpi/acpi_video.c |  64 ++-
 drivers/acpi/video_detect.c   | 428 +++---
 drivers/gpu/drm/Kconfig   |  14 +
 .../gpu/drm/amd/amdgpu/atombios_encoders.c|  14 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |   9 +
 drivers/gpu/drm/gma500/Kconfig|   2 +
 drivers/gpu/drm/i915/Kconfig  |   2 +
 drivers/gpu/drm/i915/display/intel_acpi.c |  27 ++
 drivers/gpu/drm/i915/display/intel_acpi.h |   3 +

[PATCH v5 14/31] drm/radeon: Register ACPI video backlight when skipping radeon backlight registration

2022-08-25 Thread Hans de Goede
Typically the acpi_video driver will initialize before radeon, which
used to cause /sys/class/backlight/acpi_video0 to get registered and then
radeon would register its own radeon_bl# device later. After which
the drivers/acpi/video_detect.c code unregistered the acpi_video0 device
to avoid there being 2 backlight devices.

This means that userspace used to briefly see 2 devices and the
disappearing of acpi_video0 after a brief time confuses the systemd
backlight level save/restore code, see e.g.:
https://bbs.archlinux.org/viewtopic.php?id=269920

To fix this the ACPI video code has been modified to make backlight class
device registration a separate step, relying on the drm/kms driver to
ask for the acpi_video backlight registration after it is done setting up
its native backlight device.

Add a call to the new acpi_video_register_backlight() when radeon skips
registering its own backlight device because of e.g. the firmware_flags
or the acpi_video_get_backlight_type() return value. This ensures that
if the acpi_video backlight device should be used, it will be available
before the radeon drm_device gets registered with userspace.

Acked-by: Alex Deucher 
Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/radeon/radeon_encoders.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_encoders.c 
b/drivers/gpu/drm/radeon/radeon_encoders.c
index 35c535e48b8d..fbc0a2182318 100644
--- a/drivers/gpu/drm/radeon/radeon_encoders.c
+++ b/drivers/gpu/drm/radeon/radeon_encoders.c
@@ -30,6 +30,8 @@
 #include 
 #include 
 
+#include 
+
 #include "radeon.h"
 #include "radeon_atombios.h"
 #include "radeon_legacy_encoders.h"
@@ -167,7 +169,7 @@ static void radeon_encoder_add_backlight(struct 
radeon_encoder *radeon_encoder,
return;
 
if (radeon_backlight == 0) {
-   return;
+   use_bl = false;
} else if (radeon_backlight == 1) {
use_bl = true;
} else if (radeon_backlight == -1) {
@@ -193,6 +195,13 @@ static void radeon_encoder_add_backlight(struct 
radeon_encoder *radeon_encoder,
else
radeon_legacy_backlight_init(radeon_encoder, connector);
}
+
+   /*
+* If there is no native backlight device (which may happen even when
+* use_bl==true) try registering an ACPI video backlight device instead.
+*/
+   if (!rdev->mode_info.bl_encoder)
+   acpi_video_register_backlight();
 }
 
 void
-- 
2.37.2



[PATCH v5 06/31] ACPI: video: Drop backlight_device_get_by_type() call from acpi_video_get_backlight_type()

2022-08-25 Thread Hans de Goede
All x86/ACPI kms drivers which register native/BACKLIGHT_RAW type
backlight devices call acpi_video_backlight_use_native() now. This sets
__acpi_video_get_backlight_type()'s internal static native_available flag.

This makes the backlight_device_get_by_type(BACKLIGHT_RAW) check
unnecessary.

Relying on the cached native_available value not only is simpler, it will
also work correctly in cases where then native backlight registration was
skipped because of acpi_video_backlight_use_native() returning false.

Acked-by: Rafael J. Wysocki 
Signed-off-by: Hans de Goede 
---
 drivers/acpi/video_detect.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index 5f105eaa7d30..385eb49c763f 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -608,8 +608,7 @@ static enum acpi_backlight_type 
__acpi_video_get_backlight_type(bool native)
if (!(video_caps & ACPI_VIDEO_BACKLIGHT))
return acpi_backlight_vendor;
 
-   if (acpi_osi_is_win8() &&
-   (native_available || backlight_device_get_by_type(BACKLIGHT_RAW)))
+   if (acpi_osi_is_win8() && native_available)
return acpi_backlight_native;
 
return acpi_backlight_video;
-- 
2.37.2



[PATCH v5 01/31] ACPI: video: Add acpi_video_backlight_use_native() helper

2022-08-25 Thread Hans de Goede
ATM on x86 laptops where we want userspace to use the acpi_video backlight
device we often register both the GPU's native backlight device and
acpi_video's firmware acpi_video# backlight device. This relies on
userspace preferring firmware type backlight devices over native ones, but
registering 2 backlight devices for a single display really is undesirable.

On x86 laptops where the native GPU backlight device should be used,
the registering of other backlight devices is avoided by their drivers
using acpi_video_get_backlight_type() and only registering their backlight
if the return value matches their type.

acpi_video_get_backlight_type() uses
backlight_device_get_by_type(BACKLIGHT_RAW) to determine if a native
driver is available and will never return native if this returns
false. This means that the GPU's native backlight registering code
cannot just call acpi_video_get_backlight_type() to determine if it
should register its backlight, since acpi_video_get_backlight_type() will
never return native until the native backlight has already registered.

To fix this add a new internal native function parameter to
acpi_video_get_backlight_type(), which when set to true will make
acpi_video_get_backlight_type() behave as if a native backlight has
already been registered.

And add a new acpi_video_backlight_use_native() helper, which sets this
to true, for use in native GPU backlight code.

Changes in v2:
- Replace adding a native parameter to acpi_video_get_backlight_type() with
  adding a new acpi_video_backlight_use_native() helper.

Acked-by: Rafael J. Wysocki 
Signed-off-by: Hans de Goede 
---
 drivers/acpi/video_detect.c | 24 
 include/acpi/video.h|  5 +
 2 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index 5d7f38016a24..5f105eaa7d30 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -17,8 +17,9 @@
  * Otherwise vendor specific drivers like thinkpad_acpi, asus-laptop,
  * sony_acpi,... can take care about backlight brightness.
  *
- * Backlight drivers can use acpi_video_get_backlight_type() to determine
- * which driver should handle the backlight.
+ * Backlight drivers can use acpi_video_get_backlight_type() to determine which
+ * driver should handle the backlight. RAW/GPU-driver backlight drivers must
+ * use the acpi_video_backlight_use_native() helper for this.
  *
  * If CONFIG_ACPI_VIDEO is neither set as "compiled in" (y) nor as a module (m)
  * this file will not be compiled and acpi_video_get_backlight_type() will
@@ -571,9 +572,10 @@ static int acpi_video_backlight_notify(struct 
notifier_block *nb,
  * Arguably the native on win8 check should be done first, but that would
  * be a behavior change, which may causes issues.
  */
-enum acpi_backlight_type acpi_video_get_backlight_type(void)
+static enum acpi_backlight_type __acpi_video_get_backlight_type(bool native)
 {
static DEFINE_MUTEX(init_mutex);
+   static bool native_available;
static bool init_done;
static long video_caps;
 
@@ -593,6 +595,8 @@ enum acpi_backlight_type acpi_video_get_backlight_type(void)
backlight_notifier_registered = true;
init_done = true;
}
+   if (native)
+   native_available = true;
mutex_unlock(&init_mutex);
 
if (acpi_backlight_cmdline != acpi_backlight_undef)
@@ -604,13 +608,25 @@ enum acpi_backlight_type 
acpi_video_get_backlight_type(void)
if (!(video_caps & ACPI_VIDEO_BACKLIGHT))
return acpi_backlight_vendor;
 
-   if (acpi_osi_is_win8() && backlight_device_get_by_type(BACKLIGHT_RAW))
+   if (acpi_osi_is_win8() &&
+   (native_available || backlight_device_get_by_type(BACKLIGHT_RAW)))
return acpi_backlight_native;
 
return acpi_backlight_video;
 }
+
+enum acpi_backlight_type acpi_video_get_backlight_type(void)
+{
+   return __acpi_video_get_backlight_type(false);
+}
 EXPORT_SYMBOL(acpi_video_get_backlight_type);
 
+bool acpi_video_backlight_use_native(void)
+{
+   return __acpi_video_get_backlight_type(true) == acpi_backlight_native;
+}
+EXPORT_SYMBOL(acpi_video_backlight_use_native);
+
 /*
  * Set the preferred backlight interface type based on DMI info.
  * This function allows DMI blacklists to be implemented by external
diff --git a/include/acpi/video.h b/include/acpi/video.h
index db8548ff03ce..4705e339c252 100644
--- a/include/acpi/video.h
+++ b/include/acpi/video.h
@@ -56,6 +56,7 @@ extern void acpi_video_unregister(void);
 extern int acpi_video_get_edid(struct acpi_device *device, int type,
   int device_id, void **edid);
 extern enum acpi_backlight_type acpi_video_get_backlight_type(void);
+extern bool acpi_video_backlight_use_native(void);
 extern void acpi_video_set_dmi_backlight_type(enum acpi_backlight_type type);
 /*
  * Note: The value returned by acpi_v

[PATCH v5 18/31] ACPI: video: Add Apple GMUX brightness control detection

2022-08-25 Thread Hans de Goede
On Apple laptops with an Apple GMUX using this for brightness control,
should take precedence of any other brightness control methods.

Add apple-gmux detection to acpi_video_get_backlight_type() using
the already existing apple_gmux_present() helper function.

This will allow removig the (ab)use of:

acpi_video_set_dmi_backlight_type(acpi_backlight_vendor);

Inside the apple-gmux driver.

Acked-by: Rafael J. Wysocki 
Signed-off-by: Hans de Goede 
---
 drivers/acpi/video_detect.c | 4 
 include/acpi/video.h| 1 +
 2 files changed, 5 insertions(+)

diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index 4dc7fb865083..be2fc43418af 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -28,6 +28,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -607,6 +608,9 @@ static enum acpi_backlight_type 
__acpi_video_get_backlight_type(bool native)
if (nvidia_wmi_ec_present)
return acpi_backlight_nvidia_wmi_ec;
 
+   if (apple_gmux_present())
+   return acpi_backlight_apple_gmux;
+
/* On systems with ACPI video use either native or ACPI video. */
if (video_caps & ACPI_VIDEO_BACKLIGHT) {
/*
diff --git a/include/acpi/video.h b/include/acpi/video.h
index 91578e77ac4e..dbd48cb8bd23 100644
--- a/include/acpi/video.h
+++ b/include/acpi/video.h
@@ -49,6 +49,7 @@ enum acpi_backlight_type {
acpi_backlight_vendor,
acpi_backlight_native,
acpi_backlight_nvidia_wmi_ec,
+   acpi_backlight_apple_gmux,
 };
 
 #if IS_ENABLED(CONFIG_ACPI_VIDEO)
-- 
2.37.2



[PATCH v5 13/31] drm/amdgpu: Register ACPI video backlight when skipping amdgpu backlight registration

2022-08-25 Thread Hans de Goede
Typically the acpi_video driver will initialize before amdgpu, which
used to cause /sys/class/backlight/acpi_video0 to get registered and then
amdgpu would register its own amdgpu_bl# device later. After which
the drivers/acpi/video_detect.c code unregistered the acpi_video0 device
to avoid there being 2 backlight devices.

This means that userspace used to briefly see 2 devices and the
disappearing of acpi_video0 after a brief time confuses the systemd
backlight level save/restore code, see e.g.:
https://bbs.archlinux.org/viewtopic.php?id=269920

To fix this the ACPI video code has been modified to make backlight class
device registration a separate step, relying on the drm/kms driver to
ask for the acpi_video backlight registration after it is done setting up
its native backlight device.

Add a call to the new acpi_video_register_backlight() when amdgpu skips
registering its own backlight device because of either the firmware_flags
or the acpi_video_get_backlight_type() return value. This ensures that
if the acpi_video backlight device should be used, it will be available
before the amdgpu drm_device gets registered with userspace.

Acked-by: Alex Deucher 
Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/amd/amdgpu/atombios_encoders.c| 9 +++--
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 ++
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/atombios_encoders.c 
b/drivers/gpu/drm/amd/amdgpu/atombios_encoders.c
index b4e3cedceaf8..6be9ac2b9c5b 100644
--- a/drivers/gpu/drm/amd/amdgpu/atombios_encoders.c
+++ b/drivers/gpu/drm/amd/amdgpu/atombios_encoders.c
@@ -184,11 +184,11 @@ void amdgpu_atombios_encoder_init_backlight(struct 
amdgpu_encoder *amdgpu_encode
return;
 
if (!(adev->mode_info.firmware_flags & 
ATOM_BIOS_INFO_BL_CONTROLLED_BY_GPU))
-   return;
+   goto register_acpi_backlight;
 
if (!acpi_video_backlight_use_native()) {
drm_info(dev, "Skipping amdgpu atom DIG backlight 
registration\n");
-   return;
+   goto register_acpi_backlight;
}
 
pdata = kmalloc(sizeof(struct amdgpu_backlight_privdata), GFP_KERNEL);
@@ -225,6 +225,11 @@ void amdgpu_atombios_encoder_init_backlight(struct 
amdgpu_encoder *amdgpu_encode
 error:
kfree(pdata);
return;
+
+register_acpi_backlight:
+   /* Try registering an ACPI video backlight device instead. */
+   acpi_video_register_backlight();
+   return;
 }
 
 void
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 706c67f4bda8..c450964f84d4 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4037,6 +4037,8 @@ amdgpu_dm_register_backlight_device(struct 
amdgpu_display_manager *dm)
 
if (!acpi_video_backlight_use_native()) {
drm_info(adev_to_drm(dm->adev), "Skipping amdgpu DM backlight 
registration\n");
+   /* Try registering an ACPI video backlight device instead. */
+   acpi_video_register_backlight();
return;
}
 
-- 
2.37.2



[PATCH v5 03/31] drm/amdgpu: Don't register backlight when another backlight should be used (v3)

2022-08-25 Thread Hans de Goede
Before this commit when we want userspace to use the acpi_video backlight
device we register both the GPU's native backlight device and acpi_video's
firmware acpi_video# backlight device. This relies on userspace preferring
firmware type backlight devices over native ones.

Registering 2 backlight devices for a single display really is
undesirable, don't register the GPU's native backlight device when
another backlight device should be used.

Changes in v2:
- To avoid linker errors when amdgpu is builtin and video_detect.c is in
  a module, select ACPI_VIDEO and its deps if ACPI is enabled.
  When ACPI is disabled, ACPI_VIDEO is also always disabled, ensuring
  the stubs from acpi/video.h will be used.

Changes in v3:
- Use drm_info(drm_dev, "...") to log messages
- ACPI_VIDEO can now be enabled on non X86 too,
  adjust the Kconfig changes to match this.

Acked-by: Alex Deucher 
Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/Kconfig   | 7 +++
 drivers/gpu/drm/amd/amdgpu/atombios_encoders.c| 7 +++
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 7 +++
 3 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 0b2ad7212ee6..95ca33938b4a 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -259,6 +259,13 @@ config DRM_AMDGPU
select BACKLIGHT_CLASS_DEVICE
select INTERVAL_TREE
select DRM_BUDDY
+   # amdgpu depends on ACPI_VIDEO when ACPI is enabled, for select to work
+   # ACPI_VIDEO's dependencies must also be selected.
+   select INPUT if ACPI
+   select ACPI_VIDEO if ACPI
+   # On x86 ACPI_VIDEO also needs ACPI_WMI
+   select X86_PLATFORM_DEVICES if ACPI && X86
+   select ACPI_WMI if ACPI && X86
help
  Choose this option if you have a recent AMD Radeon graphics card.
 
diff --git a/drivers/gpu/drm/amd/amdgpu/atombios_encoders.c 
b/drivers/gpu/drm/amd/amdgpu/atombios_encoders.c
index fa7421afb9a6..b4e3cedceaf8 100644
--- a/drivers/gpu/drm/amd/amdgpu/atombios_encoders.c
+++ b/drivers/gpu/drm/amd/amdgpu/atombios_encoders.c
@@ -26,6 +26,8 @@
 
 #include 
 
+#include 
+
 #include 
 #include 
 #include "amdgpu.h"
@@ -184,6 +186,11 @@ void amdgpu_atombios_encoder_init_backlight(struct 
amdgpu_encoder *amdgpu_encode
if (!(adev->mode_info.firmware_flags & 
ATOM_BIOS_INFO_BL_CONTROLLED_BY_GPU))
return;
 
+   if (!acpi_video_backlight_use_native()) {
+   drm_info(dev, "Skipping amdgpu atom DIG backlight 
registration\n");
+   return;
+   }
+
pdata = kmalloc(sizeof(struct amdgpu_backlight_privdata), GFP_KERNEL);
if (!pdata) {
DRM_ERROR("Memory allocation failed\n");
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index e702f0d72d53..706c67f4bda8 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -90,6 +90,8 @@
 #include 
 #include 
 
+#include 
+
 #include "ivsrcid/dcn/irqsrcs_dcn_1_0.h"
 
 #include "dcn/dcn_1_0_offset.h"
@@ -4033,6 +4035,11 @@ amdgpu_dm_register_backlight_device(struct 
amdgpu_display_manager *dm)
amdgpu_dm_update_backlight_caps(dm, dm->num_of_edps);
dm->brightness[dm->num_of_edps] = AMDGPU_MAX_BL_LEVEL;
 
+   if (!acpi_video_backlight_use_native()) {
+   drm_info(adev_to_drm(dm->adev), "Skipping amdgpu DM backlight 
registration\n");
+   return;
+   }
+
props.max_brightness = AMDGPU_MAX_BL_LEVEL;
props.brightness = AMDGPU_MAX_BL_LEVEL;
props.type = BACKLIGHT_RAW;
-- 
2.37.2



[PATCH v5 30/31] ACPI: video: Fix indentation of video_detect_dmi_table[] entries

2022-08-25 Thread Hans de Goede
The video_detect_dmi_table[] uses an unusual indentation for
before the ".name = ..." named struct initializers.

Instead of being indented with an extra tab compared to
the previous line's '{' these are indented to with only
a single space to allow for long DMI_MATCH() lines without
wrapping.

But over time some entries did not event have the single space
indent in front of the ".name = ..." lines.

Make things consistent by using a single space indent for these
lines everywhere.

Acked-by: Rafael J. Wysocki 
Signed-off-by: Hans de Goede 
---
 drivers/acpi/video_detect.c | 48 ++---
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index 789d5913c178..db2474fe58ac 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -142,17 +142,17 @@ static const struct dmi_system_id 
video_detect_dmi_table[] = {
},
},
{
-   .callback = video_detect_force_vendor,
-   /* Asus UL30VT */
-   .matches = {
+.callback = video_detect_force_vendor,
+/* Asus UL30VT */
+.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK Computer Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "UL30VT"),
},
},
{
-   .callback = video_detect_force_vendor,
-   /* Asus UL30A */
-   .matches = {
+.callback = video_detect_force_vendor,
+/* Asus UL30A */
+.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK Computer Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "UL30A"),
},
@@ -198,9 +198,9 @@ static const struct dmi_system_id video_detect_dmi_table[] 
= {
},
},
{
-   .callback = video_detect_force_vendor,
-   /* GIGABYTE GB-BXBT-2807 */
-   .matches = {
+.callback = video_detect_force_vendor,
+/* GIGABYTE GB-BXBT-2807 */
+.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "GIGABYTE"),
DMI_MATCH(DMI_PRODUCT_NAME, "GB-BXBT-2807"),
},
@@ -233,17 +233,17 @@ static const struct dmi_system_id 
video_detect_dmi_table[] = {
},
},
{
-   .callback = video_detect_force_vendor,
-   /* Sony VPCEH3U1E */
-   .matches = {
+.callback = video_detect_force_vendor,
+/* Sony VPCEH3U1E */
+.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Sony Corporation"),
DMI_MATCH(DMI_PRODUCT_NAME, "VPCEH3U1E"),
},
},
{
-   .callback = video_detect_force_vendor,
-   /* Xiaomi Mi Pad 2 */
-   .matches = {
+.callback = video_detect_force_vendor,
+/* Xiaomi Mi Pad 2 */
+.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Xiaomi Inc"),
DMI_MATCH(DMI_PRODUCT_NAME, "Mipad2"),
},
@@ -551,25 +551,25 @@ static const struct dmi_system_id 
video_detect_dmi_table[] = {
},
},
{
-   .callback = video_detect_force_native,
-   /* ASUSTeK COMPUTER INC. GA401 */
-   .matches = {
+.callback = video_detect_force_native,
+/* ASUSTeK COMPUTER INC. GA401 */
+.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
DMI_MATCH(DMI_PRODUCT_NAME, "GA401"),
},
},
{
-   .callback = video_detect_force_native,
-   /* ASUSTeK COMPUTER INC. GA502 */
-   .matches = {
+.callback = video_detect_force_native,
+/* ASUSTeK COMPUTER INC. GA502 */
+.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
DMI_MATCH(DMI_PRODUCT_NAME, "GA502"),
},
},
{
-   .callback = video_detect_force_native,
-   /* ASUSTeK COMPUTER INC. GA503 */
-   .matches = {
+.callback = video_detect_force_native,
+/* ASUSTeK COMPUTER INC. GA503 */
+.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
DMI_MATCH(DMI_PRODUCT_NAME, "GA503"),
},
-- 
2.37.2



[PATCH v5 05/31] drm/nouveau: Don't register backlight when another backlight should be used (v2)

2022-08-25 Thread Hans de Goede
Before this commit when we want userspace to use the acpi_video backlight
device we register both the GPU's native backlight device and acpi_video's
firmware acpi_video# backlight device. This relies on userspace preferring
firmware type backlight devices over native ones.

Registering 2 backlight devices for a single display really is
undesirable, don't register the GPU's native backlight device when
another backlight device should be used.

Changes in v2:
- Add nouveau_acpi_video_backlight_use_native() wrapper to avoid unresolved
  symbol errors on non X86

Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c  | 5 +
 drivers/gpu/drm/nouveau/nouveau_acpi.h  | 2 ++
 drivers/gpu/drm/nouveau/nouveau_backlight.c | 6 ++
 3 files changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index 6140db756d06..1592c9cd7750 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -386,3 +386,8 @@ nouveau_acpi_edid(struct drm_device *dev, struct 
drm_connector *connector)
 
return kmemdup(edid, EDID_LENGTH, GFP_KERNEL);
 }
+
+bool nouveau_acpi_video_backlight_use_native(void)
+{
+   return acpi_video_backlight_use_native();
+}
diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.h 
b/drivers/gpu/drm/nouveau/nouveau_acpi.h
index 330f9b837066..3c666c30dfca 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.h
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.h
@@ -11,6 +11,7 @@ void nouveau_register_dsm_handler(void);
 void nouveau_unregister_dsm_handler(void);
 void nouveau_switcheroo_optimus_dsm(void);
 void *nouveau_acpi_edid(struct drm_device *, struct drm_connector *);
+bool nouveau_acpi_video_backlight_use_native(void);
 #else
 static inline bool nouveau_is_optimus(void) { return false; };
 static inline bool nouveau_is_v1_dsm(void) { return false; };
@@ -18,6 +19,7 @@ static inline void nouveau_register_dsm_handler(void) {}
 static inline void nouveau_unregister_dsm_handler(void) {}
 static inline void nouveau_switcheroo_optimus_dsm(void) {}
 static inline void *nouveau_acpi_edid(struct drm_device *dev, struct 
drm_connector *connector) { return NULL; }
+static inline bool nouveau_acpi_video_backlight_use_native(void) { return 
true; }
 #endif
 
 #endif
diff --git a/drivers/gpu/drm/nouveau/nouveau_backlight.c 
b/drivers/gpu/drm/nouveau/nouveau_backlight.c
index a2141d3d9b1d..d2b8f8c13db4 100644
--- a/drivers/gpu/drm/nouveau/nouveau_backlight.c
+++ b/drivers/gpu/drm/nouveau/nouveau_backlight.c
@@ -38,6 +38,7 @@
 #include "nouveau_reg.h"
 #include "nouveau_encoder.h"
 #include "nouveau_connector.h"
+#include "nouveau_acpi.h"
 
 static struct ida bl_ida;
 #define BL_NAME_SIZE 15 // 12 for name + 2 for digits + 1 for '\0'
@@ -405,6 +406,11 @@ nouveau_backlight_init(struct drm_connector *connector)
goto fail_alloc;
}
 
+   if (!nouveau_acpi_video_backlight_use_native()) {
+   NV_INFO(drm, "Skipping nv_backlight registration\n");
+   goto fail_alloc;
+   }
+
if (!nouveau_get_backlight_name(backlight_name, bl)) {
NV_ERROR(drm, "Failed to retrieve a unique name for the 
backlight interface\n");
goto fail_alloc;
-- 
2.37.2



[PATCH v5 07/31] ACPI: video: Remove acpi_video_bus from list before tearing it down

2022-08-25 Thread Hans de Goede
Move the list_del removing an acpi_video_bus from video_bus_head
on teardown to before the teardown is done, to avoid code iterating
over the video_bus_head list seeing acpi_video_bus objects on there
which are (partly) torn down already.

Acked-by: Rafael J. Wysocki 
Signed-off-by: Hans de Goede 
---
 drivers/acpi/acpi_video.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c
index 5cbe2196176d..cde8ffa9f0b8 100644
--- a/drivers/acpi/acpi_video.c
+++ b/drivers/acpi/acpi_video.c
@@ -2111,14 +2111,14 @@ static int acpi_video_bus_remove(struct acpi_device 
*device)
 
video = acpi_driver_data(device);
 
-   acpi_video_bus_remove_notify_handler(video);
-   acpi_video_bus_unregister_backlight(video);
-   acpi_video_bus_put_devices(video);
-
mutex_lock(&video_list_lock);
list_del(&video->entry);
mutex_unlock(&video_list_lock);
 
+   acpi_video_bus_remove_notify_handler(video);
+   acpi_video_bus_unregister_backlight(video);
+   acpi_video_bus_put_devices(video);
+
kfree(video->attached_array);
kfree(video);
 
-- 
2.37.2



Re: [PATCH 1/2] drm/amdgpu: Move HDP remapping earlier during init

2022-08-25 Thread Alex Deucher
On Thu, Aug 25, 2022 at 10:22 AM Lazar, Lijo  wrote:
>
>
>
> On 8/25/2022 7:37 PM, Alex Deucher wrote:
> > On Thu, Aug 25, 2022 at 4:58 AM Lijo Lazar  wrote:
> >>
> >> HDP flush is used early in the init sequence as part of memory controller
> >> block initialization. Hence remapping of HDP registers needed for flush
> >> needs to happen earlier.
> >>
> >> This also fixes the AER error reported as Unsupported Request during
> >> driver load.
> >>
> >> Link: 
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D216373&data=05%7C01%7Clijo.lazar%40amd.com%7Caeec5a5e8ec7402e546708da86a31e41%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637970332414985963%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EQuUjHTaVPSKZdCUhL6iI4EJ56UMhKTLl86uVpSL8AU%3D&reserved=0
> >>
> >> Reported-by: Tom Seewald 
> >> Signed-off-by: Lijo Lazar 
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +
> >>   drivers/gpu/drm/amd/amdgpu/nv.c| 6 --
> >>   drivers/gpu/drm/amd/amdgpu/soc15.c | 6 --
> >>   drivers/gpu/drm/amd/amdgpu/soc21.c | 6 --
> >>   4 files changed, 9 insertions(+), 18 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> index ce7d117efdb5..53d753e94a71 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> @@ -2376,6 +2376,15 @@ static int amdgpu_device_ip_init(struct 
> >> amdgpu_device *adev)
> >>  DRM_ERROR("amdgpu_vram_scratch_init 
> >> failed %d\n", r);
> >>  goto init_failed;
> >>  }
> >> +
> >> +   /* remap HDP registers to a hole in mmio space,
> >> +* for the purpose of expose those registers
> >> +* to process space. This is needed for any early 
> >> HDP
> >> +* flush operation during gmc initialization.
> >> +*/
> >> +   if (adev->nbio.funcs->remap_hdp_registers && 
> >> !amdgpu_sriov_vf(adev))
> >> +   
> >> adev->nbio.funcs->remap_hdp_registers(adev);
> >> +
> >
> > We probably also need this in ip_resume() as well to handle the
> > suspend and resume case.  Thinking about this more, maybe it's easier
> > to just track whether the remap has happened yet and use the old or
> > new offset based on that.
>
> If we can use the default offset without a remap, does it make sense to
> remap? What about calling the same in ip_resume?

The remap is necessary so that userspace drivers can access this to
flush the HDP registers when they need to since normally it's in a
non-accessible region of the MMIO space.  I'm fine with updating it in
ip_resume as well.

Alex


>
> Thanks,
> Lijo
>
> >
> > Alex
> >
> >
> >>  r = 
> >> adev->ip_blocks[i].version->funcs->hw_init((void *)adev);
> >>  if (r) {
> >>  DRM_ERROR("hw_init %d failed %d\n", i, r);
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c 
> >> b/drivers/gpu/drm/amd/amdgpu/nv.c
> >> index b3fba8dea63c..3ac7fef74277 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/nv.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/nv.c
> >> @@ -1032,12 +1032,6 @@ static int nv_common_hw_init(void *handle)
> >>  nv_program_aspm(adev);
> >>  /* setup nbio registers */
> >>  adev->nbio.funcs->init_registers(adev);
> >> -   /* remap HDP registers to a hole in mmio space,
> >> -* for the purpose of expose those registers
> >> -* to process space
> >> -*/
> >> -   if (adev->nbio.funcs->remap_hdp_registers && 
> >> !amdgpu_sriov_vf(adev))
> >> -   adev->nbio.funcs->remap_hdp_registers(adev);
> >>  /* enable the doorbell aperture */
> >>  nv_enable_doorbell_aperture(adev, true);
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
> >> b/drivers/gpu/drm/amd/amdgpu/soc15.c
> >> index fde6154f2009..a0481e37d7cf 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> >> @@ -1240,12 +1240,6 @@ static int soc15_common_hw_init(void *handle)
> >>  soc15_program_aspm(adev);
> >>  /* setup nbio registers */
> >>  adev->nbio.funcs->init_registers(adev);
> >> -   /* remap HDP registers to a hole in mmio space,
> >> -* for the purpose of expose those registers
> >> -* to process space
> >> -*/
> >> -   if (adev->nbio.funcs->remap_hdp_registers && 
> >> !amdgpu_sriov_vf(adev))
> >> -   adev->nbio.funcs->remap_hdp_registers(adev);
> >>
> >>  /* enable the doorbell aperture */
> >>  soc15_enable_doorbell_aperture(adev, true);
> >> diff --git a/driv

Re: [PATCH 1/2] drm/amdgpu: Move HDP remapping earlier during init

2022-08-25 Thread Lazar, Lijo




On 8/25/2022 7:37 PM, Alex Deucher wrote:

On Thu, Aug 25, 2022 at 4:58 AM Lijo Lazar  wrote:


HDP flush is used early in the init sequence as part of memory controller
block initialization. Hence remapping of HDP registers needed for flush
needs to happen earlier.

This also fixes the AER error reported as Unsupported Request during
driver load.

Link: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D216373&data=05%7C01%7Clijo.lazar%40amd.com%7Caeec5a5e8ec7402e546708da86a31e41%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637970332414985963%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EQuUjHTaVPSKZdCUhL6iI4EJ56UMhKTLl86uVpSL8AU%3D&reserved=0

Reported-by: Tom Seewald 
Signed-off-by: Lijo Lazar 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +
  drivers/gpu/drm/amd/amdgpu/nv.c| 6 --
  drivers/gpu/drm/amd/amdgpu/soc15.c | 6 --
  drivers/gpu/drm/amd/amdgpu/soc21.c | 6 --
  4 files changed, 9 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index ce7d117efdb5..53d753e94a71 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2376,6 +2376,15 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
*adev)
 DRM_ERROR("amdgpu_vram_scratch_init failed 
%d\n", r);
 goto init_failed;
 }
+
+   /* remap HDP registers to a hole in mmio space,
+* for the purpose of expose those registers
+* to process space. This is needed for any early HDP
+* flush operation during gmc initialization.
+*/
+   if (adev->nbio.funcs->remap_hdp_registers && 
!amdgpu_sriov_vf(adev))
+   adev->nbio.funcs->remap_hdp_registers(adev);
+


We probably also need this in ip_resume() as well to handle the
suspend and resume case.  Thinking about this more, maybe it's easier
to just track whether the remap has happened yet and use the old or
new offset based on that.


If we can use the default offset without a remap, does it make sense to 
remap? What about calling the same in ip_resume?


Thanks,
Lijo



Alex



 r = adev->ip_blocks[i].version->funcs->hw_init((void 
*)adev);
 if (r) {
 DRM_ERROR("hw_init %d failed %d\n", i, r);
diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index b3fba8dea63c..3ac7fef74277 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -1032,12 +1032,6 @@ static int nv_common_hw_init(void *handle)
 nv_program_aspm(adev);
 /* setup nbio registers */
 adev->nbio.funcs->init_registers(adev);
-   /* remap HDP registers to a hole in mmio space,
-* for the purpose of expose those registers
-* to process space
-*/
-   if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
-   adev->nbio.funcs->remap_hdp_registers(adev);
 /* enable the doorbell aperture */
 nv_enable_doorbell_aperture(adev, true);

diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index fde6154f2009..a0481e37d7cf 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -1240,12 +1240,6 @@ static int soc15_common_hw_init(void *handle)
 soc15_program_aspm(adev);
 /* setup nbio registers */
 adev->nbio.funcs->init_registers(adev);
-   /* remap HDP registers to a hole in mmio space,
-* for the purpose of expose those registers
-* to process space
-*/
-   if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
-   adev->nbio.funcs->remap_hdp_registers(adev);

 /* enable the doorbell aperture */
 soc15_enable_doorbell_aperture(adev, true);
diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c 
b/drivers/gpu/drm/amd/amdgpu/soc21.c
index 55284b24f113..16b447055102 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc21.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
@@ -660,12 +660,6 @@ static int soc21_common_hw_init(void *handle)
 soc21_program_aspm(adev);
 /* setup nbio registers */
 adev->nbio.funcs->init_registers(adev);
-   /* remap HDP registers to a hole in mmio space,
-* for the purpose of expose those registers
-* to process space
-*/
-   if (adev->nbio.funcs->remap_hdp_registers)
-   adev->nbio.funcs->remap_hdp_registers(adev);
 /* enable the doorbell aperture */
 soc21_enable_doorbell_aperture(adev, true);

--
2.25.1



Re: [PATCH 2/2] drm/amdgpu: Init VF's HDP flush reg offset early

2022-08-25 Thread Alex Deucher
On Thu, Aug 25, 2022 at 9:22 AM Alex Deucher  wrote:
>
> Series is:
> Reviewed-by: Alex Deucher 
>

Actually, hold off on that, I have a comment on patch 1.

Alex

> On Thu, Aug 25, 2022 at 4:58 AM Lijo Lazar  wrote:
> >
> > Make sure the register offsets used for HDP flush in VF is
> > initialized early so that it works fine during any early HDP flush
> > sequence. For that, move the offset initialization to *_remap_hdp.
> >
> > Signed-off-by: Lijo Lazar 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  2 +-
> >  drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c | 23 +
> >  drivers/gpu/drm/amd/amdgpu/nbio_v4_3.c | 12 +++
> >  drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c | 23 +
> >  drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c | 21 ---
> >  drivers/gpu/drm/amd/amdgpu/nbio_v7_2.c | 24 ++
> >  drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 23 +
> >  7 files changed, 84 insertions(+), 44 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index 53d753e94a71..c0bb2e9616c5 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -2382,7 +2382,7 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
> > *adev)
> >  * to process space. This is needed for any early 
> > HDP
> >  * flush operation during gmc initialization.
> >  */
> > -   if (adev->nbio.funcs->remap_hdp_registers && 
> > !amdgpu_sriov_vf(adev))
> > +   if (adev->nbio.funcs->remap_hdp_registers)
> > adev->nbio.funcs->remap_hdp_registers(adev);
> >
> > r = 
> > adev->ip_blocks[i].version->funcs->hw_init((void *)adev);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c 
> > b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
> > index b465baa26762..20fa2c5ad510 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
> > @@ -65,10 +65,21 @@
> >
> >  static void nbio_v2_3_remap_hdp_registers(struct amdgpu_device *adev)
> >  {
> > -   WREG32_SOC15(NBIO, 0, mmREMAP_HDP_MEM_FLUSH_CNTL,
> > -   adev->rmmio_remap.reg_offset + 
> > KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL);
> > -   WREG32_SOC15(NBIO, 0, mmREMAP_HDP_REG_FLUSH_CNTL,
> > -   adev->rmmio_remap.reg_offset + 
> > KFD_MMIO_REMAP_HDP_REG_FLUSH_CNTL);
> > +   if (amdgpu_sriov_vf(adev))
> > +   adev->rmmio_remap.reg_offset =
> > +   SOC15_REG_OFFSET(
> > +   NBIO, 0,
> > +   
> > mmBIF_BX_DEV0_EPF0_VF0_HDP_MEM_COHERENCY_FLUSH_CNTL)
> > +   << 2;
> > +
> > +   if (!amdgpu_sriov_vf(adev)) {
> > +   WREG32_SOC15(NBIO, 0, mmREMAP_HDP_MEM_FLUSH_CNTL,
> > +adev->rmmio_remap.reg_offset +
> > +KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL);
> > +   WREG32_SOC15(NBIO, 0, mmREMAP_HDP_REG_FLUSH_CNTL,
> > +adev->rmmio_remap.reg_offset +
> > +KFD_MMIO_REMAP_HDP_REG_FLUSH_CNTL);
> > +   }
> >  }
> >
> >  static u32 nbio_v2_3_get_rev_id(struct amdgpu_device *adev)
> > @@ -338,10 +349,6 @@ static void nbio_v2_3_init_registers(struct 
> > amdgpu_device *adev)
> >
> > if (def != data)
> > WREG32_PCIE(smnPCIE_CONFIG_CNTL, data);
> > -
> > -   if (amdgpu_sriov_vf(adev))
> > -   adev->rmmio_remap.reg_offset = SOC15_REG_OFFSET(NBIO, 0,
> > -   
> > mmBIF_BX_DEV0_EPF0_VF0_HDP_MEM_COHERENCY_FLUSH_CNTL) << 2;
> >  }
> >
> >  #define NAVI10_PCIE__LC_L0S_INACTIVITY_DEFAULT 0x // off 
> > by default, no gains over L1
> > diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v4_3.c 
> > b/drivers/gpu/drm/amd/amdgpu/nbio_v4_3.c
> > index 982a89f841d5..e011d9856794 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/nbio_v4_3.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v4_3.c
> > @@ -30,10 +30,14 @@
> >
> >  static void nbio_v4_3_remap_hdp_registers(struct amdgpu_device *adev)
> >  {
> > -   WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_MEM_FLUSH_CNTL,
> > -   adev->rmmio_remap.reg_offset + 
> > KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL);
> > -   WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_REG_FLUSH_CNTL,
> > -   adev->rmmio_remap.reg_offset + 
> > KFD_MMIO_REMAP_HDP_REG_FLUSH_CNTL);
> > +   if (!amdgpu_sriov_vf(adev)) {
> > +   WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_MEM_FLUSH_CNTL,
> > +adev->rmmio_remap.reg_offset +
> > +KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL);
> > +   WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_REG_FLUSH_CNTL,

Re: [PATCH 1/2] drm/amdgpu: Move HDP remapping earlier during init

2022-08-25 Thread Alex Deucher
On Thu, Aug 25, 2022 at 4:58 AM Lijo Lazar  wrote:
>
> HDP flush is used early in the init sequence as part of memory controller
> block initialization. Hence remapping of HDP registers needed for flush
> needs to happen earlier.
>
> This also fixes the AER error reported as Unsupported Request during
> driver load.
>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373
>
> Reported-by: Tom Seewald 
> Signed-off-by: Lijo Lazar 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +
>  drivers/gpu/drm/amd/amdgpu/nv.c| 6 --
>  drivers/gpu/drm/amd/amdgpu/soc15.c | 6 --
>  drivers/gpu/drm/amd/amdgpu/soc21.c | 6 --
>  4 files changed, 9 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index ce7d117efdb5..53d753e94a71 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2376,6 +2376,15 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
> *adev)
> DRM_ERROR("amdgpu_vram_scratch_init failed 
> %d\n", r);
> goto init_failed;
> }
> +
> +   /* remap HDP registers to a hole in mmio space,
> +* for the purpose of expose those registers
> +* to process space. This is needed for any early HDP
> +* flush operation during gmc initialization.
> +*/
> +   if (adev->nbio.funcs->remap_hdp_registers && 
> !amdgpu_sriov_vf(adev))
> +   adev->nbio.funcs->remap_hdp_registers(adev);
> +

We probably also need this in ip_resume() as well to handle the
suspend and resume case.  Thinking about this more, maybe it's easier
to just track whether the remap has happened yet and use the old or
new offset based on that.

Alex


> r = adev->ip_blocks[i].version->funcs->hw_init((void 
> *)adev);
> if (r) {
> DRM_ERROR("hw_init %d failed %d\n", i, r);
> diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
> index b3fba8dea63c..3ac7fef74277 100644
> --- a/drivers/gpu/drm/amd/amdgpu/nv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/nv.c
> @@ -1032,12 +1032,6 @@ static int nv_common_hw_init(void *handle)
> nv_program_aspm(adev);
> /* setup nbio registers */
> adev->nbio.funcs->init_registers(adev);
> -   /* remap HDP registers to a hole in mmio space,
> -* for the purpose of expose those registers
> -* to process space
> -*/
> -   if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
> -   adev->nbio.funcs->remap_hdp_registers(adev);
> /* enable the doorbell aperture */
> nv_enable_doorbell_aperture(adev, true);
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
> b/drivers/gpu/drm/amd/amdgpu/soc15.c
> index fde6154f2009..a0481e37d7cf 100644
> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> @@ -1240,12 +1240,6 @@ static int soc15_common_hw_init(void *handle)
> soc15_program_aspm(adev);
> /* setup nbio registers */
> adev->nbio.funcs->init_registers(adev);
> -   /* remap HDP registers to a hole in mmio space,
> -* for the purpose of expose those registers
> -* to process space
> -*/
> -   if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
> -   adev->nbio.funcs->remap_hdp_registers(adev);
>
> /* enable the doorbell aperture */
> soc15_enable_doorbell_aperture(adev, true);
> diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c 
> b/drivers/gpu/drm/amd/amdgpu/soc21.c
> index 55284b24f113..16b447055102 100644
> --- a/drivers/gpu/drm/amd/amdgpu/soc21.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
> @@ -660,12 +660,6 @@ static int soc21_common_hw_init(void *handle)
> soc21_program_aspm(adev);
> /* setup nbio registers */
> adev->nbio.funcs->init_registers(adev);
> -   /* remap HDP registers to a hole in mmio space,
> -* for the purpose of expose those registers
> -* to process space
> -*/
> -   if (adev->nbio.funcs->remap_hdp_registers)
> -   adev->nbio.funcs->remap_hdp_registers(adev);
> /* enable the doorbell aperture */
> soc21_enable_doorbell_aperture(adev, true);
>
> --
> 2.25.1
>


Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-25 Thread Andrey Grodzovsky



On 2022-08-24 22:29, Luben Tuikov wrote:

Inlined:

On 2022-08-24 12:21, Andrey Grodzovsky wrote:

On 2022-08-23 17:37, Luben Tuikov wrote:

On 2022-08-23 14:57, Andrey Grodzovsky wrote:

On 2022-08-23 14:30, Luben Tuikov wrote:


On 2022-08-23 14:13, Andrey Grodzovsky wrote:

On 2022-08-23 12:58, Luben Tuikov wrote:

Inlined:

On 2022-08-22 16:09, Andrey Grodzovsky wrote:

Poblem: Given many entities competing for same rq on

^Problem


same scheduler an uncceptabliy long wait time for some

^unacceptably


jobs waiting stuck in rq before being picked up are
observed (seen using  GPUVis).
The issue is due to Round Robin policy used by scheduler
to pick up the next entity for execution. Under stress
of many entities and long job queus within entity some

^queues


jobs could be stack for very long time in it's entity's
queue before being popped from the queue and executed
while for other entites with samller job queues a job

^entities; smaller


might execute ealier even though that job arrived later

^earlier


then the job in the long queue.

Fix:
Add FIFO selection policy to entites in RQ, chose next enitity
on rq in such order that if job on one entity arrived
ealrier then job on another entity the first job will start
executing ealier regardless of the length of the entity's job
queue.

Signed-off-by: Andrey Grodzovsky 
Tested-by: Li Yunxiang (Teddy) 
---
 drivers/gpu/drm/scheduler/sched_entity.c |  2 +
 drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
 include/drm/gpu_scheduler.h  |  8 +++
 3 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 6b25b2f4f5a3..3bb7f69306ef 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job 
*sched_job)
atomic_inc(entity->rq->sched->score);
WRITE_ONCE(entity->last_user, current->group_leader);
first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node);
+   sched_job->submit_ts = ktime_get();
+
 
 	/* first job wakes up scheduler */

if (first) {
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 68317d3a7a27..c123aa120d06 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -59,6 +59,19 @@
 #define CREATE_TRACE_POINTS
 #include "gpu_scheduler_trace.h"
 
+

+
+int drm_sched_policy = -1;
+
+/**
+ * DOC: sched_policy (int)
+ * Used to override default entites scheduling policy in a run queue.
+ */
+MODULE_PARM_DESC(sched_policy,
+   "specify schedule policy for entites on a runqueue (-1 = 
auto(default) value, 0 = Round Robin,1  = use FIFO");
+module_param_named(sched_policy, drm_sched_policy, int, 0444);

As per Christian's comments, you can drop the "auto" and perhaps leave one as 
the default,
say the RR.

I do think it is beneficial to have a module parameter control the scheduling 
policy, as shown above.

Christian is not against it, just against adding 'auto' here - like the
default.

Exactly what I said.

Also, I still think an O(1) scheduling (picking next to run) should be
what we strive for in such a FIFO patch implementation.
A FIFO mechanism is by it's nature an O(1) mechanism for picking the next
element.

Regards,
Luben

The only solution i see for this now is keeping a global per rq jobs
list parallel to SPCP queue per entity - we use this list when we switch
to FIFO scheduling, we can even start building  it ONLY when we switch
to FIFO building it gradually as more jobs come. Do you have other solution
in mind ?

The idea is to "sort" on insertion, not on picking the next one to run.

cont'd below:


Andrey


+
+
 #define to_drm_sched_job(sched_job)\
container_of((sched_job), struct drm_sched_job, queue_node)
 
@@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,

 }
 
 /**

- * drm_sched_rq_select_entity - Select an entity which could provide a job to 
run
+ * drm_sched_rq_select_entity_rr - Select an entity which could provide a job 
to run
  *
  * @rq: scheduler run queue to check.
  *
- * Try to find a ready entity, returns NULL if none found.
+ * Try to find a ready entity, in round robin manner.
+ *
+ * Returns NULL if none found.
  */
 static struct drm_sched_entity *
-drm_sched_rq_select_entity(struct drm_sched_rq *rq)
+drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
 {
struct drm_sched_entity *entity;
 
@@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)

return NULL;
 }
 
+/**

+ * drm_sched_rq_select_entity_fifo - Select an entity which could provide a 
job to run
+ *
+ * @rq: scheduler run queue to check.
+ *
+ * Try to find a ready entity, based on

Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-25 Thread Andrey Grodzovsky



On 2022-08-24 22:29, Luben Tuikov wrote:

Inlined:

On 2022-08-24 12:21, Andrey Grodzovsky wrote:

On 2022-08-23 17:37, Luben Tuikov wrote:

On 2022-08-23 14:57, Andrey Grodzovsky wrote:

On 2022-08-23 14:30, Luben Tuikov wrote:


On 2022-08-23 14:13, Andrey Grodzovsky wrote:

On 2022-08-23 12:58, Luben Tuikov wrote:

Inlined:

On 2022-08-22 16:09, Andrey Grodzovsky wrote:

Poblem: Given many entities competing for same rq on

^Problem


same scheduler an uncceptabliy long wait time for some

^unacceptably


jobs waiting stuck in rq before being picked up are
observed (seen using  GPUVis).
The issue is due to Round Robin policy used by scheduler
to pick up the next entity for execution. Under stress
of many entities and long job queus within entity some

^queues


jobs could be stack for very long time in it's entity's
queue before being popped from the queue and executed
while for other entites with samller job queues a job

^entities; smaller


might execute ealier even though that job arrived later

^earlier


then the job in the long queue.

Fix:
Add FIFO selection policy to entites in RQ, chose next enitity
on rq in such order that if job on one entity arrived
ealrier then job on another entity the first job will start
executing ealier regardless of the length of the entity's job
queue.

Signed-off-by: Andrey Grodzovsky 
Tested-by: Li Yunxiang (Teddy) 
---
 drivers/gpu/drm/scheduler/sched_entity.c |  2 +
 drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
 include/drm/gpu_scheduler.h  |  8 +++
 3 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 6b25b2f4f5a3..3bb7f69306ef 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job 
*sched_job)
atomic_inc(entity->rq->sched->score);
WRITE_ONCE(entity->last_user, current->group_leader);
first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node);
+   sched_job->submit_ts = ktime_get();
+
 
 	/* first job wakes up scheduler */

if (first) {
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 68317d3a7a27..c123aa120d06 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -59,6 +59,19 @@
 #define CREATE_TRACE_POINTS
 #include "gpu_scheduler_trace.h"
 
+

+
+int drm_sched_policy = -1;
+
+/**
+ * DOC: sched_policy (int)
+ * Used to override default entites scheduling policy in a run queue.
+ */
+MODULE_PARM_DESC(sched_policy,
+   "specify schedule policy for entites on a runqueue (-1 = 
auto(default) value, 0 = Round Robin,1  = use FIFO");
+module_param_named(sched_policy, drm_sched_policy, int, 0444);

As per Christian's comments, you can drop the "auto" and perhaps leave one as 
the default,
say the RR.

I do think it is beneficial to have a module parameter control the scheduling 
policy, as shown above.

Christian is not against it, just against adding 'auto' here - like the
default.

Exactly what I said.

Also, I still think an O(1) scheduling (picking next to run) should be
what we strive for in such a FIFO patch implementation.
A FIFO mechanism is by it's nature an O(1) mechanism for picking the next
element.

Regards,
Luben

The only solution i see for this now is keeping a global per rq jobs
list parallel to SPCP queue per entity - we use this list when we switch
to FIFO scheduling, we can even start building  it ONLY when we switch
to FIFO building it gradually as more jobs come. Do you have other solution
in mind ?

The idea is to "sort" on insertion, not on picking the next one to run.

cont'd below:


Andrey


+
+
 #define to_drm_sched_job(sched_job)\
container_of((sched_job), struct drm_sched_job, queue_node)
 
@@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,

 }
 
 /**

- * drm_sched_rq_select_entity - Select an entity which could provide a job to 
run
+ * drm_sched_rq_select_entity_rr - Select an entity which could provide a job 
to run
  *
  * @rq: scheduler run queue to check.
  *
- * Try to find a ready entity, returns NULL if none found.
+ * Try to find a ready entity, in round robin manner.
+ *
+ * Returns NULL if none found.
  */
 static struct drm_sched_entity *
-drm_sched_rq_select_entity(struct drm_sched_rq *rq)
+drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
 {
struct drm_sched_entity *entity;
 
@@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)

return NULL;
 }
 
+/**

+ * drm_sched_rq_select_entity_fifo - Select an entity which could provide a 
job to run
+ *
+ * @rq: scheduler run queue to check.
+ *
+ * Try to find a ready entity, based on

Re: [PATCH] drm/amdgpu: Fix page table setup on Arcturus

2022-08-25 Thread Alex Deucher
On Mon, Aug 22, 2022 at 11:53 AM Mukul Joshi  wrote:
>
> When translate_further is enabled, page table depth needs to
> be updated. This was missing on Arcturus MMHUB init. This was
> causing address translations to fail for SDMA user-mode queues.
>

Do other mmhub implementations need a similar fix?  It looks like some
of them are missing similar changes.

Alex

> Fixes: 2abf2573b1c69 ("drm/amdgpu: Enable translate_further to extend UTCL2 
> reach"
> Signed-off-by: Mukul Joshi 
> ---
>  drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c 
> b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> index 6e0145b2b408..445cb06b9d26 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> @@ -295,9 +295,17 @@ static void mmhub_v9_4_disable_identity_aperture(struct 
> amdgpu_device *adev,
>  static void mmhub_v9_4_setup_vmid_config(struct amdgpu_device *adev, int 
> hubid)
>  {
> struct amdgpu_vmhub *hub = &adev->vmhub[AMDGPU_MMHUB_0];
> +   unsigned int num_level, block_size;
> uint32_t tmp;
> int i;
>
> +   num_level = adev->vm_manager.num_level;
> +   block_size = adev->vm_manager.block_size;
> +   if (adev->gmc.translate_further)
> +   num_level -= 1;
> +   else
> +   block_size -= 9;
> +
> for (i = 0; i <= 14; i++) {
> tmp = RREG32_SOC15_OFFSET(MMHUB, 0, 
> mmVML2VC0_VM_CONTEXT1_CNTL,
> hubid * MMHUB_INSTANCE_REGISTER_OFFSET + i);
> @@ -305,7 +313,7 @@ static void mmhub_v9_4_setup_vmid_config(struct 
> amdgpu_device *adev, int hubid)
> ENABLE_CONTEXT, 1);
> tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> PAGE_TABLE_DEPTH,
> -   adev->vm_manager.num_level);
> +   num_level);
> tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> RANGE_PROTECTION_FAULT_ENABLE_DEFAULT, 1);
> tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> @@ -323,7 +331,7 @@ static void mmhub_v9_4_setup_vmid_config(struct 
> amdgpu_device *adev, int hubid)
> EXECUTE_PROTECTION_FAULT_ENABLE_DEFAULT, 
> 1);
> tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> PAGE_TABLE_BLOCK_SIZE,
> -   adev->vm_manager.block_size - 9);
> +   block_size);
> /* Send no-retry XNACK on fault to suppress VM fault storm. */
> tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> RETRY_PERMISSION_OR_INVALID_PAGE_FAULT,
> --
> 2.35.1
>


[PATCH 07/11] drm/amdgpu: revert "fix limiting AV1 to the first instance on VCN3"

2022-08-25 Thread Christian König
This reverts commit 250195ff744f260c169f5427422b6f39c58cb883.

The job should now be initialized when we reach the parser functions.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index 39405f0db824..3cabceee5f57 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -1761,21 +1761,23 @@ static const struct amdgpu_ring_funcs 
vcn_v3_0_dec_sw_ring_vm_funcs = {
.emit_reg_write_reg_wait = amdgpu_ring_emit_reg_write_reg_wait_helper,
 };
 
-static int vcn_v3_0_limit_sched(struct amdgpu_cs_parser *p)
+static int vcn_v3_0_limit_sched(struct amdgpu_cs_parser *p,
+   struct amdgpu_job *job)
 {
struct drm_gpu_scheduler **scheds;
 
/* The create msg must be in the first IB submitted */
-   if (atomic_read(&p->entity->fence_seq))
+   if (atomic_read(&job->base.entity->fence_seq))
return -EINVAL;
 
scheds = p->adev->gpu_sched[AMDGPU_HW_IP_VCN_DEC]
[AMDGPU_RING_PRIO_DEFAULT].sched;
-   drm_sched_entity_modify_sched(p->entity, scheds, 1);
+   drm_sched_entity_modify_sched(job->base.entity, scheds, 1);
return 0;
 }
 
-static int vcn_v3_0_dec_msg(struct amdgpu_cs_parser *p, uint64_t addr)
+static int vcn_v3_0_dec_msg(struct amdgpu_cs_parser *p, struct amdgpu_job *job,
+   uint64_t addr)
 {
struct ttm_operation_ctx ctx = { false, false };
struct amdgpu_bo_va_mapping *map;
@@ -1846,7 +1848,7 @@ static int vcn_v3_0_dec_msg(struct amdgpu_cs_parser *p, 
uint64_t addr)
if (create[0] == 0x7 || create[0] == 0x10 || create[0] == 0x11)
continue;
 
-   r = vcn_v3_0_limit_sched(p);
+   r = vcn_v3_0_limit_sched(p, job);
if (r)
goto out;
}
@@ -1860,7 +1862,7 @@ static int vcn_v3_0_ring_patch_cs_in_place(struct 
amdgpu_cs_parser *p,
   struct amdgpu_job *job,
   struct amdgpu_ib *ib)
 {
-   struct amdgpu_ring *ring = to_amdgpu_ring(p->entity->rq->sched);
+   struct amdgpu_ring *ring = to_amdgpu_ring(job->base.sched);
uint32_t msg_lo = 0, msg_hi = 0;
unsigned i;
int r;
@@ -1879,7 +1881,8 @@ static int vcn_v3_0_ring_patch_cs_in_place(struct 
amdgpu_cs_parser *p,
msg_hi = val;
} else if (reg == PACKET0(p->adev->vcn.internal.cmd, 0) &&
   val == 0) {
-   r = vcn_v3_0_dec_msg(p, ((u64)msg_hi) << 32 | msg_lo);
+   r = vcn_v3_0_dec_msg(p, job,
+((u64)msg_hi) << 32 | msg_lo);
if (r)
return r;
}
-- 
2.25.1



[PATCH 01/11] drm/sched: move calling drm_sched_entity_select_rq

2022-08-25 Thread Christian König
We already discussed that the call to drm_sched_entity_select_rq() needs
to move to drm_sched_job_arm() to be able to set a new scheduler list
between _init() and _arm(). This was just not applied for some reason.

Signed-off-by: Christian König 
Reviewed-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/scheduler/sched_main.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 68317d3a7a27..e0ab14e0fb6b 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -592,7 +592,6 @@ int drm_sched_job_init(struct drm_sched_job *job,
   struct drm_sched_entity *entity,
   void *owner)
 {
-   drm_sched_entity_select_rq(entity);
if (!entity->rq)
return -ENOENT;
 
@@ -628,7 +627,7 @@ void drm_sched_job_arm(struct drm_sched_job *job)
struct drm_sched_entity *entity = job->entity;
 
BUG_ON(!entity);
-
+   drm_sched_entity_select_rq(entity);
sched = entity->rq->sched;
 
job->sched = sched;
-- 
2.25.1



[PATCH 04/11] drm/amdgpu: cleanup and reorder amdgpu_cs.c

2022-08-25 Thread Christian König
Sort the functions in the order they are called and cleanup the coding
style and function names to represent the data they process.

Check the size of the IB chunk, initialize resulting entity and scheduler
job much earlier as well.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 1635 
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.h |2 +-
 2 files changed, 816 insertions(+), 821 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 75fab438ba4d..b9de631a66a3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -39,9 +39,61 @@
 #include "amdgpu_gem.h"
 #include "amdgpu_ras.h"
 
-static int amdgpu_cs_user_fence_chunk(struct amdgpu_cs_parser *p,
- struct drm_amdgpu_cs_chunk_fence *data,
- uint32_t *offset)
+static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p,
+struct amdgpu_device *adev,
+struct drm_file *filp,
+union drm_amdgpu_cs *cs)
+{
+   struct amdgpu_fpriv *fpriv = filp->driver_priv;
+
+   if (cs->in.num_chunks == 0)
+   return -EINVAL;
+
+   memset(p, 0, sizeof(*p));
+   p->adev = adev;
+   p->filp = filp;
+
+   p->ctx = amdgpu_ctx_get(fpriv, cs->in.ctx_id);
+   if (!p->ctx)
+   return -EINVAL;
+
+   if (atomic_read(&p->ctx->guilty)) {
+   amdgpu_ctx_put(p->ctx);
+   return -ECANCELED;
+   }
+   return 0;
+}
+
+static int amdgpu_cs_p1_ib(struct amdgpu_cs_parser *p,
+  struct drm_amdgpu_cs_chunk_ib *chunk_ib,
+  unsigned int *num_ibs)
+{
+   struct drm_sched_entity *entity;
+   int r;
+
+   r = amdgpu_ctx_get_entity(p->ctx, chunk_ib->ip_type,
+ chunk_ib->ip_instance,
+ chunk_ib->ring, &entity);
+   if (r)
+   return r;
+
+   /* Abort if there is no run queue associated with this entity.
+* Possibly because of disabled HW IP*/
+   if (entity->rq == NULL)
+   return -EINVAL;
+
+   /* Currently we don't support submitting to multiple entities */
+   if (p->entity && p->entity != entity)
+   return -EINVAL;
+
+   p->entity = entity;
+   ++(*num_ibs);
+   return 0;
+}
+
+static int amdgpu_cs_p1_user_fence(struct amdgpu_cs_parser *p,
+  struct drm_amdgpu_cs_chunk_fence *data,
+  uint32_t *offset)
 {
struct drm_gem_object *gobj;
struct amdgpu_bo *bo;
@@ -80,11 +132,11 @@ static int amdgpu_cs_user_fence_chunk(struct 
amdgpu_cs_parser *p,
return r;
 }
 
-static int amdgpu_cs_bo_handles_chunk(struct amdgpu_cs_parser *p,
- struct drm_amdgpu_bo_list_in *data)
+static int amdgpu_cs_p1_bo_handles(struct amdgpu_cs_parser *p,
+  struct drm_amdgpu_bo_list_in *data)
 {
+   struct drm_amdgpu_bo_list_entry *info;
int r;
-   struct drm_amdgpu_bo_list_entry *info = NULL;
 
r = amdgpu_bo_create_list_entry_array(data, &info);
if (r)
@@ -104,7 +156,9 @@ static int amdgpu_cs_bo_handles_chunk(struct 
amdgpu_cs_parser *p,
return r;
 }
 
-static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, union 
drm_amdgpu_cs *cs)
+/* Copy the data from userspace and go over it the first time */
+static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
+  union drm_amdgpu_cs *cs)
 {
struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
struct amdgpu_vm *vm = &fpriv->vm;
@@ -112,28 +166,17 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser 
*p, union drm_amdgpu_cs
uint64_t *chunk_array;
unsigned size, num_ibs = 0;
uint32_t uf_offset = 0;
-   int i;
int ret;
+   int i;
 
if (cs->in.num_chunks == 0)
return -EINVAL;
 
-   chunk_array = kvmalloc_array(cs->in.num_chunks, sizeof(uint64_t), 
GFP_KERNEL);
+   chunk_array = kvmalloc_array(cs->in.num_chunks, sizeof(uint64_t),
+GFP_KERNEL);
if (!chunk_array)
return -ENOMEM;
 
-   p->ctx = amdgpu_ctx_get(fpriv, cs->in.ctx_id);
-   if (!p->ctx) {
-   ret = -EINVAL;
-   goto free_chunk;
-   }
-
-   /* skip guilty context job */
-   if (atomic_read(&p->ctx->guilty) == 1) {
-   ret = -ECANCELED;
-   goto free_chunk;
-   }
-
/* get chunks */
chunk_array_user = u64_to_user_ptr(cs->in.chunks);
if (copy_from_user(chunk_array, chunk_array_user,
@@ -168,7 +211,8 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser 
*p, union drm_amdgp

[PATCH 11/11] drm/amdgpu: fix VCN3 and VCN4 instance limiting

2022-08-25 Thread Christian König
Check if the entity is already limited, not if it's assigned to the
first instance.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 5 ++---
 drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c | 5 ++---
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index 3cabceee5f57..5e64c3426728 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -1862,13 +1862,12 @@ static int vcn_v3_0_ring_patch_cs_in_place(struct 
amdgpu_cs_parser *p,
   struct amdgpu_job *job,
   struct amdgpu_ib *ib)
 {
-   struct amdgpu_ring *ring = to_amdgpu_ring(job->base.sched);
uint32_t msg_lo = 0, msg_hi = 0;
unsigned i;
int r;
 
-   /* The first instance can decode anything */
-   if (!ring->me)
+   /* Abort if it's already limited */
+   if (job->base.entity->num_sched_list <= 1)
return 0;
 
for (i = 0; i < ib->length_dw; i += 2) {
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
index 9338172eec8b..a8264fe2201d 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
@@ -1430,13 +1430,12 @@ static int vcn_v4_0_ring_patch_cs_in_place(struct 
amdgpu_cs_parser *p,
   struct amdgpu_job *job,
   struct amdgpu_ib *ib)
 {
-   struct amdgpu_ring *ring = to_amdgpu_ring(job->base.entity->rq->sched);
struct amdgpu_vcn_decode_buffer *decode_buffer;
uint64_t addr;
uint32_t val;
 
-   /* The first instance can decode anything */
-   if (!ring->me)
+   /* Abort if it's already limited */
+   if (job->base.entity->num_sched_list <= 1)
return 0;
 
/* unified queue ib header has 8 double words. */
-- 
2.25.1



[PATCH 08/11] drm/amdgpu: cleanup instance limit on VCN4

2022-08-25 Thread Christian König
Similar to what we did for VCN3 use the job instead of the parser
entity. Cleanup the coding style quite a bit as well.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c | 46 +++
 1 file changed, 25 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
index fb2d74f30448..9338172eec8b 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
@@ -1327,21 +1327,23 @@ static void vcn_v4_0_unified_ring_set_wptr(struct 
amdgpu_ring *ring)
}
 }
 
-static int vcn_v4_0_limit_sched(struct amdgpu_cs_parser *p)
+static int vcn_v4_0_limit_sched(struct amdgpu_cs_parser *p,
+   struct amdgpu_job *job)
 {
struct drm_gpu_scheduler **scheds;
 
/* The create msg must be in the first IB submitted */
-   if (atomic_read(&p->entity->fence_seq))
+   if (atomic_read(&job->base.entity->fence_seq))
return -EINVAL;
 
-   scheds = p->adev->gpu_sched[AMDGPU_HW_IP_VCN_ENC]
-   [AMDGPU_RING_PRIO_0].sched;
-   drm_sched_entity_modify_sched(p->entity, scheds, 1);
+   scheds = p->adev->gpu_sched[AMDGPU_HW_IP_VCN_DEC]
+   [AMDGPU_RING_PRIO_DEFAULT].sched;
+   drm_sched_entity_modify_sched(job->base.entity, scheds, 1);
return 0;
 }
 
-static int vcn_v4_0_dec_msg(struct amdgpu_cs_parser *p, uint64_t addr)
+static int vcn_v4_0_dec_msg(struct amdgpu_cs_parser *p, struct amdgpu_job *job,
+   uint64_t addr)
 {
struct ttm_operation_ctx ctx = { false, false };
struct amdgpu_bo_va_mapping *map;
@@ -1412,7 +1414,7 @@ static int vcn_v4_0_dec_msg(struct amdgpu_cs_parser *p, 
uint64_t addr)
if (create[0] == 0x7 || create[0] == 0x10 || create[0] == 0x11)
continue;
 
-   r = vcn_v4_0_limit_sched(p);
+   r = vcn_v4_0_limit_sched(p, job);
if (r)
goto out;
}
@@ -1425,32 +1427,34 @@ static int vcn_v4_0_dec_msg(struct amdgpu_cs_parser *p, 
uint64_t addr)
 #define RADEON_VCN_ENGINE_TYPE_DECODE 
(0x0003)
 
 static int vcn_v4_0_ring_patch_cs_in_place(struct amdgpu_cs_parser *p,
-   struct amdgpu_job *job,
-   struct amdgpu_ib *ib)
+  struct amdgpu_job *job,
+  struct amdgpu_ib *ib)
 {
-   struct amdgpu_ring *ring = to_amdgpu_ring(p->entity->rq->sched);
-   struct amdgpu_vcn_decode_buffer *decode_buffer = NULL;
+   struct amdgpu_ring *ring = to_amdgpu_ring(job->base.entity->rq->sched);
+   struct amdgpu_vcn_decode_buffer *decode_buffer;
+   uint64_t addr;
uint32_t val;
-   int r = 0;
 
/* The first instance can decode anything */
if (!ring->me)
-   return r;
+   return 0;
 
/* unified queue ib header has 8 double words. */
if (ib->length_dw < 8)
-   return r;
+   return 0;
 
val = amdgpu_ib_get_value(ib, 6); //RADEON_VCN_ENGINE_TYPE
+   if (val != RADEON_VCN_ENGINE_TYPE_DECODE)
+   return 0;
 
-   if (val == RADEON_VCN_ENGINE_TYPE_DECODE) {
-   decode_buffer = (struct amdgpu_vcn_decode_buffer *)&ib->ptr[10];
+   decode_buffer = (struct amdgpu_vcn_decode_buffer *)&ib->ptr[10];
 
-   if (decode_buffer->valid_buf_flag  & 0x1)
-   r = vcn_v4_0_dec_msg(p, 
((u64)decode_buffer->msg_buffer_address_hi) << 32 |
-   
decode_buffer->msg_buffer_address_lo);
-   }
-   return r;
+   if (!(decode_buffer->valid_buf_flag  & 0x1))
+   return 0;
+
+   addr = ((u64)decode_buffer->msg_buffer_address_hi) << 32 |
+   decode_buffer->msg_buffer_address_lo;
+   return vcn_v4_0_dec_msg(p, job, addr);
 }
 
 static const struct amdgpu_ring_funcs vcn_v4_0_unified_ring_vm_funcs = {
-- 
2.25.1



[PATCH 10/11] drm/amdgpu: add gang submit frontend v4

2022-08-25 Thread Christian König
Allows submitting jobs as gang which needs to run on multiple engines at the
same time.

All members of the gang get the same implicit, explicit and VM dependencies. So
no gang member will start running until everything else is ready.

The last job is considered the gang leader (usually a submission to the GFX
ring) and used for signaling output dependencies.

Each job is remembered individually as user of a buffer object, so there is no
joining of work at the end.

v2: rebase and fix review comments from Andrey and Yogesh
v3: use READ instead of BOOKKEEP for now because of VM unmaps, set gang
leader only when necessary
v4: fix order of pushing jobs and adding fences found by Trigger.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c| 259 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.h|  10 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h |  12 +-
 3 files changed, 184 insertions(+), 97 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 88f491dc7ca2..a0eb80eaa136 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -69,6 +69,7 @@ static int amdgpu_cs_p1_ib(struct amdgpu_cs_parser *p,
   unsigned int *num_ibs)
 {
struct drm_sched_entity *entity;
+   unsigned int i;
int r;
 
r = amdgpu_ctx_get_entity(p->ctx, chunk_ib->ip_type,
@@ -77,17 +78,28 @@ static int amdgpu_cs_p1_ib(struct amdgpu_cs_parser *p,
if (r)
return r;
 
-   /* Abort if there is no run queue associated with this entity.
-* Possibly because of disabled HW IP*/
+   /*
+* Abort if there is no run queue associated with this entity.
+* Possibly because of disabled HW IP.
+*/
if (entity->rq == NULL)
return -EINVAL;
 
-   /* Currently we don't support submitting to multiple entities */
-   if (p->entity && p->entity != entity)
+   /* Check if we can add this IB to some existing job */
+   for (i = 0; i < p->gang_size; ++i) {
+   if (p->entities[i] == entity)
+   goto found;
+   }
+
+   /* If not increase the gang size if possible */
+   if (i == AMDGPU_CS_GANG_SIZE)
return -EINVAL;
 
-   p->entity = entity;
-   ++(*num_ibs);
+   p->entities[i] = entity;
+   p->gang_size = i + 1;
+
+found:
+   ++(num_ibs[i]);
return 0;
 }
 
@@ -161,11 +173,12 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
   union drm_amdgpu_cs *cs)
 {
struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
+   unsigned int num_ibs[AMDGPU_CS_GANG_SIZE] = { };
struct amdgpu_vm *vm = &fpriv->vm;
uint64_t *chunk_array_user;
uint64_t *chunk_array;
-   unsigned size, num_ibs = 0;
uint32_t uf_offset = 0;
+   unsigned int size;
int ret;
int i;
 
@@ -231,7 +244,7 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
if (size < sizeof(struct drm_amdgpu_cs_chunk_ib))
goto free_partial_kdata;
 
-   ret = amdgpu_cs_p1_ib(p, p->chunks[i].kdata, &num_ibs);
+   ret = amdgpu_cs_p1_ib(p, p->chunks[i].kdata, num_ibs);
if (ret)
goto free_partial_kdata;
break;
@@ -268,21 +281,28 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
}
}
 
-   ret = amdgpu_job_alloc(p->adev, num_ibs, &p->job, vm);
-   if (ret)
-   goto free_all_kdata;
+   if (!p->gang_size)
+   return -EINVAL;
 
-   ret = drm_sched_job_init(&p->job->base, p->entity, &fpriv->vm);
-   if (ret)
-   goto free_all_kdata;
+   for (i = 0; i < p->gang_size; ++i) {
+   ret = amdgpu_job_alloc(p->adev, num_ibs[i], &p->jobs[i], vm);
+   if (ret)
+   goto free_all_kdata;
+
+   ret = drm_sched_job_init(&p->jobs[i]->base, p->entities[i],
+&fpriv->vm);
+   if (ret)
+   goto free_all_kdata;
+   }
+   p->gang_leader = p->jobs[p->gang_size - 1];
 
-   if (p->ctx->vram_lost_counter != p->job->vram_lost_counter) {
+   if (p->ctx->vram_lost_counter != p->gang_leader->vram_lost_counter) {
ret = -ECANCELED;
goto free_all_kdata;
}
 
if (p->uf_entry.tv.bo)
-   p->job->uf_addr = uf_offset;
+   p->gang_leader->uf_addr = uf_offset;
kvfree(chunk_array);
 
/* Use this opportunity to fill in task info for the vm */
@@ -304,22 +324,18 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
return ret;
 }
 
-static int amdgpu_cs_p2_ib(struct amdgpu_cs_parser *p,
-

[PATCH 09/11] drm/amdgpu: add gang submit backend v2

2022-08-25 Thread Christian König
Allows submitting jobs as gang which needs to run on multiple
engines at the same time.

Basic idea is that we have a global gang submit fence representing when the
gang leader is finally pushed to run on the hardware last.

Jobs submitted as gang are never re-submitted in case of a GPU reset since this
won't work and will just deadlock the hardware immediately again.

v2: fix logic inversion, improve documentation, fix rcu

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  3 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 35 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 28 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.h|  3 ++
 4 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 79bb6fd83094..ae9371b172e3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -885,6 +885,7 @@ struct amdgpu_device {
u64 fence_context;
unsignednum_rings;
struct amdgpu_ring  *rings[AMDGPU_MAX_RINGS];
+   struct dma_fence __rcu  *gang_submit;
boolib_pool_ready;
struct amdgpu_sa_managerib_pools[AMDGPU_IB_POOL_MAX];
struct amdgpu_sched 
gpu_sched[AMDGPU_HW_IP_NUM][AMDGPU_RING_PRIO_MAX];
@@ -1294,6 +1295,8 @@ u32 amdgpu_device_pcie_port_rreg(struct amdgpu_device 
*adev,
u32 reg);
 void amdgpu_device_pcie_port_wreg(struct amdgpu_device *adev,
u32 reg, u32 v);
+struct dma_fence *amdgpu_device_switch_gang(struct amdgpu_device *adev,
+   struct dma_fence *gang);
 
 /* atpx handler */
 #if defined(CONFIG_VGA_SWITCHEROO)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index d7eb23b8d692..172095122cc1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3501,6 +3501,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
adev->gmc.gart_size = 512 * 1024 * 1024;
adev->accel_working = false;
adev->num_rings = 0;
+   RCU_INIT_POINTER(adev->gang_submit, dma_fence_get_stub());
adev->mman.buffer_funcs = NULL;
adev->mman.buffer_funcs_ring = NULL;
adev->vm_manager.vm_pte_funcs = NULL;
@@ -3983,6 +3984,7 @@ void amdgpu_device_fini_sw(struct amdgpu_device *adev)
release_firmware(adev->firmware.gpu_info_fw);
adev->firmware.gpu_info_fw = NULL;
adev->accel_working = false;
+   dma_fence_put(rcu_dereference_protected(adev->gang_submit, true));
 
amdgpu_reset_fini(adev);
 
@@ -5916,3 +5918,36 @@ void amdgpu_device_pcie_port_wreg(struct amdgpu_device 
*adev,
(void)RREG32(data);
spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
 }
+
+/**
+ * amdgpu_device_switch_gang - switch to a new gang
+ * @adev: amdgpu_device pointer
+ * @gang: the gang to switch to
+ *
+ * Try to switch to a new gang.
+ * Returns: NULL if we switched to the new gang or a reference to the current
+ * gang leader.
+ */
+struct dma_fence *amdgpu_device_switch_gang(struct amdgpu_device *adev,
+   struct dma_fence *gang)
+{
+   struct dma_fence *old = NULL;
+
+   do {
+   dma_fence_put(old);
+   rcu_read_lock();
+   old = dma_fence_get_rcu_safe(&adev->gang_submit);
+   rcu_read_unlock();
+
+   if (old == gang)
+   break;
+
+   if (!dma_fence_is_signaled(old))
+   return old;
+
+   } while (cmpxchg((struct dma_fence __force **)&adev->gang_submit,
+old, gang) != old);
+
+   dma_fence_put(old);
+   return NULL;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 37dc5ee4153d..6f6708caf0e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -173,11 +173,29 @@ static void amdgpu_job_free_cb(struct drm_sched_job 
*s_job)
dma_fence_put(&job->hw_fence);
 }
 
+void amdgpu_job_set_gang_leader(struct amdgpu_job *job,
+   struct amdgpu_job *leader)
+{
+   struct dma_fence *fence = &leader->base.s_fence->scheduled;
+
+   WARN_ON(job->gang_submit);
+
+   /*
+* Don't add a reference when we are the gang leader to avoid circle
+* dependency.
+*/
+   if (job != leader)
+   dma_fence_get(fence);
+   job->gang_submit = fence;
+}
+
 void amdgpu_job_free(struct amdgpu_job *job)
 {
amdgpu_job_free_resources(job);
amdgpu_sync_free(&job->sync);
amdgpu_sync_free(&job->sched_sync);
+   if (job->gang_submit != &job->

[PATCH 05/11] drm/amdgpu: remove SRIOV and MCBP dependencies from the CS

2022-08-25 Thread Christian König
We should not have any different CS constrains based
on the execution environment.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index b9de631a66a3..dfb7b4f46bc3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -323,8 +323,7 @@ static int amdgpu_cs_p2_ib(struct amdgpu_cs_parser *p,
return -EINVAL;
 
if (chunk_ib->ip_type == AMDGPU_HW_IP_GFX &&
-   chunk_ib->flags & AMDGPU_IB_FLAG_PREEMPT &&
-   (amdgpu_mcbp || amdgpu_sriov_vf(p->adev))) {
+   chunk_ib->flags & AMDGPU_IB_FLAG_PREEMPT) {
if (chunk_ib->flags & AMDGPU_IB_FLAG_CE)
(*ce_preempt)++;
else
@@ -1084,7 +1083,7 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser 
*p)
if (r)
return r;
 
-   if (amdgpu_mcbp || amdgpu_sriov_vf(adev)) {
+   if (fpriv->csa_va) {
bo_va = fpriv->csa_va;
r = amdgpu_vm_bo_update(adev, bo_va, false);
if (r)
-- 
2.25.1



[PATCH 06/11] drm/amdgpu: move setting the job resources

2022-08-25 Thread Christian König
Move setting the job resources into amdgpu_job.c

Signed-off-by: Christian König 
Reviewed-by: Andrey Grodzovsky 
Reviewed-by: Luben Tuikov 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  | 21 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 17 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.h |  2 ++
 3 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index dfb7b4f46bc3..88f491dc7ca2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -828,9 +828,6 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
struct amdgpu_vm *vm = &fpriv->vm;
struct amdgpu_bo_list_entry *e;
struct list_head duplicates;
-   struct amdgpu_bo *gds;
-   struct amdgpu_bo *gws;
-   struct amdgpu_bo *oa;
int r;
 
INIT_LIST_HEAD(&p->validated);
@@ -947,22 +944,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
amdgpu_cs_report_moved_bytes(p->adev, p->bytes_moved,
 p->bytes_moved_vis);
 
-   gds = p->bo_list->gds_obj;
-   gws = p->bo_list->gws_obj;
-   oa = p->bo_list->oa_obj;
-
-   if (gds) {
-   p->job->gds_base = amdgpu_bo_gpu_offset(gds) >> PAGE_SHIFT;
-   p->job->gds_size = amdgpu_bo_size(gds) >> PAGE_SHIFT;
-   }
-   if (gws) {
-   p->job->gws_base = amdgpu_bo_gpu_offset(gws) >> PAGE_SHIFT;
-   p->job->gws_size = amdgpu_bo_size(gws) >> PAGE_SHIFT;
-   }
-   if (oa) {
-   p->job->oa_base = amdgpu_bo_gpu_offset(oa) >> PAGE_SHIFT;
-   p->job->oa_size = amdgpu_bo_size(oa) >> PAGE_SHIFT;
-   }
+   amdgpu_job_set_resources(p->job, p->bo_list->gds_obj,
+p->bo_list->gws_obj, p->bo_list->oa_obj);
 
if (p->uf_entry.tv.bo) {
struct amdgpu_bo *uf = ttm_to_amdgpu_bo(p->uf_entry.tv.bo);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 8f51adf3b329..37dc5ee4153d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -132,6 +132,23 @@ int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, 
unsigned size,
return r;
 }
 
+void amdgpu_job_set_resources(struct amdgpu_job *job, struct amdgpu_bo *gds,
+ struct amdgpu_bo *gws, struct amdgpu_bo *oa)
+{
+   if (gds) {
+   job->gds_base = amdgpu_bo_gpu_offset(gds) >> PAGE_SHIFT;
+   job->gds_size = amdgpu_bo_size(gds) >> PAGE_SHIFT;
+   }
+   if (gws) {
+   job->gws_base = amdgpu_bo_gpu_offset(gws) >> PAGE_SHIFT;
+   job->gws_size = amdgpu_bo_size(gws) >> PAGE_SHIFT;
+   }
+   if (oa) {
+   job->oa_base = amdgpu_bo_gpu_offset(oa) >> PAGE_SHIFT;
+   job->oa_size = amdgpu_bo_size(oa) >> PAGE_SHIFT;
+   }
+}
+
 void amdgpu_job_free_resources(struct amdgpu_job *job)
 {
struct amdgpu_ring *ring = to_amdgpu_ring(job->base.sched);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
index babc0af751c2..2a1961bf1194 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
@@ -76,6 +76,8 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned 
num_ibs,
 struct amdgpu_job **job, struct amdgpu_vm *vm);
 int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, unsigned size,
enum amdgpu_ib_pool_type pool, struct amdgpu_job **job);
+void amdgpu_job_set_resources(struct amdgpu_job *job, struct amdgpu_bo *gds,
+ struct amdgpu_bo *gws, struct amdgpu_bo *oa);
 void amdgpu_job_free_resources(struct amdgpu_job *job);
 void amdgpu_job_free(struct amdgpu_job *job);
 int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
-- 
2.25.1



[PATCH 02/11] drm/amdgpu: revert "partial revert "remove ctx->lock" v2"

2022-08-25 Thread Christian König
This reverts commit 94f4c4965e5513ba624488f4b601d6b385635aec.

We found that the bo_list is missing a protection for its list entries.
Since that is fixed now this workaround can be removed again.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  | 21 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c |  2 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h |  1 -
 3 files changed, 6 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index b7bae833c804..75fab438ba4d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -128,8 +128,6 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser 
*p, union drm_amdgpu_cs
goto free_chunk;
}
 
-   mutex_lock(&p->ctx->lock);
-
/* skip guilty context job */
if (atomic_read(&p->ctx->guilty) == 1) {
ret = -ECANCELED;
@@ -708,7 +706,6 @@ static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser 
*parser, int error,
dma_fence_put(parser->fence);
 
if (parser->ctx) {
-   mutex_unlock(&parser->ctx->lock);
amdgpu_ctx_put(parser->ctx);
}
if (parser->bo_list)
@@ -1157,9 +1154,6 @@ static int amdgpu_cs_dependencies(struct amdgpu_device 
*adev,
 {
int i, r;
 
-   /* TODO: Investigate why we still need the context lock */
-   mutex_unlock(&p->ctx->lock);
-
for (i = 0; i < p->nchunks; ++i) {
struct amdgpu_cs_chunk *chunk;
 
@@ -1170,34 +1164,32 @@ static int amdgpu_cs_dependencies(struct amdgpu_device 
*adev,
case AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES:
r = amdgpu_cs_process_fence_dep(p, chunk);
if (r)
-   goto out;
+   return r;
break;
case AMDGPU_CHUNK_ID_SYNCOBJ_IN:
r = amdgpu_cs_process_syncobj_in_dep(p, chunk);
if (r)
-   goto out;
+   return r;
break;
case AMDGPU_CHUNK_ID_SYNCOBJ_OUT:
r = amdgpu_cs_process_syncobj_out_dep(p, chunk);
if (r)
-   goto out;
+   return r;
break;
case AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT:
r = amdgpu_cs_process_syncobj_timeline_in_dep(p, chunk);
if (r)
-   goto out;
+   return r;
break;
case AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL:
r = amdgpu_cs_process_syncobj_timeline_out_dep(p, 
chunk);
if (r)
-   goto out;
+   return r;
break;
}
}
 
-out:
-   mutex_lock(&p->ctx->lock);
-   return r;
+   return 0;
 }
 
 static void amdgpu_cs_post_dependencies(struct amdgpu_cs_parser *p)
@@ -1359,7 +1351,6 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
goto out;
 
r = amdgpu_cs_submit(&parser, cs);
-
 out:
amdgpu_cs_parser_fini(&parser, r, reserved_buffers);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index 8ee4e8491f39..168337d8d4cf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -315,7 +315,6 @@ static int amdgpu_ctx_init(struct amdgpu_ctx_mgr *mgr, 
int32_t priority,
kref_init(&ctx->refcount);
ctx->mgr = mgr;
spin_lock_init(&ctx->ring_lock);
-   mutex_init(&ctx->lock);
 
ctx->reset_counter = atomic_read(&mgr->adev->gpu_reset_counter);
ctx->reset_counter_query = ctx->reset_counter;
@@ -407,7 +406,6 @@ static void amdgpu_ctx_fini(struct kref *ref)
drm_dev_exit(idx);
}
 
-   mutex_destroy(&ctx->lock);
kfree(ctx);
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
index cc7c8afff414..0fa0e56daf67 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
@@ -53,7 +53,6 @@ struct amdgpu_ctx {
boolpreamble_presented;
int32_t init_priority;
int32_t override_priority;
-   struct mutexlock;
atomic_tguilty;
unsigned long   ras_counter_ce;
unsigned long   ras_counter_ue;
-- 
2.25.1



[PATCH 03/11] drm/amdgpu: use DMA_RESV_USAGE_BOOKKEEP

2022-08-25 Thread Christian König
Use DMA_RESV_USAGE_BOOKKEEP for VM page table updates and KFD preemption fence.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c  | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index cbd593f7d553..85eb68ec692e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -297,7 +297,7 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct 
amdgpu_bo *bo,
 */
replacement = dma_fence_get_stub();
dma_resv_replace_fences(bo->tbo.base.resv, ef->base.context,
-   replacement, DMA_RESV_USAGE_READ);
+   replacement, DMA_RESV_USAGE_BOOKKEEP);
dma_fence_put(replacement);
return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
index 1fd3cbca20a2..03ec099d64e0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
@@ -112,7 +112,8 @@ static int amdgpu_vm_sdma_commit(struct 
amdgpu_vm_update_params *p,
swap(p->vm->last_unlocked, tmp);
dma_fence_put(tmp);
} else {
-   amdgpu_bo_fence(p->vm->root.bo, f, true);
+   dma_resv_add_fence(p->vm->root.bo->tbo.base.resv, f,
+  DMA_RESV_USAGE_BOOKKEEP);
}
 
if (fence && !p->immediate)
-- 
2.25.1



Latest gang submit patches

2022-08-25 Thread Christian König
Hi guys,

so Trigger found another issues with this series which is now fixed.

Patch #11 is just a shoot into the dark so far, could be that the VCN3/4
problems are now solved. But could as well be that this needs more
investigation.

Bas still has a backtrace with this which I can't figure out why that is
happening, but going to investigate further.

Please review and comment,
Christian.




Re: [PATCH 2/2] drm/amdgpu: Init VF's HDP flush reg offset early

2022-08-25 Thread Alex Deucher
Series is:
Reviewed-by: Alex Deucher 

On Thu, Aug 25, 2022 at 4:58 AM Lijo Lazar  wrote:
>
> Make sure the register offsets used for HDP flush in VF is
> initialized early so that it works fine during any early HDP flush
> sequence. For that, move the offset initialization to *_remap_hdp.
>
> Signed-off-by: Lijo Lazar 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  2 +-
>  drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c | 23 +
>  drivers/gpu/drm/amd/amdgpu/nbio_v4_3.c | 12 +++
>  drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c | 23 +
>  drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c | 21 ---
>  drivers/gpu/drm/amd/amdgpu/nbio_v7_2.c | 24 ++
>  drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 23 +
>  7 files changed, 84 insertions(+), 44 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 53d753e94a71..c0bb2e9616c5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2382,7 +2382,7 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
> *adev)
>  * to process space. This is needed for any early HDP
>  * flush operation during gmc initialization.
>  */
> -   if (adev->nbio.funcs->remap_hdp_registers && 
> !amdgpu_sriov_vf(adev))
> +   if (adev->nbio.funcs->remap_hdp_registers)
> adev->nbio.funcs->remap_hdp_registers(adev);
>
> r = adev->ip_blocks[i].version->funcs->hw_init((void 
> *)adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c 
> b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
> index b465baa26762..20fa2c5ad510 100644
> --- a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
> +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
> @@ -65,10 +65,21 @@
>
>  static void nbio_v2_3_remap_hdp_registers(struct amdgpu_device *adev)
>  {
> -   WREG32_SOC15(NBIO, 0, mmREMAP_HDP_MEM_FLUSH_CNTL,
> -   adev->rmmio_remap.reg_offset + 
> KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL);
> -   WREG32_SOC15(NBIO, 0, mmREMAP_HDP_REG_FLUSH_CNTL,
> -   adev->rmmio_remap.reg_offset + 
> KFD_MMIO_REMAP_HDP_REG_FLUSH_CNTL);
> +   if (amdgpu_sriov_vf(adev))
> +   adev->rmmio_remap.reg_offset =
> +   SOC15_REG_OFFSET(
> +   NBIO, 0,
> +   
> mmBIF_BX_DEV0_EPF0_VF0_HDP_MEM_COHERENCY_FLUSH_CNTL)
> +   << 2;
> +
> +   if (!amdgpu_sriov_vf(adev)) {
> +   WREG32_SOC15(NBIO, 0, mmREMAP_HDP_MEM_FLUSH_CNTL,
> +adev->rmmio_remap.reg_offset +
> +KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL);
> +   WREG32_SOC15(NBIO, 0, mmREMAP_HDP_REG_FLUSH_CNTL,
> +adev->rmmio_remap.reg_offset +
> +KFD_MMIO_REMAP_HDP_REG_FLUSH_CNTL);
> +   }
>  }
>
>  static u32 nbio_v2_3_get_rev_id(struct amdgpu_device *adev)
> @@ -338,10 +349,6 @@ static void nbio_v2_3_init_registers(struct 
> amdgpu_device *adev)
>
> if (def != data)
> WREG32_PCIE(smnPCIE_CONFIG_CNTL, data);
> -
> -   if (amdgpu_sriov_vf(adev))
> -   adev->rmmio_remap.reg_offset = SOC15_REG_OFFSET(NBIO, 0,
> -   mmBIF_BX_DEV0_EPF0_VF0_HDP_MEM_COHERENCY_FLUSH_CNTL) 
> << 2;
>  }
>
>  #define NAVI10_PCIE__LC_L0S_INACTIVITY_DEFAULT 0x // off by 
> default, no gains over L1
> diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v4_3.c 
> b/drivers/gpu/drm/amd/amdgpu/nbio_v4_3.c
> index 982a89f841d5..e011d9856794 100644
> --- a/drivers/gpu/drm/amd/amdgpu/nbio_v4_3.c
> +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v4_3.c
> @@ -30,10 +30,14 @@
>
>  static void nbio_v4_3_remap_hdp_registers(struct amdgpu_device *adev)
>  {
> -   WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_MEM_FLUSH_CNTL,
> -   adev->rmmio_remap.reg_offset + 
> KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL);
> -   WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_REG_FLUSH_CNTL,
> -   adev->rmmio_remap.reg_offset + 
> KFD_MMIO_REMAP_HDP_REG_FLUSH_CNTL);
> +   if (!amdgpu_sriov_vf(adev)) {
> +   WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_MEM_FLUSH_CNTL,
> +adev->rmmio_remap.reg_offset +
> +KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL);
> +   WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_REG_FLUSH_CNTL,
> +adev->rmmio_remap.reg_offset +
> +KFD_MMIO_REMAP_HDP_REG_FLUSH_CNTL);
> +   }
>  }
>
>  static u32 nbio_v4_3_get_rev_id(struct amdgpu_device *adev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c 
> b/drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c
> index f7f6ddebd

Re: [PATCH 0/4] Add support for atomic async page-flips

2022-08-25 Thread Melissa Wen
On 08/24, Simon Ser wrote:
> This series adds support for DRM_MODE_PAGE_FLIP_ASYNC for atomic
> commits, aka. "immediate flip" (which might result in tearing).
> The feature was only available via the legacy uAPI, however for
> gaming use-cases it may be desirable to enable it via the atomic
> uAPI too.

Hi Simon,

I'm cc'ing André as he has been actively working on it lately and must
be quite familiar with the async flip machinery.

> 
> User-space patch:
> https://github.com/Plagman/gamescope/pull/595
> 
> IGT patch:
> https://patchwork.freedesktop.org/series/107681/

Also, André recently generalized the kms_async_flip to test drivers
other than i915, so I think he can provide some thoughts about the IGT
test too.

Thanks,

Melissa

> 
> Tested on an AMD Picasso iGPU.
> 
> Simon Ser (4):
>   drm: introduce drm_mode_config.atomic_async_page_flip_not_supported
>   drm: allow DRM_MODE_PAGE_FLIP_ASYNC for atomic commits
>   drm: introduce DRM_CAP_ATOMIC_ASYNC_PAGE_FLIP
>   amd/display: indicate support for atomic async page-flips on DCN
> 
>  drivers/gpu/drm/amd/amdgpu/dce_v10_0.c   |  1 +
>  drivers/gpu/drm/amd/amdgpu/dce_v11_0.c   |  1 +
>  drivers/gpu/drm/amd/amdgpu/dce_v6_0.c|  1 +
>  drivers/gpu/drm/amd/amdgpu/dce_v8_0.c|  1 +
>  drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c |  1 +
>  drivers/gpu/drm/drm_atomic_uapi.c| 28 +---
>  drivers/gpu/drm/drm_ioctl.c  |  5 
>  drivers/gpu/drm/i915/display/intel_display.c |  1 +
>  drivers/gpu/drm/nouveau/nouveau_display.c|  1 +
>  drivers/gpu/drm/radeon/radeon_display.c  |  1 +
>  drivers/gpu/drm/vc4/vc4_kms.c|  1 +
>  include/drm/drm_mode_config.h| 11 
>  include/uapi/drm/drm.h   | 10 ++-
>  13 files changed, 59 insertions(+), 4 deletions(-)
> 
> -- 
> 2.37.2
> 
> 


signature.asc
Description: PGP signature


[PATCH] drm: amd: amdgpu: ACPI: Add comment about ACPI_FADT_LOW_POWER_S0

2022-08-25 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

According to the ACPI specification [1], the ACPI_FADT_LOW_POWER_S0
flag merely means that it is better to use low-power S0 idle on the
given platform than S3 (provided that the latter is supported) and it
doesn't preclude using either of them (which of them will be used
depends on the choices made by user space).

However, on some systems that flag is used to indicate whether or not
to enable special firmware mechanics allowing the system to save more
energy when suspended to idle.  If that flag is unset, doing so is
generally risky.

Accordingly, add a comment to explain the ACPI_FADT_LOW_POWER_S0 check
in amdgpu_acpi_is_s0ix_active(), the purpose of which is otherwise
somewhat unclear.

Link: 
https://uefi.org/specs/ACPI/6.4/05_ACPI_Software_Programming_Model/ACPI_Software_Programming_Model.html#fixed-acpi-description-table-fadt
 # [1]
Signed-off-by: Rafael J. Wysocki 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c |6 ++
 1 file changed, 6 insertions(+)

Index: linux-pm/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
===
--- linux-pm.orig/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
+++ linux-pm/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
@@ -1066,6 +1066,12 @@ bool amdgpu_acpi_is_s0ix_active(struct a
(pm_suspend_target_state != PM_SUSPEND_TO_IDLE))
return false;
 
+   /*
+* If ACPI_FADT_LOW_POWER_S0 is not set in the FADT, it is generally
+* risky to do any special firmware-related preparations for entering
+* S0ix even though the system is suspending to idle, so return false
+* in that case.
+*/
if (!(acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0)) {
dev_warn_once(adev->dev,
  "Power consumption will be higher as BIOS has not 
been configured for suspend-to-idle.\n"





Re: [Linaro-mm-sig] [PATCH v3 6/9] dma-buf: Move dma-buf attachment to dynamic locking specification

2022-08-25 Thread Dmitry Osipenko
On 8/24/22 20:45, Christian König wrote:
> Am 24.08.22 um 17:49 schrieb Dmitry Osipenko:
>> On 8/24/22 18:24, Christian König wrote:
>>> Am 24.08.22 um 12:22 schrieb Dmitry Osipenko:
 Move dma-buf attachment API functions to the dynamic locking
 specification.
 The strict locking convention prevents deadlock situations for dma-buf
 importers and exporters.

 Previously, the "unlocked" versions of the attachment API functions
 weren't taking the reservation lock and this patch makes them to take
 the lock.

 Intel and AMD GPU drivers already were mapping the attached dma-bufs
 under
 the held lock during attachment, hence these drivers are updated to use
 the locked functions.

 Signed-off-by: Dmitry Osipenko 
 ---
    drivers/dma-buf/dma-buf.c  | 115
 ++---
    drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c    |   4 +-
    drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c |   8 +-
    drivers/gpu/drm/i915/gem/i915_gem_object.c |  12 +++
    include/linux/dma-buf.h    |  20 ++--
    5 files changed, 110 insertions(+), 49 deletions(-)

 diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
 index 4556a12bd741..f2a5a122da4a 100644
 --- a/drivers/dma-buf/dma-buf.c
 +++ b/drivers/dma-buf/dma-buf.c
 @@ -559,7 +559,7 @@ static struct file *dma_buf_getfile(struct dma_buf
 *dmabuf, int flags)
     * 2. Userspace passes this file-descriptors to all drivers it wants
 this buffer
     *    to share with: First the file descriptor is converted to a
 &dma_buf using
     *    dma_buf_get(). Then the buffer is attached to the device using
 - *    dma_buf_attach().
 + *    dma_buf_attach_unlocked().
>>> Now I get why this is confusing me so much.
>>>
>>> The _unlocked postfix implies that there is another function which
>>> should be called with the locks already held, but this is not the case
>>> for attach/detach (because they always need to grab the lock
>>> themselves).
>> That's correct. The attach/detach ops of exporter can take the lock
>> (like i915 exporter does it), hence importer must not grab the lock
>> around dma_buf_attach() invocation.
>>
>>> So I suggest to drop the _unlocked postfix for the attach/detach
>>> functions. Another step would then be to unify attach/detach with
>>> dynamic_attach/dynamic_detach when both have the same locking convention
>>> anyway.
>> It's not a problem to change the name, but it's unclear to me why we
>> should do it. The _unlocked postfix tells importer that reservation must
>> be unlocked and it must be unlocked in case of dma_buf_attach().
>>
>> Dropping the postfix will make dma_buf_attach() inconsistent with the
>> rest of the _unlocked functions(?). Are you sure we need to rename it?
> 
> The idea of the postfix was to distinguish between two different
> versions of the same function, e.g. dma_buf_vmap_unlocked() vs normal
> dma_buf_vmap().
> 
> When we don't have those two types of the same function I don't think it
> makes to much sense to keep that. We should just properly document which
> functions expect what and that's what your documentation patch does.

Thank you for the clarification. I'll change the names in v4 like you're
suggesting, we can always improve naming later on if will be necessary.

-- 
Best regards,
Dmitry


Re: [PATCH v4 05/31] drm/nouveau: Don't register backlight when another backlight should be used (v2)

2022-08-25 Thread Hans de Goede
Hi Lyude,

Thank you for the review.

On 8/24/22 19:41, Lyude Paul wrote:
> Just one tiny nitpick below:
> 
> On Wed, 2022-08-24 at 14:14 +0200, Hans de Goede wrote:
>> Before this commit when we want userspace to use the acpi_video backlight
>> device we register both the GPU's native backlight device and acpi_video's
>> firmware acpi_video# backlight device. This relies on userspace preferring
>> firmware type backlight devices over native ones.
>>
>> Registering 2 backlight devices for a single display really is
>> undesirable, don't register the GPU's native backlight device when
>> another backlight device should be used.
>>
>> Changes in v2:
>> - Add nouveau_acpi_video_backlight_use_native() wrapper to avoid unresolved
>>   symbol errors on non X86
>>
>> Signed-off-by: Hans de Goede 
>> ---
>>  drivers/gpu/drm/nouveau/nouveau_acpi.c  | 5 +
>>  drivers/gpu/drm/nouveau/nouveau_acpi.h  | 2 ++
>>  drivers/gpu/drm/nouveau/nouveau_backlight.c | 6 ++
>>  3 files changed, 13 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
>> b/drivers/gpu/drm/nouveau/nouveau_acpi.c
>> index 6140db756d06..1592c9cd7750 100644
>> --- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
>> +++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
>> @@ -386,3 +386,8 @@ nouveau_acpi_edid(struct drm_device *dev, struct 
>> drm_connector *connector)
>>  
>>  return kmemdup(edid, EDID_LENGTH, GFP_KERNEL);
>>  }
>> +
>> +bool nouveau_acpi_video_backlight_use_native(void)
>> +{
>> +return acpi_video_backlight_use_native();
>> +}
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.h 
>> b/drivers/gpu/drm/nouveau/nouveau_acpi.h
>> index 330f9b837066..3c666c30dfca 100644
>> --- a/drivers/gpu/drm/nouveau/nouveau_acpi.h
>> +++ b/drivers/gpu/drm/nouveau/nouveau_acpi.h
>> @@ -11,6 +11,7 @@ void nouveau_register_dsm_handler(void);
>>  void nouveau_unregister_dsm_handler(void);
>>  void nouveau_switcheroo_optimus_dsm(void);
>>  void *nouveau_acpi_edid(struct drm_device *, struct drm_connector *);
>> +bool nouveau_acpi_video_backlight_use_native(void);
>>  #else
>>  static inline bool nouveau_is_optimus(void) { return false; };
>>  static inline bool nouveau_is_v1_dsm(void) { return false; };
>> @@ -18,6 +19,7 @@ static inline void nouveau_register_dsm_handler(void) {}
>>  static inline void nouveau_unregister_dsm_handler(void) {}
>>  static inline void nouveau_switcheroo_optimus_dsm(void) {}
>>  static inline void *nouveau_acpi_edid(struct drm_device *dev, struct 
>> drm_connector *connector) { return NULL; }
>> +static inline bool nouveau_acpi_video_backlight_use_native(void) { return 
>> true; }
>>  #endif
>>  
>>  #endif
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_backlight.c 
>> b/drivers/gpu/drm/nouveau/nouveau_backlight.c
>> index a2141d3d9b1d..d2b8f8c13db4 100644
>> --- a/drivers/gpu/drm/nouveau/nouveau_backlight.c
>> +++ b/drivers/gpu/drm/nouveau/nouveau_backlight.c
>> @@ -38,6 +38,7 @@
>>  #include "nouveau_reg.h"
>>  #include "nouveau_encoder.h"
>>  #include "nouveau_connector.h"
>> +#include "nouveau_acpi.h"
>>  
>>  static struct ida bl_ida;
>>  #define BL_NAME_SIZE 15 // 12 for name + 2 for digits + 1 for '\0'
>> @@ -405,6 +406,11 @@ nouveau_backlight_init(struct drm_connector *connector)
>>  goto fail_alloc;
>>  }
>>  
>> +if (!nouveau_acpi_video_backlight_use_native()) {
>> +NV_INFO(drm, "Skipping nv_backlight registration\n");
>> +goto fail_alloc;
>> +}
> 
> We should probably make this say something like "No native backlight
> interface, using ACPI instead" instead. With that fixed

But that would not be correct. If we get to this point then before
the change we would continue with registering the native backlight
interface.

In other words, the native backlight interface is known to
be available at this point so saying "No native backlight interface"
would not be correct.

The reason the registration is being skipped is because the
drivers/acpi/video_detect.c heuristics (or DMI quirk or cmdline
override) say that another method to control the backlight is
preferred and we want to stop registering the native backlight
alltogether in that case so that there is only
1 /sys/class/backlight entry (on a 1 GPU 1 panel system).

Also "using ACPI instead" is not correct, on older systems
it might e.g. by a vendor specific control method such as
the one from dell-laptop. And on newer systems it might
e.g. be the new nvidia-wmi-ec-backlight driver instead.

So you could say the log message is a bit vague on purpose
because making it detailed would make it very long.

The idea behind the log message is to have something to
check for in dmesg if users start complaining about
/sys/class/backlight/nouveau_bl disappearing.

Normally users should not notice this, because indeed typically
they will then also have an /sys/class/backlight/acpi_video0
which is already preferred over the native one by userspace,
so nothing should change for them.  But they could e.g.
have

Re: [PATCH v4 02/31] drm/i915: Don't register backlight when another backlight should be used

2022-08-25 Thread Hans de Goede
Hi All,

On 8/24/22 14:50, Jani Nikula wrote:
> On Wed, 24 Aug 2022, Hans de Goede  wrote:
>> Before this commit when we want userspace to use the acpi_video backlight
>> device we register both the GPU's native backlight device and acpi_video's
>> firmware acpi_video# backlight device. This relies on userspace preferring
>> firmware type backlight devices over native ones.
>>
>> Registering 2 backlight devices for a single display really is
>> undesirable, don't register the GPU's native backlight device when
>> another backlight device should be used.
>>
>> Signed-off-by: Hans de Goede 
>> ---
>>  drivers/gpu/drm/i915/display/intel_backlight.c | 7 +++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/display/intel_backlight.c 
>> b/drivers/gpu/drm/i915/display/intel_backlight.c
>> index 681ebcda97ad..a4dd7924e0c1 100644
>> --- a/drivers/gpu/drm/i915/display/intel_backlight.c
>> +++ b/drivers/gpu/drm/i915/display/intel_backlight.c
>> @@ -8,6 +8,8 @@
>>  #include 
>>  #include 
>>  
>> +#include 
>> +
>>  #include "intel_backlight.h"
>>  #include "intel_backlight_regs.h"
>>  #include "intel_connector.h"
>> @@ -952,6 +954,11 @@ int intel_backlight_device_register(struct 
>> intel_connector *connector)
>>  
>>  WARN_ON(panel->backlight.max == 0);
>>  
>> +if (!acpi_video_backlight_use_native()) {
>> +DRM_INFO("Skipping intel_backlight registration\n");
> 
> Could use drm_info with drm_device.

Ack, fixed for v5.

> Either way,
> 
> Reviewed-by: Jani Nikula 

Thank you.

> and ack for merging via whichever tree suits you best.

My plan is to create a branch with the series on top
of 6.0-rc1 and then send a pull-req to all involved trees.

So far there are no conflicts between this branch and drm-tip...

Regards,

Hans



>> +return 0;
>> +}
>> +
>>  memset(&props, 0, sizeof(props));
>>  props.type = BACKLIGHT_RAW;
> 



RE: [PATCH] drm/amdgpu: Handle potential NULL pointer dereference

2022-08-25 Thread Russell, Kent
[AMD Official Use Only - General]

It does indeed short-circuit on || (If the left side of an || statement is not 
0, it doesn't evaluate the right side and returns true). So we can ignore this 
patch, since checking for each individual field on the 2nd term is probably 
overkill. We were still investigating what got passed in and why it wasn't 
valid, so I'll drop this for now. Thanks Lijo!

 Kent

-Original Message-
From: amd-gfx  On Behalf Of Russell, Kent
Sent: Thursday, August 25, 2022 8:52 AM
To: Lazar, Lijo ; amd-gfx@lists.freedesktop.org
Cc: Ghannam, Yazen 
Subject: RE: [PATCH] drm/amdgpu: Handle potential NULL pointer dereference

[AMD Official Use Only - General]

Good point, as if (1 || 0) should short-circuit at the if (1) part. Thus we 
should go down to NOTIFY_DONE if m is NULL. Yazen can confirm here, as he was 
the one who with me when we found the original issue. It's possible that one of 
the 3 message fields was NULL instead of the actual message in our repro 
situation, but I'll double-check the short-circuit side of C to confirm. I know 
it short circuits for &&, I don't know if it does for ||.

 Kent

-Original Message-
From: Lazar, Lijo  
Sent: Thursday, August 25, 2022 8:34 AM
To: Russell, Kent ; amd-gfx@lists.freedesktop.org
Cc: Ghannam, Yazen 
Subject: Re: [PATCH] drm/amdgpu: Handle potential NULL pointer dereference



On 8/25/2022 5:16 PM, Russell, Kent wrote:
> [AMD Official Use Only - General]
> 
> Friendly ping
> 

Wonder how it goes any further when m is NULL. It should do shortcut evaluation 
and return NOTIFY_DONE, right? Or is this for better readability?

Thanks,
Lijo

>   Kent
> 
> -Original Message-
> From: Russell, Kent 
> Sent: Monday, August 15, 2022 11:31 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Ghannam, Yazen ; Russell, Kent 
> 
> Subject: [PATCH] drm/amdgpu: Handle potential NULL pointer dereference
> 
> If m is NULL, we will end up referencing a NULL pointer in the subsequent m 
> elements like extcpu, bank and status. Pull the NULL check out and do it 
> first before referencing m's elements.
> 
> Signed-off-by: Kent Russell 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 5 -
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index ab9ba5a9c33d..028495fdfa62 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -2838,12 +2838,15 @@ static int amdgpu_bad_page_notifier(struct 
> notifier_block *nb,
>   struct eeprom_table_record err_rec;
>   uint64_t retired_page;
>   
> + if (!m)
> + return NOTIFY_DONE;
> +
>   /*
>* If the error was generated in UMC_V2, which belongs to GPU UMCs,
>* and error occurred in DramECC (Extended error code = 0) then only
>* process the error, else bail out.
>*/
> - if (!m || !((smca_get_bank_type(m->extcpu, m->bank) == SMCA_UMC_V2) &&
> + if (!((smca_get_bank_type(m->extcpu, m->bank) == SMCA_UMC_V2) &&
>   (XEC(m->status, 0x3f) == 0x0)))
>   return NOTIFY_DONE;
>   
> --
> 2.25.1
> 


RE: [PATCH] drm/amdgpu: Handle potential NULL pointer dereference

2022-08-25 Thread Russell, Kent
[AMD Official Use Only - General]

Good point, as if (1 || 0) should short-circuit at the if (1) part. Thus we 
should go down to NOTIFY_DONE if m is NULL. Yazen can confirm here, as he was 
the one who with me when we found the original issue. It's possible that one of 
the 3 message fields was NULL instead of the actual message in our repro 
situation, but I'll double-check the short-circuit side of C to confirm. I know 
it short circuits for &&, I don't know if it does for ||.

 Kent

-Original Message-
From: Lazar, Lijo  
Sent: Thursday, August 25, 2022 8:34 AM
To: Russell, Kent ; amd-gfx@lists.freedesktop.org
Cc: Ghannam, Yazen 
Subject: Re: [PATCH] drm/amdgpu: Handle potential NULL pointer dereference



On 8/25/2022 5:16 PM, Russell, Kent wrote:
> [AMD Official Use Only - General]
> 
> Friendly ping
> 

Wonder how it goes any further when m is NULL. It should do shortcut evaluation 
and return NOTIFY_DONE, right? Or is this for better readability?

Thanks,
Lijo

>   Kent
> 
> -Original Message-
> From: Russell, Kent 
> Sent: Monday, August 15, 2022 11:31 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Ghannam, Yazen ; Russell, Kent 
> 
> Subject: [PATCH] drm/amdgpu: Handle potential NULL pointer dereference
> 
> If m is NULL, we will end up referencing a NULL pointer in the subsequent m 
> elements like extcpu, bank and status. Pull the NULL check out and do it 
> first before referencing m's elements.
> 
> Signed-off-by: Kent Russell 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 5 -
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index ab9ba5a9c33d..028495fdfa62 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -2838,12 +2838,15 @@ static int amdgpu_bad_page_notifier(struct 
> notifier_block *nb,
>   struct eeprom_table_record err_rec;
>   uint64_t retired_page;
>   
> + if (!m)
> + return NOTIFY_DONE;
> +
>   /*
>* If the error was generated in UMC_V2, which belongs to GPU UMCs,
>* and error occurred in DramECC (Extended error code = 0) then only
>* process the error, else bail out.
>*/
> - if (!m || !((smca_get_bank_type(m->extcpu, m->bank) == SMCA_UMC_V2) &&
> + if (!((smca_get_bank_type(m->extcpu, m->bank) == SMCA_UMC_V2) &&
>   (XEC(m->status, 0x3f) == 0x0)))
>   return NOTIFY_DONE;
>   
> --
> 2.25.1
> 


Re: [PATCH] drm/amdgpu: Handle potential NULL pointer dereference

2022-08-25 Thread Lazar, Lijo




On 8/25/2022 5:16 PM, Russell, Kent wrote:

[AMD Official Use Only - General]

Friendly ping



Wonder how it goes any further when m is NULL. It should do shortcut 
evaluation and return NOTIFY_DONE, right? Or is this for better readability?


Thanks,
Lijo


  Kent

-Original Message-
From: Russell, Kent 
Sent: Monday, August 15, 2022 11:31 AM
To: amd-gfx@lists.freedesktop.org
Cc: Ghannam, Yazen ; Russell, Kent 
Subject: [PATCH] drm/amdgpu: Handle potential NULL pointer dereference

If m is NULL, we will end up referencing a NULL pointer in the subsequent m 
elements like extcpu, bank and status. Pull the NULL check out and do it first 
before referencing m's elements.

Signed-off-by: Kent Russell 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index ab9ba5a9c33d..028495fdfa62 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -2838,12 +2838,15 @@ static int amdgpu_bad_page_notifier(struct 
notifier_block *nb,
struct eeprom_table_record err_rec;
uint64_t retired_page;
  
+	if (!m)

+   return NOTIFY_DONE;
+
/*
 * If the error was generated in UMC_V2, which belongs to GPU UMCs,
 * and error occurred in DramECC (Extended error code = 0) then only
 * process the error, else bail out.
 */
-   if (!m || !((smca_get_bank_type(m->extcpu, m->bank) == SMCA_UMC_V2) &&
+   if (!((smca_get_bank_type(m->extcpu, m->bank) == SMCA_UMC_V2) &&
(XEC(m->status, 0x3f) == 0x0)))
return NOTIFY_DONE;
  
--

2.25.1



RE: [PATCH] drm/amdgpu: Handle potential NULL pointer dereference

2022-08-25 Thread Russell, Kent
[AMD Official Use Only - General]

Friendly ping

 Kent

-Original Message-
From: Russell, Kent  
Sent: Monday, August 15, 2022 11:31 AM
To: amd-gfx@lists.freedesktop.org
Cc: Ghannam, Yazen ; Russell, Kent 
Subject: [PATCH] drm/amdgpu: Handle potential NULL pointer dereference

If m is NULL, we will end up referencing a NULL pointer in the subsequent m 
elements like extcpu, bank and status. Pull the NULL check out and do it first 
before referencing m's elements.

Signed-off-by: Kent Russell 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index ab9ba5a9c33d..028495fdfa62 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -2838,12 +2838,15 @@ static int amdgpu_bad_page_notifier(struct 
notifier_block *nb,
struct eeprom_table_record err_rec;
uint64_t retired_page;
 
+   if (!m)
+   return NOTIFY_DONE;
+
/*
 * If the error was generated in UMC_V2, which belongs to GPU UMCs,
 * and error occurred in DramECC (Extended error code = 0) then only
 * process the error, else bail out.
 */
-   if (!m || !((smca_get_bank_type(m->extcpu, m->bank) == SMCA_UMC_V2) &&
+   if (!((smca_get_bank_type(m->extcpu, m->bank) == SMCA_UMC_V2) &&
(XEC(m->status, 0x3f) == 0x0)))
return NOTIFY_DONE;
 
--
2.25.1


RE: [PATCH] drm/amdgpu: disable FRU access on special SIENNA CICHLID card

2022-08-25 Thread Russell, Kent
[AMD Official Use Only - General]


Reviewed-by: Kent Russell 



-Original Message-
From: Chen, Guchun  
Sent: Wednesday, August 24, 2022 11:04 AM
To: amd-gfx@lists.freedesktop.org; Deucher, Alexander 
; Zhang, Hawking ; Russell, 
Kent 
Cc: Chen, Guchun 
Subject: [PATCH] drm/amdgpu: disable FRU access on special SIENNA CICHLID card

Below driver load error will be printed, not friendly to end user.

amdgpu: ATOM BIOS: 113-D603GLXE-077
[drm] FRU: Failed to get size field
[drm:amdgpu_fru_get_product_info [amdgpu]] *ERROR* Failed to read FRU 
Manufacturer, ret:-5

Signed-off-by: Guchun Chen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
index ecada5eadfe3..9d612b8745aa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
@@ -66,10 +66,15 @@ static bool is_fru_eeprom_supported(struct amdgpu_device 
*adev)
return true;
case CHIP_SIENNA_CICHLID:
if (strnstr(atom_ctx->vbios_version, "D603",
-   sizeof(atom_ctx->vbios_version)))
-   return true;
-   else
+   sizeof(atom_ctx->vbios_version))) {
+   if (strnstr(atom_ctx->vbios_version, "D603GLXE",
+sizeof(atom_ctx->vbios_version)))
+   return false;
+   else
+   return true;
+   } else {
return false;
+   }
default:
return false;
}
-- 
2.25.1


Re: [PATCH v4 31/31] drm/todo: Add entry about dealing with brightness control on devices with > 1 panel

2022-08-25 Thread Lyude Paul
Reviewed-by: Lyude Paul 

On Wed, 2022-08-24 at 14:15 +0200, Hans de Goede wrote:
> Add an entry summarizing the discussion about dealing with brightness
> control on devices with more then 1 internal panel.
> 
> The original discussion can be found here:
> https://lore.kernel.org/dri-devel/20220517152331.16217-1-hdego...@redhat.com/
> 
> Signed-off-by: Hans de Goede 
> ---
>  Documentation/gpu/todo.rst | 68 ++
>  1 file changed, 68 insertions(+)
> 
> diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst
> index 7634c27ac562..393d218e4a0c 100644
> --- a/Documentation/gpu/todo.rst
> +++ b/Documentation/gpu/todo.rst
> @@ -679,6 +679,74 @@ Contact: Sam Ravnborg
>  
>  Level: Advanced
>  
> +Brightness handling on devices with multiple internal panels
> +
> +
> +On x86/ACPI devices there can be multiple backlight firmware interfaces:
> +(ACPI) video, vendor specific and others. As well as direct/native (PWM)
> +register programming by the KMS driver.
> +
> +To deal with this backlight drivers used on x86/ACPI call
> +acpi_video_get_backlight_type() which has heuristics (+quirks) to select
> +which backlight interface to use; and backlight drivers which do not match
> +the returned type will not register themselves, so that only one backlight
> +device gets registered (in a single GPU setup, see below).
> +
> +At the moment this more or less assumes that there will only
> +be 1 (internal) panel on a system.
> +
> +On systems with 2 panels this may be a problem, depending on
> +what interface acpi_video_get_backlight_type() selects:
> +
> +1. native: in this case the KMS driver is expected to know which backlight
> +   device belongs to which output so everything should just work.
> +2. video: this does support controlling multiple backlights, but some work
> +   will need to be done to get the output <-> backlight device mapping
> +
> +The above assumes both panels will require the same backlight interface type.
> +Things will break on systems with multiple panels where the 2 panels need
> +a different type of control. E.g. one panel needs ACPI video backlight 
> control,
> +where as the other is using native backlight control. Currently in this case
> +only one of the 2 required backlight devices will get registered, based on
> +the acpi_video_get_backlight_type() return value.
> +
> +If this (theoretical) case ever shows up, then supporting this will need some
> +work. A possible solution here would be to pass a device and connector-name
> +to acpi_video_get_backlight_type() so that it can deal with this.
> +
> +Note in a way we already have a case where userspace sees 2 panels,
> +in dual GPU laptop setups with a mux. On those systems we may see
> +either 2 native backlight devices; or 2 native backlight devices.
> +
> +Userspace already has code to deal with this by detecting if the related
> +panel is active (iow which way the mux between the GPU and the panels
> +points) and then uses that backlight device. Userspace here very much
> +assumes a single panel though. It picks only 1 of the 2 backlight devices
> +and then only uses that one.
> +
> +Note that all userspace code (that I know off) is currently hardcoded
> +to assume a single panel.
> +
> +Before the recent changes to not register multiple (e.g. video + native)
> +/sys/class/backlight devices for a single panel (on a single GPU laptop),
> +userspace would see multiple backlight devices all controlling the same
> +backlight.
> +
> +To deal with this userspace had to always picks one preferred device under
> +/sys/class/backlight and will ignore the others. So to support brightness
> +control on multiple panels userspace will need to be updated too.
> +
> +There are plans to allow brightness control through the KMS API by adding
> +a "display brightness" property to drm_connector objects for panels. This
> +solves a number of issues with the /sys/class/backlight API, including not
> +being able to map a sysfs backlight device to a specific connector. Any
> +userspace changes to add support for brightness control on devices with
> +multiple panels really should build on top of this new KMS property.
> +
> +Contact: Hans de Goede
> +
> +Level: Advanced
> +
>  Outside DRM
>  ===
>  

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat



Re: [PATCH v4 12/31] drm/nouveau: Register ACPI video backlight when nv_backlight registration fails (v2)

2022-08-25 Thread Lyude Paul
Reviewed-by: Lyude Paul 

On Wed, 2022-08-24 at 14:15 +0200, Hans de Goede wrote:
> Typically the acpi_video driver will initialize before nouveau, which
> used to cause /sys/class/backlight/acpi_video0 to get registered and then
> nouveau would register its own nv_backlight device later. After which
> the drivers/acpi/video_detect.c code unregistered the acpi_video0 device
> to avoid there being 2 backlight devices.
> 
> This means that userspace used to briefly see 2 devices and the
> disappearing of acpi_video0 after a brief time confuses the systemd
> backlight level save/restore code, see e.g.:
> https://bbs.archlinux.org/viewtopic.php?id=269920
> 
> To fix this the ACPI video code has been modified to make backlight class
> device registration a separate step, relying on the drm/kms driver to
> ask for the acpi_video backlight registration after it is done setting up
> its native backlight device.
> 
> Add a call to the new acpi_video_register_backlight() when native backlight
> device registration has failed / was skipped to ensure that there is a
> backlight device available before the drm_device gets registered with
> userspace.
> 
> Changes in v2:
> - Add nouveau_acpi_video_register_backlight() wrapper to avoid unresolved
>   symbol errors on non X86
> 
> Signed-off-by: Hans de Goede 
> ---
>  drivers/gpu/drm/nouveau/nouveau_acpi.c  | 5 +
>  drivers/gpu/drm/nouveau/nouveau_acpi.h  | 2 ++
>  drivers/gpu/drm/nouveau/nouveau_backlight.c | 7 +++
>  3 files changed, 14 insertions(+)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
> b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> index 1592c9cd7750..8cf096f841a9 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> @@ -391,3 +391,8 @@ bool nouveau_acpi_video_backlight_use_native(void)
>  {
>   return acpi_video_backlight_use_native();
>  }
> +
> +void nouveau_acpi_video_register_backlight(void)
> +{
> + acpi_video_register_backlight();
> +}
> diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.h 
> b/drivers/gpu/drm/nouveau/nouveau_acpi.h
> index 3c666c30dfca..e39dd8b94b8b 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_acpi.h
> +++ b/drivers/gpu/drm/nouveau/nouveau_acpi.h
> @@ -12,6 +12,7 @@ void nouveau_unregister_dsm_handler(void);
>  void nouveau_switcheroo_optimus_dsm(void);
>  void *nouveau_acpi_edid(struct drm_device *, struct drm_connector *);
>  bool nouveau_acpi_video_backlight_use_native(void);
> +void nouveau_acpi_video_register_backlight(void);
>  #else
>  static inline bool nouveau_is_optimus(void) { return false; };
>  static inline bool nouveau_is_v1_dsm(void) { return false; };
> @@ -20,6 +21,7 @@ static inline void nouveau_unregister_dsm_handler(void) {}
>  static inline void nouveau_switcheroo_optimus_dsm(void) {}
>  static inline void *nouveau_acpi_edid(struct drm_device *dev, struct 
> drm_connector *connector) { return NULL; }
>  static inline bool nouveau_acpi_video_backlight_use_native(void) { return 
> true; }
> +static inline void nouveau_acpi_video_register_backlight(void) {}
>  #endif
>  
>  #endif
> diff --git a/drivers/gpu/drm/nouveau/nouveau_backlight.c 
> b/drivers/gpu/drm/nouveau/nouveau_backlight.c
> index d2b8f8c13db4..a614582779ca 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_backlight.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_backlight.c
> @@ -436,6 +436,13 @@ nouveau_backlight_init(struct drm_connector *connector)
>  
>  fail_alloc:
>   kfree(bl);
> + /*
> +  * If we get here we have an internal panel, but no nv_backlight,
> +  * try registering an ACPI video backlight device instead.
> +  */
> + if (ret == 0)
> + nouveau_acpi_video_register_backlight();
> +
>   return ret;
>  }
>  

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat



Re: [PATCH v4 05/31] drm/nouveau: Don't register backlight when another backlight should be used (v2)

2022-08-25 Thread Lyude Paul
Just one tiny nitpick below:

On Wed, 2022-08-24 at 14:14 +0200, Hans de Goede wrote:
> Before this commit when we want userspace to use the acpi_video backlight
> device we register both the GPU's native backlight device and acpi_video's
> firmware acpi_video# backlight device. This relies on userspace preferring
> firmware type backlight devices over native ones.
> 
> Registering 2 backlight devices for a single display really is
> undesirable, don't register the GPU's native backlight device when
> another backlight device should be used.
> 
> Changes in v2:
> - Add nouveau_acpi_video_backlight_use_native() wrapper to avoid unresolved
>   symbol errors on non X86
> 
> Signed-off-by: Hans de Goede 
> ---
>  drivers/gpu/drm/nouveau/nouveau_acpi.c  | 5 +
>  drivers/gpu/drm/nouveau/nouveau_acpi.h  | 2 ++
>  drivers/gpu/drm/nouveau/nouveau_backlight.c | 6 ++
>  3 files changed, 13 insertions(+)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
> b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> index 6140db756d06..1592c9cd7750 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> @@ -386,3 +386,8 @@ nouveau_acpi_edid(struct drm_device *dev, struct 
> drm_connector *connector)
>  
>   return kmemdup(edid, EDID_LENGTH, GFP_KERNEL);
>  }
> +
> +bool nouveau_acpi_video_backlight_use_native(void)
> +{
> + return acpi_video_backlight_use_native();
> +}
> diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.h 
> b/drivers/gpu/drm/nouveau/nouveau_acpi.h
> index 330f9b837066..3c666c30dfca 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_acpi.h
> +++ b/drivers/gpu/drm/nouveau/nouveau_acpi.h
> @@ -11,6 +11,7 @@ void nouveau_register_dsm_handler(void);
>  void nouveau_unregister_dsm_handler(void);
>  void nouveau_switcheroo_optimus_dsm(void);
>  void *nouveau_acpi_edid(struct drm_device *, struct drm_connector *);
> +bool nouveau_acpi_video_backlight_use_native(void);
>  #else
>  static inline bool nouveau_is_optimus(void) { return false; };
>  static inline bool nouveau_is_v1_dsm(void) { return false; };
> @@ -18,6 +19,7 @@ static inline void nouveau_register_dsm_handler(void) {}
>  static inline void nouveau_unregister_dsm_handler(void) {}
>  static inline void nouveau_switcheroo_optimus_dsm(void) {}
>  static inline void *nouveau_acpi_edid(struct drm_device *dev, struct 
> drm_connector *connector) { return NULL; }
> +static inline bool nouveau_acpi_video_backlight_use_native(void) { return 
> true; }
>  #endif
>  
>  #endif
> diff --git a/drivers/gpu/drm/nouveau/nouveau_backlight.c 
> b/drivers/gpu/drm/nouveau/nouveau_backlight.c
> index a2141d3d9b1d..d2b8f8c13db4 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_backlight.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_backlight.c
> @@ -38,6 +38,7 @@
>  #include "nouveau_reg.h"
>  #include "nouveau_encoder.h"
>  #include "nouveau_connector.h"
> +#include "nouveau_acpi.h"
>  
>  static struct ida bl_ida;
>  #define BL_NAME_SIZE 15 // 12 for name + 2 for digits + 1 for '\0'
> @@ -405,6 +406,11 @@ nouveau_backlight_init(struct drm_connector *connector)
>   goto fail_alloc;
>   }
>  
> + if (!nouveau_acpi_video_backlight_use_native()) {
> + NV_INFO(drm, "Skipping nv_backlight registration\n");
> + goto fail_alloc;
> + }

We should probably make this say something like "No native backlight
interface, using ACPI instead" instead. With that fixed

Reviewed-by: Lyude Paul 

> +
>   if (!nouveau_get_backlight_name(backlight_name, bl)) {
>   NV_ERROR(drm, "Failed to retrieve a unique name for the 
> backlight interface\n");
>   goto fail_alloc;

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat



[PATCH 2/2] drm/amdgpu: Init VF's HDP flush reg offset early

2022-08-25 Thread Lijo Lazar
Make sure the register offsets used for HDP flush in VF is
initialized early so that it works fine during any early HDP flush
sequence. For that, move the offset initialization to *_remap_hdp.

Signed-off-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c | 23 +
 drivers/gpu/drm/amd/amdgpu/nbio_v4_3.c | 12 +++
 drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c | 23 +
 drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c | 21 ---
 drivers/gpu/drm/amd/amdgpu/nbio_v7_2.c | 24 ++
 drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 23 +
 7 files changed, 84 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 53d753e94a71..c0bb2e9616c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2382,7 +2382,7 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
*adev)
 * to process space. This is needed for any early HDP
 * flush operation during gmc initialization.
 */
-   if (adev->nbio.funcs->remap_hdp_registers && 
!amdgpu_sriov_vf(adev))
+   if (adev->nbio.funcs->remap_hdp_registers)
adev->nbio.funcs->remap_hdp_registers(adev);
 
r = adev->ip_blocks[i].version->funcs->hw_init((void 
*)adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c 
b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
index b465baa26762..20fa2c5ad510 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
@@ -65,10 +65,21 @@
 
 static void nbio_v2_3_remap_hdp_registers(struct amdgpu_device *adev)
 {
-   WREG32_SOC15(NBIO, 0, mmREMAP_HDP_MEM_FLUSH_CNTL,
-   adev->rmmio_remap.reg_offset + 
KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL);
-   WREG32_SOC15(NBIO, 0, mmREMAP_HDP_REG_FLUSH_CNTL,
-   adev->rmmio_remap.reg_offset + 
KFD_MMIO_REMAP_HDP_REG_FLUSH_CNTL);
+   if (amdgpu_sriov_vf(adev))
+   adev->rmmio_remap.reg_offset =
+   SOC15_REG_OFFSET(
+   NBIO, 0,
+   
mmBIF_BX_DEV0_EPF0_VF0_HDP_MEM_COHERENCY_FLUSH_CNTL)
+   << 2;
+
+   if (!amdgpu_sriov_vf(adev)) {
+   WREG32_SOC15(NBIO, 0, mmREMAP_HDP_MEM_FLUSH_CNTL,
+adev->rmmio_remap.reg_offset +
+KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL);
+   WREG32_SOC15(NBIO, 0, mmREMAP_HDP_REG_FLUSH_CNTL,
+adev->rmmio_remap.reg_offset +
+KFD_MMIO_REMAP_HDP_REG_FLUSH_CNTL);
+   }
 }
 
 static u32 nbio_v2_3_get_rev_id(struct amdgpu_device *adev)
@@ -338,10 +349,6 @@ static void nbio_v2_3_init_registers(struct amdgpu_device 
*adev)
 
if (def != data)
WREG32_PCIE(smnPCIE_CONFIG_CNTL, data);
-
-   if (amdgpu_sriov_vf(adev))
-   adev->rmmio_remap.reg_offset = SOC15_REG_OFFSET(NBIO, 0,
-   mmBIF_BX_DEV0_EPF0_VF0_HDP_MEM_COHERENCY_FLUSH_CNTL) << 
2;
 }
 
 #define NAVI10_PCIE__LC_L0S_INACTIVITY_DEFAULT 0x // off by 
default, no gains over L1
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v4_3.c 
b/drivers/gpu/drm/amd/amdgpu/nbio_v4_3.c
index 982a89f841d5..e011d9856794 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v4_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v4_3.c
@@ -30,10 +30,14 @@
 
 static void nbio_v4_3_remap_hdp_registers(struct amdgpu_device *adev)
 {
-   WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_MEM_FLUSH_CNTL,
-   adev->rmmio_remap.reg_offset + 
KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL);
-   WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_REG_FLUSH_CNTL,
-   adev->rmmio_remap.reg_offset + 
KFD_MMIO_REMAP_HDP_REG_FLUSH_CNTL);
+   if (!amdgpu_sriov_vf(adev)) {
+   WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_MEM_FLUSH_CNTL,
+adev->rmmio_remap.reg_offset +
+KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL);
+   WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_REG_FLUSH_CNTL,
+adev->rmmio_remap.reg_offset +
+KFD_MMIO_REMAP_HDP_REG_FLUSH_CNTL);
+   }
 }
 
 static u32 nbio_v4_3_get_rev_id(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c 
b/drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c
index f7f6ddebd3e4..7536ca3fcd69 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c
@@ -55,10 +55,21 @@
 
 static void nbio_v6_1_remap_hdp_registers(struct amdgpu_device *adev)
 {
-   WREG32_SOC15(NBIO, 0, mmREMAP_HDP_MEM_FLUSH_CNTL,
-   adev->rmm

[PATCH 1/2] drm/amdgpu: Move HDP remapping earlier during init

2022-08-25 Thread Lijo Lazar
HDP flush is used early in the init sequence as part of memory controller
block initialization. Hence remapping of HDP registers needed for flush
needs to happen earlier.

This also fixes the AER error reported as Unsupported Request during
driver load.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373

Reported-by: Tom Seewald 
Signed-off-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +
 drivers/gpu/drm/amd/amdgpu/nv.c| 6 --
 drivers/gpu/drm/amd/amdgpu/soc15.c | 6 --
 drivers/gpu/drm/amd/amdgpu/soc21.c | 6 --
 4 files changed, 9 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index ce7d117efdb5..53d753e94a71 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2376,6 +2376,15 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
*adev)
DRM_ERROR("amdgpu_vram_scratch_init failed 
%d\n", r);
goto init_failed;
}
+
+   /* remap HDP registers to a hole in mmio space,
+* for the purpose of expose those registers
+* to process space. This is needed for any early HDP
+* flush operation during gmc initialization.
+*/
+   if (adev->nbio.funcs->remap_hdp_registers && 
!amdgpu_sriov_vf(adev))
+   adev->nbio.funcs->remap_hdp_registers(adev);
+
r = adev->ip_blocks[i].version->funcs->hw_init((void 
*)adev);
if (r) {
DRM_ERROR("hw_init %d failed %d\n", i, r);
diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index b3fba8dea63c..3ac7fef74277 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -1032,12 +1032,6 @@ static int nv_common_hw_init(void *handle)
nv_program_aspm(adev);
/* setup nbio registers */
adev->nbio.funcs->init_registers(adev);
-   /* remap HDP registers to a hole in mmio space,
-* for the purpose of expose those registers
-* to process space
-*/
-   if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
-   adev->nbio.funcs->remap_hdp_registers(adev);
/* enable the doorbell aperture */
nv_enable_doorbell_aperture(adev, true);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index fde6154f2009..a0481e37d7cf 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -1240,12 +1240,6 @@ static int soc15_common_hw_init(void *handle)
soc15_program_aspm(adev);
/* setup nbio registers */
adev->nbio.funcs->init_registers(adev);
-   /* remap HDP registers to a hole in mmio space,
-* for the purpose of expose those registers
-* to process space
-*/
-   if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
-   adev->nbio.funcs->remap_hdp_registers(adev);
 
/* enable the doorbell aperture */
soc15_enable_doorbell_aperture(adev, true);
diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c 
b/drivers/gpu/drm/amd/amdgpu/soc21.c
index 55284b24f113..16b447055102 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc21.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
@@ -660,12 +660,6 @@ static int soc21_common_hw_init(void *handle)
soc21_program_aspm(adev);
/* setup nbio registers */
adev->nbio.funcs->init_registers(adev);
-   /* remap HDP registers to a hole in mmio space,
-* for the purpose of expose those registers
-* to process space
-*/
-   if (adev->nbio.funcs->remap_hdp_registers)
-   adev->nbio.funcs->remap_hdp_registers(adev);
/* enable the doorbell aperture */
soc21_enable_doorbell_aperture(adev, true);
 
-- 
2.25.1



Re: [Bug 216373] New: Uncorrected errors reported for AMD GPU

2022-08-25 Thread Christian König

Am 25.08.22 um 09:54 schrieb Lazar, Lijo:



On 8/25/2022 1:04 PM, Christian König wrote:

Am 25.08.22 um 08:40 schrieb Stefan Roese:

On 24.08.22 16:45, Tom Seewald wrote:
On Wed, Aug 24, 2022 at 12:11 AM Lazar, Lijo  
wrote:

Unfortunately, I don't have any NV platforms to test. Attached is an
'untested-patch' based on your trace logs.

Thanks,
Lijo


Thank you for the patch. It applied cleanly to v6.0-rc2 and after
booting that kernel I no longer see any messages about PCI errors. I
have uploaded a dmesg log to the bug report:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fattachment.cgi%3Fid%3D301642&data=05%7C01%7Cchristian.koenig%40amd.com%7Cd55a659245b24864bd2d08da8664ae2d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637970065087671063%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000%7C%7C%7C&sdata=vbhJ9OB0jIYr%2FRkDIbQHhRRqhyklnnHOT9Xi8z17MYY%3D&reserved=0 



I did not follow this thread in depth, but FWICT the bug is solved now
with this patch. So is it correct, that the now fully enabled AER
support in the PCI subsystem in v6.0 helped detecting a bug in the AMD
GPU driver?


It looks like it, but I'm not 100% sure about the rational behind it.

Lijo can you explain more on this?



From the trace, during gmc hw_init it takes this route -

gart_enable -> amdgpu_gtt_mgr_recover -> amdgpu_gart_invalidate_tlb -> 
amdgpu_device_flush_hdp -> amdgpu_asic_flush_hdp (non-ring based HDP 
flush)


HDP flush is done using remapped offset which is MMIO_REG_HOLE_OFFSET 
(0x8 - PAGE_SIZE)


WREG32_NO_KIQ((adev->rmmio_remap.reg_offset + 
KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0);


However, the remapping is not yet done at this point. It's done at a 
later point during common block initialization. Access to the unmapped 
offset '(0x8 - PAGE_SIZE)' seems to come back as unsupported 
request and reported through AER.


That's interesting behavior. So far AER always indicated some kind of 
transmission error.


When that happens as well on unmapped areas of the MMIO BAR then we need 
to keep that in mind.


Thanks,
Christian.



In the patch, I just moved the remapping before gmc block initialization.

Thanks,
Lijo


Thanks,
Christian.



Thanks,
Stefan






Re: [PATCH] drm/amdgpu: use adev_to_drm to get drm device

2022-08-25 Thread Christian König

Am 25.08.22 um 09:48 schrieb Guchun Chen:

adev_to_drm is used everywhere in amdgpu code, so modify
it to keep consistency.

Signed-off-by: Guchun Chen 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index 8ee4e8491f39..6ea8980c8ad7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -402,7 +402,7 @@ static void amdgpu_ctx_fini(struct kref *ref)
}
}
  
-	if (drm_dev_enter(&adev->ddev, &idx)) {

+   if (drm_dev_enter(adev_to_drm(adev), &idx)) {
amdgpu_ctx_set_stable_pstate(ctx, ctx->stable_pstate);
drm_dev_exit(idx);
}




Re: [PATCH] drm: amd: amdgpu: ACPI: Add comment about ACPI_FADT_LOW_POWER_S0

2022-08-25 Thread Limonciello, Mario

On 8/24/2022 12:32, Rafael J. Wysocki wrote:

From: Rafael J. Wysocki 

According to the ACPI specification [1], the ACPI_FADT_LOW_POWER_S0
flag merely means that it is better to use low-power S0 idle on the
given platform than S3 (provided that the latter is supported) and it
doesn't preclude using either of them (which of them will be used
depends on the choices made by user space).

However, on some systems that flag is used to indicate whether or not
to enable special firmware mechanics allowing the system to save more
energy when suspended to idle.  If that flag is unset, doing so is
generally risky.

Accordingly, add a comment to explain the ACPI_FADT_LOW_POWER_S0 check
in amdgpu_acpi_is_s0ix_active(), the purpose of which is otherwise
somewhat unclear.

Link: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuefi.org%2Fspecs%2FACPI%2F6.4%2F05_ACPI_Software_Programming_Model%2FACPI_Software_Programming_Model.html%23fixed-acpi-description-table-fadt&data=05%7C01%7Cmario.limonciello%40amd.com%7Cf43320dda5114deeb16908da85f69d3b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637969591512297179%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=xp8pNzsXCkLcIJOJFY77yaLkMrvz5he3S%2Bi%2FwaxTwwg%3D&reserved=0
 # [1]
Signed-off-by: Rafael J. Wysocki 


Reviewed-by: Mario Limonciello 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c |6 ++
  1 file changed, 6 insertions(+)

Index: linux-pm/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
===
--- linux-pm.orig/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
+++ linux-pm/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
@@ -1066,6 +1066,12 @@ bool amdgpu_acpi_is_s0ix_active(struct a
(pm_suspend_target_state != PM_SUSPEND_TO_IDLE))
return false;
  
+	/*

+* If ACPI_FADT_LOW_POWER_S0 is not set in the FADT, it is generally
+* risky to do any special firmware-related preparations for entering
+* S0ix even though the system is suspending to idle, so return false
+* in that case.
+*/
if (!(acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0)) {
dev_warn_once(adev->dev,
  "Power consumption will be higher as BIOS has not been 
configured for suspend-to-idle.\n"







Re: [Bug 216373] New: Uncorrected errors reported for AMD GPU

2022-08-25 Thread Lazar, Lijo




On 8/25/2022 1:04 PM, Christian König wrote:

Am 25.08.22 um 08:40 schrieb Stefan Roese:

On 24.08.22 16:45, Tom Seewald wrote:

On Wed, Aug 24, 2022 at 12:11 AM Lazar, Lijo  wrote:

Unfortunately, I don't have any NV platforms to test. Attached is an
'untested-patch' based on your trace logs.

Thanks,
Lijo


Thank you for the patch. It applied cleanly to v6.0-rc2 and after
booting that kernel I no longer see any messages about PCI errors. I
have uploaded a dmesg log to the bug report:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fattachment.cgi%3Fid%3D301642&data=05%7C01%7Cchristian.koenig%40amd.com%7Cd55a659245b24864bd2d08da8664ae2d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637970065087671063%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000%7C%7C%7C&sdata=vbhJ9OB0jIYr%2FRkDIbQHhRRqhyklnnHOT9Xi8z17MYY%3D&reserved=0 



I did not follow this thread in depth, but FWICT the bug is solved now
with this patch. So is it correct, that the now fully enabled AER
support in the PCI subsystem in v6.0 helped detecting a bug in the AMD
GPU driver?


It looks like it, but I'm not 100% sure about the rational behind it.

Lijo can you explain more on this?



From the trace, during gmc hw_init it takes this route -

gart_enable -> amdgpu_gtt_mgr_recover -> amdgpu_gart_invalidate_tlb -> 
amdgpu_device_flush_hdp -> amdgpu_asic_flush_hdp (non-ring based HDP flush)


HDP flush is done using remapped offset which is MMIO_REG_HOLE_OFFSET 
(0x8 - PAGE_SIZE)


WREG32_NO_KIQ((adev->rmmio_remap.reg_offset + 
KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0);


However, the remapping is not yet done at this point. It's done at a 
later point during common block initialization. Access to the unmapped 
offset '(0x8 - PAGE_SIZE)' seems to come back as unsupported request 
and reported through AER.


In the patch, I just moved the remapping before gmc block initialization.

Thanks,
Lijo


Thanks,
Christian.



Thanks,
Stefan




RE: [PATCH] drm/amdkfd: Fix isa version for the GC 10.3.7

2022-08-25 Thread Liu, Aaron
[Public]

Reviewed-by: Aaron Liu 

> -Original Message-
> From: Liang, Prike 
> Sent: Wednesday, August 24, 2022 8:40 PM
> To: Liang, Prike ; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Huang, Ray
> ; Zhang, Yifan ; Liu, Aaron
> ; Limonciello, Mario 
> Subject: RE: [PATCH] drm/amdkfd: Fix isa version for the GC 10.3.7
> 
> [Public]
> 
> Add more for the review and awareness.
> 
> Regards,
> --Prike
> 
> -Original Message-
> From: Prike Liang 
> Sent: Wednesday, August 24, 2022 2:41 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Huang, Ray
> ; Zhang, Yifan ; Liang,
> Prike 
> Subject: [PATCH] drm/amdkfd: Fix isa version for the GC 10.3.7
> 
> Correct the isa version for handling KFD test.
> 
> Fixes: 7c4f4f197e0c ("drm/amdkfd: Add GC 10.3.6 and 10.3.7 KFD
> definitions")
> Signed-off-by: Prike Liang 
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index fdad1415f8bd..5ebbeac61379 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -388,7 +388,7 @@ struct kfd_dev *kgd2kfd_probe(struct
> amdgpu_device *adev, bool vf)
> f2g = &gfx_v10_3_kfd2kgd;
> break;
> case IP_VERSION(10, 3, 7):
> -   gfx_target_version = 100307;
> +   gfx_target_version = 100306;
> if (!vf)
> f2g = &gfx_v10_3_kfd2kgd;
> break;
> --
> 2.25.1
> 


RE: [PATCH] drm/amdgpu: use dev_info to benifit mGPU case

2022-08-25 Thread Chen, Guchun
I think so, we indeed need a new message indicating suspend, e.g. runtime 
suspend has completed. I will provide a new patch for this.

Regards,
Guchun

-Original Message-
From: Quan, Evan  
Sent: Thursday, August 25, 2022 3:17 PM
To: Chen, Guchun ; amd-gfx@lists.freedesktop.org; Deucher, 
Alexander ; Zhang, Hawking ; 
Koenig, Christian 
Subject: RE: [PATCH] drm/amdgpu: use dev_info to benifit mGPU case

[AMD Official Use Only - General]

Here "free PSP TMR buffer" seems having some special meanings (a marker that 
indicates suspending is on-going).
Better to redesign the prompts for suspending.
Anyway, the patch is reviewed-by: Evan Quan 

Evan
> -Original Message-
> From: Chen, Guchun 
> Sent: Thursday, August 25, 2022 2:26 PM
> To: amd-gfx@lists.freedesktop.org; Deucher, Alexander 
> ; Zhang, Hawking ; 
> Quan, Evan ; Koenig, Christian 
> 
> Cc: Chen, Guchun 
> Subject: [PATCH] drm/amdgpu: use dev_info to benifit mGPU case
> 
> 'free PSP TMR buffer' happens in suspend, but sometimes in mGPU 
> config, it mixes with PSP resume log printing from another GPU, which 
> is confusing. So use dev_info instead of DRM_INFO for printing.
> 
> [drm] PSP is resuming...
> [drm] reserve 0xa0 from 0x877e00 for PSP TMR amdgpu 
> :e3:00.0: amdgpu: GECC is enabled amdgpu :e3:00.0: amdgpu: 
> SECUREDISPLAY: securedisplay ta ucode is not available amdgpu 
> :e3:00.0: amdgpu: SMU is resuming...
> amdgpu :e3:00.0: amdgpu: smu driver if version = 0x0040, smu 
> fw if version = 0x0041, smu fw program = 0, version = 0x003a5400 
> (58.84.0) amdgpu :e3:00.0: amdgpu: SMU driver if version not 
> matched amdgpu :e3:00.0: amdgpu: dpm has been enabled amdgpu 
> :e3:00.0: amdgpu: SMU is resumed successfully!
> [drm] DMUB hardware initialized: version=0x02020014 [drm] free PSP TMR 
> buffer [drm] kiq ring mec 2 pipe 1 q 0
> 
> Signed-off-by: Guchun Chen 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index 1036446abc30..c932bc148554 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -812,7 +812,7 @@ static int psp_tmr_unload(struct psp_context *psp)
>   struct psp_gfx_cmd_resp *cmd = acquire_psp_cmd_buf(psp);
> 
>   psp_prep_tmr_unload_cmd_buf(psp, cmd);
> - DRM_INFO("free PSP TMR buffer\n");
> + dev_info(psp->adev->dev, "free PSP TMR buffer\n");
> 
>   ret = psp_cmd_submit_buf(psp, NULL, cmd,
>psp->fence_buf_mc_addr);
> --
> 2.25.1


[PATCH] drm/amdgpu: use adev_to_drm to get drm device

2022-08-25 Thread Guchun Chen
adev_to_drm is used everywhere in amdgpu code, so modify
it to keep consistency.

Signed-off-by: Guchun Chen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index 8ee4e8491f39..6ea8980c8ad7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -402,7 +402,7 @@ static void amdgpu_ctx_fini(struct kref *ref)
}
}
 
-   if (drm_dev_enter(&adev->ddev, &idx)) {
+   if (drm_dev_enter(adev_to_drm(adev), &idx)) {
amdgpu_ctx_set_stable_pstate(ctx, ctx->stable_pstate);
drm_dev_exit(idx);
}
-- 
2.25.1



[PATCH 0/2] Use kfd_lock/unlock_pdd helpers

2022-08-25 Thread Daniel Phillips
Patch 1 adds kfd_lock_pdd_by_id and patch 2 adds kfd_unlock_pdd helpers,
broken out this way to reduce noise in the first patch, which is a bit
tricky because of modifying a lot of error paths. Patch 2 is trivial.

Daniel Phillips (2):
  drm/amdgpu: Use kfd_lock_pdd_by_id helper in more places
  drm/amdgpu: Use kfd_unlock_pdd helper

 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 130 +--
 1 file changed, 48 insertions(+), 82 deletions(-)

-- 
2.35.1



[PATCH 1/2] drm/amdgpu: Use kfd_lock_pdd_by_id helper in more places

2022-08-25 Thread Daniel Phillips
Convert most of the "mutex_lock; kfd_process_device_data_by_id" occurrences
in kfd_chardev to use the kfd_lock_pdd_by_id. These will now consistently
log debug output if the lookup fails. Sites where kfd_process_device_data_by_id
is used without locking are not converted for now.

Signed-off-by: Daniel Phillips 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 96 
 1 file changed, 32 insertions(+), 64 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 2b3d8bc8f0aa..bb5528c55b73 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -75,6 +75,7 @@ static inline struct kfd_process_device 
*kfd_lock_pdd_by_id(struct kfd_process *
if (pdd)
return pdd;
 
+   pr_debug("Could not find gpu id 0x%x\n", gpu_id);
mutex_unlock(&p->mutex);
return NULL;
 }
@@ -311,14 +312,9 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
 
pr_debug("Looking for gpu id 0x%x\n", args->gpu_id);
 
-   mutex_lock(&p->mutex);
-
-   pdd = kfd_process_device_data_by_id(p, args->gpu_id);
-   if (!pdd) {
-   pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
-   err = -EINVAL;
-   goto err_pdd;
-   }
+   pdd = kfd_lock_pdd_by_id(p, args->gpu_id);
+   if (!pdd)
+   return -EINVAL;
dev = pdd->dev;
 
pdd = kfd_bind_process_to_device(dev, p);
@@ -405,7 +401,6 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
amdgpu_amdkfd_free_gtt_mem(dev->adev, wptr_bo);
 err_wptr_map_gart:
 err_bind_process:
-err_pdd:
mutex_unlock(&p->mutex);
return err;
 }
@@ -566,13 +561,9 @@ static int kfd_ioctl_set_memory_policy(struct file *filep,
return -EINVAL;
}
 
-   mutex_lock(&p->mutex);
-   pdd = kfd_process_device_data_by_id(p, args->gpu_id);
-   if (!pdd) {
-   pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
-   err = -EINVAL;
-   goto err_pdd;
-   }
+   pdd = kfd_lock_pdd_by_id(p, args->gpu_id);
+   if (!pdd)
+   return -EINVAL;
 
pdd = kfd_bind_process_to_device(pdd->dev, p);
if (IS_ERR(pdd)) {
@@ -596,7 +587,6 @@ static int kfd_ioctl_set_memory_policy(struct file *filep,
err = -EINVAL;
 
 out:
-err_pdd:
mutex_unlock(&p->mutex);
 
return err;
@@ -609,13 +599,9 @@ static int kfd_ioctl_set_trap_handler(struct file *filep,
int err = 0;
struct kfd_process_device *pdd;
 
-   mutex_lock(&p->mutex);
-
-   pdd = kfd_process_device_data_by_id(p, args->gpu_id);
-   if (!pdd) {
-   err = -EINVAL;
-   goto err_pdd;
-   }
+   pdd = kfd_lock_pdd_by_id(p, args->gpu_id);
+   if (!pdd)
+   return -EINVAL;
 
pdd = kfd_bind_process_to_device(pdd->dev, p);
if (IS_ERR(pdd)) {
@@ -626,7 +612,6 @@ static int kfd_ioctl_set_trap_handler(struct file *filep,
kfd_process_set_trap_handler(&pdd->qpd, args->tba_addr, args->tma_addr);
 
 out:
-err_pdd:
mutex_unlock(&p->mutex);
 
return err;
@@ -663,13 +648,12 @@ static int kfd_ioctl_get_clock_counters(struct file 
*filep,
struct kfd_ioctl_get_clock_counters_args *args = data;
struct kfd_process_device *pdd;
 
-   mutex_lock(&p->mutex);
-   pdd = kfd_process_device_data_by_id(p, args->gpu_id);
-   mutex_unlock(&p->mutex);
-   if (pdd)
+   pdd = kfd_lock_pdd_by_id(p, args->gpu_id);
+   if (pdd) {
+   mutex_unlock(&p->mutex);
/* Reading GPU clock counter from KGD */
args->gpu_clock_counter = 
amdgpu_amdkfd_get_gpu_clock_counter(pdd->dev->adev);
-   else
+   } else
/* Node without GPU resource */
args->gpu_clock_counter = 0;
 
@@ -886,12 +870,9 @@ static int kfd_ioctl_set_scratch_backing_va(struct file 
*filep,
struct kfd_dev *dev;
long err;
 
-   mutex_lock(&p->mutex);
-   pdd = kfd_process_device_data_by_id(p, args->gpu_id);
-   if (!pdd) {
-   err = -EINVAL;
-   goto err_pdd;
-   }
+   pdd = kfd_lock_pdd_by_id(p, args->gpu_id);
+   if (!pdd)
+   return -EINVAL;
dev = pdd->dev;
 
pdd = kfd_bind_process_to_device(dev, p);
@@ -912,7 +893,6 @@ static int kfd_ioctl_set_scratch_backing_va(struct file 
*filep,
return 0;
 
 bind_process_to_device_fail:
-err_pdd:
mutex_unlock(&p->mutex);
return err;
 }
@@ -973,12 +953,9 @@ static int kfd_ioctl_acquire_vm(struct file *filep, struct 
kfd_process *p,
if (!drm_file)
return -EINVAL;
 
-   mutex_lock(&p->mutex);
-   pdd = kfd_process_device_data_by_id(p, args->gpu_id);
-   if (!

Re: [Bug 216373] New: Uncorrected errors reported for AMD GPU

2022-08-25 Thread Christian König

Am 25.08.22 um 08:40 schrieb Stefan Roese:

On 24.08.22 16:45, Tom Seewald wrote:

On Wed, Aug 24, 2022 at 12:11 AM Lazar, Lijo  wrote:

Unfortunately, I don't have any NV platforms to test. Attached is an
'untested-patch' based on your trace logs.

Thanks,
Lijo


Thank you for the patch. It applied cleanly to v6.0-rc2 and after
booting that kernel I no longer see any messages about PCI errors. I
have uploaded a dmesg log to the bug report:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fattachment.cgi%3Fid%3D301642&data=05%7C01%7Cchristian.koenig%40amd.com%7Cd55a659245b24864bd2d08da8664ae2d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637970065087671063%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000%7C%7C%7C&sdata=vbhJ9OB0jIYr%2FRkDIbQHhRRqhyklnnHOT9Xi8z17MYY%3D&reserved=0 



I did not follow this thread in depth, but FWICT the bug is solved now
with this patch. So is it correct, that the now fully enabled AER
support in the PCI subsystem in v6.0 helped detecting a bug in the AMD
GPU driver?


It looks like it, but I'm not 100% sure about the rational behind it.

Lijo can you explain more on this?

Thanks,
Christian.



Thanks,
Stefan




Re: [PATCH 1/2] drm/amdgpu: The call to amdgpu_xgmi_remove_device needs to be earlier than psp_hw_fini

2022-08-25 Thread Zhang, Hawking
I thought I reviewed this one together with another one from you that fixed 
hive refcount leak. You sent them in series.

Anyway, go ahead to submit with my RB.

Thanks.

Regards,
Hawking

From: amd-gfx  on behalf of Chai, Thomas 

Date: Thursday, August 25, 2022 at 00:37
To: amd-gfx@lists.freedesktop.org 
Cc: Wang, Yang(Kevin) , Zhang, Hawking 

Subject: RE: [PATCH 1/2] drm/amdgpu: The call to amdgpu_xgmi_remove_device 
needs to be earlier than psp_hw_fini
[AMD Official Use Only - General]

Ping on this series.

-Original Message-
From: Chai, Thomas 
Sent: Friday, August 12, 2022 5:13 PM
To: amd-gfx@lists.freedesktop.org
Cc: Chai, Thomas ; Zhang, Hawking ; 
Wang, Yang(Kevin) ; Chai, Thomas 
Subject: [PATCH 1/2] drm/amdgpu: The call to amdgpu_xgmi_remove_device needs to 
be earlier than psp_hw_fini

The amdgpu_xgmi_remove_device function will send unload command to psp through 
psp ring to terminate xgmi, but psp ring has been destroyed in psp_hw_fini.

Signed-off-by: YiPeng Chai 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index c84fdef0ac45..2445255bbf01 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2787,6 +2787,9 @@ static int amdgpu_device_ip_fini_early(struct 
amdgpu_device *adev)

 amdgpu_amdkfd_suspend(adev, false);

+   if (adev->gmc.xgmi.num_physical_nodes > 1)
+   amdgpu_xgmi_remove_device(adev);
+
 /* Workaroud for ASICs need to disable SMC first */
 amdgpu_device_smu_fini_early(adev);

@@ -2830,9 +2833,6 @@ static int amdgpu_device_ip_fini(struct amdgpu_device 
*adev)
 if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
 amdgpu_virt_release_ras_err_handler_data(adev);

-   if (adev->gmc.xgmi.num_physical_nodes > 1)
-   amdgpu_xgmi_remove_device(adev);
-
 amdgpu_amdkfd_device_fini_sw(adev);

 for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
--
2.25.1


  1   2   >