date:20230906

Re: [PATCHv3] drm/amdkfd: Fix unaligned 64-bit doorbell warning

2023-09-06 Thread Lazar, Lijo





On 9/6/2023 9:09 PM, Mukul Joshi wrote:

This patch fixes the following unaligned 64-bit doorbell
warning seen when submitting packets on HIQ on GFX v9.4.3
by making the HIQ doorbell 64-bit aligned.
The warning is seen when GPU is loaded in any mode other
than SPX mode.

[  +0.000301] [ cut here ]
[  +0.03] Unaligned 64-bit doorbell
[  +0.30] WARNING: /amdkfd/kfd_doorbell.c:339 
write_kernel_doorbell64+0x72/0x80
[  +0.03] RIP: 0010:write_kernel_doorbell64+0x72/0x80
[  +0.04] RSP: 0018:c90004287730 EFLAGS: 00010246
[  +0.05] RAX:  RBX:  RCX: 
[  +0.03] RDX: 0001 RSI: 82837c71 RDI: 
[  +0.03] RBP: c90004287748 R08: 0003 R09: 0001
[  +0.02] R10: 001a R11: 88a034008198 R12: c900013bd004
[  +0.03] R13: 0008 R14: c900042877b0 R15: 007f
[  +0.03] FS:  7fa8c7b62000() GS:889f8840() 
knlGS:
[  +0.04] CS:  0010 DS:  ES:  CR0: 80050033
[  +0.03] CR2: 56111c45aaf0 CR3: 0001414f2002 CR4: 00770ee0
[  +0.03] PKRU: 5554
[  +0.02] Call Trace:
[  +0.04]  
[  +0.06]  kq_submit_packet+0x45/0x50 [amdgpu]
[  +0.000524]  pm_send_set_resources+0x7f/0xc0 [amdgpu]
[  +0.000500]  set_sched_resources+0xe4/0x160 [amdgpu]
[  +0.000503]  start_cpsch+0x1c5/0x2a0 [amdgpu]
[  +0.000497]  kgd2kfd_device_init.cold+0x816/0xb42 [amdgpu]
[  +0.000743]  amdgpu_amdkfd_device_init+0x15f/0x1f0 [amdgpu]
[  +0.000602]  amdgpu_device_init.cold+0x1813/0x2176 [amdgpu]
[  +0.000684]  ? pci_bus_read_config_word+0x4a/0x80
[  +0.12]  ? do_pci_enable_device+0xdc/0x110
[  +0.08]  amdgpu_driver_load_kms+0x1a/0x110 [amdgpu]
[  +0.000545]  amdgpu_pci_probe+0x197/0x400 [amdgpu]

Fixes: cfeaeb3c0ce7 ("drm/amdgpu: use doorbell mgr for kfd kernel doorbells")
Signed-off-by: Mukul Joshi 
---
v1->v2:
- Update the logic to make it work with both 32 bit
   64 bit doorbells.
- Add the Fixed tag
v2->v3:
- Revert to the original change to align it with whats done in
   amdgpu_doorbell_index_on_bar.

  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
index c2e0b79dcc6d..7b38537c7c99 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
@@ -162,6 +162,7 @@ void __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
return NULL;
  
  	*doorbell_off = amdgpu_doorbell_index_on_bar(kfd->adev, kfd->doorbells, inx);

+   inx *= 2;


To be more clear, suggest to use a macro here and at 
amdgpu_doorbell_index_on_bar() -

DOORBELL_INDX2ABS()/DOORBELL_ABS2INDX()

Thanks,
Lijo

  
  	pr_debug("Get kernel queue doorbell\n"

" doorbell offset   == 0x%08X\n"
@@ -176,6 +177,7 @@ void kfd_release_kernel_doorbell(struct kfd_dev *kfd, u32 
__iomem *db_addr)
unsigned int inx;
  
  	inx = (unsigned int)(db_addr - kfd->doorbell_kernel_ptr);

+   inx /= 2;
  
  	mutex_lock(>doorbell_mutex);

__clear_bit(inx, kfd->doorbell_bitmap);

Re: [PATCH] drm/amdgpu: add type conversion for gc info

2023-09-06 Thread Alex Deucher

Acked-by: Alex Deucher 

On Thu, Sep 7, 2023 at 12:20 AM Yifan Zhang  wrote:
>
> gc info usage misses type conversion.
>
> Signed-off-by: Yifan Zhang 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> index 5d179edcc8a8..9ab33b0bbbad 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> @@ -1439,12 +1439,12 @@ static int amdgpu_discovery_get_gfx_info(struct 
> amdgpu_device *adev)
> adev->gfx.config.num_sc_per_sh = 
> le32_to_cpu(gc_info->v1.gc_num_sc_per_se) /
> le32_to_cpu(gc_info->v1.gc_num_sa_per_se);
> adev->gfx.config.num_packer_per_sc = 
> le32_to_cpu(gc_info->v1.gc_num_packer_per_sc);
> -   if (gc_info->v1.header.version_minor >= 1) {
> +   if (le16_to_cpu(gc_info->v1.header.version_minor) >= 1) {
> adev->gfx.config.gc_num_tcp_per_sa = 
> le32_to_cpu(gc_info->v1_1.gc_num_tcp_per_sa);
> adev->gfx.config.gc_num_sdp_interface = 
> le32_to_cpu(gc_info->v1_1.gc_num_sdp_interface);
> adev->gfx.config.gc_num_tcps = 
> le32_to_cpu(gc_info->v1_1.gc_num_tcps);
> }
> -   if (gc_info->v1.header.version_minor >= 2) {
> +   if (le16_to_cpu(gc_info->v1.header.version_minor) >= 2) {
> adev->gfx.config.gc_num_tcp_per_wpg = 
> le32_to_cpu(gc_info->v1_2.gc_num_tcp_per_wpg);
> adev->gfx.config.gc_tcp_l1_size = 
> le32_to_cpu(gc_info->v1_2.gc_tcp_l1_size);
> adev->gfx.config.gc_num_sqc_per_wgp = 
> le32_to_cpu(gc_info->v1_2.gc_num_sqc_per_wgp);
> @@ -1473,7 +1473,7 @@ static int amdgpu_discovery_get_gfx_info(struct 
> amdgpu_device *adev)
> adev->gfx.config.num_sc_per_sh = 
> le32_to_cpu(gc_info->v2.gc_num_sc_per_se) /
> le32_to_cpu(gc_info->v2.gc_num_sh_per_se);
> adev->gfx.config.num_packer_per_sc = 
> le32_to_cpu(gc_info->v2.gc_num_packer_per_sc);
> -   if (gc_info->v2.header.version_minor == 1) {
> +   if (le16_to_cpu(gc_info->v2.header.version_minor == 1)) {
> adev->gfx.config.gc_num_tcp_per_sa = 
> le32_to_cpu(gc_info->v2_1.gc_num_tcp_per_sh);
> adev->gfx.config.gc_tcp_size_per_cu = 
> le32_to_cpu(gc_info->v2_1.gc_tcp_size_per_cu);
> adev->gfx.config.gc_num_sdp_interface = 
> le32_to_cpu(gc_info->v2_1.gc_num_sdp_interface); /* per XCD */
> --
> 2.37.3
>

Re: [PATCH] drm/amdgpu: Fix refclk reporting for SMU v13.0.6

2023-09-06 Thread Lazar, Lijo





On 9/6/2023 8:53 PM, Alex Deucher wrote:

On Wed, Sep 6, 2023 at 12:05 AM Lijo Lazar  wrote:


SMU v13.0.6 SOCs have 100MHz reference clock.



Do we want to use the vbios value on boards that have a vbios?  If
it's the same on all variants, then this is probably fine as is.



Yes, it's the same for all variants.

Thanks,
Lijo


Alex


Signed-off-by: Lijo Lazar 
---
  drivers/gpu/drm/amd/amdgpu/soc15.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index f5be40d7ba36..28094cd7d9c2 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -325,7 +325,8 @@ static u32 soc15_get_xclk(struct amdgpu_device *adev)
 u32 reference_clock = adev->clock.spll.reference_freq;

 if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(12, 0, 0) ||
-   adev->ip_versions[MP1_HWIP][0] == IP_VERSION(12, 0, 1))
+   adev->ip_versions[MP1_HWIP][0] == IP_VERSION(12, 0, 1) ||
+   adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 6))
 return 1;
 if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(10, 0, 0) ||
 adev->ip_versions[MP1_HWIP][0] == IP_VERSION(10, 0, 1))
--
2.25.1

RE: [PATCH 3/3] drm/mst: adjust the function drm_dp_remove_payload_part2()

2023-09-06 Thread Lin, Wayne

[AMD Official Use Only - General]

> -Original Message-
> From: Imre Deak 
> Sent: Friday, August 25, 2023 9:56 PM
> To: Lin, Wayne 
> Cc: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org;
> ly...@redhat.com; jani.nik...@intel.com; ville.syrj...@linux.intel.com;
> Wentland, Harry ; Zuo, Jerry
> 
> Subject: Re: [PATCH 3/3] drm/mst: adjust the function
> drm_dp_remove_payload_part2()
>
> On Wed, Aug 23, 2023 at 03:16:44AM +, Lin, Wayne wrote:
> > [AMD Official Use Only - General]
> >
> > > -Original Message-
> > > From: Imre Deak 
> > > Sent: Saturday, August 19, 2023 1:46 AM
> > > To: Lin, Wayne 
> > > Cc: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org;
> > > ly...@redhat.com; jani.nik...@intel.com;
> > > ville.syrj...@linux.intel.com; Wentland, Harry
> > > ; Zuo, Jerry 
> > > Subject: Re: [PATCH 3/3] drm/mst: adjust the function
> > > drm_dp_remove_payload_part2()
> > >
> > > On Tue, Aug 08, 2023 at 03:47:47AM +, Lin, Wayne wrote:
> > > > [AMD Official Use Only - General]
> > > >
> > > > > -Original Message-
> > > > > From: Imre Deak 
> > > > > Sent: Tuesday, August 8, 2023 12:00 AM
> > > > > To: Lin, Wayne 
> > > > > Cc: dri-de...@lists.freedesktop.org;
> > > > > amd-gfx@lists.freedesktop.org; ly...@redhat.com;
> > > > > jani.nik...@intel.com; ville.syrj...@linux.intel.com; Wentland,
> > > > > Harry ; Zuo, Jerry 
> > > > > Subject: Re: [PATCH 3/3] drm/mst: adjust the function
> > > > > drm_dp_remove_payload_part2()
> > > > >
> > > > > On Mon, Aug 07, 2023 at 02:43:02AM +, Lin, Wayne wrote:
> > > > > > [AMD Official Use Only - General]
> > > > > >
> > > > > > > -Original Message-
> > > > > > > From: Imre Deak 
> > > > > > > Sent: Friday, August 4, 2023 11:32 PM
> > > > > > > To: Lin, Wayne 
> > > > > > > Cc: dri-de...@lists.freedesktop.org;
> > > > > > > amd-gfx@lists.freedesktop.org; ly...@redhat.com;
> > > > > > > jani.nik...@intel.com; ville.syrj...@linux.intel.com;
> > > > > > > Wentland, Harry ; Zuo, Jerry
> > > > > > > 
> > > > > > > Subject: Re: [PATCH 3/3] drm/mst: adjust the function
> > > > > > > drm_dp_remove_payload_part2()
> > > > > > >
> > > > > > > On Fri, Aug 04, 2023 at 02:20:29PM +0800, Wayne Lin wrote:
> > > > > > > > [...]
> > > > > > > > diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > > > > > > > b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > > > > > > > index e04f87ff755a..4270178f95f6 100644
> > > > > > > > --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > > > > > > > +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > > > > > > > @@ -3382,8 +3382,7 @@
> > > > > > > EXPORT_SYMBOL(drm_dp_remove_payload_part1);
> > > > > > > >   * drm_dp_remove_payload_part2() - Remove an MST payload
> > > locally
> > > > > > > >   * @mgr: Manager to use.
> > > > > > > >   * @mst_state: The MST atomic state
> > > > > > > > - * @old_payload: The payload with its old state
> > > > > > > > - * @new_payload: The payload with its latest state
> > > > > > > > + * @payload: The payload with its latest state
> > > > > > > >   *
> > > > > > > >   * Updates the starting time slots of all other payloads
> > > > > > > > which would have
> > > > > > > been shifted towards
> > > > > > > >   * the start of the payload ID table as a result of
> > > > > > > > removing a payload. Driver should call this @@ -3392,25
> > > > > > > > +3391,36 @@
> > > > > > > EXPORT_SYMBOL(drm_dp_remove_payload_part1);
> > > > > > > >   */
> > > > > > > >  void drm_dp_remove_payload_part2(struct
> > > > > drm_dp_mst_topology_mgr
> > > > > > > *mgr,
> > > > > > > >  struct
> > > > > > > > drm_dp_mst_topology_state
> > > > > > > *mst_state,
> > > > > > > > -const struct 
> > > > > > > > drm_dp_mst_atomic_payload
> > > > > > > *old_payload,
> > > > > > > > -struct drm_dp_mst_atomic_payload
> > > > > > > *new_payload)
> > > > > > > > +struct
> > > > > > > > + drm_dp_mst_atomic_payload
> > > > > > > *payload)
> > > > > > > >  {
> > > > > > > > struct drm_dp_mst_atomic_payload *pos;
> > > > > > > > +   u8 time_slots_to_remove;
> > > > > > > > +   u8 next_payload_vc_start = mgr->next_start_slot;
> > > > > > > > +
> > > > > > > > +   /* Find the current allocated time slot number of the 
> > > > > > > > payload */
> > > > > > > > +   list_for_each_entry(pos, _state->payloads, next) {
> > > > > > > > +   if (pos != payload &&
> > > > > > > > +   pos->vc_start_slot > payload->vc_start_slot &&
> > > > > > > > +   pos->vc_start_slot < next_payload_vc_start)
> > > > > > > > +   next_payload_vc_start = pos->vc_start_slot;
> > > > > > > > +   }
> > > > > > > > +
> > > > > > > > +   time_slots_to_remove = next_payload_vc_start -
> > > > > > > > +payload->vc_start_slot;
> > > > > > >
> > > > > > > Imo, the intuitive way would be to pass the old payload
> > > > > > > state to this function -

[pull] amdgpu, amdkfd drm-fixes-6.6

2023-09-06 Thread Alex Deucher

Hi Dave, Daniel,

Fixes for 6.6.  Bigger than usual since this is ~3 weeks of fixes.

The following changes since commit 3698a75f5a98d0a6599e2878ab25d30a82dd836a:

  Merge tag 'drm-intel-next-fixes-2023-08-24' of 
git://anongit.freedesktop.org/drm/drm-intel into drm-next (2023-08-25 12:55:55 
+1000)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-fixes-6.6-2023-09-06

for you to fetch changes up to fbe1a9e0c78134db7e7f48322ab7d6a0530f2ee2:

  drm/amdgpu: Restrict bootloader wait to SMUv13.0.6 (2023-09-06 22:11:51 -0400)


amd-drm-fixes-6.6-2023-09-06:

amdgpu:
- Display replay fixes
- Fixes for headless boards
- Fix documentation breakage
- RAS fixes
- Handle newer IP discovery tables
- SMU 13.0.6 fixes
- SR-IOV fixes
- Display vstartup fixes
- NBIO 7.9 fixes
- Display scaling mode fixes
- Debugfs power reporting fix
- GC 9.4.3 fixes
- Dirty framebuffer fixes for fbcon
- eDP fixes
- DCN 3.1.5 fix
- Display ODM fixes
- GPU core dump fix
- Re-enable zops property now that IGT test is fixed
- Fix possible UAF in CS code
- Cursor degamma fix

amdkfd:
- HMM fixes
- Interrupt masking fix
- GFX11 MQD fixes


Alex Deucher (1):
  drm/amd/pm: fix debugfs pm_info output

Alex Sierra (2):
  drm/amdkfd: retry after EBUSY is returned from hmm_ranges_get_pages
  drm/amdkfd: use mask to get v9 interrupt sq data bits correctly

André Almeida (1):
  drm/amdgpu: Allocate coredump memory in a nonblocking way

Asad Kamal (3):
  drm/amd/pm: Update SMUv13.0.6 PMFW headers
  drm/amd/pm: Add critical temp for GC v9.4.3
  drm/amd/pm: Fix critical temp unit of SMU v13.0.6

Bhawanpreet Lakha (1):
  drm/amd/display: Enable Replay for static screen use cases

Bokun Zhang (1):
  drm/amdgpu/pm: Add notification for no DC support

Candice Li (1):
  drm/amdgpu: Only support RAS EEPROM on dGPU platform

Christian König (1):
  drm/amdgpu: fix amdgpu_cs_p1_user_fence

ChunTao Tso (1):
  drm/amd/display: set minimum of VBlank_nom

Fudong Wang (1):
  drm/amd/display: Add smu write msg id fail retry process

Gabe Teeger (1):
  drm/amd/display: Remove wait while locked

Hamza Mahfooz (7):
  drm/amd/display: fix mode scaling (RMX_.*)
  drm/amdgpu: register a dirty framebuffer callback for fbcon
  drm/amd/display: register edp_backlight_control() for DCN301
  Revert "Revert "drm/amd/display: Implement zpos property""
  Revert "drm/amd/display: Remove v_startup workaround for dcn3+"
  drm/amd/display: limit the v_startup workaround to ASICs older than DCN3.1
  drm/amd/display: prevent potential division by zero errors

Hawking Zhang (4):
  drm/amdgpu: Fix the return for gpu mode1_reset
  drm/amdgpu: Add umc_info v4_0 structure
  drm/amdgpu: Support query ecc cap for aqua_vanjaram
  drm/amdgpu: Free ras cmd input buffer properly

Horace Chen (1):
  drm/amdkfd: use correct method to get clock under SRIOV

Jay Cornwall (1):
  drm/amdkfd: Add missing gfx11 MQD manager callbacks

Le Ma (2):
  drm/amdgpu: update mall info v2 from discovery
  drm/amdgpu: update gc_info v2_1 from discovery

Lijo Lazar (6):
  Documentation/gpu: Update amdgpu documentation
  drm/amdgpu: Unset baco dummy mode on nbio v7.9
  drm/amdgpu: Add bootloader status check
  drm/amdgpu: Add bootloader wait for PSP v13
  drm/amdgpu: Add SMU v13.0.6 default reset methods
  drm/amdgpu: Restrict bootloader wait to SMUv13.0.6

Mangesh Gadre (2):
  drm/amdgpu: Remove SRAM clock gater override by driver
  drm/amdgpu: Updated TCP/UTCL1 programming

Melissa Wen (1):
  drm/amd/display: enable cursor degamma for DCN3+ DRM legacy gamma

Ovidiu Bunea (1):
  drm/amd/display: Roll back unit correction

Rajneesh Bhardwaj (1):
  drm/amdgpu: Hide xcp partition sysfs under SRIOV

Reza Amini (1):
  drm/amd/display: Correct unit conversion for vstartup

Samir Dhume (1):
  drm/amdgpu/jpeg - skip change of power-gating state for sriov

SungHuai Wang (1):
  drm/amd/display: fix static screen detection setting

Tao Zhou (1):
  drm/amdgpu: use read-modify-write mode for gfx v9_4_3 SQ setting

Wenjing Liu (3):
  Partially revert "drm/amd/display: update add plane to context logic with 
a new algorithm"
  drm/amd/display: update blank state on ODM changes
  drm/amd/display: always switch off ODM before committing more streams

YiPeng Chai (1):
  drm/amdgpu: Enable ras for mp0 v13_0_6 sriov

 Documentation/gpu/amdgpu/driver-misc.rst   |   8 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c |   8 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c   |  18 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  18 +---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  30 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c

[PATCH] drm/amdgpu: change harvest unit to WGP for gfx10 and later

2023-09-06 Thread Yifan Zhang

>From gfx10 and onwards, there are two bitmaps in driver,
CU bitmap and WGP bitmap, current log for harvesting WGP is
misleading, the disabling unit is WGP not CU for gfx10 and later.

Signed-off-by: Yifan Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 10 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 17 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c |  2 +-
 11 files changed, 24 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 83a9607a87b8..81191005854d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -184,7 +184,7 @@ extern uint amdgpu_pcie_lane_cap;
 extern u64 amdgpu_cg_mask;
 extern uint amdgpu_pg_mask;
 extern uint amdgpu_sdma_phase_quantum;
-extern char *amdgpu_disable_cu;
+extern char *amdgpu_disable_wgp_cu;
 extern char *amdgpu_virtual_display;
 extern uint amdgpu_pp_feature_mask;
 extern uint amdgpu_force_long_training;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index ef713806dd60..1eff18649963 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -151,7 +151,7 @@ uint amdgpu_pcie_lane_cap;
 u64 amdgpu_cg_mask = 0x;
 uint amdgpu_pg_mask = 0x;
 uint amdgpu_sdma_phase_quantum = 32;
-char *amdgpu_disable_cu;
+char *amdgpu_disable_wgp_cu;
 char *amdgpu_virtual_display;
 bool enforce_isolation;
 /*
@@ -505,11 +505,11 @@ MODULE_PARM_DESC(sdma_phase_quantum, "SDMA context switch 
phase quantum (x 1K GP
 module_param_named(sdma_phase_quantum, amdgpu_sdma_phase_quantum, uint, 0444);
 
 /**
- * DOC: disable_cu (charp)
- * Set to disable CUs (It's set like se.sh.cu,...). The default is NULL.
+ * DOC: disable_wgp_cu (charp)
+ * Set to disable WGP (gfx10 and later) or CUs (gfx9 and ealier) (It's set 
like se.sh.cu,...). The default is NULL.
  */
-MODULE_PARM_DESC(disable_cu, "Disable CUs (se.sh.cu,...)");
-module_param_named(disable_cu, amdgpu_disable_cu, charp, 0444);
+MODULE_PARM_DESC(disable_wgp_cu, "Disable WGP or CUs (se.sh.cu,...)");
+module_param_named(disable_wgp_cu, amdgpu_disable_wgp_cu, charp, 0444);
 
 /**
  * DOC: virtual_display (charp)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 2382921710ec..13a24efe2352 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -107,20 +107,23 @@ bool amdgpu_gfx_is_me_queue_enabled(struct amdgpu_device 
*adev,
  * @max_se: number of SEs
  * @max_sh: number of SHs
  *
- * The bitmask of CUs to be disabled in the shader array determined by se and
+ * The bitmask of WGP or CUs to be disabled in the shader array determined by 
se and
  * sh is stored in mask[se * max_sh + sh].
  */
-void amdgpu_gfx_parse_disable_cu(unsigned int *mask, unsigned int max_se, 
unsigned int max_sh)
+void amdgpu_gfx_parse_disable_cu(unsigned int *mask, unsigned int max_se, 
unsigned int max_sh, bool wgp_mode)
 {
unsigned int se, sh, cu;
const char *p;
+   const char *disable_unit;
+
+   disable_unit = wgp_mode ? "WGP" : "CU";
 
memset(mask, 0, sizeof(*mask) * max_se * max_sh);
 
-   if (!amdgpu_disable_cu || !*amdgpu_disable_cu)
+   if (!amdgpu_disable_wgp_cu || !*amdgpu_disable_wgp_cu)
return;
 
-   p = amdgpu_disable_cu;
+   p = amdgpu_disable_wgp_cu;
for (;;) {
char *next;
int ret = sscanf(p, "%u.%u.%u", , , );
@@ -131,11 +134,11 @@ void amdgpu_gfx_parse_disable_cu(unsigned int *mask, 
unsigned int max_se, unsign
}
 
if (se < max_se && sh < max_sh && cu < 16) {
-   DRM_INFO("amdgpu: disabling CU %u.%u.%u\n", se, sh, cu);
+   DRM_INFO("amdgpu: disabling %s %u.%u.%u\n", 
disable_unit, se, sh, cu);
mask[se * max_sh + sh] |= 1u << cu;
} else {
-   DRM_ERROR("amdgpu: disable_cu %u.%u.%u is out of 
range\n",
- se, sh, cu);
+   DRM_ERROR("amdgpu: disable_%s %u.%u.%u is out of 
range\n",
+ disable_unit, se, sh, cu);
}
 
next = strchr(p, ',');
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index 395c1768b9fc..c13af19c9b82 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -473,7 +473,7 @@ static

[PATCH] drm/amdgpu: add type conversion for gc info

2023-09-06 Thread Yifan Zhang

gc info usage misses type conversion.

Signed-off-by: Yifan Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 5d179edcc8a8..9ab33b0bbbad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -1439,12 +1439,12 @@ static int amdgpu_discovery_get_gfx_info(struct 
amdgpu_device *adev)
adev->gfx.config.num_sc_per_sh = 
le32_to_cpu(gc_info->v1.gc_num_sc_per_se) /
le32_to_cpu(gc_info->v1.gc_num_sa_per_se);
adev->gfx.config.num_packer_per_sc = 
le32_to_cpu(gc_info->v1.gc_num_packer_per_sc);
-   if (gc_info->v1.header.version_minor >= 1) {
+   if (le16_to_cpu(gc_info->v1.header.version_minor) >= 1) {
adev->gfx.config.gc_num_tcp_per_sa = 
le32_to_cpu(gc_info->v1_1.gc_num_tcp_per_sa);
adev->gfx.config.gc_num_sdp_interface = 
le32_to_cpu(gc_info->v1_1.gc_num_sdp_interface);
adev->gfx.config.gc_num_tcps = 
le32_to_cpu(gc_info->v1_1.gc_num_tcps);
}
-   if (gc_info->v1.header.version_minor >= 2) {
+   if (le16_to_cpu(gc_info->v1.header.version_minor) >= 2) {
adev->gfx.config.gc_num_tcp_per_wpg = 
le32_to_cpu(gc_info->v1_2.gc_num_tcp_per_wpg);
adev->gfx.config.gc_tcp_l1_size = 
le32_to_cpu(gc_info->v1_2.gc_tcp_l1_size);
adev->gfx.config.gc_num_sqc_per_wgp = 
le32_to_cpu(gc_info->v1_2.gc_num_sqc_per_wgp);
@@ -1473,7 +1473,7 @@ static int amdgpu_discovery_get_gfx_info(struct 
amdgpu_device *adev)
adev->gfx.config.num_sc_per_sh = 
le32_to_cpu(gc_info->v2.gc_num_sc_per_se) /
le32_to_cpu(gc_info->v2.gc_num_sh_per_se);
adev->gfx.config.num_packer_per_sc = 
le32_to_cpu(gc_info->v2.gc_num_packer_per_sc);
-   if (gc_info->v2.header.version_minor == 1) {
+   if (le16_to_cpu(gc_info->v2.header.version_minor == 1)) {
adev->gfx.config.gc_num_tcp_per_sa = 
le32_to_cpu(gc_info->v2_1.gc_num_tcp_per_sh);
adev->gfx.config.gc_tcp_size_per_cu = 
le32_to_cpu(gc_info->v2_1.gc_tcp_size_per_cu);
adev->gfx.config.gc_num_sdp_interface = 
le32_to_cpu(gc_info->v2_1.gc_num_sdp_interface); /* per XCD */
-- 
2.37.3

[PATCH] drm/amdkfd: update struct pm4_mes_runlist Struct pm4_mes_runlist in amdgpu is conflict with spec Add last dword of the design of spec into struct pm4_mes_runlist

2023-09-06 Thread Lin . Cao

Signed-off-by: Lin.Cao 
---
 drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h
index 8b6b2bd5c148..ed937f70895c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h
@@ -129,6 +129,16 @@ struct pm4_mes_runlist {
uint32_t ordinal4;
};
 
+   union {
+   struct {
+   uint32_t level_1_static_queue_cnt:4;
+   uint32_t level_2_static_queue_cnt:4;
+   uint32_t level_3_static_queue_cnt:4;
+   uint32_t reserved4:20;
+   } bitfields5;
+   uint32_t ordinal5;
+   };
+
 };
 #endif
 
-- 
2.25.1

Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Sui Jingfeng


Hi,


On 2023/9/6 17:40, Christian König wrote:

Am 06.09.23 um 11:08 schrieb suijingfeng:

Well, welcome to correct me if I'm wrong.


You seem to have some very basic misunderstandings here.

The term framebuffer describes some VRAM memory used for scanout.

This framebuffer is exposed to userspace through some framebuffer 
driver, on UEFI platforms that is usually efifb but can be quite a 
bunch of different drivers.


When the DRM drivers load they remove the previous drivers using 
drm_aperture_remove_conflicting_pci_framebuffers() (or similar 
function), but this does not mean that the framebuffer or scanout 
parameters are modified in any way. It just means that the framebuffer 
is just no longer exposed through this driver.


Take over is the perfectly right description here because that's 
exactly what's happening. The framebuffer configuration including the 
VRAM memory as well as the parameters for scanout are exposed by the 
newly loaded DRM driver.


In other words userspace can query through the DRM interfaces which 
monitors already driven by the hardware and so in your terminology 
figure out which is the primary one.



I'm a little bit of not convinced about this idea, you might be correct.
But there cases where three are multiple monitors and each video card
connect one.

It also quite common that no monitors is connected, let the machine boot
first, then find a monitors to connect to a random display output. See
which will display. I don't expect the primary shake with.
The primary one have to be determined as early as possible, because of
the VGA console and the framebuffer console may directly output the primary.
Get the DDC and/or HPD involved may necessary complicated the problem.

There are ASpeed BMC who add a virtual connector in order to able display 
remotely.
There are also have commands to force a connector to be connected status.


It's just that as Thomas explained as well that this completely 
irrelevant to any modern desktop. Both X and Wayland both iterate the 
available devices and start rendering to them which one was used 
during boot doesn't really matter to them.



You may be correct, but I'm still not sure.
I probably need more times to investigate.
Me and my colleagues are mainly using X server,
the version varies from 1.20.4 and 1.21.1.4.
Even this is true, the problems still exist for non-modern desktops.

Apart from that ranting like this and trying to explain stuff to 
people who obviously have much better background in the topic is not 
going to help your patches getting upstream.




Thanks for you tell me so much knowledge,
I'm realized where are the problems now.
I will try to resolve the concerns at the next version.



Regards,
Christian.

Re: [RFC,drm-misc-next v4 3/9] drm/radeon: Implement .be_primary() callback

2023-09-06 Thread Sui Jingfeng




On 2023/9/7 00:00, Alex Deucher wrote:

On Tue, Sep 5, 2023 at 1:25 PM suijingfeng  wrote:

Hi,


On 2023/9/5 13:50, Christian König wrote:

Am 04.09.23 um 21:57 schrieb Sui Jingfeng:

From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over
which one
is primary at boot time.

Question is why is that useful? Should we give users the ability to
control that?

I don't see an use case for this.


On a specific machine with multiple GPUs mounted, only the
primary graphics get POST-ed (initialized) by the firmware.
Therefore the DRM drivers for the rest video cards have to
work without the prerequisite setups done by firmware, This
is called as POST.

I think that should be regarded as a bug in the driver that should be
fixed and this would not help with that case.  If a driver can't
initialize a device without aid from the pre-OS environment, that
should be fixed in the driver.  This solution also doesn't fix which
device is selected as the primary by the pre-OS environment.  That can
only be fixed in the pre-OS environment code.


One of the use cases is to test if a specific DRM driver
would works properly, under the circumstance of not being
POST-ed, The ast drm driver is the first one which refused
to work if not being POST-ed by the firmware.

Before apply this series, I was unable make drm/ast as the
primary video card easily. The problem is that on a multiple
video card configuration, the monitor connected with my
AST2400 card not light up. While confusing, a naive programmer
may suspect the PRIME is not working.

After applied this series and passing ast.modeset=10 on the
kernel cmd line, I found that the monitor connected with my
ast2400 video card still black, It doesn't display and It
doesn't show image to me.

The problem with adding modeset=10 is that it only helps when you have
one GPU driven by that driver in the system.  If you have multiple
GPUs driven by that driver, which one would that apply to?  E.g., what
if you have 2 AMD GPUs in the system.


While in the process of study drm/ast, I know that drm/ast
driver has the POST code shipped, See the ast_post_gpu() function.
Then, I was wondering why this function doesn't works.

After a short-time (hasty) debugging, I found that the ast_post_gpu()
function didn't get run. Because it have something to do with the
ast->config_mode. Without thinking too much, I hardcoded the
ast->config_mode as ast_use_p2a, the key point is to force the
ast_post_gpu() function to run.


```

--- a/drivers/gpu/drm/ast/ast_main.c
+++ b/drivers/gpu/drm/ast/ast_main.c
@@ -132,6 +132,8 @@ static int ast_device_config_init(struct ast_device
*ast)
  }
  }

+   ast->config_mode = ast_use_p2a;
+
  switch (ast->config_mode) {
  case ast_use_defaults:
  drm_info(dev, "Using default configuration\n");

```

Then, the monitor light up, it display the Ubuntu greeter to me. Therefore
my patch is useful, at least for the Linux drm driver tester and developer.
It allow programmers to test the specific part of a specific driver without
changing a line of the source code and without the need of sudo authority.

It improves the efficiency of the testing and patch verification. I know
the PrimaryGPU option of Xorg conf, but this approach will remember the
setup have been made, you need modify it with root authority each time
you want to switch the primary. But on the process of rapid developing
and/or testing for multiple video drivers, with only one computer hardware
resource available. What we really want is a one-shot command, as provided
by this series.  So, this is the first use case.


The second use case is that sometime the firmware is not reliable.
While there are thousands of ARM64, PowerPC and Mips servers machine,
Most of them don't have a good UEFI firmware support. I haven't test the
drm/amdgpu and drm/radeon at my ARM64 server yet. Because this ARM64
server always use the platform(BMC) integrated display controller as primary.
The UEFI firmware of it does not provide options menu to tune.
So, for the first time, the discrete card because useless, despite more 
powerful.
I will take time to carry on the testing, so I will be able to tell more
in the future.


Even on X86, when select the PEG as primary on the UEFI BIOS menu.
There is no way to tell the bios which one of my three
discrete video be the primary. Not to mention some old UEFI
firmware, which doesn't provide a setting at all.
While the benefit of my approach is the flexibility.
Yes the i915, amdgpu and radeon are good quality,
but there may have programmers want to try nouveau.


The third use case is that VGAARB is also not reliable, It will
select a wrong device as primary. Especially on Arm64, Loongarch
and mips arch etc. And the X server will use this wrong device
as primary and completely crash there. Either because of lacking
a driver or the driver has a bug which can not bear the graphic
environment up. VGAARB

RE: [PATCH] drm/amdgpu: fix retry loop test

2023-09-06 Thread Quan, Evan

[AMD Official Use Only - General]

Yeah, nice catch. But personally I would prefer to change the check as "if 
(retry <= 0)".
Either way, the patch is reviewed-by: Evan Quan 

Evan
> -Original Message-
> From: Dan Carpenter 
> Sent: Wednesday, September 6, 2023 6:55 PM
> To: Quan, Evan ; Wang, Yang(Kevin)
> 
> Cc: Deucher, Alexander ; Koenig, Christian
> ; Pan, Xinhui ; David
> Airlie ; Daniel Vetter ; Lazar, Lijo
> ; Kamal, Asad ; Zhang,
> Hawking ; Limonciello, Mario
> ; amd-gfx@lists.freedesktop.org; kernel-
> janit...@vger.kernel.org
> Subject: [PATCH] drm/amdgpu: fix retry loop test
>
> This loop will exit with "retry" set to -1 if it fails but the code
> checks for if "retry" is zero.  Fix this by changing post-op to a
> pre-op.  --retry vs retry--.
>
> Fixes: e01eeffc3f86 ("drm/amd/pm: avoid driver getting empty metrics table
> for the first time")
> Signed-off-by: Dan Carpenter 
> ---
> Obviously this only loop 99 times now instead of a hundred but that's
> fine, this is an approximation.
>
>  drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
> b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
> index ff58ee14a68f..20163a9b2a66 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
> @@ -336,7 +336,7 @@ static int smu_v13_0_6_setup_driver_pptable(struct
> smu_context *smu)
>
>   /* Store one-time values in driver PPTable */
>   if (!pptable->Init) {
> - while (retry--) {
> + while (--retry) {
>   ret = smu_v13_0_6_get_metrics_table(smu, NULL,
> true);
>   if (ret)
>   return ret;
> --
> 2.39.2

RE: [PATCH v3 2/5] drm/kmb: add trailing newlines to drm_dbg msgs

2023-09-06 Thread Chrisanthus, Anitha

Acked-by: Anitha Chrisanthus 

> -Original Message-
> From: Jim Cromie 
> Sent: Wednesday, September 6, 2023 12:02 PM
> To: linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org; amd-
> g...@lists.freedesktop.org; intel-gvt-...@lists.freedesktop.org; intel-
> g...@lists.freedesktop.org
> Cc: daniel.vet...@ffwll.ch; dan...@ffwll.ch; Nikula, Jani
> ; ville.syrj...@linux.intel.com;
> seanp...@chromium.org; robdcl...@gmail.com; Jim Cromie
> ; Chrisanthus, Anitha
> ; Edmund Dea ;
> David Airlie 
> Subject: [PATCH v3 2/5] drm/kmb: add trailing newlines to drm_dbg msgs
> 
> By at least strong convention, a print-buffer's trailing newline says
> "message complete, send it".  The exception (no TNL, followed by a call
> to pr_cont) proves the general rule.
> 
> Most DRM.debug calls already comport with this: 207 DRM_DEV_DEBUG,
> 1288 drm_dbg.  Clean up the remainders, in maintainer sized chunks.
> 
> No functional changes.
> 
> Signed-off-by: Jim Cromie 
> ---
>  drivers/gpu/drm/kmb/kmb_crtc.c  | 10 +-
>  drivers/gpu/drm/kmb/kmb_plane.c |  6 +++---
>  2 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/kmb/kmb_crtc.c
> b/drivers/gpu/drm/kmb/kmb_crtc.c
> index 647872f65bff..a58baf25322d 100644
> --- a/drivers/gpu/drm/kmb/kmb_crtc.c
> +++ b/drivers/gpu/drm/kmb/kmb_crtc.c
> @@ -94,7 +94,7 @@ static void kmb_crtc_set_mode(struct drm_crtc *crtc,
>   vm.hback_porch = 0;
>   vm.hsync_len = 28;
> 
> - drm_dbg(dev, "%s : %dactive height= %d vbp=%d vfp=%d vsync-w=%d
> h-active=%d h-bp=%d h-fp=%d hsync-l=%d",
> + drm_dbg(dev, "%s : %dactive height= %d vbp=%d vfp=%d vsync-w=%d
> h-active=%d h-bp=%d h-fp=%d hsync-l=%d\n",
>   __func__, __LINE__,
>   m->crtc_vdisplay, vm.vback_porch, vm.vfront_porch,
>   vm.vsync_len, m->crtc_hdisplay, vm.hback_porch,
> @@ -194,24 +194,24 @@ static enum drm_mode_status
>   int vfp = mode->vsync_start - mode->vdisplay;
> 
>   if (mode->vdisplay < KMB_CRTC_MAX_HEIGHT) {
> - drm_dbg(dev, "height = %d less than %d",
> + drm_dbg(dev, "height = %d less than %d\n",
>   mode->vdisplay, KMB_CRTC_MAX_HEIGHT);
>   return MODE_BAD_VVALUE;
>   }
>   if (mode->hdisplay < KMB_CRTC_MAX_WIDTH) {
> - drm_dbg(dev, "width = %d less than %d",
> + drm_dbg(dev, "width = %d less than %d\n",
>   mode->hdisplay, KMB_CRTC_MAX_WIDTH);
>   return MODE_BAD_HVALUE;
>   }
>   refresh = drm_mode_vrefresh(mode);
>   if (refresh < KMB_MIN_VREFRESH || refresh > KMB_MAX_VREFRESH) {
> - drm_dbg(dev, "refresh = %d less than %d or greater than %d",
> + drm_dbg(dev, "refresh = %d less than %d or greater than
> %d\n",
>   refresh, KMB_MIN_VREFRESH, KMB_MAX_VREFRESH);
>   return MODE_BAD;
>   }
> 
>   if (vfp < KMB_CRTC_MIN_VFP) {
> - drm_dbg(dev, "vfp = %d less than %d", vfp,
> KMB_CRTC_MIN_VFP);
> + drm_dbg(dev, "vfp = %d less than %d\n", vfp,
> KMB_CRTC_MIN_VFP);
>   return MODE_BAD;
>   }
> 
> diff --git a/drivers/gpu/drm/kmb/kmb_plane.c
> b/drivers/gpu/drm/kmb/kmb_plane.c
> index 9e0562aa2bcb..308bd1cb50c8 100644
> --- a/drivers/gpu/drm/kmb/kmb_plane.c
> +++ b/drivers/gpu/drm/kmb/kmb_plane.c
> @@ -78,7 +78,7 @@ static unsigned int check_pixel_format(struct drm_plane
> *plane, u32 format)
>* plane configuration is not supported.
>*/
>   if (init_disp_cfg.format && init_disp_cfg.format != format) {
> - drm_dbg(>drm, "Cannot change format after initial
> plane configuration");
> + drm_dbg(>drm, "Cannot change format after initial
> plane configuration\n");
>   return -EINVAL;
>   }
>   for (i = 0; i < plane->format_count; i++) {
> @@ -124,7 +124,7 @@ static int kmb_plane_atomic_check(struct drm_plane
> *plane,
>   if ((init_disp_cfg.width && init_disp_cfg.height) &&
>   (init_disp_cfg.width != fb->width ||
>   init_disp_cfg.height != fb->height)) {
> - drm_dbg(>drm, "Cannot change plane height or width
> after initial configuration");
> + drm_dbg(>drm, "Cannot change plane height or width
> after initial configuration\n");
>   return -EINVAL;
>   }
>   can_position = (plane->type == DRM_PLANE_TYPE_OVERLAY);
> @@ -375,7 +375,7 @@ static void kmb_plane_atomic_update(struct
> drm_plane *plane,
>   spin_lock_irq(>irq_lock);
>   if (kmb->kmb_under_flow || kmb->kmb_flush_done) {
>   spin_unlock_irq(>irq_lock);
> - drm_dbg(>drm, "plane_update:underflow
> returning");
> + drm_dbg(>drm, "plane_update:underflow
> returning\n");
>   return;
>   }
>   spin_unlock_irq(>irq_lock);
> --
> 2.41.0

Re: [PATCH v3 3/5] drm/msm: add trailing newlines to drm_dbg msgs

2023-09-06 Thread Abhinav Kumar


Hi Jim

On 9/6/2023 12:02 PM, Jim Cromie wrote:

By at least strong convention, a print-buffer's trailing newline says
"message complete, send it".  The exception (no TNL, followed by a call
to pr_cont) proves the general rule.

Most DRM.debug calls already comport with this: 207 DRM_DEV_DEBUG,
1288 drm_dbg.  Clean up the remainders, in maintainer sized chunks.


May I know what 207, 1288 mean here? Is it the number of callers already 
having \n?


If so, this might be a big confusing as its subjective to the code-base 
you are referring to. So I will just stop with "Most DRM.debug calls 
already comport with this".




No functional changes.

Signed-off-by: Jim Cromie 
---
  drivers/gpu/drm/msm/msm_fb.c | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)



The change itself LGTM, hence

Reviewed-by: Abhinav Kumar

Re: [PATCHv3] drm/amdkfd: Fix unaligned 64-bit doorbell warning

2023-09-06 Thread Felix Kuehling


On 2023-09-06 11:39, Mukul Joshi wrote:

This patch fixes the following unaligned 64-bit doorbell
warning seen when submitting packets on HIQ on GFX v9.4.3
by making the HIQ doorbell 64-bit aligned.
The warning is seen when GPU is loaded in any mode other
than SPX mode.

[  +0.000301] [ cut here ]
[  +0.03] Unaligned 64-bit doorbell
[  +0.30] WARNING: /amdkfd/kfd_doorbell.c:339 
write_kernel_doorbell64+0x72/0x80
[  +0.03] RIP: 0010:write_kernel_doorbell64+0x72/0x80
[  +0.04] RSP: 0018:c90004287730 EFLAGS: 00010246
[  +0.05] RAX:  RBX:  RCX: 
[  +0.03] RDX: 0001 RSI: 82837c71 RDI: 
[  +0.03] RBP: c90004287748 R08: 0003 R09: 0001
[  +0.02] R10: 001a R11: 88a034008198 R12: c900013bd004
[  +0.03] R13: 0008 R14: c900042877b0 R15: 007f
[  +0.03] FS:  7fa8c7b62000() GS:889f8840() 
knlGS:
[  +0.04] CS:  0010 DS:  ES:  CR0: 80050033
[  +0.03] CR2: 56111c45aaf0 CR3: 0001414f2002 CR4: 00770ee0
[  +0.03] PKRU: 5554
[  +0.02] Call Trace:
[  +0.04]  
[  +0.06]  kq_submit_packet+0x45/0x50 [amdgpu]
[  +0.000524]  pm_send_set_resources+0x7f/0xc0 [amdgpu]
[  +0.000500]  set_sched_resources+0xe4/0x160 [amdgpu]
[  +0.000503]  start_cpsch+0x1c5/0x2a0 [amdgpu]
[  +0.000497]  kgd2kfd_device_init.cold+0x816/0xb42 [amdgpu]
[  +0.000743]  amdgpu_amdkfd_device_init+0x15f/0x1f0 [amdgpu]
[  +0.000602]  amdgpu_device_init.cold+0x1813/0x2176 [amdgpu]
[  +0.000684]  ? pci_bus_read_config_word+0x4a/0x80
[  +0.12]  ? do_pci_enable_device+0xdc/0x110
[  +0.08]  amdgpu_driver_load_kms+0x1a/0x110 [amdgpu]
[  +0.000545]  amdgpu_pci_probe+0x197/0x400 [amdgpu]

Fixes: cfeaeb3c0ce7 ("drm/amdgpu: use doorbell mgr for kfd kernel doorbells")
Signed-off-by: Mukul Joshi 


Reviewed-by: Felix Kuehling 



---
v1->v2:
- Update the logic to make it work with both 32 bit
   64 bit doorbells.
- Add the Fixed tag
v2->v3:
- Revert to the original change to align it with whats done in
   amdgpu_doorbell_index_on_bar.

  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
index c2e0b79dcc6d..7b38537c7c99 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
@@ -162,6 +162,7 @@ void __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
return NULL;
  
  	*doorbell_off = amdgpu_doorbell_index_on_bar(kfd->adev, kfd->doorbells, inx);

+   inx *= 2;
  
  	pr_debug("Get kernel queue doorbell\n"

" doorbell offset   == 0x%08X\n"
@@ -176,6 +177,7 @@ void kfd_release_kernel_doorbell(struct kfd_dev *kfd, u32 
__iomem *db_addr)
unsigned int inx;
  
  	inx = (unsigned int)(db_addr - kfd->doorbell_kernel_ptr);

+   inx /= 2;
  
  	mutex_lock(>doorbell_mutex);

__clear_bit(inx, kfd->doorbell_bitmap);

Re: [PATCH v2 07/34] drm/amd/display: explicitly define EOTF and inverse EOTF

2023-09-06 Thread Harry Wentland




On 2023-08-25 10:18, Melissa Wen wrote:
> On 08/22, Pekka Paalanen wrote:
>> On Thu, 10 Aug 2023 15:02:47 -0100
>> Melissa Wen  wrote:
>>
>>> Instead of relying on color block names to get the transfer function
>>> intention regarding encoding pixel's luminance, define supported
>>> Electro-Optical Transfer Functions (EOTFs) and inverse EOTFs, that
>>> includes pure gamma or standardized transfer functions.
>>>
>>> Suggested-by: Harry Wentland 
>>> Signed-off-by: Melissa Wen 
>>> ---
>>>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 19 +++--
>>>  .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 69 +++
>>>  2 files changed, 67 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
>>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
>>> index c749c9cb3d94..f6251ed89684 100644
>>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
>>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
>>> @@ -718,14 +718,21 @@ extern const struct amdgpu_ip_block_version 
>>> dm_ip_block;
>>>  
>>>  enum amdgpu_transfer_function {
>>> AMDGPU_TRANSFER_FUNCTION_DEFAULT,
>>> -   AMDGPU_TRANSFER_FUNCTION_SRGB,
>>> -   AMDGPU_TRANSFER_FUNCTION_BT709,
>>> -   AMDGPU_TRANSFER_FUNCTION_PQ,
>>> +   AMDGPU_TRANSFER_FUNCTION_SRGB_EOTF,
>>> +   AMDGPU_TRANSFER_FUNCTION_BT709_EOTF,
>>> +   AMDGPU_TRANSFER_FUNCTION_PQ_EOTF,
>>> AMDGPU_TRANSFER_FUNCTION_LINEAR,
>>> AMDGPU_TRANSFER_FUNCTION_UNITY,
>>> -   AMDGPU_TRANSFER_FUNCTION_GAMMA22,
>>> -   AMDGPU_TRANSFER_FUNCTION_GAMMA24,
>>> -   AMDGPU_TRANSFER_FUNCTION_GAMMA26,
>>> +   AMDGPU_TRANSFER_FUNCTION_GAMMA22_EOTF,
>>> +   AMDGPU_TRANSFER_FUNCTION_GAMMA24_EOTF,
>>> +   AMDGPU_TRANSFER_FUNCTION_GAMMA26_EOTF,
>>> +   AMDGPU_TRANSFER_FUNCTION_SRGB_INV_EOTF,
>>> +   AMDGPU_TRANSFER_FUNCTION_BT709_INV_EOTF,
>>> +   AMDGPU_TRANSFER_FUNCTION_PQ_INV_EOTF,
>>> +   AMDGPU_TRANSFER_FUNCTION_GAMMA22_INV_EOTF,
>>> +   AMDGPU_TRANSFER_FUNCTION_GAMMA24_INV_EOTF,
>>> +   AMDGPU_TRANSFER_FUNCTION_GAMMA26_INV_EOTF,
>>> +AMDGPU_TRANSFER_FUNCTION_COUNT
>>>  };
>>>  
>>>  struct dm_plane_state {
>>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
>>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
>>> index 56ce008b9095..cc2187c0879a 100644
>>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
>>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
>>> @@ -85,18 +85,59 @@ void amdgpu_dm_init_color_mod(void)
>>>  }
>>>  
>>>  #ifdef AMD_PRIVATE_COLOR
>>> -static const struct drm_prop_enum_list 
>>> amdgpu_transfer_function_enum_list[] = {
>>> -   { AMDGPU_TRANSFER_FUNCTION_DEFAULT, "Default" },
>>> -   { AMDGPU_TRANSFER_FUNCTION_SRGB, "sRGB" },
>>> -   { AMDGPU_TRANSFER_FUNCTION_BT709, "BT.709" },
>>> -   { AMDGPU_TRANSFER_FUNCTION_PQ, "PQ (Perceptual Quantizer)" },
>>> -   { AMDGPU_TRANSFER_FUNCTION_LINEAR, "Linear" },
>>> -   { AMDGPU_TRANSFER_FUNCTION_UNITY, "Unity" },
>>> -   { AMDGPU_TRANSFER_FUNCTION_GAMMA22, "Gamma 2.2" },
>>> -   { AMDGPU_TRANSFER_FUNCTION_GAMMA24, "Gamma 2.4" },
>>> -   { AMDGPU_TRANSFER_FUNCTION_GAMMA26, "Gamma 2.6" },
>>> +static const char * const
>>> +amdgpu_transfer_function_names[] = {
>>> +   [AMDGPU_TRANSFER_FUNCTION_DEFAULT]  = "Default",
>>> +   [AMDGPU_TRANSFER_FUNCTION_LINEAR]   = "Linear",
>>
>> Hi,
>>
>> if the below is identity, then what is linear? Is there a coefficient
>> (multiplier) somewhere? Offset?
>>
>>> +   [AMDGPU_TRANSFER_FUNCTION_UNITY]= "Unity",
>>
>> Should "Unity" be called "Identity"?
> 
> AFAIU, AMD treats Linear and Unity as the same: Identity. So, IIUC,
> indeed merging both as identity sounds the best approach.   

Agreed.

>>
>> Doesn't unity mean that the output is always 1.0 regardless of input?
>>
>>> +   [AMDGPU_TRANSFER_FUNCTION_SRGB_EOTF]= "sRGB EOTF",
>>> +   [AMDGPU_TRANSFER_FUNCTION_BT709_EOTF]   = "BT.709 EOTF",
>>
>> BT.709 says about "Overall opto-electronic transfer characteristics at
>> source":
>>
>>  In typical production practice the encoding function of image
>>  sources is adjusted so that the final picture has the desired
>>  look, as viewed on a reference monitor having the reference
>>  decoding function of Recommendation ITU-R BT.1886, in the
>>  reference viewing environment defined in Recommendation ITU-R
>>  BT.2035.
>>
>> IOW, typically people tweak the encoding function instead of using
>> BT.709 OETF as is, which means that inverting the BT.709 OETF produces
>> something slightly unknown. The note about BT.1886 means that that
>> something is also not quite how it's supposed to be turned into light.
>>
>> Should this enum item be "BT.709 inverse OETF" and respectively below a
>> "BT.709 OETF"?
>>
>> What curve does the hardware actually implement?
> 
> H.. I think I got confused in using OETF here since it's done within
> a camera. Looking at the coefficients used by AMD color

Re: [V11 3/8] wifi: mac80211: Add support for WBRF features

2023-09-06 Thread Mario Limonciello


On 9/1/2023 09:32, Jeff Johnson wrote:

On 8/30/2023 11:20 PM, Evan Quan wrote:

To support the WBRF mechanism, Wifi adapters utilized in the system must


Since this is the first mention of WBRF in the core wireless code IMO 
you should indicate what this is an acronym for and briefly describe it

(or add a lore link).


A lot of information is captured in the cover letter and earlier 
commits.  I think you raise a good point that 10 years from now someone 
looking at random commits will have a hard time understanding what 
exactly WBRF stands for.


How about if we introduce a wbrf.rst somewhere in Documentation/ that 
explains the basic principles of how/why for it.  This Documentation 
patch could be the first in the series and then the commit message for 
wireless subsystem can tell people to look at that path for more 
information.




I'm wondering if WBRF is just a special case of frequency avoidance, and 
that more generic naming/terminology should be used in core wireless.
For example, I know there are vendor-specific solutions which allow 
Wi-Fi to avoid using channels which may conflict with cellular or 
BlueTooth, and those may benefit from a more generic




It seems to me that most vendor solutions that exist don't operate in 
the kernel code but usually in firmware based solutions, right?


I think to come up with a generic solution we need to first have a 
vendor that "wants" to participate in a generic solution to design it 
properly.



register the frequencies in use(or unregister those frequencies no longer
used) via the dedicated calls. So that, other drivers responding to the
frequencies can take proper actions to mitigate possible interference.

Co-developed-by: Mario Limonciello 
Signed-off-by: Mario Limonciello 
Co-developed-by: Evan Quan 
Signed-off-by: Evan Quan 
--
v1->v2:
   - place the new added member(`wbrf_supported`) in
 ieee80211_local(Johannes)
   - handle chandefs change scenario properly(Johannes)
   - some minor fixes around code sharing and possible invalid input
 checks(Johannes)
v2->v3:
   - drop unnecessary input checks and intermediate APIs(Mario)
   - Separate some mac80211 common code(Mario, Johannes)
v3->v4:
   - some minor fixes around return values(Johannes)
v9->v10:
   - get ranges_in->num_of_ranges set and passed in(Johannes)
---
  include/linux/ieee80211.h  |   1 +
  net/mac80211/Makefile  |   2 +
  net/mac80211/chan.c    |   9 
  net/mac80211/ieee80211_i.h |   9 
  net/mac80211/main.c    |   2 +
  net/mac80211/wbrf.c    | 105 +
  6 files changed, 128 insertions(+)
  create mode 100644 net/mac80211/wbrf.c

diff --git a/include/linux/ieee80211.h b/include/linux/ieee80211.h
index 4b998090898e..f995d06da87f 100644
--- a/include/linux/ieee80211.h
+++ b/include/linux/ieee80211.h
@@ -4335,6 +4335,7 @@ static inline int 
ieee80211_get_tdls_action(struct sk_buff *skb, u32 hdr_size)

  /* convert frequencies */
  #define MHZ_TO_KHZ(freq) ((freq) * 1000)
  #define KHZ_TO_MHZ(freq) ((freq) / 1000)
+#define KHZ_TO_HZ(freq)  ((freq) * 1000)
  #define PR_KHZ(f) KHZ_TO_MHZ(f), f % 1000
  #define KHZ_F "%d.%03d"
diff --git a/net/mac80211/Makefile b/net/mac80211/Makefile
index b8de44da1fb8..d46c36f55fd3 100644
--- a/net/mac80211/Makefile
+++ b/net/mac80211/Makefile
@@ -65,4 +65,6 @@ rc80211_minstrel-$(CONFIG_MAC80211_DEBUGFS) += \
  mac80211-$(CONFIG_MAC80211_RC_MINSTREL) += $(rc80211_minstrel-y)
+mac80211-y += wbrf.o
+
  ccflags-y += -DDEBUG
diff --git a/net/mac80211/chan.c b/net/mac80211/chan.c
index 68952752b599..458469c224ae 100644
--- a/net/mac80211/chan.c
+++ b/net/mac80211/chan.c
@@ -506,11 +506,16 @@ static void _ieee80211_change_chanctx(struct 
ieee80211_local *local,

  WARN_ON(!cfg80211_chandef_compatible(>conf.def, chandef));
+    ieee80211_remove_wbrf(local, >conf.def);
+
  ctx->conf.def = *chandef;
  /* check if min chanctx also changed */
  changed = IEEE80211_CHANCTX_CHANGE_WIDTH |
    _ieee80211_recalc_chanctx_min_def(local, ctx, rsvd_for);
+
+    ieee80211_add_wbrf(local, >conf.def);
+
  drv_change_chanctx(local, ctx, changed);
  if (!local->use_chanctx) {
@@ -668,6 +673,8 @@ static int ieee80211_add_chanctx(struct 
ieee80211_local *local,

  lockdep_assert_held(>mtx);
  lockdep_assert_held(>chanctx_mtx);
+    ieee80211_add_wbrf(local, >conf.def);
+
  if (!local->use_chanctx)
  local->hw.conf.radar_enabled = ctx->conf.radar_enabled;
@@ -748,6 +755,8 @@ static void ieee80211_del_chanctx(struct 
ieee80211_local *local,

  }
  ieee80211_recalc_idle(local);
+
+    ieee80211_remove_wbrf(local, >conf.def);
  }
  static void ieee80211_free_chanctx(struct ieee80211_local *local,
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index 91633a0b723e..719f2c892132 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -1600,6 +1600,8 @@ struct ieee80211_local {
  /* extended capabilities

[PATCH] drm/radeon: make fence wait in suballocator uninterrruptable

2023-09-06 Thread Alex Deucher

Commit 254986e324ad ("drm/radeon: Use the drm suballocation manager 
implementation.")
made the fence wait in amdgpu_sa_bo_new() interruptible but there is no
code to handle an interrupt. This caused the kernel to randomly explode
in high-VRAM-pressure situations so make it uninterruptible again.

Fixes: 254986e324ad ("drm/radeon: Use the drm suballocation manager 
implementation.")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2769
Signed-off-by: Alex Deucher 
CC: sta...@vger.kernel.org # 6.4+
CC: Simon Pilkington 
---
 drivers/gpu/drm/radeon/radeon_sa.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_sa.c 
b/drivers/gpu/drm/radeon/radeon_sa.c
index c87a57c9c592..22dd8b445685 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -123,7 +123,7 @@ int radeon_sa_bo_new(struct radeon_sa_manager *sa_manager,
 unsigned int size, unsigned int align)
 {
struct drm_suballoc *sa = drm_suballoc_new(_manager->base, size,
-  GFP_KERNEL, true, align);
+  GFP_KERNEL, false, align);
 
if (IS_ERR(sa)) {
*sa_bo = NULL;
-- 
2.41.0

Re: [PATCH v2 00/34] drm/amd/display: add AMD driver-specific properties for color mgmt

2023-09-06 Thread Harry Wentland

On 2023-08-10 12:02, Melissa Wen wrote:
> Hi all,
> 
> Here is the next version of our work to enable AMD driver-specific color
> management properties [1][2]. This series is a collection of
> contributions from Joshua, Harry, and me to enhance the AMD KMS color
> pipeline for Steam Deck/SteamOS by exposing additional pre-blending and
> post-blending color capabilities from those available in the current DRM
> KMS API[3].
> 
> The userspace case here is Gamescope which is the compositor for
> SteamOS. Gamescope is already using these features to implement its
> color management pipeline [4].
> 
> In this version, I try to address all concerns shared in the previous
> one, i.e.:
> - Replace DRM_ by AMDGPU_ prefix for transfer function enumeration; 
> - Explicitly define EOTFs and inverse EOTFs and set props accordingly;
> - Document pre-defined transfer functions;
> - Remove misleading comments;
> - Remove post-blending/MPC shaper and 3D LUT support;
> - Move driver-specific property operations from amdgpu_display.c to
>   amdgpu_dm_color.c;
> - Reset planes if any color props change;
> - Nits/small fixes;
> 
> Bearing in mind the complexity of color concepts, I believe there is a
> high chance of some misunderstanding from my side when defining EOTFs
> and documenting pre-defined TFs. So, reviews are very important and
> welcome (thanks in advance). FWIW, I added Harry as a co-developer of
> this TF documentation since I based on his description of EOTF/inv_EOTF
> and previous documentation work [5]. Let me know if there is a better
> way for credits.
> 
> Two DC patches were already applied and, therefore, removed from the
> series. I added r-b according to previous feedback. We also add plane
> CTM to driver-specific properties. As a result, this is the updated list
> of all driver-specific color properties exposed by this series:
> 
> - plane degamma LUT and pre-defined TF;
> - plane HDR multiplier;
> - plane CTM 3x4;
> - plane shaper LUT and pre-defined TF;
> - plane 3D LUT;
> - plane blend LUT and pre-defined TF;
> - CRTC gamma pre-defined TF;
> 
> Remember you can find the AMD HW color capabilities documented here:
> https://dri.freedesktop.org/docs/drm/gpu/amdgpu/display/display-manager.html#color-management-properties
> 
> Worth mentioning that the pre-blending degamma block can use ROM curves
> for some pre-defined TFs, but the other blocks use the AMD color module
> to calculate this curve considering pre-defined coefficients.
> 
> We need changes on DC gamut remap matrix to support the plane and CRTC
> CTM on drivers that support both. I've sent a previous patch to apply
> these changes to all DCN3+ families [6]. Here I use the same changes but
> limited to DCN301. Just let me know if you prefer the previous/expanded
> version.
> 
> Finally, this is the Linux/AMD color management API before and after
> blending with the driver-specific properties:
> 
> +--+
> |   PLANE  |
> |  |
> |  ++  |
> |  | AMD Degamma|  |
> |  ||  |
> |  | EOTF | 1D LUT  |  |
> |  ++---+  |
> |   |  |
> |  +v---+  |
> |  |AMD HDR |  |
> |  |Multiply|  |
> |  ++---+  |
> |   |  |
> |  +v---+  |
> |  |  AMD CTM (3x4) |  |
> |  ++---+  |
> |   |  |
> |  +v---+  |
> |  | AMD Shaper |  |
> |  ||  |
> |  | inv_EOTF | |  |
> |  | Custom 1D LUT  |  |
> |  ++---+  |
> |   |  |
> |  +v---+  |
> |  |   AMD 3D LUT   |  |
> |  |   17^3/12-bit  |  |
> |  ++---+  |
> |   |  |
> |  +v---+  |
> |  | AMD Blend  |  |
> |  ||  |
> |  | EOTF | 1D LUT  |  |
> |  ++---+  |
> |   |  |
> ++--v-++
> ||  Blending  ||
> ++--+-++
> |CRTC   |  |
> |   |  |
> |   +---v---+  |
> |   | DRM Degamma   |  |
> |   |   |  |
> |   | Custom 1D LUT |  |
> |   +---+---+  |
> |   |  |
> |   +---v---+  |
> |   | DRM CTM (3x3) |  |
> |   +---+---+  |
> |   |  |
> |   +---v---+  |
> |   | DRM Gamma |  |
> |   |   |  |
> |   | Custom 1D LUT |  |
> |   +---+  |
> |   | *AMD Gamma|  |
> |   |   inv_EOTF|  |
> |   +---+  |
> |  |
> +--+
> 
> Let me know your thoughts.
> 

Thanks again for your amazing work on this.

Patches 5, 6, 14, 16, and 24 are
Reviewed-by: Harry Wentland 

I left comments on the remaining unreviewed patches.

Harry

> Best Regards,
> 
> Melissa Wen
> 
> [1] https://lore.kernel.org/dri-devel/20230423141051.702990-1-m...@igalia.com
> [2] https://lore.kernel.org/dri-devel/20230523221520.3115570-1-m...@igalia.com
> [3] 
>

Re: [PATCH v2 11/34] drm/amd/display: add plane shaper LUT and TF driver-specific properties

2023-09-06 Thread Harry Wentland

On 2023-08-10 12:02, Melissa Wen wrote:
> On AMD HW, 3D LUT always assumes a preceding shaper 1D LUT used for
> delinearizing and/or normalizing the color space before applying a 3D
> LUT. Add pre-defined transfer function to enable delinearizing content
> with or without shaper LUT, where AMD color module calculates the
> resulted shaper curve. We apply an inverse EOTF to go from linear values
> to encoded values. If we are already in a non-linear space and/or don't
> need to normalize values, we can bypass shaper LUT with a linear
> transfer function that is also the default TF value.
> 

I think the color module will combine the TF and the custom 1D LUT
into the LUT that's actually programmed. We should spell out this
behavior in the comments below and in the patch description as it's
important for a userspace application to know.

The same applies to all other TF+LUT blocks.

Harry

> v2:
> - squash commits for shaper LUT and shaper TF
> - define inverse EOTF as supported shaper TFs
> 
> Signed-off-by: Melissa Wen 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  | 16 ++
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 11 +++
>  .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 29 +
>  .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   | 32 +++
>  4 files changed, 88 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> index 730a88236501..4fb164204ee6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> @@ -363,6 +363,22 @@ struct amdgpu_mode_info {
>* @plane_hdr_mult_property:
>*/
>   struct drm_property *plane_hdr_mult_property;
> + /**
> +  * @shaper_lut_property: Plane property to set pre-blending shaper LUT
> +  * that converts color content before 3D LUT.
> +  */
> + struct drm_property *plane_shaper_lut_property;
> + /**
> +  * @shaper_lut_size_property: Plane property for the size of
> +  * pre-blending shaper LUT as supported by the driver (read-only).
> +  */
> + struct drm_property *plane_shaper_lut_size_property;
> + /**
> +  * @plane_shaper_tf_property: Plane property to set a predefined
> +  * transfer function for pre-blending shaper (before applying 3D LUT)
> +  * with or without LUT.
> +  */
> + struct drm_property *plane_shaper_tf_property;
>   /**
>* @plane_lut3d_property: Plane property for gamma correction using a
>* 3D LUT (pre-blending).
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> index deea90212e31..6b6c2980f0af 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> @@ -769,6 +769,17 @@ struct dm_plane_state {
>* S31.32 sign-magnitude.
>*/
>   __u64 hdr_mult;
> + /**
> +  * @shaper_lut: shaper lookup table blob. The blob (if not NULL) is an
> +  * array of  drm_color_lut.
> +  */
> + struct drm_property_blob *shaper_lut;
> + /**
> +  * @shaper_tf:
> +  *
> +  * Predefined transfer function to delinearize color space.
> +  */
> + enum amdgpu_transfer_function shaper_tf;
>   /**
>* @lut3d: 3D lookup table blob. The blob (if not NULL) is an array of
>*  drm_color_lut.
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> index 7e6d4df99a0c..fbcee717bf0a 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> @@ -151,6 +151,14 @@ static const u32 amdgpu_eotf =
>   BIT(AMDGPU_TRANSFER_FUNCTION_GAMMA24_EOTF) |
>   BIT(AMDGPU_TRANSFER_FUNCTION_GAMMA26_EOTF);
>  
> +static const u32 amdgpu_inv_eotf =
> + BIT(AMDGPU_TRANSFER_FUNCTION_SRGB_INV_EOTF) |
> + BIT(AMDGPU_TRANSFER_FUNCTION_BT709_INV_EOTF) |
> + BIT(AMDGPU_TRANSFER_FUNCTION_PQ_INV_EOTF) |
> + BIT(AMDGPU_TRANSFER_FUNCTION_GAMMA22_INV_EOTF) |
> + BIT(AMDGPU_TRANSFER_FUNCTION_GAMMA24_INV_EOTF) |
> + BIT(AMDGPU_TRANSFER_FUNCTION_GAMMA26_INV_EOTF);
> +
>  static struct drm_property *
>  amdgpu_create_tf_property(struct drm_device *dev,
> const char *name,
> @@ -209,6 +217,27 @@ amdgpu_dm_create_color_properties(struct amdgpu_device 
> *adev)
>   return -ENOMEM;
>   adev->mode_info.plane_hdr_mult_property = prop;
>  
> + prop = drm_property_create(adev_to_drm(adev),
> +DRM_MODE_PROP_BLOB,
> +"AMD_PLANE_SHAPER_LUT", 0);
> + if (!prop)
> + return -ENOMEM;
> + adev->mode_info.plane_shaper_lut_property = prop;
> +
> + prop = drm_property_create_range(adev_to_drm(adev),
> +

Re: [PATCH v2 10/34] drm/amd/display: add plane 3D LUT driver-specific properties

2023-09-06 Thread Harry Wentland




On 2023-08-10 12:02, Melissa Wen wrote:
> Add 3D LUT property for plane gamma correction using a 3D lookup table.
> Since a 3D LUT has a limited number of entries in each dimension we want
> to use them in an optimal fashion. This means using the 3D LUT in a
> colorspace that is optimized for human vision, such as sRGB, PQ, or
> another non-linear space. Therefore, userpace may need one 1D LUT
> (shaper) before it to delinearize content and another 1D LUT after 3D
> LUT (blend) to linearize content again for blending. The next patches
> add these 1D LUTs to the plane color mgmt pipeline.
> 
> Signed-off-by: Melissa Wen 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  | 10 
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  9 
>  .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 14 +++
>  .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   | 23 +++
>  4 files changed, 56 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> index 66bae0eed80c..730a88236501 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> @@ -363,6 +363,16 @@ struct amdgpu_mode_info {
>* @plane_hdr_mult_property:
>*/
>   struct drm_property *plane_hdr_mult_property;
> + /**
> +  * @plane_lut3d_property: Plane property for gamma correction using a
> +  * 3D LUT (pre-blending).
> +  */

I think we'll want to describe how the 3DLUT entries are laid out.
Something that describes how userspace should fill it, like
gamescope does for example:
https://github.com/ValveSoftware/gamescope/blob/7108880ed80b68c21750369e2ac9b7315fecf264/src/color_helpers.cpp#L302

Something like: a three-dimensional array, with each dimension
having a size of the cubed root of lut3d_size, blue being the
outermost dimension, red the innermost.


> + struct drm_property *plane_lut3d_property;
> + /**
> +  * @plane_degamma_lut_size_property: Plane property to define the max
> +  * size of 3D LUT as supported by the driver (read-only).
> +  */

We should probably document that the size of the 3DLUT should
be the size of one dimension cubed, or that the cubed root of
the LUT size gives the size per dimension.

Harry

> + struct drm_property *plane_lut3d_size_property;
>  };
>  
>  #define AMDGPU_MAX_BL_LEVEL 0xFF
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> index 44f17ac11a5f..deea90212e31 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> @@ -769,6 +769,11 @@ struct dm_plane_state {
>* S31.32 sign-magnitude.
>*/
>   __u64 hdr_mult;
> + /**
> +  * @lut3d: 3D lookup table blob. The blob (if not NULL) is an array of
> +  *  drm_color_lut.
> +  */
> + struct drm_property_blob *lut3d;
>  };
>  
>  struct dm_crtc_state {
> @@ -854,6 +859,10 @@ void amdgpu_dm_update_freesync_caps(struct drm_connector 
> *connector,
>  
>  void amdgpu_dm_trigger_timing_sync(struct drm_device *dev);
>  
> +/* 3D LUT max size is 17x17x17 */
> +#define MAX_COLOR_3DLUT_ENTRIES 4913
> +#define MAX_COLOR_3DLUT_BITDEPTH 12
> +/* 1D LUT size */
>  #define MAX_COLOR_LUT_ENTRIES 4096
>  /* Legacy gamm LUT users such as X doesn't like large LUT sizes */
>  #define MAX_COLOR_LEGACY_LUT_ENTRIES 256
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> index b891aaf5f7c1..7e6d4df99a0c 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> @@ -209,6 +209,20 @@ amdgpu_dm_create_color_properties(struct amdgpu_device 
> *adev)
>   return -ENOMEM;
>   adev->mode_info.plane_hdr_mult_property = prop;
>  
> + prop = drm_property_create(adev_to_drm(adev),
> +DRM_MODE_PROP_BLOB,
> +"AMD_PLANE_LUT3D", 0);
> + if (!prop)
> + return -ENOMEM;
> + adev->mode_info.plane_lut3d_property = prop;
> +
> + prop = drm_property_create_range(adev_to_drm(adev),
> +  DRM_MODE_PROP_IMMUTABLE,
> +  "AMD_PLANE_LUT3D_SIZE", 0, UINT_MAX);
> + if (!prop)
> + return -ENOMEM;
> + adev->mode_info.plane_lut3d_size_property = prop;
> +
>   return 0;
>  }
>  #endif
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
> index ab7f0332c431..882391f7add6 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
> @@ -1353,6 +1353,8 @@ dm_drm_plane_duplicate_state(struct drm_plane *plane)
>  
>   if

Re: [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Alex Williamson

On Wed, 6 Sep 2023 11:51:59 +0800
Sui Jingfeng  wrote:

> Hi,
> 
> 
> On 2023/9/5 22:52, Alex Williamson wrote:
> > On Tue,  5 Sep 2023 03:57:15 +0800
> > Sui Jingfeng  wrote:
> >  
> >> From: Sui Jingfeng 
> >>
> >> On a machine with multiple GPUs, a Linux user has no control over which
> >> one is primary at boot time. This series tries to solve above mentioned
> >> problem by introduced the ->be_primary() function stub. The specific
> >> device drivers can provide an implementation to hook up with this stub by
> >> calling the vga_client_register() function.
> >>
> >> Once the driver bound the device successfully, VGAARB will call back to
> >> the device driver. To query if the device drivers want to be primary or
> >> not. Device drivers can just pass NULL if have no such needs.
> >>
> >> Please note that:
> >>
> >> 1) The ARM64, Loongarch, Mips servers have a lot PCIe slot, and I would
> >> like to mount at least three video cards.
> >>
> >> 2) Typically, those non-86 machines don't have a good UEFI firmware
> >> support, which doesn't support select primary GPU as firmware stage.
> >> Even on x86, there are old UEFI firmwares which already made undesired
> >> decision for you.
> >>
> >> 3) This series is attempt to solve the remain problems at the driver level,
> >> while another series[1] of me is target to solve the majority of the
> >> problems at device level.
> >>
> >> Tested (limited) on x86 with four video card mounted, Intel UHD Graphics
> >> 630 is the default boot VGA, successfully override by ast2400 with
> >> ast.modeset=10 append at the kernel cmd line.
> >>
> >> $ lspci | grep VGA
> >>
> >>   00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 
> >> [UHD Graphics 630]  
> > In all my previous experiments with VGA routing and IGD I found that
> > IGD can't actually release VGA routing and Intel confirmed the hardware
> > doesn't have the ability to do so.  
> 
> Which model of the IGD you are using? even for the IGD in Atom D2550,
> the legacy 128KB VGA memory range can be tuned to be mapped to IGD
> or to the DMI Interface. See the 1.7.3.2 section of the N2000 datasheet[1].

I believe it's the VGA I/O that can't be disabled, there's no means to
do so other than the I/O enable bit in the command register and iirc
the driver depends on this for other features.  The history of this is
pretty old, but here are some links:

https://lore.kernel.org/all/1376486637.31494.19.ca...@ul30vt.home/
https://bbs.archlinux.org/viewtopic.php?pid=1400212#p1400212
https://lore.kernel.org/all/20130815223917.27890.28003.st...@bling.home/
https://lore.kernel.org/all/20130824144701.23370.42110.st...@bling.home/
https://lore.kernel.org/all/20140509201655.2849.97478.st...@bling.home/

I think the issue was that i915 doesn't claim to the VGA arbiter to be
controlling legacy VGA ranges, but in fact the hardware does claim
those ranges.  We can "fix" i915 to report that VGA MMIO space is
owned and can be controlled, but then Xorg likely sees multiple VGA
arbiter clients and disables DRI because it wants to mmap VGA MMIO
space.

Therefore unless something has changed in the past 10yrs, i915 owns but
does not advertise ownership of the VGA address spaces and therefore
the arbiter can't and doesn't know to change VGA routing to enable a
"be_primary" path to another device.
 
> If a specific model of Intel has a bug in the VGA routing hardware logic unit,
> I would like to ignore it. Or switch to the UEFI firmware on such hardware.

That's a convenient and impractical approach.  I expect all Intel HD
graphics has this issue.  Unknown for Xe.

> It is the hardware engineer's responsibility, I will not worry about it.

We often need to deal with broken hardware in the kernel.

> Thanks for you tell this.
> 
> [1] 
> https://www.intel.com/content/dam/doc/datasheet/atom-d2000-n2000-vol-2-datasheet.pdf
> 
> 
> >   It will always be primary from a
> > VGA routing perspective.  Was this actually tested with non-UEFI?  
> 
> 
> As you already said, the generous Intel already have confirmed that the 
> hardware defect.
> So probably this is a good chance to switch to UEFI to solve the problem. 
> Then, no
> testing for legacy is needed.

Then why are we hacking on VGA arbitration in this series at all?

> > I suspect it might only work in UEFI mode where we probably don't
> > actually have a dependency on VGA routing.  This is essentially why
> > vfio requires UEFI ROMs when assigning GPUs to VMs, VGA routing is too
> > broken to use on Intel systems with IGD.  Thanks,  
> 
> Thanks for you tell me this.
> 
> To be honest, I have only tested my patch on machines with UEFI firmware.
> Since UEFI because the main stream, but if this patch is really useful for
> majority machine, I'm satisfied. The results is not too bad.

This looks like a pretty significant scoping issue if you're proposing
changes to the VGA arbiter which specifically handles the routing of
legacy VGA address spaces

RE: [PATCH v3] drm/amdgpu: Add EXT_COHERENT memory allocation flags

2023-09-06 Thread Yat Sin, David

[AMD Official Use Only - General]

Reviewed-by: David Yat Sin 

> -Original Message-
> From: Kuehling, Felix 
> Sent: Friday, July 28, 2023 4:00 PM
> To: Francis, David ; amd-gfx@lists.freedesktop.org;
> Yat Sin, David 
> Subject: Re: [PATCH v3] drm/amdgpu: Add EXT_COHERENT memory allocation
> flags
>
> On 2023-07-28 15:39, David Francis wrote:
> > These flags (for GEM and SVM allocations) allocate memory that allows
> > for system-scope atomic semantics.
> >
> > On GFX943 these flags cause caches to be avoided on non-local memory.
> >
> > On all other ASICs they are identical in functionality to the
> > equivalent COHERENT flags.
> >
> > Corresponding Thunk patch is at
> > https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88
> >
> > v3: changed name of flag
> >
> > Signed-off-by: David Francis 
>
> I made one comment on the user mode patch regarding the explicit handling
> of invalid combinations of Uncached, Coherent, ExtCoherent flags. I'm not
> sure what we agreed on any more. But I don't think we want to just leave it up
> to chance. Other than that, this patch looks good to me.
>
> Regards,
>Felix
>
>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c |  2 ++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c  |  1 +
> >   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c   |  1 +
> >   drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c   |  1 +
> >   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c|  5 -
> >   drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 10 +-
> >   include/uapi/drm/amdgpu_drm.h| 10 +-
> >   include/uapi/linux/kfd_ioctl.h   |  3 +++
> >   8 files changed, 30 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > index d34c3ef8f3ed..a1ce261f2d06 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > @@ -1738,6 +1738,8 @@ int
> amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
> >
> > if (flags & KFD_IOC_ALLOC_MEM_FLAGS_COHERENT)
> > alloc_flags |= AMDGPU_GEM_CREATE_COHERENT;
> > +   if (flags & KFD_IOC_ALLOC_MEM_FLAGS_EXT_COHERENT)
> > +   alloc_flags |= AMDGPU_GEM_CREATE_EXT_COHERENT;
> > if (flags & KFD_IOC_ALLOC_MEM_FLAGS_UNCACHED)
> > alloc_flags |= AMDGPU_GEM_CREATE_UNCACHED;
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> > index 12210598e5b8..76b618735dc0 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> > @@ -331,6 +331,7 @@ amdgpu_dma_buf_create_obj(struct drm_device
> *dev,
> > struct dma_buf *dma_buf)
> >
> > flags |= other->flags &
> (AMDGPU_GEM_CREATE_CPU_GTT_USWC |
> >  AMDGPU_GEM_CREATE_COHERENT
> |
> > +
> AMDGPU_GEM_CREATE_EXT_COHERENT |
> >
> AMDGPU_GEM_CREATE_UNCACHED);
> > }
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> > b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> > index 6b430e10d38e..301ffe30824f 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> > @@ -632,6 +632,7 @@ static void gmc_v10_0_get_vm_pte(struct
> amdgpu_device *adev,
> > }
> >
> > if (bo && bo->flags & (AMDGPU_GEM_CREATE_COHERENT |
> > +  AMDGPU_GEM_CREATE_EXT_COHERENT |
> >AMDGPU_GEM_CREATE_UNCACHED))
> > *flags = (*flags & ~AMDGPU_PTE_MTYPE_NV10_MASK) |
> >  AMDGPU_PTE_MTYPE_NV10(MTYPE_UC); diff --git
> > a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> > b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> > index a6ee0220db56..846894e212e7 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> > @@ -540,6 +540,7 @@ static void gmc_v11_0_get_vm_pte(struct
> amdgpu_device *adev,
> > }
> >
> > if (bo && bo->flags & (AMDGPU_GEM_CREATE_COHERENT |
> > +  AMDGPU_GEM_CREATE_EXT_COHERENT |
> >AMDGPU_GEM_CREATE_UNCACHED))
> > *flags = (*flags & ~AMDGPU_PTE_MTYPE_NV10_MASK) |
> >  AMDGPU_PTE_MTYPE_NV10(MTYPE_UC); diff --git
> > a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> > b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> > index 880460cd3239..92a623e130d9 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> > @@ -1183,7 +1183,8 @@ static void gmc_v9_0_get_coherence_flags(struct
> amdgpu_device *adev,
> >   {
> > struct amdgpu_device *bo_adev = amdgpu_ttm_adev(bo->tbo.bdev);
> > bool is_vram = bo->tbo.resource->mem_type == TTM_PL_VRAM;
> > -   bool coherent = bo->flags & AMDGPU_GEM_CREATE_COHERENT;
> > +   bool coherent = bo->flags & (AMDGPU_GEM_CREATE_COHERENT |
>

Re: [PATCH v2 01/34] drm/amd/display: fix segment distribution for linear LUTs

2023-09-06 Thread Harry Wentland

On 2023-08-10 12:02, Melissa Wen wrote:
> From: Harry Wentland 
> 
> The region and segment calculation was incapable of dealing
> with regions of more than 16 segments. We first fix this.
> 
> Now that we can support regions up to 256 elements we can
> define a better segment distribution for near-linear LUTs
> for our maximum of 256 HW-supported points.
> 
> With these changes an "identity" LUT looks visually
> indistinguishable from bypass and allows us to use
> our 3DLUT.
> 

Have you had a chance to test whether this patch makes a
difference? I haven't had the time yet.

Harry

> Signed-off-by: Harry Wentland 
> Signed-off-by: Melissa Wen 
> ---
>  .../amd/display/dc/dcn10/dcn10_cm_common.c| 93 +++
>  1 file changed, 75 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_cm_common.c 
> b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_cm_common.c
> index 3538973bd0c6..04b2e04b68f3 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_cm_common.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_cm_common.c
> @@ -349,20 +349,37 @@ bool cm_helper_translate_curve_to_hw_format(struct 
> dc_context *ctx,
>* segment is from 2^-10 to 2^1
>* There are less than 256 points, for optimization
>*/
> - seg_distr[0] = 3;
> - seg_distr[1] = 4;
> - seg_distr[2] = 4;
> - seg_distr[3] = 4;
> - seg_distr[4] = 4;
> - seg_distr[5] = 4;
> - seg_distr[6] = 4;
> - seg_distr[7] = 4;
> - seg_distr[8] = 4;
> - seg_distr[9] = 4;
> - seg_distr[10] = 1;
> + if (output_tf->tf == TRANSFER_FUNCTION_LINEAR) {
> + seg_distr[0] = 0; /* 2 */
> + seg_distr[1] = 1; /* 4 */
> + seg_distr[2] = 2; /* 4 */
> + seg_distr[3] = 3; /* 8 */
> + seg_distr[4] = 4; /* 16 */
> + seg_distr[5] = 5; /* 32 */
> + seg_distr[6] = 6; /* 64 */
> + seg_distr[7] = 7; /* 128 */
> +
> + region_start = -8;
> + region_end = 1;
> + } else {
> + seg_distr[0] = 3; /* 8 */
> + seg_distr[1] = 4; /* 16 */
> + seg_distr[2] = 4;
> + seg_distr[3] = 4;
> + seg_distr[4] = 4;
> + seg_distr[5] = 4;
> + seg_distr[6] = 4;
> + seg_distr[7] = 4;
> + seg_distr[8] = 4;
> + seg_distr[9] = 4;
> + seg_distr[10] = 1; /* 2 */
> + /* total = 8*16 + 8 + 64 + 2 = */
> +
> + region_start = -10;
> + region_end = 1;
> + }
> +
>  
> - region_start = -10;
> - region_end = 1;
>   }
>  
>   for (i = region_end - region_start; i < MAX_REGIONS_NUMBER ; i++)
> @@ -375,16 +392,56 @@ bool cm_helper_translate_curve_to_hw_format(struct 
> dc_context *ctx,
>  
>   j = 0;
>   for (k = 0; k < (region_end - region_start); k++) {
> - increment = NUMBER_SW_SEGMENTS / (1 << seg_distr[k]);
> + /*
> +  * We're using an ugly-ish hack here. Our HW allows for
> +  * 256 segments per region but SW_SEGMENTS is 16.
> +  * SW_SEGMENTS has some undocumented relationship to
> +  * the number of points in the tf_pts struct, which
> +  * is 512, unlike what's suggested TRANSFER_FUNC_POINTS.
> +  *
> +  * In order to work past this dilemma we'll scale our
> +  * increment by (1 << 4) and then do the inverse (1 >> 4)
> +  * when accessing the elements in tf_pts.
> +  *
> +  * TODO: find a better way using SW_SEGMENTS and
> +  *   TRANSFER_FUNC_POINTS definitions
> +  */
> + increment = (NUMBER_SW_SEGMENTS << 4) / (1 << seg_distr[k]);
>   start_index = (region_start + k + MAX_LOW_POINT) *
>   NUMBER_SW_SEGMENTS;
> - for (i = start_index; i < start_index + NUMBER_SW_SEGMENTS;
> + for (i = (start_index << 4); i < (start_index << 4) + 
> (NUMBER_SW_SEGMENTS << 4);
>   i += increment) {
> + struct fixed31_32 in_plus_one, in;
> + struct fixed31_32 value, red_value, green_value, 
> blue_value;
> + uint32_t t = i & 0xf;
> +
>   if (j == hw_points - 1)
>   break;
> - rgb_resulted[j].red = output_tf->tf_pts.red[i];
> - rgb_resulted[j].green = output_tf->tf_pts.green[i];
> - rgb_resulted[j].blue =

[PATCH v3 5/5] drm/Makefile: use correct ccflags-y syntax

2023-09-06 Thread Jim Cromie

Incorrect CFLAGS- usage failed to add -DDYNAMIC_DEBUG_MODULE when needed,
which broke builds with:

CONFIG_DRM_USE_DYNAMIC_DEBUG=Y
CONFIG_DYNAMIC_DEBUG_CORE=Y
CONFIG_DYNAMIC_DEBUG=N

Also add subdir-ccflags so that all drivers pick up the addition.

Fixes: 84ec67288c10 ("drm_print: wrap drm_*_dbg in dyndbg descriptor factory 
macro")
Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 7a09a89b493b..013cde886326 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -3,7 +3,8 @@
 # Makefile for the drm device driver.  This driver provides support for the
 # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
 
-CFLAGS-$(CONFIG_DRM_USE_DYNAMIC_DEBUG) += -DDYNAMIC_DEBUG_MODULE
+ccflags-$(CONFIG_DRM_USE_DYNAMIC_DEBUG)+= 
-DDYNAMIC_DEBUG_MODULE
+subdir-ccflags-$(CONFIG_DRM_USE_DYNAMIC_DEBUG) += -DDYNAMIC_DEBUG_MODULE
 
 drm-y := \
drm_aperture.o \
-- 
2.41.0

[PATCH v3 4/5] drm/vc4: add trailing newlines to drm_dbg msgs

2023-09-06 Thread Jim Cromie

By at least strong convention, a print-buffer's trailing newline says
"message complete, send it".  The exception (no TNL, followed by a call
to pr_cont) proves the general rule.

Most DRM.debug calls already comport with this: 207 DRM_DEV_DEBUG,
1288 drm_dbg.  Clean up the remainders, in maintainer sized chunks.

No functional changes.

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/vc4/vc4_crtc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
index bef9d45ef1df..959123759711 100644
--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -592,7 +592,7 @@ static void vc4_crtc_atomic_disable(struct drm_crtc *crtc,
struct drm_encoder *encoder = vc4_get_crtc_encoder(crtc, old_state);
struct drm_device *dev = crtc->dev;
 
-   drm_dbg(dev, "Disabling CRTC %s (%u) connected to Encoder %s (%u)",
+   drm_dbg(dev, "Disabling CRTC %s (%u) connected to Encoder %s (%u)\n",
crtc->name, crtc->base.id, encoder->name, encoder->base.id);
 
require_hvs_enabled(dev);
@@ -620,7 +620,7 @@ static void vc4_crtc_atomic_enable(struct drm_crtc *crtc,
struct vc4_encoder *vc4_encoder = to_vc4_encoder(encoder);
int idx;
 
-   drm_dbg(dev, "Enabling CRTC %s (%u) connected to Encoder %s (%u)",
+   drm_dbg(dev, "Enabling CRTC %s (%u) connected to Encoder %s (%u)\n",
crtc->name, crtc->base.id, encoder->name, encoder->base.id);
 
if (!drm_dev_enter(dev, ))
-- 
2.41.0

[PATCH v3 1/5] drm/connector: add trailing newlines to drm_dbg msgs

2023-09-06 Thread Jim Cromie

By at least strong convention, a print-buffer's trailing newline says
"message complete, send it".  The exception (no TNL, followed by a call
to pr_cont) proves the general rule.

Most DRM.debug calls already comport with this: 207 DRM_DEV_DEBUG,
1288 drm_dbg.  Clean up the remainders, in maintainer sized chunks.

No functional changes.

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/drm_connector.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index f28725736237..14020585bdc0 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -2925,7 +2925,9 @@ int drm_mode_getconnector(struct drm_device *dev, void 
*data,
 dev->mode_config.max_width,
 
dev->mode_config.max_height);
else
-   drm_dbg_kms(dev, "User-space requested a forced probe 
on [CONNECTOR:%d:%s] but is not the DRM master, demoting to read-only probe",
+   drm_dbg_kms(dev,
+   "User-space requested a forced probe on 
[CONNECTOR:%d:%s] "
+   "but is not the DRM master, demoting to 
read-only probe\n",
connector->base.id, connector->name);
}
 
-- 
2.41.0

[PATCH v3 3/5] drm/msm: add trailing newlines to drm_dbg msgs

2023-09-06 Thread Jim Cromie

By at least strong convention, a print-buffer's trailing newline says
"message complete, send it".  The exception (no TNL, followed by a call
to pr_cont) proves the general rule.

Most DRM.debug calls already comport with this: 207 DRM_DEV_DEBUG,
1288 drm_dbg.  Clean up the remainders, in maintainer sized chunks.

No functional changes.

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/msm/msm_fb.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_fb.c b/drivers/gpu/drm/msm/msm_fb.c
index e3f61c39df69..88bb5fa23bb1 100644
--- a/drivers/gpu/drm/msm/msm_fb.c
+++ b/drivers/gpu/drm/msm/msm_fb.c
@@ -89,7 +89,7 @@ int msm_framebuffer_prepare(struct drm_framebuffer *fb,
 
for (i = 0; i < n; i++) {
ret = msm_gem_get_and_pin_iova(fb->obj[i], aspace, 
_fb->iova[i]);
-   drm_dbg_state(fb->dev, "FB[%u]: iova[%d]: %08llx (%d)",
+   drm_dbg_state(fb->dev, "FB[%u]: iova[%d]: %08llx (%d)\n",
  fb->base.id, i, msm_fb->iova[i], ret);
if (ret)
return ret;
@@ -176,9 +176,9 @@ static struct drm_framebuffer *msm_framebuffer_init(struct 
drm_device *dev,
const struct msm_format *format;
int ret, i, n;
 
-   drm_dbg_state(dev, "create framebuffer: mode_cmd=%p (%dx%d@%4.4s)",
-   mode_cmd, mode_cmd->width, mode_cmd->height,
-   (char *)_cmd->pixel_format);
+   drm_dbg_state(dev, "create framebuffer: mode_cmd=%p (%dx%d@%4.4s)\n",
+ mode_cmd, mode_cmd->width, mode_cmd->height,
+ (char *)_cmd->pixel_format);
 
n = info->num_planes;
format = kms->funcs->get_format(kms, mode_cmd->pixel_format,
@@ -232,7 +232,7 @@ static struct drm_framebuffer *msm_framebuffer_init(struct 
drm_device *dev,
 
refcount_set(_fb->dirtyfb, 1);
 
-   drm_dbg_state(dev, "create: FB ID: %d (%p)", fb->base.id, fb);
+   drm_dbg_state(dev, "create: FB ID: %d (%p)\n", fb->base.id, fb);
 
return fb;
 
-- 
2.41.0

[PATCH v3 2/5] drm/kmb: add trailing newlines to drm_dbg msgs

2023-09-06 Thread Jim Cromie

By at least strong convention, a print-buffer's trailing newline says
"message complete, send it".  The exception (no TNL, followed by a call
to pr_cont) proves the general rule.

Most DRM.debug calls already comport with this: 207 DRM_DEV_DEBUG,
1288 drm_dbg.  Clean up the remainders, in maintainer sized chunks.

No functional changes.

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/kmb/kmb_crtc.c  | 10 +-
 drivers/gpu/drm/kmb/kmb_plane.c |  6 +++---
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/kmb/kmb_crtc.c b/drivers/gpu/drm/kmb/kmb_crtc.c
index 647872f65bff..a58baf25322d 100644
--- a/drivers/gpu/drm/kmb/kmb_crtc.c
+++ b/drivers/gpu/drm/kmb/kmb_crtc.c
@@ -94,7 +94,7 @@ static void kmb_crtc_set_mode(struct drm_crtc *crtc,
vm.hback_porch = 0;
vm.hsync_len = 28;
 
-   drm_dbg(dev, "%s : %dactive height= %d vbp=%d vfp=%d vsync-w=%d 
h-active=%d h-bp=%d h-fp=%d hsync-l=%d",
+   drm_dbg(dev, "%s : %dactive height= %d vbp=%d vfp=%d vsync-w=%d 
h-active=%d h-bp=%d h-fp=%d hsync-l=%d\n",
__func__, __LINE__,
m->crtc_vdisplay, vm.vback_porch, vm.vfront_porch,
vm.vsync_len, m->crtc_hdisplay, vm.hback_porch,
@@ -194,24 +194,24 @@ static enum drm_mode_status
int vfp = mode->vsync_start - mode->vdisplay;
 
if (mode->vdisplay < KMB_CRTC_MAX_HEIGHT) {
-   drm_dbg(dev, "height = %d less than %d",
+   drm_dbg(dev, "height = %d less than %d\n",
mode->vdisplay, KMB_CRTC_MAX_HEIGHT);
return MODE_BAD_VVALUE;
}
if (mode->hdisplay < KMB_CRTC_MAX_WIDTH) {
-   drm_dbg(dev, "width = %d less than %d",
+   drm_dbg(dev, "width = %d less than %d\n",
mode->hdisplay, KMB_CRTC_MAX_WIDTH);
return MODE_BAD_HVALUE;
}
refresh = drm_mode_vrefresh(mode);
if (refresh < KMB_MIN_VREFRESH || refresh > KMB_MAX_VREFRESH) {
-   drm_dbg(dev, "refresh = %d less than %d or greater than %d",
+   drm_dbg(dev, "refresh = %d less than %d or greater than %d\n",
refresh, KMB_MIN_VREFRESH, KMB_MAX_VREFRESH);
return MODE_BAD;
}
 
if (vfp < KMB_CRTC_MIN_VFP) {
-   drm_dbg(dev, "vfp = %d less than %d", vfp, KMB_CRTC_MIN_VFP);
+   drm_dbg(dev, "vfp = %d less than %d\n", vfp, KMB_CRTC_MIN_VFP);
return MODE_BAD;
}
 
diff --git a/drivers/gpu/drm/kmb/kmb_plane.c b/drivers/gpu/drm/kmb/kmb_plane.c
index 9e0562aa2bcb..308bd1cb50c8 100644
--- a/drivers/gpu/drm/kmb/kmb_plane.c
+++ b/drivers/gpu/drm/kmb/kmb_plane.c
@@ -78,7 +78,7 @@ static unsigned int check_pixel_format(struct drm_plane 
*plane, u32 format)
 * plane configuration is not supported.
 */
if (init_disp_cfg.format && init_disp_cfg.format != format) {
-   drm_dbg(>drm, "Cannot change format after initial plane 
configuration");
+   drm_dbg(>drm, "Cannot change format after initial plane 
configuration\n");
return -EINVAL;
}
for (i = 0; i < plane->format_count; i++) {
@@ -124,7 +124,7 @@ static int kmb_plane_atomic_check(struct drm_plane *plane,
if ((init_disp_cfg.width && init_disp_cfg.height) &&
(init_disp_cfg.width != fb->width ||
init_disp_cfg.height != fb->height)) {
-   drm_dbg(>drm, "Cannot change plane height or width after 
initial configuration");
+   drm_dbg(>drm, "Cannot change plane height or width after 
initial configuration\n");
return -EINVAL;
}
can_position = (plane->type == DRM_PLANE_TYPE_OVERLAY);
@@ -375,7 +375,7 @@ static void kmb_plane_atomic_update(struct drm_plane *plane,
spin_lock_irq(>irq_lock);
if (kmb->kmb_under_flow || kmb->kmb_flush_done) {
spin_unlock_irq(>irq_lock);
-   drm_dbg(>drm, "plane_update:underflow returning");
+   drm_dbg(>drm, "plane_update:underflow returning\n");
return;
}
spin_unlock_irq(>irq_lock);
-- 
2.41.0

[PATCH v3 0/5] drm/drm_dbg: add trailing newlines where missing

2023-09-06 Thread Jim Cromie

By at least strong convention, a print-buffer's trailing newline says
"message complete, send it".  The exception (no TNL, followed by a call
to pr_cont) proves the general rule.

Most DRM.debug calls already comport with this rule/convention:
207 DRM_DEV_DEBUG, 1288 drm_dbg.  Clean up the remainders, in
maintainer sized chunks.

V3: adds proper "drm/:" to subject, as suggested by Rodrigo.
drops drm/i915: already applied by Rodrigo.

Jim Cromie (5):
  drm/connector: add trailing newlines to drm_dbg msgs
  drm/kmb: add trailing newlines to drm_dbg msgs
  drm/msm: add trailing newlines to drm_dbg msgs
  drm/vc4: add trailing newlines to drm_dbg msgs
  drm/Makefile: use correct ccflags-y syntax

 drivers/gpu/drm/Makefile|  3 ++-
 drivers/gpu/drm/drm_connector.c |  4 +++-
 drivers/gpu/drm/kmb/kmb_crtc.c  | 10 +-
 drivers/gpu/drm/kmb/kmb_plane.c |  6 +++---
 drivers/gpu/drm/msm/msm_fb.c| 10 +-
 drivers/gpu/drm/vc4/vc4_crtc.c  |  4 ++--
 6 files changed, 20 insertions(+), 17 deletions(-)

-- 
2.41.0

Re: [PATCH v2 34/34] drm/amd/display: Use 3x4 CTM for plane CTM

2023-09-06 Thread Harry Wentland




On 2023-08-10 12:03, Melissa Wen wrote:
> From: Joshua Ashton 
> 
> Signed-off-by: Joshua Ashton 
> Signed-off-by: Melissa Wen 
> ---
>  .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 32 +--
>  .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   |  2 +-
>  include/uapi/drm/drm_mode.h   |  8 +
>  3 files changed, 38 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> index 7ff329101fd4..0a51af44efd5 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> @@ -412,6 +412,32 @@ static void __drm_ctm_to_dc_matrix(const struct 
> drm_color_ctm *ctm,
>   }
>  }
>  
> +/**
> + * __drm_ctm2_to_dc_matrix - converts a DRM CTM2 to a DC CSC float matrix
> + * @ctm: DRM color transformation matrix
> + * @matrix: DC CSC float matrix
> + *
> + * The matrix needs to be a 3x4 (12 entry) matrix.
> + */
> +static void __drm_ctm2_to_dc_matrix(const struct drm_color_ctm2 *ctm,
> +struct fixed31_32 *matrix)
> +{
> + int i;
> +
> + /*
> +  * DRM gives a 3x3 matrix, but DC wants 3x4. Assuming we're operating
> +  * with homogeneous coordinates, augment the matrix with 0's.
> +  *

Left-over copy-paste comment. This version takes 3x4 as input param.

> +  * The format provided is S31.32, using signed-magnitude representation.
> +  * Our fixed31_32 is also S31.32, but is using 2's complement. We have
> +  * to convert from signed-magnitude to 2's complement.
> +  */
> + for (i = 0; i < 12; i++) {
> + /* gamut_remap_matrix[i] = ctm[i - floor(i/4)] */
> + matrix[i] = dc_fixpt_from_s3132(ctm->matrix[i]);
> + }
> +}
> +
>  /**
>   * __set_legacy_tf - Calculates the legacy transfer function
>   * @func: transfer function
> @@ -1159,7 +1185,7 @@ int amdgpu_dm_update_plane_color_mgmt(struct 
> dm_crtc_state *crtc,
>  {
>   struct amdgpu_device *adev = drm_to_adev(crtc->base.state->dev);
>   struct dm_plane_state *dm_plane_state = to_dm_plane_state(plane_state);
> - struct drm_color_ctm *ctm = NULL;
> + struct drm_color_ctm2 *ctm = NULL;
>   struct dc_color_caps *color_caps = NULL;
>   bool has_crtc_cm_degamma;
>   int ret;
> @@ -1213,7 +1239,7 @@ int amdgpu_dm_update_plane_color_mgmt(struct 
> dm_crtc_state *crtc,
>  
>   /* Setup CRTC CTM. */
>   if (dm_plane_state->ctm) {
> - ctm = (struct drm_color_ctm *)dm_plane_state->ctm->data;
> + ctm = (struct drm_color_ctm2 *)dm_plane_state->ctm->data;
>  
>   /*
>* So far, if we have both plane and CRTC CTM, plane CTM takes
> @@ -1224,7 +1250,7 @@ int amdgpu_dm_update_plane_color_mgmt(struct 
> dm_crtc_state *crtc,
>* provide support for both DPP and MPC matrix at the same
>* time.
>*/
> - __drm_ctm_to_dc_matrix(ctm, 
> dc_plane_state->gamut_remap_matrix.matrix);
> + __drm_ctm2_to_dc_matrix(ctm, 
> dc_plane_state->gamut_remap_matrix.matrix);
>  
>   dc_plane_state->gamut_remap_matrix.enable_remap = true;
>   dc_plane_state->input_csc_color_matrix.enable_adjustment = 
> false;
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
> index 0b1081c690cb..27962a3d30f5 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
> @@ -1543,7 +1543,7 @@ dm_atomic_plane_set_property(struct drm_plane *plane,
>   ret = drm_property_replace_blob_from_id(plane->dev,
>   _plane_state->ctm,
>   val,
> - sizeof(struct 
> drm_color_ctm), -1,
> + sizeof(struct 
> drm_color_ctm2), -1,

We need to update the comment for dm_plane_state.ctm in amdgpu_dm.h
to specify the property is of type drm_color_ctm2 (or drm_color_ctm_3x4).

>   );
>   dm_plane_state->base.color_mgmt_changed |= replaced;
>   return ret;
> diff --git a/include/uapi/drm/drm_mode.h b/include/uapi/drm/drm_mode.h
> index 46becedf5b2f..402288133e4c 100644
> --- a/include/uapi/drm/drm_mode.h
> +++ b/include/uapi/drm/drm_mode.h
> @@ -838,6 +838,14 @@ struct drm_color_ctm {
>   __u64 matrix[9];
>  };
>  
> +struct drm_color_ctm2 {

Calling this drm_color_ctm_3x4 might be good to make it clear this is
for a 3x4 matrix.

Harry

> + /*
> +  * Conversion matrix in S31.32 sign-magnitude
> +  * (not two's complement!) format.
> +  */
> + __u64 matrix[12];
> +};
> +
>  struct drm_color_lut

Re: [PATCH v2 33/34] drm/amd/display: add plane CTM support

2023-09-06 Thread Harry Wentland




On 2023-08-10 12:03, Melissa Wen wrote:
> Map the plane CTM driver-specific property to DC plane, instead of DC
> stream. The remaining steps to program DPP block are already implemented
> on DC shared-code.
> 
> Signed-off-by: Melissa Wen 
> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  1 +
>  .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 25 +++
>  2 files changed, 26 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index dfe61c5ed49e..f239410234b3 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -9578,6 +9578,7 @@ static bool should_reset_plane(struct drm_atomic_state 
> *state,
>   if (dm_old_other_state->degamma_tf != 
> dm_new_other_state->degamma_tf ||
>   dm_old_other_state->degamma_lut != 
> dm_new_other_state->degamma_lut ||
>   dm_old_other_state->hdr_mult != 
> dm_new_other_state->hdr_mult ||
> + dm_old_other_state->ctm != dm_new_other_state->ctm ||
>   dm_old_other_state->shaper_lut != 
> dm_new_other_state->shaper_lut ||
>   dm_old_other_state->shaper_tf != 
> dm_new_other_state->shaper_tf ||
>   dm_old_other_state->lut3d != dm_new_other_state->lut3d ||
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> index 86a918ab82be..7ff329101fd4 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> @@ -1158,6 +1158,8 @@ int amdgpu_dm_update_plane_color_mgmt(struct 
> dm_crtc_state *crtc,
> struct dc_plane_state *dc_plane_state)
>  {
>   struct amdgpu_device *adev = drm_to_adev(crtc->base.state->dev);
> + struct dm_plane_state *dm_plane_state = to_dm_plane_state(plane_state);
> + struct drm_color_ctm *ctm = NULL;
>   struct dc_color_caps *color_caps = NULL;
>   bool has_crtc_cm_degamma;
>   int ret;
> @@ -1209,6 +1211,29 @@ int amdgpu_dm_update_plane_color_mgmt(struct 
> dm_crtc_state *crtc,
>   return ret;
>   }
>  
> + /* Setup CRTC CTM. */
> + if (dm_plane_state->ctm) {
> + ctm = (struct drm_color_ctm *)dm_plane_state->ctm->data;
> +
> + /*
> +  * So far, if we have both plane and CRTC CTM, plane CTM takes
> +  * the priority and we discard data for CRTC CTM, as
> +  * implemented in dcn10_program_gamut_remap().  However, we

Isn't it the opposite? If stream (crtc) has a CTM we program that, only if
stream doesn't have a CTM we program the plane one?

Harry

> +  * have MPC gamut_remap_matrix from DCN3 family, therefore we
> +  * can remap MPC programing of the matrix to MPC block and
> +  * provide support for both DPP and MPC matrix at the same
> +  * time.
> +  */
> + __drm_ctm_to_dc_matrix(ctm, 
> dc_plane_state->gamut_remap_matrix.matrix);
> +
> + dc_plane_state->gamut_remap_matrix.enable_remap = true;
> + dc_plane_state->input_csc_color_matrix.enable_adjustment = 
> false;
> + } else {
> + /* Bypass CTM. */
> + dc_plane_state->gamut_remap_matrix.enable_remap = false;
> + dc_plane_state->input_csc_color_matrix.enable_adjustment = 
> false;
> + }
> +
>   return amdgpu_dm_plane_set_color_properties(plane_state,
>   dc_plane_state, color_caps);
>  }

Re: [PATCH v2 32/34] drm/amd/display: add plane CTM driver-specific property

2023-09-06 Thread Harry Wentland




On 2023-08-10 12:03, Melissa Wen wrote:
> Plane CTM for pre-blending color space conversion. Only enable
> driver-specific plane CTM property on drivers that support both pre- and
> post-blending gamut remap matrix, i.e., DCN3+ family. Otherwise it
> conflits with DRM CRTC CTM property.
> 
> Signed-off-by: Melissa Wen 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  |  2 ++
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  7 +++
>  .../amd/display/amdgpu_dm/amdgpu_dm_color.c   |  7 +++
>  .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   | 20 +++
>  4 files changed, 36 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> index abb871a912d7..84bf501b02f4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> @@ -363,6 +363,8 @@ struct amdgpu_mode_info {
>* @plane_hdr_mult_property:
>*/
>   struct drm_property *plane_hdr_mult_property;
> +
> + struct drm_property *plane_ctm_property;
>   /**
>* @shaper_lut_property: Plane property to set pre-blending shaper LUT
>* that converts color content before 3D LUT.
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> index 095f39f04210..6252ee912a63 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> @@ -769,6 +769,13 @@ struct dm_plane_state {
>* S31.32 sign-magnitude.
>*/
>   __u64 hdr_mult;
> + /**
> +  * @ctm:
> +  *
> +  * Color transformation matrix. See drm_crtc_enable_color_mgmt(). The
> +  * blob (if not NULL) is a  drm_color_ctm.
> +  */
> + struct drm_property_blob *ctm;
>   /**
>* @shaper_lut: shaper lookup table blob. The blob (if not NULL) is an
>* array of  drm_color_lut.
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> index 4356846a2bce..86a918ab82be 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> @@ -218,6 +218,13 @@ amdgpu_dm_create_color_properties(struct amdgpu_device 
> *adev)
>   return -ENOMEM;
>   adev->mode_info.plane_hdr_mult_property = prop;
>  
> + prop = drm_property_create(adev_to_drm(adev),
> +DRM_MODE_PROP_BLOB,
> +"AMD_PLANE_CTM", 0);

We'll want to wrap the property creation/attachment with
#ifdef AMD_PRIVATE_COLOR here as well.

Harry

> + if (!prop)
> + return -ENOMEM;
> + adev->mode_info.plane_ctm_property = prop;
> +
>   prop = drm_property_create(adev_to_drm(adev),
>  DRM_MODE_PROP_BLOB,
>  "AMD_PLANE_SHAPER_LUT", 0);
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
> index 3fd57de7c5be..0b1081c690cb 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
> @@ -1355,6 +1355,8 @@ dm_drm_plane_duplicate_state(struct drm_plane *plane)
>  
>   if (dm_plane_state->degamma_lut)
>   drm_property_blob_get(dm_plane_state->degamma_lut);
> + if (dm_plane_state->ctm)
> + drm_property_blob_get(dm_plane_state->ctm);
>   if (dm_plane_state->shaper_lut)
>   drm_property_blob_get(dm_plane_state->shaper_lut);
>   if (dm_plane_state->lut3d)
> @@ -1436,6 +1438,8 @@ static void dm_drm_plane_destroy_state(struct drm_plane 
> *plane,
>  
>   if (dm_plane_state->degamma_lut)
>   drm_property_blob_put(dm_plane_state->degamma_lut);
> + if (dm_plane_state->ctm)
> + drm_property_blob_put(dm_plane_state->ctm);
>   if (dm_plane_state->lut3d)
>   drm_property_blob_put(dm_plane_state->lut3d);
>   if (dm_plane_state->shaper_lut)
> @@ -1473,6 +1477,11 @@ dm_atomic_plane_attach_color_mgmt_properties(struct 
> amdgpu_display_manager *dm,
>  dm->adev->mode_info.plane_hdr_mult_property,
>  AMDGPU_HDR_MULT_DEFAULT);
>  
> + /* Only enable plane CTM if both DPP and MPC gamut remap is available. 
> */
> + if (dm->dc->caps.color.mpc.gamut_remap)
> + drm_object_attach_property(>base,
> +
> dm->adev->mode_info.plane_ctm_property, 0);
> +
>   if (dpp_color_caps.hw_3d_lut) {
>   drm_object_attach_property(>base,
>  mode_info.plane_shaper_lut_property, 
> 0);
> @@ -1530,6 +1539,14 @@ dm_atomic_plane_set_property(struct drm_plane *plane,
>   dm_plane_state->hdr_mult = val;
>

Re: [PATCH v2 31/34] drm/amd/display: set stream gamut remap matrix to MPC for DCN301

2023-09-06 Thread Harry Wentland




On 2023-08-28 04:20, Pekka Paalanen wrote:
> On Fri, 25 Aug 2023 13:37:08 -0100
> Melissa Wen  wrote:
> 
>> On 08/22, Pekka Paalanen wrote:
>>> On Thu, 10 Aug 2023 15:03:11 -0100
>>> Melissa Wen  wrote:
>>>   
 dc->caps.color.mpc.gamut_remap says there is a post-blending color block
 for gamut remap matrix for DCN3 HW family and newer versions. However,
 those drivers still follow DCN10 programming that remap stream
 gamut_remap_matrix to DPP (pre-blending).  
>>>
>>> That's ok only as long as CRTC degamma is pass-through. Blending itself
>>> is a linear operation, so it doesn't matter if a matrix is applied to
>>> the blending result or to all blending inputs. But you cannot move a
>>> matrix operation to the other side of a non-linear operation, and you
>>> cannot move a non-linear operation across blending.  
>>
>> Oh, I'm not moving it, what I'm doing here is the opposite and fixing
>> it. This patch puts each pre- and post-blending CTM in their right
>> place, since we have the HW caps for it on DCN3+... Or are you just
>> pointing out the implementation mistake on old driver versions?
> 
> It's just the old mistake.
> 
> I hope no-one complains, forcing you to revert this fix as a regression.
> 

I'm worried this will break other OSes since its in DC and shared. I'll
check with Kruno when he's back from vacation. But most likely this will
be problematic.

Worst case we can add a new "program_gamut_remap_actually_post_blending"
(with a better name) function to HWSS, expose it in DC, and make sure
amdgpu_dm never calls the old "program_gamut_remap".

I hope nobody relies on the current (IMO broken) behavior on Linux.

Harry

> 
> Thanks,
> pq
> 
> 
 To enable pre-blending and post-blending gamut_remap matrix supports at
 the same time, set stream gamut_remap to MPC and plane gamut_remap to
 DPP for DCN301 that support both.

 It was tested using IGT KMS color tests for DRM CRTC CTM property and it
 preserves test results.

 Signed-off-by: Melissa Wen 
 ---
  .../drm/amd/display/dc/dcn30/dcn30_hwseq.c| 37 +++
  .../drm/amd/display/dc/dcn30/dcn30_hwseq.h|  3 ++
  .../drm/amd/display/dc/dcn301/dcn301_init.c   |  2 +-
  3 files changed, 41 insertions(+), 1 deletion(-)

Re: [PATCH v2 3/6] drm_dbg: add trailing newlines to msgs

2023-09-06 Thread jim . cromie

On Wed, Sep 6, 2023 at 10:42 AM Rodrigo Vivi  wrote:
>
> On Mon, Sep 04, 2023 at 08:32:40AM +0200, Andi Shyti wrote:
> > Hi Jim,
> >
> > On Sun, Sep 03, 2023 at 12:46:00PM -0600, Jim Cromie wrote:
> > > By at least strong convention, a print-buffer's trailing newline says
> > > "message complete, send it".  The exception (no TNL, followed by a call
> > > to pr_cont) proves the general rule.
> > >
> > > Most DRM.debug calls already comport with this: 207 DRM_DEV_DEBUG,
> > > 1288 drm_dbg.  Clean up the remainders, in maintainer sized chunks.
> > >
> > > No functional changes.
> > >
> > > Signed-off-by: Jim Cromie 
> >
> > Reviewed-by: Andi Shyti 
>
> I pushed this i915 one to our drm-intel-next.
> While doing it I have changed the subject to make it clear
> this is 'drm/i915:'.
>
> I believe you should do similar change to all the other patches
> to make it clear in the subject about which domain that commit
> is touching... instead of only 'drm_dbg'.
>

I will do that, and drop the one you've already pushed.
Thank you both.


> i.e.: 183670347b06 ("drm/i915: add trailing newlines to msgs")
> https://cgit.freedesktop.org/drm-intel/commit/?h=drm-intel-next=183670347b060521920a81f84ff7f10e227ebe05
>
> Thanks for the patch,
> Rodrigo.
>
> >
> > Andi

Re: [PATCH v2 29/34] drm/amd/display: allow newer DC hardware to use degamma ROM for PQ/HLG

2023-09-06 Thread Harry Wentland




On 2023-08-10 12:03, Melissa Wen wrote:
> From: Joshua Ashton 
> 
> Need to funnel the color caps through to these functions so it can check
> that the hardware is capable.
> 
> v2:
> - remove redundant color caps assignment on plane degamma map (Harry)
> - pass color caps to degamma params
> 
> Signed-off-by: Joshua Ashton 
> Signed-off-by: Melissa Wen 
> ---
>  .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 35 ---
>  1 file changed, 22 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> index f638e5b3a70b..4356846a2bce 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> @@ -538,6 +538,7 @@ static int amdgpu_dm_set_atomic_regamma(struct 
> dc_stream_state *stream,
>  /**
>   * __set_input_tf - calculates the input transfer function based on expected
>   * input space.
> + * @caps: dc color capabilities
>   * @func: transfer function
>   * @lut: lookup table that defines the color space
>   * @lut_size: size of respective lut.
> @@ -545,7 +546,7 @@ static int amdgpu_dm_set_atomic_regamma(struct 
> dc_stream_state *stream,
>   * Returns:
>   * 0 in case of success. -ENOMEM if fails.
>   */
> -static int __set_input_tf(struct dc_transfer_func *func,
> +static int __set_input_tf(struct dc_color_caps *caps, struct 
> dc_transfer_func *func,
> const struct drm_color_lut *lut, uint32_t lut_size)
>  {
>   struct dc_gamma *gamma = NULL;
> @@ -562,7 +563,7 @@ static int __set_input_tf(struct dc_transfer_func *func,
>   __drm_lut_to_dc_gamma(lut, gamma, false);
>   }
>  
> - res = mod_color_calculate_degamma_params(NULL, func, gamma, gamma != 
> NULL);
> + res = mod_color_calculate_degamma_params(caps, func, gamma, gamma != 
> NULL);
>  
>   if (gamma)
>   dc_gamma_release();
> @@ -725,7 +726,7 @@ static int amdgpu_dm_atomic_blend_lut(const struct 
> drm_color_lut *blend_lut,
>   func_blend->tf = tf;
>   func_blend->sdr_ref_white_level = SDR_WHITE_LEVEL_INIT_VALUE;
>  
> - ret = __set_input_tf(func_blend, blend_lut, blend_size);
> + ret = __set_input_tf(NULL, func_blend, blend_lut, blend_size);
>   } else {
>   func_blend->type = TF_TYPE_BYPASS;
>   func_blend->tf = TRANSFER_FUNCTION_LINEAR;
> @@ -950,7 +951,8 @@ int amdgpu_dm_update_crtc_color_mgmt(struct dm_crtc_state 
> *crtc)
>  
>  static int
>  map_crtc_degamma_to_dc_plane(struct dm_crtc_state *crtc,
> -  struct dc_plane_state *dc_plane_state)
> +  struct dc_plane_state *dc_plane_state,
> +  struct dc_color_caps *caps)
>  {
>   const struct drm_color_lut *degamma_lut;
>   enum dc_transfer_func_predefined tf = TRANSFER_FUNCTION_SRGB;
> @@ -1005,7 +1007,7 @@ map_crtc_degamma_to_dc_plane(struct dm_crtc_state *crtc,
>   dc_plane_state->in_transfer_func->tf =
>   TRANSFER_FUNCTION_LINEAR;
>  
> - r = __set_input_tf(dc_plane_state->in_transfer_func,
> + r = __set_input_tf(caps, dc_plane_state->in_transfer_func,
>  degamma_lut, degamma_size);
>   if (r)
>   return r;
> @@ -1018,7 +1020,7 @@ map_crtc_degamma_to_dc_plane(struct dm_crtc_state *crtc,
>   dc_plane_state->in_transfer_func->tf = tf;
>  
>   if (tf != TRANSFER_FUNCTION_SRGB &&
> - !mod_color_calculate_degamma_params(NULL,
> + !mod_color_calculate_degamma_params(caps,
>   
> dc_plane_state->in_transfer_func,
>   NULL, false))
>   return -ENOMEM;
> @@ -1029,7 +1031,8 @@ map_crtc_degamma_to_dc_plane(struct dm_crtc_state *crtc,
>  
>  static int
>  __set_dm_plane_degamma(struct drm_plane_state *plane_state,
> -struct dc_plane_state *dc_plane_state)
> +struct dc_plane_state *dc_plane_state,
> +struct dc_color_caps *color_caps)
>  {
>   struct dm_plane_state *dm_plane_state = to_dm_plane_state(plane_state);
>   const struct drm_color_lut *degamma_lut;
> @@ -1060,7 +1063,7 @@ __set_dm_plane_degamma(struct drm_plane_state 
> *plane_state,
>   dc_plane_state->in_transfer_func->type =
>   TF_TYPE_DISTRIBUTED_POINTS;
>  
> - ret = __set_input_tf(dc_plane_state->in_transfer_func,
> + ret = __set_input_tf(color_caps, 
> dc_plane_state->in_transfer_func,
>degamma_lut, degamma_size);
>   if (ret)
>   return ret;
> @@ -1068,7 +1071,7 @@ __set_dm_plane_degamma(struct drm_plane_state 
>

Re: [PATCH] drm/amdgpu: fix unsigned error codes

2023-09-06 Thread Deucher, Alexander

[AMD Official Use Only - General]

Reviewed-by: Alex Deucher 

From: Yu, Lang 
Sent: Wednesday, September 6, 2023 7:42 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Deucher, Alexander ; Gopalakrishnan, 
Veerabadhran (Veera) ; Yu, Lang 
; Dan Carpenter 
Subject: [PATCH] drm/amdgpu: fix unsigned error codes

Fixes: 77b13b916728 ("drm/amdgpu: add selftest framework for UMSCH")

Signed-off-by: Lang Yu 
Reported-by: Dan Carpenter 
Link: https://lore.kernel.org/all/ZPhddADtKmOuVyDq@lang-desktop
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c
index 284643e1efeb..9da80b54d63e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c
@@ -335,11 +335,10 @@ static int setup_umsch_mm_test(struct amdgpu_device *adev,
 if (r)
 goto error_free_vm;

-   test->pasid = amdgpu_pasid_alloc(16);
-   if (test->pasid < 0) {
-   r = test->pasid;
+   r = amdgpu_pasid_alloc(16);
+   if (r < 0)
 goto error_fini_vm;
-   }
+   test->pasid = r;

 r = amdgpu_bo_create_kernel(adev, sizeof(struct 
umsch_mm_test_ctx_data),
 PAGE_SIZE, AMDGPU_GEM_DOMAIN_GTT,
--
2.25.1

Re: [PATCH v2 3/6] drm_dbg: add trailing newlines to msgs

2023-09-06 Thread Rodrigo Vivi

On Mon, Sep 04, 2023 at 08:32:40AM +0200, Andi Shyti wrote:
> Hi Jim,
> 
> On Sun, Sep 03, 2023 at 12:46:00PM -0600, Jim Cromie wrote:
> > By at least strong convention, a print-buffer's trailing newline says
> > "message complete, send it".  The exception (no TNL, followed by a call
> > to pr_cont) proves the general rule.
> > 
> > Most DRM.debug calls already comport with this: 207 DRM_DEV_DEBUG,
> > 1288 drm_dbg.  Clean up the remainders, in maintainer sized chunks.
> > 
> > No functional changes.
> > 
> > Signed-off-by: Jim Cromie 
> 
> Reviewed-by: Andi Shyti  

I pushed this i915 one to our drm-intel-next.
While doing it I have changed the subject to make it clear
this is 'drm/i915:'.

I believe you should do similar change to all the other patches
to make it clear in the subject about which domain that commit
is touching... instead of only 'drm_dbg'.

i.e.: 183670347b06 ("drm/i915: add trailing newlines to msgs")
https://cgit.freedesktop.org/drm-intel/commit/?h=drm-intel-next=183670347b060521920a81f84ff7f10e227ebe05

Thanks for the patch,
Rodrigo.

> 
> Andi

Re: [PATCHv3] drm/amdkfd: Fix unaligned 64-bit doorbell warning

2023-09-06 Thread Alex Deucher

+ Shashank

On Wed, Sep 6, 2023 at 11:45 AM Mukul Joshi  wrote:
>
> This patch fixes the following unaligned 64-bit doorbell
> warning seen when submitting packets on HIQ on GFX v9.4.3
> by making the HIQ doorbell 64-bit aligned.
> The warning is seen when GPU is loaded in any mode other
> than SPX mode.
>
> [  +0.000301] [ cut here ]
> [  +0.03] Unaligned 64-bit doorbell
> [  +0.30] WARNING: /amdkfd/kfd_doorbell.c:339 
> write_kernel_doorbell64+0x72/0x80
> [  +0.03] RIP: 0010:write_kernel_doorbell64+0x72/0x80
> [  +0.04] RSP: 0018:c90004287730 EFLAGS: 00010246
> [  +0.05] RAX:  RBX:  RCX: 
> 
> [  +0.03] RDX: 0001 RSI: 82837c71 RDI: 
> 
> [  +0.03] RBP: c90004287748 R08: 0003 R09: 
> 0001
> [  +0.02] R10: 001a R11: 88a034008198 R12: 
> c900013bd004
> [  +0.03] R13: 0008 R14: c900042877b0 R15: 
> 007f
> [  +0.03] FS:  7fa8c7b62000() GS:889f8840() 
> knlGS:
> [  +0.04] CS:  0010 DS:  ES:  CR0: 80050033
> [  +0.03] CR2: 56111c45aaf0 CR3: 0001414f2002 CR4: 
> 00770ee0
> [  +0.03] PKRU: 5554
> [  +0.02] Call Trace:
> [  +0.04]  
> [  +0.06]  kq_submit_packet+0x45/0x50 [amdgpu]
> [  +0.000524]  pm_send_set_resources+0x7f/0xc0 [amdgpu]
> [  +0.000500]  set_sched_resources+0xe4/0x160 [amdgpu]
> [  +0.000503]  start_cpsch+0x1c5/0x2a0 [amdgpu]
> [  +0.000497]  kgd2kfd_device_init.cold+0x816/0xb42 [amdgpu]
> [  +0.000743]  amdgpu_amdkfd_device_init+0x15f/0x1f0 [amdgpu]
> [  +0.000602]  amdgpu_device_init.cold+0x1813/0x2176 [amdgpu]
> [  +0.000684]  ? pci_bus_read_config_word+0x4a/0x80
> [  +0.12]  ? do_pci_enable_device+0xdc/0x110
> [  +0.08]  amdgpu_driver_load_kms+0x1a/0x110 [amdgpu]
> [  +0.000545]  amdgpu_pci_probe+0x197/0x400 [amdgpu]
>
> Fixes: cfeaeb3c0ce7 ("drm/amdgpu: use doorbell mgr for kfd kernel doorbells")
> Signed-off-by: Mukul Joshi 
> ---
> v1->v2:
> - Update the logic to make it work with both 32 bit
>   64 bit doorbells.
> - Add the Fixed tag
> v2->v3:
> - Revert to the original change to align it with whats done in
>   amdgpu_doorbell_index_on_bar.
>
>  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> index c2e0b79dcc6d..7b38537c7c99 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> @@ -162,6 +162,7 @@ void __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
> return NULL;
>
> *doorbell_off = amdgpu_doorbell_index_on_bar(kfd->adev, 
> kfd->doorbells, inx);
> +   inx *= 2;
>
> pr_debug("Get kernel queue doorbell\n"
> " doorbell offset   == 0x%08X\n"
> @@ -176,6 +177,7 @@ void kfd_release_kernel_doorbell(struct kfd_dev *kfd, u32 
> __iomem *db_addr)
> unsigned int inx;
>
> inx = (unsigned int)(db_addr - kfd->doorbell_kernel_ptr);
> +   inx /= 2;
>
> mutex_lock(>doorbell_mutex);
> __clear_bit(inx, kfd->doorbell_bitmap);
> --
> 2.35.1
>

Re: [RFC, drm-misc-next v4 3/9] drm/radeon: Implement .be_primary() callback

2023-09-06 Thread Alex Deucher

On Tue, Sep 5, 2023 at 1:25 PM suijingfeng  wrote:
>
> Hi,
>
>
> On 2023/9/5 13:50, Christian König wrote:
> > Am 04.09.23 um 21:57 schrieb Sui Jingfeng:
> >> From: Sui Jingfeng 
> >>
> >> On a machine with multiple GPUs, a Linux user has no control over
> >> which one
> >> is primary at boot time.
> >
> > Question is why is that useful? Should we give users the ability to
> > control that?
> >
> > I don't see an use case for this.
> >
>
> On a specific machine with multiple GPUs mounted, only the
> primary graphics get POST-ed (initialized) by the firmware.
> Therefore the DRM drivers for the rest video cards have to
> work without the prerequisite setups done by firmware, This
> is called as POST.

I think that should be regarded as a bug in the driver that should be
fixed and this would not help with that case.  If a driver can't
initialize a device without aid from the pre-OS environment, that
should be fixed in the driver.  This solution also doesn't fix which
device is selected as the primary by the pre-OS environment.  That can
only be fixed in the pre-OS environment code.

>
> One of the use cases is to test if a specific DRM driver
> would works properly, under the circumstance of not being
> POST-ed, The ast drm driver is the first one which refused
> to work if not being POST-ed by the firmware.
>
> Before apply this series, I was unable make drm/ast as the
> primary video card easily. The problem is that on a multiple
> video card configuration, the monitor connected with my
> AST2400 card not light up. While confusing, a naive programmer
> may suspect the PRIME is not working.
>
> After applied this series and passing ast.modeset=10 on the
> kernel cmd line, I found that the monitor connected with my
> ast2400 video card still black, It doesn't display and It
> doesn't show image to me.

The problem with adding modeset=10 is that it only helps when you have
one GPU driven by that driver in the system.  If you have multiple
GPUs driven by that driver, which one would that apply to?  E.g., what
if you have 2 AMD GPUs in the system.

>
> While in the process of study drm/ast, I know that drm/ast
> driver has the POST code shipped, See the ast_post_gpu() function.
> Then, I was wondering why this function doesn't works.
>
> After a short-time (hasty) debugging, I found that the ast_post_gpu()
> function didn't get run. Because it have something to do with the
> ast->config_mode. Without thinking too much, I hardcoded the
> ast->config_mode as ast_use_p2a, the key point is to force the
> ast_post_gpu() function to run.
>
>
> ```
>
> --- a/drivers/gpu/drm/ast/ast_main.c
> +++ b/drivers/gpu/drm/ast/ast_main.c
> @@ -132,6 +132,8 @@ static int ast_device_config_init(struct ast_device
> *ast)
>  }
>  }
>
> +   ast->config_mode = ast_use_p2a;
> +
>  switch (ast->config_mode) {
>  case ast_use_defaults:
>  drm_info(dev, "Using default configuration\n");
>
> ```
>
> Then, the monitor light up, it display the Ubuntu greeter to me. Therefore
> my patch is useful, at least for the Linux drm driver tester and developer.
> It allow programmers to test the specific part of a specific driver without
> changing a line of the source code and without the need of sudo authority.
>
> It improves the efficiency of the testing and patch verification. I know
> the PrimaryGPU option of Xorg conf, but this approach will remember the
> setup have been made, you need modify it with root authority each time
> you want to switch the primary. But on the process of rapid developing
> and/or testing for multiple video drivers, with only one computer hardware
> resource available. What we really want is a one-shot command, as provided
> by this series.  So, this is the first use case.
>
>
> The second use case is that sometime the firmware is not reliable.
> While there are thousands of ARM64, PowerPC and Mips servers machine,
> Most of them don't have a good UEFI firmware support. I haven't test the
> drm/amdgpu and drm/radeon at my ARM64 server yet. Because this ARM64
> server always use the platform(BMC) integrated display controller as primary.
> The UEFI firmware of it does not provide options menu to tune.
> So, for the first time, the discrete card because useless, despite more 
> powerful.
> I will take time to carry on the testing, so I will be able to tell more
> in the future.
>
>
> Even on X86, when select the PEG as primary on the UEFI BIOS menu.
> There is no way to tell the bios which one of my three
> discrete video be the primary. Not to mention some old UEFI
> firmware, which doesn't provide a setting at all.
> While the benefit of my approach is the flexibility.
> Yes the i915, amdgpu and radeon are good quality,
> but there may have programmers want to try nouveau.
>
>
> The third use case is that VGAARB is also not reliable, It will
> select a wrong device as primary. Especially on Arm64, Loongarch
> and mips arch etc. And the X server will use this

[PATCH 4/4] drm/amdgpu: Rename KGD_MAX_QUEUES to AMDGPU_MAX_QUEUES

2023-09-06 Thread Mukul Joshi

Rename KGD_MAX_QUEUES to AMDGPU_MAX_QUEUES to conform with
the naming convention followed in amdgpu_gfx.h. No functional
change.

Signed-off-by: Mukul Joshi 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h   | 6 +++---
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 4 ++--
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h   | 2 +-
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 25d5fda5b243..26ff5f8d9795 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -164,7 +164,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
 */
bitmap_complement(gpu_resources.cp_queue_bitmap,
  adev->gfx.mec_bitmap[0].queue_bitmap,
- KGD_MAX_QUEUES);
+ AMDGPU_MAX_QUEUES);
 
/* According to linux/bitmap.h we shouldn't use bitmap_clear if
 * nbits is not compile time constant
@@ -172,7 +172,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
last_valid_bit = 1 /* only first MEC can have compute queues */
* adev->gfx.mec.num_pipe_per_mec
* adev->gfx.mec.num_queue_per_pipe;
-   for (i = last_valid_bit; i < KGD_MAX_QUEUES; ++i)
+   for (i = last_valid_bit; i < AMDGPU_MAX_QUEUES; ++i)
clear_bit(i, gpu_resources.cp_queue_bitmap);
 
amdgpu_doorbell_get_kfd_info(adev,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
index 3c45a188b701..04b8c7dacd30 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
@@ -1037,7 +1037,7 @@ void kgd_gfx_v9_get_cu_occupancy(struct amdgpu_device 
*adev, int pasid,
int pasid_tmp;
int max_queue_cnt;
int vmid_wave_cnt = 0;
-   DECLARE_BITMAP(cp_queue_bitmap, KGD_MAX_QUEUES);
+   DECLARE_BITMAP(cp_queue_bitmap, AMDGPU_MAX_QUEUES);
 
lock_spi_csq_mutexes(adev);
soc15_grbm_select(adev, 1, 0, 0, 0, inst);
@@ -1047,7 +1047,7 @@ void kgd_gfx_v9_get_cu_occupancy(struct amdgpu_device 
*adev, int pasid,
 * to get number of waves in flight
 */
bitmap_complement(cp_queue_bitmap, adev->gfx.mec_bitmap[0].queue_bitmap,
- KGD_MAX_QUEUES);
+ AMDGPU_MAX_QUEUES);
max_queue_cnt = adev->gfx.mec.num_pipe_per_mec *
adev->gfx.mec.num_queue_per_pipe;
sh_cnt = adev->gfx.config.max_sh_per_se;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index 0ca95c4d4bfb..42ac6d1bf9ca 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -43,10 +43,10 @@
 #define AMDGPU_GFX_LBPW_DISABLED_MODE  0x0008L
 
 #define AMDGPU_MAX_GC_INSTANCES8
-#define KGD_MAX_QUEUES 128
+#define AMDGPU_MAX_QUEUES  128
 
-#define AMDGPU_MAX_GFX_QUEUES KGD_MAX_QUEUES
-#define AMDGPU_MAX_COMPUTE_QUEUES KGD_MAX_QUEUES
+#define AMDGPU_MAX_GFX_QUEUES AMDGPU_MAX_QUEUES
+#define AMDGPU_MAX_COMPUTE_QUEUES AMDGPU_MAX_QUEUES
 
 enum amdgpu_gfx_pipe_priority {
AMDGPU_GFX_PIPE_PRIO_NORMAL = AMDGPU_RING_PRIO_1,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 4170e3d32630..6d07a5dd2648 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -92,7 +92,7 @@ static bool is_pipe_enabled(struct device_queue_manager *dqm, 
int mec, int pipe)
 unsigned int get_cp_queues_num(struct device_queue_manager *dqm)
 {
return bitmap_weight(dqm->dev->kfd->shared_resources.cp_queue_bitmap,
-   KGD_MAX_QUEUES);
+   AMDGPU_MAX_QUEUES);
 }
 
 unsigned int get_queues_per_pipe(struct device_queue_manager *dqm)
@@ -1576,7 +1576,7 @@ static int set_sched_resources(struct 
device_queue_manager *dqm)
res.vmid_mask = dqm->dev->compute_vmid_bitmap;
 
res.queue_mask = 0;
-   for (i = 0; i < KGD_MAX_QUEUES; ++i) {
+   for (i = 0; i < AMDGPU_MAX_QUEUES; ++i) {
mec = (i / dqm->dev->kfd->shared_resources.num_queue_per_pipe)
/ dqm->dev->kfd->shared_resources.num_pipe_per_mec;
 
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h 
b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index 3b5a56585c4b..255adc30f802 100644
---

[PATCHv2 1/4] drm/amdgpu: Store CU info from all XCCs for GFX v9.4.3

2023-09-06 Thread Mukul Joshi

Currently, we store CU info only for a single XCC assuming
that it is the same for all XCCs. However, that may not be
true. As a result, store CU info for all XCCs. This info is
later used for CU masking.

Signed-off-by: Mukul Joshi 
---
v1->v2:
- Incorporate Felix's review comments.

 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h   |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c |  4 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c   | 76 +--
 drivers/gpu/drm/amd/amdkfd/kfd_crat.c |  3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c  |  8 +-
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 11 ++-
 .../gpu/drm/amd/include/kgd_kfd_interface.h   |  6 +-
 14 files changed, 60 insertions(+), 65 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index cdf6087706aa..25d5fda5b243 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -478,7 +478,7 @@ void amdgpu_amdkfd_get_cu_info(struct amdgpu_device *adev, 
struct kfd_cu_info *c
cu_info->cu_active_number = acu_info.number;
cu_info->cu_ao_mask = acu_info.ao_cu_mask;
memcpy(_info->cu_bitmap[0], _info.bitmap[0],
-  sizeof(acu_info.bitmap));
+  sizeof(cu_info->cu_bitmap));
cu_info->num_shader_engines = adev->gfx.config.max_shader_engines;
cu_info->num_shader_arrays_per_engine = adev->gfx.config.max_sh_per_se;
cu_info->num_cu_per_sh = adev->gfx.config.max_cu_per_sh;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index 395c1768b9fc..0ca95c4d4bfb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -43,6 +43,7 @@
 #define AMDGPU_GFX_LBPW_DISABLED_MODE  0x0008L
 
 #define AMDGPU_MAX_GC_INSTANCES8
+#define KGD_MAX_QUEUES 128
 
 #define AMDGPU_MAX_GFX_QUEUES KGD_MAX_QUEUES
 #define AMDGPU_MAX_COMPUTE_QUEUES KGD_MAX_QUEUES
@@ -257,7 +258,7 @@ struct amdgpu_cu_info {
uint32_t number;
uint32_t ao_cu_mask;
uint32_t ao_cu_bitmap[4][4];
-   uint32_t bitmap[4][4];
+   uint32_t bitmap[AMDGPU_MAX_GC_INSTANCES][4][4];
 };
 
 struct amdgpu_gfx_ras {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 3a48bec10aea..d462b36adf4b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -850,7 +850,7 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
memcpy(_info->cu_ao_bitmap[0], 
>gfx.cu_info.ao_cu_bitmap[0],
   sizeof(adev->gfx.cu_info.ao_cu_bitmap));
memcpy(_info->cu_bitmap[0], >gfx.cu_info.bitmap[0],
-  sizeof(adev->gfx.cu_info.bitmap));
+  sizeof(dev_info->cu_bitmap));
dev_info->vram_type = adev->gmc.vram_type;
dev_info->vram_bit_width = adev->gmc.vram_width;
dev_info->vce_harvest_config = adev->vce.harvest_config;
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 6ccde07ed63e..62329a822022 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -9442,7 +9442,7 @@ static int gfx_v10_0_get_cu_info(struct amdgpu_device 
*adev,
gfx_v10_0_set_user_wgp_inactive_bitmap_per_sh(
adev, disable_masks[i * 2 + j]);
bitmap = gfx_v10_0_get_cu_active_bitmap_per_sh(adev);
-   cu_info->bitmap[i][j] = bitmap;
+   cu_info->bitmap[0][i][j] = bitmap;
 
for (k = 0; k < adev->gfx.config.max_cu_per_sh; k++) {
if (bitmap & mask) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index 337ed771605f..39c434ca0dad 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -6392,7 +6392,7 @@ static int gfx_v11_0_get_cu_info(struct amdgpu_device 
*adev,
 *SE6: {SH0,SH1} --> {bitmap[2][2], bitmap[2][3]}
 *SE7: {SH0,SH1} --> {bitmap[3][2], bitmap[3][3]}
 */
-   cu_info->bitmap[i % 4][j + (i / 4) * 2] = bitmap;
+   cu_info->bitmap[0][i % 4][j + (i / 4) * 2] = bitmap;
 
for (k =

[PATCHv2 3/4] drm/amdkfd: Update CU masking for GFX 9.4.3

2023-09-06 Thread Mukul Joshi

The CU mask passed from user-space will change based on
different spatial partitioning mode. As a result, update
CU masking code for GFX9.4.3 to work for all partitioning
modes.

Signed-off-by: Mukul Joshi 
---
v1->v2:
- Incorporate Felix's review comments.

 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c  | 28 ---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  |  2 +-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  |  2 +-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  |  2 +-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c  |  2 +-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   | 46 ---
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   |  2 +-
 7 files changed, 56 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
index 763966236658..447829c22295 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
@@ -97,14 +97,16 @@ void free_mqd_hiq_sdma(struct mqd_manager *mm, void *mqd,
 
 void mqd_symmetrically_map_cu_mask(struct mqd_manager *mm,
const uint32_t *cu_mask, uint32_t cu_mask_count,
-   uint32_t *se_mask)
+   uint32_t *se_mask, uint32_t inst)
 {
struct kfd_cu_info cu_info;
uint32_t cu_per_sh[KFD_MAX_NUM_SE][KFD_MAX_NUM_SH_PER_SE] = {0};
bool wgp_mode_req = KFD_GC_VERSION(mm->dev) >= IP_VERSION(10, 0, 0);
uint32_t en_mask = wgp_mode_req ? 0x3 : 0x1;
-   int i, se, sh, cu, cu_bitmap_sh_mul, inc = wgp_mode_req ? 2 : 1;
+   int i, se, sh, cu, cu_bitmap_sh_mul, cu_inc = wgp_mode_req ? 2 : 1;
uint32_t cu_active_per_node;
+   int inc = cu_inc * NUM_XCC(mm->dev->xcc_mask);
+   int xcc_inst = inst + ffs(mm->dev->xcc_mask) - 1;
 
amdgpu_amdkfd_get_cu_info(mm->dev->adev, _info);
 
@@ -143,7 +145,8 @@ void mqd_symmetrically_map_cu_mask(struct mqd_manager *mm,
for (se = 0; se < cu_info.num_shader_engines; se++)
for (sh = 0; sh < cu_info.num_shader_arrays_per_engine; sh++)
cu_per_sh[se][sh] = hweight32(
-   cu_info.cu_bitmap[0][se % 4][sh + (se / 4) * 
cu_bitmap_sh_mul]);
+   cu_info.cu_bitmap[xcc_inst][se % 4][sh + (se / 
4) *
+   cu_bitmap_sh_mul]);
 
/* Symmetrically map cu_mask to all SEs & SHs:
 * se_mask programs up to 2 SH in the upper and lower 16 bits.
@@ -166,20 +169,33 @@ void mqd_symmetrically_map_cu_mask(struct mqd_manager *mm,
 * cu_mask[0] bit8 -> se_mask[0] bit1 (SE0,SH0,CU1)
 * ...
 *
+* For GFX 9.4.3, the following code only looks at a
+* subset of the cu_mask corresponding to the inst parameter.
+* If we have n XCCs under one GPU node
+* cu_mask[0] bit0 -> XCC0 se_mask[0] bit0 (XCC0,SE0,SH0,CU0)
+* cu_mask[0] bit1 -> XCC1 se_mask[0] bit0 (XCC1,SE0,SH0,CU0)
+* ..
+* cu_mask[0] bitn -> XCCn se_mask[0] bit0 (XCCn,SE0,SH0,CU0)
+* cu_mask[0] bit n+1 -> XCC0 se_mask[1] bit0 (XCC0,SE1,SH0,CU0)
+*
+* For example, if there are 6 XCCs under 1 KFD node, this code
+* running for each inst, will look at the bits as:
+* inst, inst + 6, inst + 12...
+*
 * First ensure all CUs are disabled, then enable user specified CUs.
 */
for (i = 0; i < cu_info.num_shader_engines; i++)
se_mask[i] = 0;
 
-   i = 0;
-   for (cu = 0; cu < 16; cu += inc) {
+   i = inst;
+   for (cu = 0; cu < 16; cu += cu_inc) {
for (sh = 0; sh < cu_info.num_shader_arrays_per_engine; sh++) {
for (se = 0; se < cu_info.num_shader_engines; se++) {
if (cu_per_sh[se][sh] > cu) {
if (cu_mask[i / 32] & (en_mask << (i % 
32)))
se_mask[se] |= en_mask << (cu + 
sh * 16);
i += inc;
-   if (i == cu_mask_count)
+   if (i >= cu_mask_count)
return;
}
}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
index 23158db7da03..57bf5e513f4d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
@@ -138,7 +138,7 @@ void free_mqd_hiq_sdma(struct mqd_manager *mm, void *mqd,
 
 void mqd_symmetrically_map_cu_mask(struct mqd_manager *mm,
const uint32_t *cu_mask, uint32_t cu_mask_count,
-   uint32_t *se_mask);
+   uint32_t *se_mask, uint32_t inst);
 
 int kfd_hiq_load_mqd_kiq(struct mqd_manager *mm, void *mqd,
uint32_t pipe_id, uint32_t queue_id,
diff --git

[PATCHv2 2/4] drm/amdkfd: Update cache info reporting for GFX v9.4.3

2023-09-06 Thread Mukul Joshi

Update cache info reporting in sysfs to report the correct
number of CUs and associated cache information based on
different spatial partitioning modes.

Signed-off-by: Mukul Joshi 
---
v1->v2:
- Revert the change in kfd_crat.c
- Add a comment to not change value of CRAT_SIBLINGMAP_SIZE.

 drivers/gpu/drm/amd/amdkfd/kfd_crat.h |  4 ++
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 82 +--
 drivers/gpu/drm/amd/amdkfd/kfd_topology.h |  2 +-
 3 files changed, 51 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_crat.h
index 387a8ef49385..74c2d7a0d628 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.h
@@ -79,6 +79,10 @@ struct crat_header {
 #define CRAT_SUBTYPE_IOLINK_AFFINITY   5
 #define CRAT_SUBTYPE_MAX   6
 
+/*
+ * Do not change the value of CRAT_SIBLINGMAP_SIZE from 32
+ * as it breaks the ABI.
+ */
 #define CRAT_SIBLINGMAP_SIZE   32
 
 /*
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index c54795682dfb..b98cc7930e4c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -1596,14 +1596,17 @@ static int fill_in_l1_pcache(struct 
kfd_cache_properties **props_ext,
 static int fill_in_l2_l3_pcache(struct kfd_cache_properties **props_ext,
struct kfd_gpu_cache_info *pcache_info,
struct kfd_cu_info *cu_info,
-   int cache_type, unsigned int cu_processor_id)
+   int cache_type, unsigned int cu_processor_id,
+   struct kfd_node *knode)
 {
unsigned int cu_sibling_map_mask;
int first_active_cu;
-   int i, j, k;
+   int i, j, k, xcc, start, end;
struct kfd_cache_properties *pcache = NULL;
 
-   cu_sibling_map_mask = cu_info->cu_bitmap[0][0][0];
+   start = ffs(knode->xcc_mask) - 1;
+   end = start + NUM_XCC(knode->xcc_mask);
+   cu_sibling_map_mask = cu_info->cu_bitmap[start][0][0];
cu_sibling_map_mask &=
((1 << pcache_info[cache_type].num_cu_shared) - 1);
first_active_cu = ffs(cu_sibling_map_mask);
@@ -1638,16 +1641,18 @@ static int fill_in_l2_l3_pcache(struct 
kfd_cache_properties **props_ext,
cu_sibling_map_mask = cu_sibling_map_mask >> (first_active_cu - 
1);
k = 0;
 
-   for (i = 0; i < cu_info->num_shader_engines; i++) {
-   for (j = 0; j < cu_info->num_shader_arrays_per_engine; 
j++) {
-   pcache->sibling_map[k] = 
(uint8_t)(cu_sibling_map_mask & 0xFF);
-   pcache->sibling_map[k+1] = 
(uint8_t)((cu_sibling_map_mask >> 8) & 0xFF);
-   pcache->sibling_map[k+2] = 
(uint8_t)((cu_sibling_map_mask >> 16) & 0xFF);
-   pcache->sibling_map[k+3] = 
(uint8_t)((cu_sibling_map_mask >> 24) & 0xFF);
-   k += 4;
-
-   cu_sibling_map_mask = cu_info->cu_bitmap[0][i % 
4][j + i / 4];
-   cu_sibling_map_mask &= ((1 << 
pcache_info[cache_type].num_cu_shared) - 1);
+   for (xcc = start; xcc < end; xcc++) {
+   for (i = 0; i < cu_info->num_shader_engines; i++) {
+   for (j = 0; j < 
cu_info->num_shader_arrays_per_engine; j++) {
+   pcache->sibling_map[k] = 
(uint8_t)(cu_sibling_map_mask & 0xFF);
+   pcache->sibling_map[k+1] = 
(uint8_t)((cu_sibling_map_mask >> 8) & 0xFF);
+   pcache->sibling_map[k+2] = 
(uint8_t)((cu_sibling_map_mask >> 16) & 0xFF);
+   pcache->sibling_map[k+3] = 
(uint8_t)((cu_sibling_map_mask >> 24) & 0xFF);
+   k += 4;
+
+   cu_sibling_map_mask = 
cu_info->cu_bitmap[start][i % 4][j + i / 4];
+   cu_sibling_map_mask &= ((1 << 
pcache_info[cache_type].num_cu_shared) - 1);
+   }
}
}
pcache->sibling_map_size = k;
@@ -1665,7 +1670,7 @@ static int fill_in_l2_l3_pcache(struct 
kfd_cache_properties **props_ext,
 static void kfd_fill_cache_non_crat_info(struct kfd_topology_device *dev, 
struct kfd_node *kdev)
 {
struct kfd_gpu_cache_info *pcache_info = NULL;
-   int i, j, k;
+   int i, j, k, xcc, start, end;
int ct = 0;
unsigned int cu_processor_id;
int ret;
@@ -1699,37 +1704,42 @@ static void kfd_fill_cache_non_crat_info(struct 
kfd_topology_device *dev, struct
 *  then it will consider only one CU from
 *

[PATCHv3] drm/amdkfd: Fix unaligned 64-bit doorbell warning

2023-09-06 Thread Mukul Joshi

This patch fixes the following unaligned 64-bit doorbell
warning seen when submitting packets on HIQ on GFX v9.4.3
by making the HIQ doorbell 64-bit aligned.
The warning is seen when GPU is loaded in any mode other
than SPX mode.

[  +0.000301] [ cut here ]
[  +0.03] Unaligned 64-bit doorbell
[  +0.30] WARNING: /amdkfd/kfd_doorbell.c:339 
write_kernel_doorbell64+0x72/0x80
[  +0.03] RIP: 0010:write_kernel_doorbell64+0x72/0x80
[  +0.04] RSP: 0018:c90004287730 EFLAGS: 00010246
[  +0.05] RAX:  RBX:  RCX: 
[  +0.03] RDX: 0001 RSI: 82837c71 RDI: 
[  +0.03] RBP: c90004287748 R08: 0003 R09: 0001
[  +0.02] R10: 001a R11: 88a034008198 R12: c900013bd004
[  +0.03] R13: 0008 R14: c900042877b0 R15: 007f
[  +0.03] FS:  7fa8c7b62000() GS:889f8840() 
knlGS:
[  +0.04] CS:  0010 DS:  ES:  CR0: 80050033
[  +0.03] CR2: 56111c45aaf0 CR3: 0001414f2002 CR4: 00770ee0
[  +0.03] PKRU: 5554
[  +0.02] Call Trace:
[  +0.04]  
[  +0.06]  kq_submit_packet+0x45/0x50 [amdgpu]
[  +0.000524]  pm_send_set_resources+0x7f/0xc0 [amdgpu]
[  +0.000500]  set_sched_resources+0xe4/0x160 [amdgpu]
[  +0.000503]  start_cpsch+0x1c5/0x2a0 [amdgpu]
[  +0.000497]  kgd2kfd_device_init.cold+0x816/0xb42 [amdgpu]
[  +0.000743]  amdgpu_amdkfd_device_init+0x15f/0x1f0 [amdgpu]
[  +0.000602]  amdgpu_device_init.cold+0x1813/0x2176 [amdgpu]
[  +0.000684]  ? pci_bus_read_config_word+0x4a/0x80
[  +0.12]  ? do_pci_enable_device+0xdc/0x110
[  +0.08]  amdgpu_driver_load_kms+0x1a/0x110 [amdgpu]
[  +0.000545]  amdgpu_pci_probe+0x197/0x400 [amdgpu]

Fixes: cfeaeb3c0ce7 ("drm/amdgpu: use doorbell mgr for kfd kernel doorbells")
Signed-off-by: Mukul Joshi 
---
v1->v2:
- Update the logic to make it work with both 32 bit
  64 bit doorbells.
- Add the Fixed tag
v2->v3:
- Revert to the original change to align it with whats done in
  amdgpu_doorbell_index_on_bar.

 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
index c2e0b79dcc6d..7b38537c7c99 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
@@ -162,6 +162,7 @@ void __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
return NULL;
 
*doorbell_off = amdgpu_doorbell_index_on_bar(kfd->adev, kfd->doorbells, 
inx);
+   inx *= 2;
 
pr_debug("Get kernel queue doorbell\n"
" doorbell offset   == 0x%08X\n"
@@ -176,6 +177,7 @@ void kfd_release_kernel_doorbell(struct kfd_dev *kfd, u32 
__iomem *db_addr)
unsigned int inx;
 
inx = (unsigned int)(db_addr - kfd->doorbell_kernel_ptr);
+   inx /= 2;
 
mutex_lock(>doorbell_mutex);
__clear_bit(inx, kfd->doorbell_bitmap);
-- 
2.35.1

[PATCH] drm/amdgpu/soc21: don't remap HDP registers for SR-IOV

2023-09-06 Thread Alex Deucher

This matches the behavior for soc15 and nv.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/soc21.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c 
b/drivers/gpu/drm/amd/amdgpu/soc21.c
index ef297b41623b..2ecc8c9a078b 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc21.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
@@ -778,7 +778,7 @@ static int soc21_common_hw_init(void *handle)
 * for the purpose of expose those registers
 * to process space
 */
-   if (adev->nbio.funcs->remap_hdp_registers)
+   if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
adev->nbio.funcs->remap_hdp_registers(adev);
/* enable the doorbell aperture */
adev->nbio.funcs->enable_doorbell_aperture(adev, true);
-- 
2.41.0

Re: [PATCH] drm/amdgpu: Fix refclk reporting for SMU v13.0.6

2023-09-06 Thread Alex Deucher

On Wed, Sep 6, 2023 at 12:05 AM Lijo Lazar  wrote:
>
> SMU v13.0.6 SOCs have 100MHz reference clock.
>

Do we want to use the vbios value on boards that have a vbios?  If
it's the same on all variants, then this is probably fine as is.

Alex

> Signed-off-by: Lijo Lazar 
> ---
>  drivers/gpu/drm/amd/amdgpu/soc15.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
> b/drivers/gpu/drm/amd/amdgpu/soc15.c
> index f5be40d7ba36..28094cd7d9c2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> @@ -325,7 +325,8 @@ static u32 soc15_get_xclk(struct amdgpu_device *adev)
> u32 reference_clock = adev->clock.spll.reference_freq;
>
> if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(12, 0, 0) ||
> -   adev->ip_versions[MP1_HWIP][0] == IP_VERSION(12, 0, 1))
> +   adev->ip_versions[MP1_HWIP][0] == IP_VERSION(12, 0, 1) ||
> +   adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 6))
> return 1;
> if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(10, 0, 0) ||
> adev->ip_versions[MP1_HWIP][0] == IP_VERSION(10, 0, 1))
> --
> 2.25.1
>

Re: [PATCH v2 19/34] drm/amd/display: decouple steps for mapping CRTC degamma to DC plane

2023-09-06 Thread Harry Wentland




On 2023-08-29 04:51, Pekka Paalanen wrote:
> On Mon, 28 Aug 2023 12:56:04 -0100
> Melissa Wen  wrote:
> 
>> On 08/28, Pekka Paalanen wrote:
>>> On Mon, 28 Aug 2023 09:45:44 +0100
>>> Joshua Ashton  wrote:
>>>   
 Degamma has always been on the plane on AMD. CRTC DEGAMMA_LUT has actually
 just been applying it to every plane pre-blend.  
>>>
>>> I've never seen that documented anywhere.
>>>
>>> It has seemed obvious, that since we have KMS objects for planes and
>>> CRTCs, stuff on the CRTC does not do plane stuff before blending. That
>>> also has not been documented in the past, but it seemed the most
>>> logical choice.
>>>
>>> Even today
>>> https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html#color-management-properties
>>> make no mention of whether they apply before or after blending.  
>>
>> It's mentioned in the next section:
>> https://dri.freedesktop.org/docs/drm/gpu/amdgpu/display/display-manager.html#dc-color-capabilities-between-dcn-generations
>> In hindsight, maybe it isn't the best place...
> 
> That is driver-specific documentation. As a userspace dev, I'd never
> look at driver-specific documentation, because I'm interested in the
> KMS UAPI which is supposed to be generic, and therefore documented with
> the DRM "core".
> 
> Maybe kernel reviewers also never look at driver-specific docs to find
> attempts at redefining common KMS properties?
> 
> (I still don't know which definition is prevalent.)
> 
>>>   
 Degamma makes no sense after blending anyway.  
>>>
>>> If the goal is to allow blending in optical or other space, you are
>>> correct. However, APIs do not need to make sense to exist, like most of
>>> the options of "Colorspace" connector property.
>>>
>>> I have always thought the CRTC DEGAMMA only exists to allow the CRTC
>>> CTM to work in linear or other space.
>>>
>>> I have at times been puzzled by what the DEGAMMA and CTM are actually
>>> good for.
>>>   
 The entire point is for it to happen before blending to blend in linear
 space. Otherwise DEGAMMA_LUT and REGAMMA_LUT are the exact same thing...  
>>>
>>> The CRTC CTM is between CRTC DEGAMMA and CRTC GAMMA, meaning they are
>>> not interchangeable.
>>>
>>> I have literally believed that DRM KMS UAPI simply does not support
>>> blending in optical space, unless your framebuffers are in optical
>>> which no-one does, until the color management properties are added to

I think Mario Kleiner had a use-case that made use of that and introduced
FP16 format support in amdgpu.

>>> KMS planes. This never even seemed weird, because non-linear blending
>>> is so common.
>>>
>>> So I have been misunderstanding the CRTC DEGAMMA property forever. Am I
>>> the only one? Do all drivers agree today at what point does CRTC
>>> DEGAMMA apply, before blending on all planes or after blending?
>>>   
>>
>> I'd like to know current userspace cases on Linux of this CRTC DEGAMMA
>> LUT.
> 
> I don't know of any, but that doesn't mean anything.
> 
>>> Does anyone know of any doc about that?  
>>
>> From what I retrieved about the introduction of CRTC color props[1], it
>> seems the main concern at that point was getting a linear space for
>> CTM[2] and CRTC degamma property seems to have followed intel
>> requirements, but didn't find anything about the blending space.
> 
> Right. I've always thought CRTC props apply after blending.
> 
>> AFAIU, we have just interpreted that all CRTC color properties for DRM
>> interface are after blending[3]. Can this be seen in another way?
> 
> Joshua did, and he has a logical point.
> 
> I guess if we really want to know, someone would need review all
> drivers exposing these props, and even check if they changed in the
> past.
> 
> FWIW, the usefulness of (RE)GAMMA (not DEGAMMA) LUT is limited by the
> fact that attempting to represent 1/2.2 power function as a uniformly
> distributed LUT is infeasible due to the approximation errors near zero.
> 

IMO, CRTC should be post-blending. Blending is at the plane/crtc boundary
by design, therefore CRTC properties apply post-blending.

Though I can understand why DEGAMMA can be interpreted to be applied
pre-blending. Though, I think that's wrong for the DRM/KMS model and
should be fixed in amdgpu.

Harry

> 
> Thanks,
> pq
> 
>> [1] https://patchwork.freedesktop.org/series/2720/
>> [2] https://codereview.chromium.org/1182063002
>> [3] https://dri.freedesktop.org/docs/drm/_images/dcn3_cm_drm_current.svg
>>
>>>
>>> If drivers do not agree on the behaviour of a KMS property, then that
>>> property is useless for generic userspace.
>>>
>>>
>>> Thanks,
>>> pq
>>>
>>>   
 On Tuesday, 22 August 2023, Pekka Paalanen 
 wrote:  
> On Thu, 10 Aug 2023 15:02:59 -0100
> Melissa Wen  wrote:
>
>> The next patch adds pre-blending degamma to AMD color mgmt pipeline, but
>> pre-blending degamma caps (DPP) is currently in use to provide DRM CRTC
>> atomic degamma or implict degamma on legacy gamma. Detach degamma usage

Re: [PATCH v2 19/34] drm/amd/display: decouple steps for mapping CRTC degamma to DC plane

2023-09-06 Thread Harry Wentland




On 2023-08-28 04:17, Pekka Paalanen wrote:
> On Fri, 25 Aug 2023 13:29:44 -0100
> Melissa Wen  wrote:
> 
>> On 08/22, Pekka Paalanen wrote:
>>> On Thu, 10 Aug 2023 15:02:59 -0100
>>> Melissa Wen  wrote:
>>>   
 The next patch adds pre-blending degamma to AMD color mgmt pipeline, but
 pre-blending degamma caps (DPP) is currently in use to provide DRM CRTC
 atomic degamma or implict degamma on legacy gamma. Detach degamma usage
 regarging CRTC color properties to manage plane and CRTC color
 correction combinations.

 Reviewed-by: Harry Wentland 
 Signed-off-by: Melissa Wen 
 ---
  .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 59 +--
  1 file changed, 41 insertions(+), 18 deletions(-)

 diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
 b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
 index 68e9f2c62f2e..74eb02655d96 100644
 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
 +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
 @@ -764,20 +764,9 @@ int amdgpu_dm_update_crtc_color_mgmt(struct 
 dm_crtc_state *crtc)
return 0;
  }
  
 -/**
 - * amdgpu_dm_update_plane_color_mgmt: Maps DRM color management to DC 
 plane.
 - * @crtc: amdgpu_dm crtc state
 - * @dc_plane_state: target DC surface
 - *
 - * Update the underlying dc_stream_state's input transfer function (ITF) 
 in
 - * preparation for hardware commit. The transfer function used depends on
 - * the preparation done on the stream for color management.
 - *
 - * Returns:
 - * 0 on success. -ENOMEM if mem allocation fails.
 - */
 -int amdgpu_dm_update_plane_color_mgmt(struct dm_crtc_state *crtc,
 -struct dc_plane_state *dc_plane_state)
 +static int
 +map_crtc_degamma_to_dc_plane(struct dm_crtc_state *crtc,
 +   struct dc_plane_state *dc_plane_state)
  {
const struct drm_color_lut *degamma_lut;
enum dc_transfer_func_predefined tf = TRANSFER_FUNCTION_SRGB;
 @@ -800,8 +789,7 @@ int amdgpu_dm_update_plane_color_mgmt(struct 
 dm_crtc_state *crtc,
 _size);
ASSERT(degamma_size == MAX_COLOR_LUT_ENTRIES);
  
 -  dc_plane_state->in_transfer_func->type =
 -  TF_TYPE_DISTRIBUTED_POINTS;
 +  dc_plane_state->in_transfer_func->type = 
 TF_TYPE_DISTRIBUTED_POINTS;
  
/*
 * This case isn't fully correct, but also fairly
 @@ -837,7 +825,7 @@ int amdgpu_dm_update_plane_color_mgmt(struct 
 dm_crtc_state *crtc,
   degamma_lut, degamma_size);
if (r)
return r;
 -  } else if (crtc->cm_is_degamma_srgb) {
 +  } else {
/*
 * For legacy gamma support we need the regamma input
 * in linear space. Assume that the input is sRGB.
 @@ -847,8 +835,43 @@ int amdgpu_dm_update_plane_color_mgmt(struct 
 dm_crtc_state *crtc,
  
if (tf != TRANSFER_FUNCTION_SRGB &&
!mod_color_calculate_degamma_params(NULL,
 -  dc_plane_state->in_transfer_func, NULL, false))
 +  
 dc_plane_state->in_transfer_func,
 +  NULL, false))
return -ENOMEM;
 +  }
 +
 +  return 0;
 +}
 +
 +/**
 + * amdgpu_dm_update_plane_color_mgmt: Maps DRM color management to DC 
 plane.
 + * @crtc: amdgpu_dm crtc state
 + * @dc_plane_state: target DC surface
 + *
 + * Update the underlying dc_stream_state's input transfer function (ITF) 
 in
 + * preparation for hardware commit. The transfer function used depends on
 + * the preparation done on the stream for color management.
 + *
 + * Returns:
 + * 0 on success. -ENOMEM if mem allocation fails.
 + */
 +int amdgpu_dm_update_plane_color_mgmt(struct dm_crtc_state *crtc,
 +struct dc_plane_state *dc_plane_state)
 +{
 +  bool has_crtc_cm_degamma;
 +  int ret;
 +
 +  has_crtc_cm_degamma = (crtc->cm_has_degamma || 
 crtc->cm_is_degamma_srgb);
 +  if (has_crtc_cm_degamma){
 +  /* AMD HW doesn't have post-blending degamma caps. When DRM
 +   * CRTC atomic degamma is set, we maps it to DPP degamma block
 +   * (pre-blending) or, on legacy gamma, we use DPP degamma to
 +   * linearize (implicit degamma) from sRGB/BT709 according to
 +   * the input space.  
>>>
>>> Uhh, you can't just move degamma before blending if KMS userspace
>>> wants it after blending. That would be incorrect behaviour. If you

Re: [PATCH 04/11] drm/amdgpu: fix and cleanup gmc_v7_0_flush_gpu_tlb_pasid

2023-09-06 Thread Shashank Sharma




On 06/09/2023 16:25, Shashank Sharma wrote:


On 05/09/2023 08:04, Christian König wrote:

Testing for reset is pointless since the reset can start right after the
test. Grab the reset semaphore instead.

The same PASID can be used by more than once VMID, build a mask of VMIDs
to reset instead of just restting the first one.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 19 ++-
  1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c

index 6a6929ac2748..9e19a752f94b 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -33,6 +33,7 @@
  #include "amdgpu_ucode.h"
  #include "amdgpu_amdkfd.h"
  #include "amdgpu_gem.h"
+#include "amdgpu_reset.h"
    #include "bif/bif_4_1_d.h"
  #include "bif/bif_4_1_sh_mask.h"
@@ -426,23 +427,23 @@ static int gmc_v7_0_flush_gpu_tlb_pasid(struct 
amdgpu_device *adev,

  uint16_t pasid, uint32_t flush_type,
  bool all_hub, uint32_t inst)
  {
+    u32 mask = 0x0;
  int vmid;
-    unsigned int tmp;
  -    if (amdgpu_in_reset(adev))
-    return -EIO;
+    if(!down_read_trylock(>reset_domain->sem))
+    return 0;
    for (vmid = 1; vmid < 16; vmid++) {
+    u32 tmp = RREG32(mmATC_VMID0_PASID_MAPPING + vmid);
  -    tmp = RREG32(mmATC_VMID0_PASID_MAPPING + vmid);
  if ((tmp & ATC_VMID0_PASID_MAPPING__VALID_MASK) &&
-    (tmp & ATC_VMID0_PASID_MAPPING__PASID_MASK) == pasid) {
-    WREG32(mmVM_INVALIDATE_REQUEST, 1 << vmid);
-    RREG32(mmVM_INVALIDATE_RESPONSE);
-    break;
-    }
+    (tmp & ATC_VMID0_PASID_MAPPING__PASID_MASK) == pasid)
+    mask |= 1 << vmid;


I am a bit concerned here about the change in code, in the previous 
code we were writing the 'first match out of 16' of tmp and of mask 
and programming the registers with (1 << vmid), whereas in new code 
set we are writing the 'last match out of 16' of vmid. Is that 
intentional or expected ?



With last, I mean all matching bits until last :)

- Shashank


  }
  +    WREG32(mmVM_INVALIDATE_REQUEST, mask);
+    RREG32(mmVM_INVALIDATE_RESPONSE);
+    up_read(>reset_domain->sem);
  return 0;
  }

Re: [PATCH 05/11] drm/amdgpu: fix and cleanup gmc_v8_0_flush_gpu_tlb_pasid

2023-09-06 Thread Shashank Sharma




On 05/09/2023 08:04, Christian König wrote:

Testing for reset is pointless since the reset can start right after the
test. Grab the reset semaphore instead.

The same PASID can be used by more than once VMID, build a mask of VMIDs
to reset instead of just restting the first one.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 20 ++--
  1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 5af235202513..2d51531a1f2d 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -31,6 +31,7 @@
  #include "amdgpu_ucode.h"
  #include "amdgpu_amdkfd.h"
  #include "amdgpu_gem.h"
+#include "amdgpu_reset.h"
  
  #include "gmc/gmc_8_1_d.h"

  #include "gmc/gmc_8_1_sh_mask.h"
@@ -616,25 +617,24 @@ static int gmc_v8_0_flush_gpu_tlb_pasid(struct 
amdgpu_device *adev,
uint16_t pasid, uint32_t flush_type,
bool all_hub, uint32_t inst)
  {
+   u32 mask = 0x0;
int vmid;
-   unsigned int tmp;
  
-	if (amdgpu_in_reset(adev))

-   return -EIO;
+   if(!down_read_trylock(>reset_domain->sem))
+   return 0;
  
  	for (vmid = 1; vmid < 16; vmid++) {

+   u32 tmp = RREG32(mmATC_VMID0_PASID_MAPPING + vmid);
  
-		tmp = RREG32(mmATC_VMID0_PASID_MAPPING + vmid);

if ((tmp & ATC_VMID0_PASID_MAPPING__VALID_MASK) &&
-   (tmp & ATC_VMID0_PASID_MAPPING__PASID_MASK) == pasid) {
-   WREG32(mmVM_INVALIDATE_REQUEST, 1 << vmid);
-   RREG32(mmVM_INVALIDATE_RESPONSE);
-   break;
-   }
+   (tmp & ATC_VMID0_PASID_MAPPING__PASID_MASK) == pasid)
+   mask |= 1 << vmid;


Same comment as previous patch, first vmid match vs last vmid match, is 
that intended logic change ?


- Shashank


}
  
+	WREG32(mmVM_INVALIDATE_REQUEST, mask);

+   RREG32(mmVM_INVALIDATE_RESPONSE);
+   up_read(>reset_domain->sem);
return 0;
-
  }
  
  /*

Re: [PATCH 04/11] drm/amdgpu: fix and cleanup gmc_v7_0_flush_gpu_tlb_pasid

2023-09-06 Thread Shashank Sharma




On 05/09/2023 08:04, Christian König wrote:

Testing for reset is pointless since the reset can start right after the
test. Grab the reset semaphore instead.

The same PASID can be used by more than once VMID, build a mask of VMIDs
to reset instead of just restting the first one.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 19 ++-
  1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index 6a6929ac2748..9e19a752f94b 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -33,6 +33,7 @@
  #include "amdgpu_ucode.h"
  #include "amdgpu_amdkfd.h"
  #include "amdgpu_gem.h"
+#include "amdgpu_reset.h"
  
  #include "bif/bif_4_1_d.h"

  #include "bif/bif_4_1_sh_mask.h"
@@ -426,23 +427,23 @@ static int gmc_v7_0_flush_gpu_tlb_pasid(struct 
amdgpu_device *adev,
uint16_t pasid, uint32_t flush_type,
bool all_hub, uint32_t inst)
  {
+   u32 mask = 0x0;
int vmid;
-   unsigned int tmp;
  
-	if (amdgpu_in_reset(adev))

-   return -EIO;
+   if(!down_read_trylock(>reset_domain->sem))
+   return 0;
  
  	for (vmid = 1; vmid < 16; vmid++) {

+   u32 tmp = RREG32(mmATC_VMID0_PASID_MAPPING + vmid);
  
-		tmp = RREG32(mmATC_VMID0_PASID_MAPPING + vmid);

if ((tmp & ATC_VMID0_PASID_MAPPING__VALID_MASK) &&
-   (tmp & ATC_VMID0_PASID_MAPPING__PASID_MASK) == pasid) {
-   WREG32(mmVM_INVALIDATE_REQUEST, 1 << vmid);
-   RREG32(mmVM_INVALIDATE_RESPONSE);
-   break;
-   }
+   (tmp & ATC_VMID0_PASID_MAPPING__PASID_MASK) == pasid)
+   mask |= 1 << vmid;


I am a bit concerned here about the change in code, in the previous code 
we were writing the 'first match out of 16' of tmp and of mask and 
programming the registers with (1 << vmid), whereas in new code set we 
are writing the 'last match out of 16' of vmid. Is that intentional or 
expected ?


- Shashank


}
  
+	WREG32(mmVM_INVALIDATE_REQUEST, mask);

+   RREG32(mmVM_INVALIDATE_RESPONSE);
+   up_read(>reset_domain->sem);
return 0;
  }

Re: [PATCH] SWDEV-420310 - struct pm4_mes_runlist in amdgpu is conflict with spec struct pm4_mes_runlist is different with mes pm4 packet nv10 spec Modification: add last dword of the design of spec i

2023-09-06 Thread Alex Deucher

On Tue, Sep 5, 2023 at 10:05 PM Lin.Cao  wrote:
>

Please fix up the title and the patch description.  Something like:

drm/amdgpu: update pm4_mes_runlist

struct pm4_mes_runlist in amdgpu is in conflict with the spec.  struct
pm4_mes_runlist is
different with mes pm4 packet nv10 spec.  Add last dword of the design
of spec into
struct pm4_mes_runlist.

Alex

> Signed-off-by: Lin.Cao 
> Change-Id: I1322c010d1428b2c1df5080b72da94e90cf17fec
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h | 12 
>  1 file changed, 12 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h 
> b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h
> index 8b6b2bd5c148..d50feaf59b8a 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h
> @@ -129,6 +129,18 @@ struct pm4_mes_runlist {
> uint32_t ordinal4;
> };
>
> +   union
> +   {
> +   struct
> +   {
> +   uint32_t level_1_static_queue_cnt:4;
> +   uint32_t level_2_static_queue_cnt:4;
> +   uint32_t level_3_static_queue_cnt:4;
> +   uint32_t reserved4:20;
> +   } bitfields5;
> +   uint32_t ordinal5;
> +   };
> +
>  };
>  #endif
>
> --
> 2.25.1
>

Re: [bug report] drm/amdgpu: add selftest framework for UMSCH

2023-09-06 Thread Alex Deucher

On Wed, Sep 6, 2023 at 8:25 AM Lang Yu  wrote:
>
> On 09/06/ , Dan Carpenter wrote:
>
> Thanks for reporting this bug. Can you give a link to this bug report? Commit 
> message requests it.
> ("Reported-by: should be immediately followed by Link: with a URL to the 
> report")

For something reported on the mailing list you can just provide the
link the the mailing list archive:
Link: https://lists.freedesktop.org/archives/amd-gfx/2023-September/098254.html

Alex

>
> Regards,
> Lang
>
> > Hello Lang Yu,
> >
> > The patch 5d5eac7e8303: "drm/amdgpu: add selftest framework for
> > UMSCH" from Jun 21, 2023 (linux-next), leads to the following Smatch
> > static checker warning:
> >
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c:338 setup_umsch_mm_test()
> >   warn: unsigned error codes 'test->pasid'
> >
> > drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c
> > 319 static int setup_umsch_mm_test(struct amdgpu_device *adev,
> > 320   struct umsch_mm_test *test)
> > 321 {
> > 322 struct amdgpu_vmhub *hub = >vmhub[AMDGPU_MMHUB0(0)];
> > 323 int r;
> > 324
> > 325 test->vm_cntx_cntl = hub->vm_cntx_cntl;
> > 326
> > 327 test->vm = kzalloc(sizeof(*test->vm), GFP_KERNEL);
> > 328 if (!test->vm) {
> > 329 r = -ENOMEM;
> > 330 return r;
> > 331 }
> > 332
> > 333 r = amdgpu_vm_init(adev, test->vm, -1);
> > 334 if (r)
> > 335 goto error_free_vm;
> > 336
> > 337 test->pasid = amdgpu_pasid_alloc(16);
> > --> 338 if (test->pasid < 0) {
> > ^^^
> > Unsigned can't be less than zero.
> >
> > 339 r = test->pasid;
> > 340 goto error_fini_vm;
> > 341 }
> > 342
> > 343 r = amdgpu_bo_create_kernel(adev, sizeof(struct 
> > umsch_mm_test_ctx_data),
> >
> > regards,
> > dan carpenter

[bug report] drm/amd/display: Add DCN35 CLK_MGR

2023-09-06 Thread Dan Carpenter

Hello Qingqing Zhuo,

This is a semi-automatic email about new static checker warnings.

The patch 8774029f76b9: "drm/amd/display: Add DCN35 CLK_MGR" from Aug
2, 2023, leads to the following Smatch complaint:

drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c:980 
dcn35_clk_mgr_construct()
warn: variable dereferenced before check 'ctx->dc_bios->integrated_info' 
(see line 913)

drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c
   912  
   913  if (ctx->dc_bios->integrated_info->memory_type == 
LpDdr5MemType) {

Unchecked dereference.  Also why is does AMD code have weird indenting
like this?  It's totally unique to AMD.  I guess there was an if
statement which was deleted or maybe this is autogenerated somehow?

   914  dcn35_bw_params.wm_table = lpddr5_wm_table;
   915  } else {
   916  dcn35_bw_params.wm_table = ddr5_wm_table;
   917  }
   918  /* Saved clocks configured at boot for debug purposes */
   919   
dcn35_dump_clk_registers(_mgr->base.base.boot_snapshot, 
_mgr->base.base, _info);
   920  
   921  clk_mgr->base.base.dprefclk_khz = 
dcn35_smu_get_dprefclk(_mgr->base);
   922  clk_mgr->base.base.clks.ref_dtbclk_khz = 
dcn35_smu_get_dtbclk(_mgr->base);
   923  
   924  if (!clk_mgr->base.base.clks.ref_dtbclk_khz)
   925  dcn35_smu_set_dtbclk(_mgr->base, true);
   926  
   927  clk_mgr->base.base.clks.dtbclk_en = true;
   928  dce_clock_read_ss_info(_mgr->base);
   929  /*when clk src is from FCH, it could have ss, same clock src as 
DPREF clk*/
   930  
   931  dcn35_read_ss_info_from_lut(_mgr->base);
   932  clk_mgr->base.base.dprefclk_khz =
   933  dce_adjust_dp_ref_freq_for_ss(_mgr->base, 
clk_mgr->base.base.dprefclk_khz);
   934  
   935  clk_mgr->base.base.bw_params = _bw_params;
   936  
   937  if (clk_mgr->base.base.ctx->dc->debug.pstate_enabled) {
   938  int i;
   939  dcn35_get_dpm_table_from_smu(_mgr->base, 
_dpm_clks);
   940  DC_LOG_SMU("NumDcfClkLevelsEnabled: %d\n"
   941 "NumDispClkLevelsEnabled: %d\n"
   942 "NumSocClkLevelsEnabled: %d\n"
   943 "VcnClkLevelsEnabled: %d\n"
   944 "NumDfPst atesEnabled: %d\n"
   945 "MinGfxClk: %d\n"
   946 "MaxGfxClk: %d\n",
   947 
smu_dpm_clks.dpm_clks->NumDcfClkLevelsEnabled,
   948 
smu_dpm_clks.dpm_clks->NumDispClkLevelsEnabled,
   949 
smu_dpm_clks.dpm_clks->NumSocClkLevelsEnabled,
   950 
smu_dpm_clks.dpm_clks->VcnClkLevelsEnabled,
   951 
smu_dpm_clks.dpm_clks->NumDfPstatesEnabled,
   952 smu_dpm_clks.dpm_clks->MinGfxClk,
   953 smu_dpm_clks.dpm_clks->MaxGfxClk);
   954  for (i = 0; i < 
smu_dpm_clks.dpm_clks->NumDcfClkLevelsEnabled; i++) {
   955  
DC_LOG_SMU("smu_dpm_clks.dpm_clks->DcfClocks[%d] = %d\n",
   956 i,
   957 
smu_dpm_clks.dpm_clks->DcfClocks[i]);
   958  }
   959  for (i = 0; i < 
smu_dpm_clks.dpm_clks->NumDispClkLevelsEnabled; i++) {
   960  
DC_LOG_SMU("smu_dpm_clks.dpm_clks->DispClocks[%d] = %d\n",
   961 i, 
smu_dpm_clks.dpm_clks->DispClocks[i]);
   962  }
   963  for (i = 0; i < 
smu_dpm_clks.dpm_clks->NumSocClkLevelsEnabled; i++) {
   964  
DC_LOG_SMU("smu_dpm_clks.dpm_clks->SocClocks[%d] = %d\n",
   965 i, 
smu_dpm_clks.dpm_clks->SocClocks[i]);
   966  }
   967  for (i = 0; i < NUM_SOC_VOLTAGE_LEVELS; i++)
   968  
DC_LOG_SMU("smu_dpm_clks.dpm_clks->SocVoltage[%d] = %d\n",
   969 i, 
smu_dpm_clks.dpm_clks->SocVoltage[i]);
   970  
   971  for (i = 0; i < NUM_DF_PSTATE_LEVELS; i++) {
   972  
DC_LOG_SMU("smu_dpm_clks.dpm_clks.DfPstateTable[%d].FClk = %d\n"
   973 
"smu_dpm_clks.dpm_clks->DfPstateTable[%d].MemClk= %d\n"
   974 
"smu_dpm_clks.dpm_clks->DfPstateTable[%d].Voltage = %d\n",
   975 i,

Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Sui Jingfeng


Hi,

On 2023/9/6 14:45, Christian König wrote:
Firmware framebuffer device already get killed by the 
drm_aperture_remove_conflicting_pci_framebuffers()
function (or its siblings). So, this series is definitely not to 
interact with the firmware framebuffer
(or more intelligent framebuffer drivers).  It is for user space 
program, such as X server and Wayland
compositor. Its for Linux user or drm drivers testers, which allow 
them to direct graphic display server

using right hardware of interested as primary video card.

Also, I believe that X server and Wayland compositor are the best 
test examples.

If a specific DRM driver can't work with X server as a primary,
then there probably have something wrong.



But what's the use case for overriding this setting?



On a specific machine with multiple GPUs mounted,
only the primary graphics get POST-ed (initialized) by the firmware.
Therefore, the DRM drivers for the rest video cards, have to choose to
work without the prerequisite setups done by firmware, This is called 
as POST.


Well, you don't seem to understand the background here. This is 
perfectly normal behavior.


Secondary cards are posted after loading the appropriate DRM driver. 
At least for amdgpu this is done by calling the appropriate functions 
in the BIOS. 



Well, thanks for you tell me this. You know more than me and definitely have a 
better understanding.

Are you telling me that the POST function for AMDGPU reside in the BIOS?
The kernel call into the BIOS?
Does the BIOS here refer to the UEFI runtime or ATOM BIOS or something else?

But the POST function for the drm ast, reside in the kernel space (in other 
word, in ast.ko).
Is this statement correct?

I means that for ASpeed BMC chip, if the firmware not POST the display 
controller.
Then we have to POST it at the kernel space before doing various modeset option.
We can only POST this chip by directly operate the various registers.
Am I correct for the judgement about ast drm driver?

Thanks for your reviews.

[bug report] Mass report of new Smatch warnings

2023-09-06 Thread Dan Carpenter

Here is the list of new warning which were introduced while I was out
of office.  The line numbers are from linux-next next-20230905.

regards,
dan carpenter

drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c:292 
dcn35_update_clocks() warn: inconsistent indenting
drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c:919 
dcn35_clk_mgr_construct() warn: inconsistent indenting
drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c:921 
dcn35_clk_mgr_construct() warn: inconsistent indenting
drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c:980 
dcn35_clk_mgr_construct() warn: variable dereferenced before check 
'ctx->dc_bios->integrated_info' (see line 913)
drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c:980 
dcn35_clk_mgr_construct() warn: variable dereferenced before check 
'ctx->dc_bios' (see line 913)
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn20/dcn20_hwseq.c:1751 
dcn20_program_pipe() error: we previously assumed 'pipe_ctx->plane_state' could 
be null (see line 1710)
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn35/dcn35_hwseq.c:159 
dcn35_init_hw() warn: inconsistent indenting
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn35/dcn35_hwseq.c:159 
dcn35_init_hw() warn: variable dereferenced before check 'res_pool->dccg' (see 
line 150)
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn35/dcn35_hwseq.c:206 
dcn35_init_hw() error: we previously assumed 'res_pool->hubbub' could be null 
(see line 159)
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn35/dcn35_hwseq.c:285 
dcn35_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see 
line 136)
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn35/dcn35_hwseq.c:977 
dcn35_calc_blocks_to_gate() error: we previously assumed 
'pipe_ctx->plane_res.hubp' could be null (see line 973)
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn35/dcn35_hwseq.c:980 
dcn35_calc_blocks_to_gate() warn: always true condition 
'(pipe_ctx->plane_res.mpcc_inst >= 0) => (0-255 >= 0)'
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn35/dcn35_pg_cntl.c:203 
pg_cntl35_hubp_dpp_pg_control() warn: duplicate check 'power_on' (previous on 
line 193)
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn35/dcn35_pg_cntl.c:318 
pg_cntl35_io_clk_pg_control() warn: duplicate check 'power_on' (previous on 
line 312)
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn35/dcn35_pg_cntl.c:404 
pg_cntl35_plane_otg_pg_control() warn: duplicate check 'power_on' (previous on 
line 398)
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn35/dcn35_resource.c:1877 
dcn35_resource_construct() warn: inconsistent indenting
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn35/dcn35_fpu.c:260 
dcn35_update_bw_bounding_box_fpu() warn: inconsistent indenting
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn35/dcn35_fpu.c:351 
dcn35_update_bw_bounding_box_fpu() warn: inconsistent indenting
drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_srv.c:355 
dmub_srv_hw_setup() warn: inconsistent indenting
drivers/gpu/drm/amd/amdgpu/nbio_v7_11.c:34 nbio_v7_11_get_rev_id() warn: 
inconsistent indenting
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_6_ppt.c:351 
smu_v13_0_6_setup_driver_pptable() warn: should this be 'retry == -1'

[PATCH] drm/amdgpu: fix retry loop test

2023-09-06 Thread Dan Carpenter

This loop will exit with "retry" set to -1 if it fails but the code
checks for if "retry" is zero.  Fix this by changing post-op to a
pre-op.  --retry vs retry--.

Fixes: e01eeffc3f86 ("drm/amd/pm: avoid driver getting empty metrics table for 
the first time")
Signed-off-by: Dan Carpenter 
---
Obviously this only loop 99 times now instead of a hundred but that's
fine, this is an approximation.

 drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
index ff58ee14a68f..20163a9b2a66 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
@@ -336,7 +336,7 @@ static int smu_v13_0_6_setup_driver_pptable(struct 
smu_context *smu)
 
/* Store one-time values in driver PPTable */
if (!pptable->Init) {
-   while (retry--) {
+   while (--retry) {
ret = smu_v13_0_6_get_metrics_table(smu, NULL, true);
if (ret)
return ret;
-- 
2.39.2

Re: [bug report] drm/amdgpu: add selftest framework for UMSCH

2023-09-06 Thread Dan Carpenter

On Wed, Sep 06, 2023 at 07:07:32PM +0800, Lang Yu wrote:
> On 09/06/ , Dan Carpenter wrote:
> 
> Thanks for reporting this bug. Can you give a link to this bug report? Commit 
> message requests it.
> ("Reported-by: should be immediately followed by Link: with a URL to the 
> report")
> 

My email hasn't hit lore.kernel.org yet...  Presumably it will in a bit.
We could link to yours or swap out the the message-id.

https://lore.kernel.org/all/ZPhddADtKmOuVyDq@lang-desktop/

regards,
dan carpenter

[bug report] drm/amdgpu: add selftest framework for UMSCH

2023-09-06 Thread Dan Carpenter

Hello Lang Yu,

The patch 5d5eac7e8303: "drm/amdgpu: add selftest framework for
UMSCH" from Jun 21, 2023 (linux-next), leads to the following Smatch
static checker warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c:338 setup_umsch_mm_test()
warn: unsigned error codes 'test->pasid'

drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c
319 static int setup_umsch_mm_test(struct amdgpu_device *adev,
320   struct umsch_mm_test *test)
321 {
322 struct amdgpu_vmhub *hub = >vmhub[AMDGPU_MMHUB0(0)];
323 int r;
324 
325 test->vm_cntx_cntl = hub->vm_cntx_cntl;
326 
327 test->vm = kzalloc(sizeof(*test->vm), GFP_KERNEL);
328 if (!test->vm) {
329 r = -ENOMEM;
330 return r;
331 }
332 
333 r = amdgpu_vm_init(adev, test->vm, -1);
334 if (r)
335 goto error_free_vm;
336 
337 test->pasid = amdgpu_pasid_alloc(16);
--> 338 if (test->pasid < 0) {
^^^
Unsigned can't be less than zero.

339 r = test->pasid;
340 goto error_fini_vm;
341 }
342 
343 r = amdgpu_bo_create_kernel(adev, sizeof(struct 
umsch_mm_test_ctx_data),

regards,
dan carpenter

Re: [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Sui Jingfeng


Hi,


On 2023/9/5 22:52, Alex Williamson wrote:

On Tue,  5 Sep 2023 03:57:15 +0800
Sui Jingfeng  wrote:


From: Sui Jingfeng 

On a machine with multiple GPUs, a Linux user has no control over which
one is primary at boot time. This series tries to solve above mentioned
problem by introduced the ->be_primary() function stub. The specific
device drivers can provide an implementation to hook up with this stub by
calling the vga_client_register() function.

Once the driver bound the device successfully, VGAARB will call back to
the device driver. To query if the device drivers want to be primary or
not. Device drivers can just pass NULL if have no such needs.

Please note that:

1) The ARM64, Loongarch, Mips servers have a lot PCIe slot, and I would
like to mount at least three video cards.

2) Typically, those non-86 machines don't have a good UEFI firmware
support, which doesn't support select primary GPU as firmware stage.
Even on x86, there are old UEFI firmwares which already made undesired
decision for you.

3) This series is attempt to solve the remain problems at the driver level,
while another series[1] of me is target to solve the majority of the
problems at device level.

Tested (limited) on x86 with four video card mounted, Intel UHD Graphics
630 is the default boot VGA, successfully override by ast2400 with
ast.modeset=10 append at the kernel cmd line.

$ lspci | grep VGA

  00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD 
Graphics 630]

In all my previous experiments with VGA routing and IGD I found that
IGD can't actually release VGA routing and Intel confirmed the hardware
doesn't have the ability to do so.


Which model of the IGD you are using? even for the IGD in Atom D2550,
the legacy 128KB VGA memory range can be tuned to be mapped to IGD
or to the DMI Interface. See the 1.7.3.2 section of the N2000 datasheet[1].

If a specific model of Intel has a bug in the VGA routing hardware logic unit,
I would like to ignore it. Or switch to the UEFI firmware on such hardware.

It is the hardware engineer's responsibility, I will not worry about it.
Thanks for you tell this.

[1] 
https://www.intel.com/content/dam/doc/datasheet/atom-d2000-n2000-vol-2-datasheet.pdf



  It will always be primary from a
VGA routing perspective.  Was this actually tested with non-UEFI?



As you already said, the generous Intel already have confirmed that the 
hardware defect.
So probably this is a good chance to switch to UEFI to solve the problem. Then, 
no
testing for legacy is needed.



I suspect it might only work in UEFI mode where we probably don't
actually have a dependency on VGA routing.  This is essentially why
vfio requires UEFI ROMs when assigning GPUs to VMs, VGA routing is too
broken to use on Intel systems with IGD.  Thanks,


Thanks for you tell me this.

To be honest, I have only tested my patch on machines with UEFI firmware.
Since UEFI because the main stream, but if this patch is really useful for
majority machine, I'm satisfied. The results is not too bad.

Thanks.


Alex

Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Sui Jingfeng


Hi,

On 2023/9/5 23:05, Thomas Zimmermann wrote:
You might have found a bug in the ast driver. Ast has means to detect 
if the device has been POSTed and maybe do that. If this doesn't work 
correctly, it needs a fix.



That sounds fine.

The bug is not a big deal, I'm just take it as an example and report it to you.
But a real fix can be complex, because there are quite a lot of servers
ship with ASpeed BMC hardware.

Honestly I don't have the time fix it on formal way.
I have already tons patches in pending and I will focus on solve VGAARB related 
problem.


Because I want to test your patch occasionally.
So this series is useful for myself at corner cases.

[PATCH 27/28] drm/amd/display: move odm power optimization decision after subvp optimization

2023-09-06 Thread Stylon Wang

From: Wenjing Liu 

[why]
ODM power optimization excludes subvp power optimization but subvp
optimization can override ODM power optimization even if subvp optimization
configuration is not found. This happens with 4k144hz + 1 5k desktop plane.
We could have applied ODM power optimization however this is overridden by
subvp but subvp ends up deciding not apply its optimization.

[how]
Move ODM power optimization decision after subvp so it will try ODM power
optimization after subvp optimization is not possible.

Reviewed-by: Dillon Varone 
Acked-by: Stylon Wang 
Signed-off-by: Wenjing Liu 
---
 drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
index 92e2d1df5b32..1f53883d8f56 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
@@ -1441,10 +1441,6 @@ static void dcn32_full_validate_bw_helper(struct dc *dc,
vba->VoltageLevel = *vlevel;
}
 
-   if (should_allow_odm_power_optimization(dc, context, vba, split, merge))
-   try_odm_power_optimization_and_revalidate(
-   dc, context, pipes, split, merge, vlevel, 
*pipe_cnt);
-
/* Conditions for setting up phantom pipes for SubVP:
 * 1. Not force disable SubVP
 * 2. Full update (i.e. !fast_validate)
@@ -1563,6 +1559,11 @@ static void dcn32_full_validate_bw_helper(struct dc *dc,
assign_subvp_index(dc, context);
}
}
+
+   if (should_allow_odm_power_optimization(dc, context, vba, split, merge))
+   try_odm_power_optimization_and_revalidate(
+   dc, context, pipes, split, merge, vlevel, 
*pipe_cnt);
+
 }
 
 static bool is_dtbclk_required(struct dc *dc, struct dc_state *context)
-- 
2.42.0

[PATCH 28/28] drm/amd/display: add skip_implict_edp_power_control flag for dcn32

2023-09-06 Thread Stylon Wang

From: Ian Chen 

Add flag skip_implict_edp_power_control check in function
dcn32_disable_link_output to fix DCN35 issue.

Reviewed-by: Robin Chen 
Acked-by: Stylon Wang 
Signed-off-by: Ian Chen 
---
 drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
index 018376146d97..e8a989a50afa 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
@@ -1322,7 +1322,8 @@ void dcn32_disable_link_output(struct dc_link *link,
struct dmcu *dmcu = dc->res_pool->dmcu;
 
if (signal == SIGNAL_TYPE_EDP &&
-   link->dc->hwss.edp_backlight_control)
+   link->dc->hwss.edp_backlight_control &&
+   !link->skip_implict_edp_power_control)
link->dc->hwss.edp_backlight_control(link, false);
else if (dmcu != NULL && dmcu->funcs->lock_phy)
dmcu->funcs->lock_phy(dmcu);
@@ -1331,7 +1332,8 @@ void dcn32_disable_link_output(struct dc_link *link,
link->phy_state.symclk_state = SYMCLK_OFF_TX_OFF;
 
if (signal == SIGNAL_TYPE_EDP &&
-   link->dc->hwss.edp_backlight_control)
+   link->dc->hwss.edp_backlight_control &&
+   !link->skip_implict_edp_power_control)
link->dc->hwss.edp_power_control(link, false);
else if (dmcu != NULL && dmcu->funcs->lock_phy)
dmcu->funcs->unlock_phy(dmcu);
-- 
2.42.0

[PATCH 26/28] drm/amd/display: add seamless pipe topology transition check

2023-09-06 Thread Stylon Wang

From: Wenjing Liu 

[why]
We have a few cases where we need to perform update topology update
in dc update interface. However some of the updates are not seamless
This could cause user noticible glitches. To enforce seamless transition
we are adding a checking condition and error logging so the corruption
as result of non seamless transition can be easily spotted.

Reviewed-by: Dillon Varone 
Acked-by: Stylon Wang 
Signed-off-by: Wenjing Liu 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c  |  8 +++
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.c| 52 +++
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.h|  4 ++
 .../gpu/drm/amd/display/dc/dcn32/dcn32_init.c |  1 +
 .../gpu/drm/amd/display/dc/inc/hw_sequencer.h |  3 ++
 5 files changed, 68 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index a857de5ebe85..f91d0f6b0d7d 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -4370,6 +4370,14 @@ bool dc_update_planes_and_stream(struct dc *dc,
update_type,
context);
} else {
+   if (!stream_update &&
+   dc->hwss.is_pipe_topology_transition_seamless &&
+   !dc->hwss.is_pipe_topology_transition_seamless(
+   dc, dc->current_state, 
context)) {
+
+   DC_LOG_ERROR("performing non-seamless pipe topology 
transition with surface only update!\n");
+   BREAK_TO_DEBUGGER();
+   }
commit_planes_for_stream(
dc,
srf_updates,
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
index cae5e1e68c86..018376146d97 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
@@ -1619,3 +1619,55 @@ void dcn32_blank_phantom(struct dc *dc,
if (tg->funcs->is_tg_enabled(tg))
hws->funcs.wait_for_blank_complete(opp);
 }
+
+bool dcn32_is_pipe_topology_transition_seamless(struct dc *dc,
+   const struct dc_state *cur_ctx,
+   const struct dc_state *new_ctx)
+{
+   int i;
+   const struct pipe_ctx *cur_pipe, *new_pipe;
+   bool is_seamless = true;
+
+   for (i = 0; i < dc->res_pool->pipe_count; i++) {
+   cur_pipe = _ctx->res_ctx.pipe_ctx[i];
+   new_pipe = _ctx->res_ctx.pipe_ctx[i];
+
+   if (resource_is_pipe_type(cur_pipe, FREE_PIPE) ||
+   resource_is_pipe_type(new_pipe, FREE_PIPE))
+   /* adding or removing free pipes is always seamless */
+   continue;
+   else if (resource_is_pipe_type(cur_pipe, OTG_MASTER)) {
+   if (resource_is_pipe_type(new_pipe, OTG_MASTER))
+   if (cur_pipe->stream->stream_id == 
new_pipe->stream->stream_id)
+   /* OTG master with the same stream is seamless 
*/
+   continue;
+   } else if (resource_is_pipe_type(cur_pipe, OPP_HEAD)) {
+   if (resource_is_pipe_type(new_pipe, OPP_HEAD)) {
+   if (cur_pipe->stream_res.tg == 
new_pipe->stream_res.tg)
+   /*
+* OPP heads sharing the same timing
+* generator is seamless
+*/
+   continue;
+   }
+   } else if (resource_is_pipe_type(cur_pipe, DPP_PIPE)) {
+   if (resource_is_pipe_type(new_pipe, DPP_PIPE)) {
+   if (cur_pipe->stream_res.opp == 
new_pipe->stream_res.opp)
+   /*
+* DPP pipes sharing the same OPP head 
is
+* seamless
+*/
+   continue;
+   }
+   }
+
+   /*
+* This pipe's transition doesn't fall under any seamless
+* conditions
+*/
+   is_seamless = false;
+   break;
+   }
+
+   return is_seamless;
+}
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.h 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.h
index 616d5219119e..9992e40acd21 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.h
@@ -120,4 +120,8 @@ void dcn32_blank_phantom(struct dc *dc,
int width,
int

[PATCH 25/28] drm/amd/display: minior logging improvements

2023-09-06 Thread Stylon Wang

From: Wenjing Liu 

[how]
- Add minimial transition log with reason and base state.
- Do not log set dpms interfaces for virtual signal in stream.

Reviewed-by: Dillon Varone 
Acked-by: Stylon Wang 
Signed-off-by: Wenjing Liu 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c|  7 +++
 drivers/gpu/drm/amd/display/dc/link/link_dpms.c | 10 --
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 8e8362026825..a857de5ebe85 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -4069,6 +4069,13 @@ static bool commit_minimal_transition_state(struct dc 
*dc,
return true;
}
 
+   DC_LOG_DC("%s base = %s state, reason = %s\n", __func__,
+   dc->current_state == transition_base_context ? 
"current" : "new",
+   subvp_in_use ? "Subvp In Use" :
+   odm_in_use ? "ODM in Use" :
+   dc->debug.pipe_split_policy != MPC_SPLIT_AVOID ? "MPC 
in Use" :
+   "Unknown");
+
if (!dc->config.is_vmin_only_asic) {
tmp_mpc_policy = dc->debug.pipe_split_policy;
dc->debug.pipe_split_policy = MPC_SPLIT_AVOID;
diff --git a/drivers/gpu/drm/amd/display/dc/link/link_dpms.c 
b/drivers/gpu/drm/amd/display/dc/link/link_dpms.c
index cd9dd270b05f..d8327911c467 100644
--- a/drivers/gpu/drm/amd/display/dc/link/link_dpms.c
+++ b/drivers/gpu/drm/amd/display/dc/link/link_dpms.c
@@ -2269,6 +2269,8 @@ void link_set_dpms_off(struct pipe_ctx *pipe_ctx)
 
if (dp_is_128b_132b_signal(pipe_ctx))
vpg = pipe_ctx->stream_res.hpo_dp_stream_enc->vpg;
+   if (dc_is_virtual_signal(pipe_ctx->stream->signal))
+   return;
 
DC_LOGGER_INIT(pipe_ctx->stream->ctx->logger);
 
@@ -2281,9 +2283,6 @@ void link_set_dpms_off(struct pipe_ctx *pipe_ctx)
}
}
 
-   if (dc_is_virtual_signal(pipe_ctx->stream->signal))
-   return;
-
if (!pipe_ctx->stream->sink->edid_caps.panel_patch.skip_avmute) {
if (dc_is_hdmi_signal(pipe_ctx->stream->signal))
set_avmute(pipe_ctx, true);
@@ -2382,6 +2381,8 @@ void link_set_dpms_on(
 
if (dp_is_128b_132b_signal(pipe_ctx))
vpg = pipe_ctx->stream_res.hpo_dp_stream_enc->vpg;
+   if (dc_is_virtual_signal(pipe_ctx->stream->signal))
+   return;
 
DC_LOGGER_INIT(pipe_ctx->stream->ctx->logger);
 
@@ -2394,9 +2395,6 @@ void link_set_dpms_on(
}
}
 
-   if (dc_is_virtual_signal(pipe_ctx->stream->signal))
-   return;
-
link_enc = link_enc_cfg_get_link_enc(link);
ASSERT(link_enc);
 
-- 
2.42.0

[PATCH 23/28] drm/amd/display: 3.2.250

2023-09-06 Thread Stylon Wang

From: Aric Cyr 

Acked-by: Stylon Wang 
Signed-off-by: Aric Cyr 
---
 drivers/gpu/drm/amd/display/dc/dc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 05ab24c81041..bece61d2508b 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -47,7 +47,7 @@ struct aux_payload;
 struct set_config_cmd_payload;
 struct dmub_notification;
 
-#define DC_VER "3.2.249"
+#define DC_VER "3.2.250"
 
 #define MAX_SURFACES 3
 #define MAX_PLANES 6
-- 
2.42.0

[PATCH 24/28] drm/amd/display: do not skip ODM minimal transition based on new state

2023-09-06 Thread Stylon Wang

From: Wenjing Liu 

[why]
During 8k video plane resizing we could transition from MPC combine mode
back to ODM combine 2:1 + 8k video plane. In this transition minimal
transition state is based on new state with ODM combine enabled.
We are skipping this and it causes corruption because we have to reassign
a current DPP pipe to a different MPC blending tree which is not supported
seamlessly.

Reviewed-by: Dillon Varone 
Acked-by: Stylon Wang 
Signed-off-by: Wenjing Liu 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 0320bc49458c..8e8362026825 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -4048,10 +4048,10 @@ static bool commit_minimal_transition_state(struct dc 
*dc,
 * pipe, we must use the minimal transition.
 */
for (i = 0; i < dc->res_pool->pipe_count; i++) {
-   struct pipe_ctx *pipe = >current_state->res_ctx.pipe_ctx[i];
+   struct pipe_ctx *pipe = 
_base_context->res_ctx.pipe_ctx[i];
 
-   if (pipe->stream && pipe->next_odm_pipe) {
-   odm_in_use = true;
+   if (resource_is_pipe_type(pipe, OTG_MASTER)) {
+   odm_in_use = resource_get_odm_slice_count(pipe) > 1;
break;
}
}
-- 
2.42.0

[PATCH 22/28] drm/amd/display: Fix MST recognizes connected displays as one

2023-09-06 Thread Stylon Wang

From: Muhammad Ahmed 

[What]
MST now recognizes both connected displays

Reviewed-by: Charlene Liu 
Acked-by: Stylon Wang 
Signed-off-by: Muhammad Ahmed 
---
 .../display/dc/dce110/dce110_hw_sequencer.c   | 30 +++
 .../drm/amd/display/dc/dcn20/dcn20_hwseq.c|  8 ++---
 .../gpu/drm/amd/display/dc/dcn32/dcn32_mpc.c  |  2 +-
 3 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index 31454db00ed5..2701620350af 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -1178,12 +1178,15 @@ void dce110_disable_stream(struct pipe_ctx *pipe_ctx)
dto_params.otg_inst = tg->inst;
dto_params.timing = _ctx->stream->timing;
dp_hpo_inst = pipe_ctx->stream_res.hpo_dp_stream_enc->inst;
-   dccg->funcs->set_dtbclk_dto(dccg, _params);
-   dccg->funcs->disable_symclk32_se(dccg, dp_hpo_inst);
-   dccg->funcs->set_dpstreamclk(dccg, REFCLK, tg->inst, 
dp_hpo_inst);
-   } else if (pipe_ctx->stream->signal == SIGNAL_TYPE_DISPLAY_PORT_MST && 
dccg->funcs->disable_symclk_se)
+   if (dccg) {
+   dccg->funcs->set_dtbclk_dto(dccg, _params);
+   dccg->funcs->disable_symclk32_se(dccg, dp_hpo_inst);
+   dccg->funcs->set_dpstreamclk(dccg, REFCLK, tg->inst, 
dp_hpo_inst);
+   }
+   } else if (dccg && dccg->funcs->disable_symclk_se) {
dccg->funcs->disable_symclk_se(dccg, 
stream_enc->stream_enc_inst,
link_enc->transmitter - TRANSMITTER_UNIPHY_A);
+   }
 
if (dc->link_srv->dp_is_128b_132b_signal(pipe_ctx)) {
/* TODO: This looks like a bug to me as we are disabling HPO IO 
when
@@ -2655,11 +2658,11 @@ void dce110_prepare_bandwidth(
struct clk_mgr *dccg = dc->clk_mgr;
 
dce110_set_safe_displaymarks(>res_ctx, dc->res_pool);
-
-   dccg->funcs->update_clocks(
-   dccg,
-   context,
-   false);
+   if (dccg)
+   dccg->funcs->update_clocks(
+   dccg,
+   context,
+   false);
 }
 
 void dce110_optimize_bandwidth(
@@ -2670,10 +2673,11 @@ void dce110_optimize_bandwidth(
 
dce110_set_displaymarks(dc, context);
 
-   dccg->funcs->update_clocks(
-   dccg,
-   context,
-   true);
+   if (dccg)
+   dccg->funcs->update_clocks(
+   dccg,
+   context,
+   true);
 }
 
 static void dce110_program_front_end_for_pipe(
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
index 37cab11d1b31..19ab08f5122e 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
@@ -2710,8 +2710,6 @@ void dcn20_enable_stream(struct pipe_ctx *pipe_ctx)
struct dce_hwseq *hws = dc->hwseq;
unsigned int k1_div = PIXEL_RATE_DIV_NA;
unsigned int k2_div = PIXEL_RATE_DIV_NA;
-   struct link_encoder *link_enc = 
link_enc_cfg_get_link_enc(pipe_ctx->stream->link);
-   struct stream_encoder *stream_enc = pipe_ctx->stream_res.stream_enc;
 
if (dc->link_srv->dp_is_128b_132b_signal(pipe_ctx)) {
if (dc->hwseq->funcs.setup_hpo_hw_control)
@@ -2731,10 +2729,8 @@ void dcn20_enable_stream(struct pipe_ctx *pipe_ctx)
dto_params.timing = _ctx->stream->timing;
dto_params.ref_dtbclk_khz = 
dc->clk_mgr->funcs->get_dtb_ref_clk_frequency(dc->clk_mgr);
dccg->funcs->set_dtbclk_dto(dccg, _params);
-   } else if (pipe_ctx->stream->signal == SIGNAL_TYPE_DISPLAY_PORT_MST && 
dccg->funcs->enable_symclk_se)
-   dccg->funcs->enable_symclk_se(dccg,
-   stream_enc->stream_enc_inst, link_enc->transmitter - 
TRANSMITTER_UNIPHY_A);
-
+   } else {
+   }
if (hws->funcs.calculate_dccg_k1_k2_values && 
dc->res_pool->dccg->funcs->set_pixel_rate_div) {
hws->funcs.calculate_dccg_k1_k2_values(pipe_ctx, _div, 
_div);
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_mpc.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_mpc.c
index 3082da04a63d..1d052f08aff5 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_mpc.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_mpc.c
@@ -75,7 +75,7 @@ void mpc32_power_on_blnd_lut(
if (power_on) {
REG_UPDATE(MPCC_MCM_MEM_PWR_CTRL[mpcc_id], 
MPCC_MCM_1DLUT_MEM_PWR_FORCE, 0);

[PATCH 21/28] drm/amd/display: fix some non-initialized register mask and setting

2023-09-06 Thread Stylon Wang

From: Charlene Liu 

[why]
fix some non-initialized register mask and update golden setting

Reviewed-by: Duncan Ma 
Acked-by: Stylon Wang 
Signed-off-by: Charlene Liu 
---
 .../display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c  | 56 ++-
 .../display/dc/dcn10/dcn10_stream_encoder.h   |  5 +-
 .../gpu/drm/amd/display/dc/inc/hw/clk_mgr.h   |  6 +-
 .../amd/display/dc/inc/hw/clk_mgr_internal.h  | 16 +-
 4 files changed, 65 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c
index 4fd25bb1ab92..37ffa0050e60 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c
@@ -53,6 +53,14 @@
 #define mmCLK1_CLK3_DFS_CNTL0x16E72
 #define mmCLK1_CLK4_DFS_CNTL0x16E75
 
+#define mmCLK1_CLK0_CURRENT_CNT 0x16EE7
+#define mmCLK1_CLK1_CURRENT_CNT 0x16EE8
+#define mmCLK1_CLK2_CURRENT_CNT 0x16EE9
+#define mmCLK1_CLK3_CURRENT_CNT 0x16EEA
+#define mmCLK1_CLK4_CURRENT_CNT 0x16EEB
+
+#define mmCLK4_CLK0_CURRENT_CNT 0x1B0C9
+
 #define CLK1_CLK_PLL_REQ__FbMult_int_MASK   0x01ffUL
 #define CLK1_CLK_PLL_REQ__PllSpineDiv_MASK  0xf000UL
 #define CLK1_CLK_PLL_REQ__FbMult_frac_MASK  0xUL
@@ -452,6 +460,26 @@ static int dcn32_get_dispclk_from_dentist(struct clk_mgr 
*clk_mgr_base)
 
 static void dcn32_auto_dpm_test_log(struct dc_clocks *new_clocks, struct 
clk_mgr_internal *clk_mgr)
 {
+unsigned int dispclk_khz_reg= REG_READ(CLK1_CLK0_CURRENT_CNT); // 
DISPCLK
+unsigned int dppclk_khz_reg = REG_READ(CLK1_CLK1_CURRENT_CNT); // 
DPPCLK
+unsigned int dprefclk_khz_reg   = REG_READ(CLK1_CLK2_CURRENT_CNT); // 
DPREFCLK
+unsigned int dcfclk_khz_reg = REG_READ(CLK1_CLK3_CURRENT_CNT); // 
DCFCLK
+unsigned int dtbclk_khz_reg = REG_READ(CLK1_CLK4_CURRENT_CNT); // 
DTBCLK
+unsigned int fclk_khz_reg   = REG_READ(CLK4_CLK0_CURRENT_CNT); // FCLK
+
+// Overrides for these clocks in case there is no p_state change support
+int dramclk_khz_override = new_clocks->dramclk_khz;
+int fclk_khz_override = new_clocks->fclk_khz;
+
+int num_fclk_levels = 
clk_mgr->base.bw_params->clk_table.num_entries_per_clk.num_fclk_levels - 1;
+
+if (!new_clocks->p_state_change_support) {
+   dramclk_khz_override = clk_mgr->base.bw_params->max_memclk_mhz * 
1000;
+}
+if (!new_clocks->fclk_p_state_change_support) {
+   fclk_khz_override = 
clk_mgr->base.bw_params->clk_table.entries[num_fclk_levels].fclk_mhz * 1000;
+}
+


//  IMPORTANT:  When adding more clocks to these logs, do NOT 
put a newline
//  anywhere other than at the very end of 
the string.
@@ -466,20 +494,20 @@ static void dcn32_auto_dpm_test_log(struct dc_clocks 
*new_clocks, struct clk_mgr
new_clocks->dcfclk_khz > 0 &&
new_clocks->dppclk_khz > 0) {
 
-   if (new_clocks->p_state_change_support) {
-   DC_LOG_AUTO_DPM_TEST("AutoDPMTest: dramclk_khz:%d - 
fclk_khz:%d - "
-"dcfclk_khz:%d - 
dppclk_khz:%d\n",
-new_clocks->dramclk_khz,
-new_clocks->fclk_khz,
-new_clocks->dcfclk_khz,
-new_clocks->dppclk_khz);
-   } else {
-   DC_LOG_AUTO_DPM_TEST("AutoDPMTest: dramclk_khz:1249000 
- fclk_khz:%d - "
-"dcfclk_khz:%d - 
dppclk_khz:%d\n",
-new_clocks->fclk_khz,
-new_clocks->dcfclk_khz,
-new_clocks->dppclk_khz);
-   }
+   DC_LOG_AUTO_DPM_TEST("AutoDPMTest: dramclk:%d - fclk:%d - "
+   "dcfclk:%d - dppclk:%d - dispclk_hw:%d - "
+   "dppclk_hw:%d - dprefclk_hw:%d - dcfclk_hw:%d - "
+   "dtbclk_hw:%d - fclk_hw:%d\n",
+   dramclk_khz_override,
+   fclk_khz_override,
+   new_clocks->dcfclk_khz,
+   new_clocks->dppclk_khz,
+   dispclk_khz_reg,
+   dppclk_khz_reg,
+   dprefclk_khz_reg,
+   dcfclk_khz_reg,
+   dtbclk_khz_reg,
+   fclk_khz_reg);
}
 }

[PATCH 20/28] drm/amd/display: Add check for vrr_active_fixed

2023-09-06 Thread Stylon Wang

From: Austin Zheng 

Why:
vrr_active_fixed should also be checked when
determining if DRR is in use

How:
Add check for vrr_active_fixed when allow_freesync
and vrr_active_variable are also checked

Reviewed-by: Alvin Lee 
Acked-by: Stylon Wang 
Signed-off-by: Austin Zheng 
---
 drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c  | 4 ++--
 drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c | 4 ++--
 drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c  | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c 
b/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c
index 979f52ee5604..2f98dfa06dad 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c
+++ b/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c
@@ -554,7 +554,7 @@ static void populate_subvp_cmd_vblank_pipe_info(struct dc 
*dc,
vblank_pipe->stream->timing.v_total - 
vblank_pipe->stream->timing.v_front_porch - 
vblank_pipe->stream->timing.v_addressable;
 
if (vblank_pipe->stream->ignore_msa_timing_param &&
-   (vblank_pipe->stream->allow_freesync || 
vblank_pipe->stream->vrr_active_variable))
+   (vblank_pipe->stream->allow_freesync || 
vblank_pipe->stream->vrr_active_variable || 
vblank_pipe->stream->vrr_active_fixed))
populate_subvp_cmd_drr_info(dc, pipe, vblank_pipe, pipe_data);
 }
 
@@ -648,7 +648,7 @@ static void populate_subvp_cmd_pipe_info(struct dc *dc,
pipe_data->pipe_config.subvp_data.mall_region_lines = 
phantom_timing->v_addressable;
pipe_data->pipe_config.subvp_data.main_pipe_index = 
subvp_pipe->stream_res.tg->inst;
pipe_data->pipe_config.subvp_data.is_drr = 
subvp_pipe->stream->ignore_msa_timing_param &&
-   (subvp_pipe->stream->allow_freesync || 
subvp_pipe->stream->vrr_active_variable);
+   (subvp_pipe->stream->allow_freesync || 
subvp_pipe->stream->vrr_active_variable || 
subvp_pipe->stream->vrr_active_fixed);
 
/* Calculate the scaling factor from the src and dst height.
 * e.g. If 3840x2160 being downscaled to 1920x1080, the scaling factor 
is 1/2.
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c
index f5705b3e6e42..bc5f0db23d0c 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c
@@ -706,7 +706,7 @@ bool dcn32_subvp_drr_admissable(struct dc *dc, struct 
dc_state *context)
non_subvp_pipes++;
drr_psr_capable = (drr_psr_capable || 
dcn32_is_psr_capable(pipe));
if (pipe->stream->ignore_msa_timing_param &&
-   (pipe->stream->allow_freesync 
|| pipe->stream->vrr_active_variable)) {
+   (pipe->stream->allow_freesync 
|| pipe->stream->vrr_active_variable || pipe->stream->vrr_active_fixed)) {
drr_pipe_found = true;
}
}
@@ -764,7 +764,7 @@ bool dcn32_subvp_vblank_admissable(struct dc *dc, struct 
dc_state *context, int
non_subvp_pipes++;
vblank_psr_capable = (vblank_psr_capable || 
dcn32_is_psr_capable(pipe));
if (pipe->stream->ignore_msa_timing_param &&
-   (pipe->stream->allow_freesync 
|| pipe->stream->vrr_active_variable)) {
+   (pipe->stream->allow_freesync 
|| pipe->stream->vrr_active_variable || pipe->stream->vrr_active_fixed)) {
drr_pipe_found = true;
}
}
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
index 2358c9100cff..92e2d1df5b32 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
@@ -822,7 +822,7 @@ static bool subvp_drr_schedulable(struct dc *dc, struct 
dc_state *context)
continue;
 
if (drr_pipe->stream->mall_stream_config.type == SUBVP_NONE && 
drr_pipe->stream->ignore_msa_timing_param &&
-   (drr_pipe->stream->allow_freesync || 
drr_pipe->stream->vrr_active_variable))
+   (drr_pipe->stream->allow_freesync || 
drr_pipe->stream->vrr_active_variable || drr_pipe->stream->vrr_active_fixed))
break;
}
 
-- 
2.42.0

[PATCH 19/28] drm/amd/display: dc cleanup for tests

2023-09-06 Thread Stylon Wang

From: Sridevi Arvindekar 

[WHY]
Code cleanup found in internal tests

Reviewed-by: Dillon Varone 
Acked-by: Stylon Wang 
Signed-off-by: Sridevi Arvindekar 
---
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
index 5ac85df158b9..37cab11d1b31 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
@@ -2855,7 +2855,7 @@ void dcn20_fpga_init_hw(struct dc *dc)
res_pool->mpc->funcs->mpc_init(res_pool->mpc);
 
/* initialize OPP mpc_tree parameter */
-   for (i = 0; i < dc->res_pool->res_cap->num_opp; i++) {
+   for (i = 0; i < dc->res_pool->pipe_count; i++) {
res_pool->opps[i]->mpc_tree_params.opp_id = 
res_pool->opps[i]->inst;
res_pool->opps[i]->mpc_tree_params.opp_list = NULL;
for (j = 0; j < MAX_PIPES; j++)
-- 
2.42.0

[PATCH 18/28] drm/amd/display: Drop unused registers

2023-09-06 Thread Stylon Wang

From: Qingqing Zhuo 

[Why & How]
Some registers are never used in the driver
but defined. Remove them.

Reviewed-by: Roman Li 
Acked-by: Stylon Wang 
Signed-off-by: Qingqing Zhuo 
---
 drivers/gpu/drm/amd/display/dc/dcn35/dcn35_hubbub.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_hubbub.h 
b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_hubbub.h
index 013029f2e257..dc7331dc3b65 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_hubbub.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_hubbub.h
@@ -37,8 +37,6 @@
SR(DCHUBBUB_ARB_SAT_LEVEL),\
SR(DCHUBBUB_ARB_DF_REQ_OUTSTAND),\
SR(DCHUBBUB_GLOBAL_TIMER_CNTL), \
-   SR(DCHUBBUB_TEST_DEBUG_INDEX), \
-   SR(DCHUBBUB_TEST_DEBUG_DATA),\
SR(DCHUBBUB_SOFT_RESET),\
SR(DCHUBBUB_CRC_CTRL), \
SR(DCN_VM_FB_LOCATION_BASE),\
-- 
2.42.0

[PATCH 17/28] drm/amd/display: add dp dto programming function to dccg

2023-09-06 Thread Stylon Wang

From: Dillon Varone 

[WHY]
Add support for programming dp dto via dccg.

Reviewed-by: Jun Lei 
Acked-by: Stylon Wang 
Signed-off-by: Dillon Varone 
---
 drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c |  1 +
 drivers/gpu/drm/amd/display/dc/inc/hw/dccg.h  | 10 ++
 2 files changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c 
b/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c
index ed8936405dfa..75cf4ab8ae3c 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c
@@ -34,6 +34,7 @@
 
 #include "dce_clock_source.h"
 #include "clk_mgr.h"
+#include "dccg.h"
 
 #include "reg_helper.h"
 
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/dccg.h 
b/drivers/gpu/drm/amd/display/dc/inc/hw/dccg.h
index 3e2f0f64c98c..65bb7cd05385 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/dccg.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/dccg.h
@@ -56,6 +56,13 @@ enum dentist_dispclk_change_mode {
DISPCLK_CHANGE_MODE_RAMPING,
 };
 
+struct dp_dto_params {
+   int otg_inst;
+   enum signal_type signal;
+   long long pixclk_hz;
+   long long refclk_hz;
+};
+
 enum pixel_rate_div {
PIXEL_RATE_DIV_BY_1 = 0,
PIXEL_RATE_DIV_BY_2 = 1,
@@ -182,6 +189,9 @@ struct dccg_funcs {
struct dccg *dccg,
uint32_t stream_enc_inst,
uint32_t link_enc_inst);
+   void (*set_dp_dto)(
+   struct dccg *dccg,
+   const struct dp_dto_params *params);
 };
 
 #endif //__DAL_DCCG_H__
-- 
2.42.0

[PATCH 15/28] drm/amd/display: do not attempt ODM power optimization if minimal transition doesn't exist

2023-09-06 Thread Stylon Wang

From: Wenjing Liu 

[why]
In some cases such as 8k desktop surface with 144Hz timing, we decide to
enable ODM power optimization but this surface doesn't have a minimum
transition state. Therefore we cannot switch off ODM power optimization 
seamlessly
This creates path depedency on ODM power optimization decision. i.e
whether or not we should switch off ODM power optimization is dependent
on if the transition to switch off ODM power optimization from current state
is seamless. We don't desire a path dependent power optimization policy
as it is too dynamic and difficult to maintain.

[how]
Attempt ODM power optimization only after we can validate new state without
using pipe combine.

Reviewed-by: Dillon Varone 
Acked-by: Stylon Wang 
Signed-off-by: Wenjing Liu 
---
 .../drm/amd/display/dc/dcn32/dcn32_resource.c |  76 +--
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c  | 487 +++---
 2 files changed, 306 insertions(+), 257 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
index 8cb6b94e83d2..a74d4cab5a7d 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
@@ -1885,67 +1885,6 @@ bool dcn32_validate_bandwidth(struct dc *dc,
return out;
 }
 
-static bool should_allow_odm_power_optimization(struct dc *dc,
-   struct dc_state *context)
-{
-   struct dc_stream_state *stream = context->streams[0];
-
-   /*
-* this debug flag allows us to disable ODM power optimization feature
-* unconditionally. we force the feature off if this is set to false.
-*/
-   if (!dc->debug.enable_single_display_2to1_odm_policy)
-   return false;
-
-   /* current design and test coverage is only limited to allow ODM power
-* optimization for single stream. Supporting it for multiple streams
-* use case would require additional algorithm to decide how to
-* optimize power consumption when there are not enough free pipes to
-* allocate for all the streams. This level of optimization would
-* require multiple attempts of revalidation to make an optimized
-* decision. Unfortunately We do not support revalidation flow in
-* current version of DML.
-*/
-   if (context->stream_count != 1)
-   return false;
-
-   /*
-* Our hardware doesn't support ODM for HDMI TMDS
-*/
-   if (dc_is_hdmi_signal(stream->signal))
-   return false;
-
-   /*
-* ODM Combine 2:1 requires horizontal timing divisible by 2 so each
-* ODM segment has the same size.
-*/
-   if (!is_h_timing_divisible_by_2(stream))
-   return false;
-
-   /*
-* No power benefits if the timing's pixel clock is not high enough to
-* raise display clock from minimum power state.
-*/
-   if (stream->timing.pix_clk_100hz * 100 <= DCN3_2_VMIN_DISPCLK_HZ)
-   return false;
-
-   /* the new ODM power optimization feature reduces software design
-* limitation and allows ODM power optimization to be supported even
-* with presence of overlay planes. The new feature is enabled based on
-* enable_windowed_mpo_odm flag. If the flag is not set, we limit our
-* feature scope due to previous software design limitation */
-   if (!dc->config.enable_windowed_mpo_odm) {
-   if (context->stream_status[0].plane_count != 1)
-   return false;
-
-   if (stream->src.width >= 5120 &&
-   stream->src.width > stream->dst.width)
-   return false;
-   }
-
-   return true;
-}
-
 int dcn32_populate_dml_pipes_from_context(
struct dc *dc, struct dc_state *context,
display_e2e_pipe_params_st *pipes,
@@ -1959,20 +1898,6 @@ int dcn32_populate_dml_pipes_from_context(
 
dcn20_populate_dml_pipes_from_context(dc, context, pipes, 
fast_validate);
 
-   /*
-* Apply pipe split policy first so we can predict the pipe split 
correctly
-* (dcn32_predict_pipe_split).
-*/
-   for (i = 0, pipe_cnt = 0; i < dc->res_pool->pipe_count; i++) {
-   if (!res_ctx->pipe_ctx[i].stream)
-   continue;
-   if (should_allow_odm_power_optimization(dc, context))
-   pipes[pipe_cnt].pipe.dest.odm_combine_policy = 
dm_odm_combine_policy_2to1;
-   else
-   pipes[pipe_cnt].pipe.dest.odm_combine_policy = 
dm_odm_combine_policy_dal;
-   pipe_cnt++;
-   }
-
for (i = 0, pipe_cnt = 0; i < dc->res_pool->pipe_count; i++) {
 
if (!res_ctx->pipe_ctx[i].stream)
@@ -1985,6 +1910,7 @@ int dcn32_populate_dml_pipes_from_context(
dcn32_zero_pipe_dcc_fraction(pipes, pipe_cnt);

[PATCH 16/28] drm/amd/display: only allow ODM power optimization if surface is within guaranteed viewport size

2023-09-06 Thread Stylon Wang

From: Wenjing Liu 

[why]
Current dc update design has limitation to support transition from
ODM combine to minimum transition to MPC combine state seamlessly
at the capability boundary when MPO plane is resizing. This will
require dc update high level refactor in order to remove the design
limitation. The decision is to block such use case for existing products
by limiting ODM power optimization support for only those surfaces
within guaranteed viewport size. This will prevent us from transitioning
to MPC combine state when ODM power optimization is enabled.

Reviewed-by: Dillon Varone 
Acked-by: Stylon Wang 
Signed-off-by: Wenjing Liu 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 36 +++
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c  | 27 ++
 2 files changed, 63 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index dcedda85dcdb..0320bc49458c 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -3881,6 +3881,7 @@ static void commit_planes_for_stream(struct dc *dc,
  */
 static bool could_mpcc_tree_change_for_active_pipes(struct dc *dc,
struct dc_stream_state *stream,
+   struct dc_surface_update *srf_updates,
int surface_count,
bool *is_plane_addition)
 {
@@ -3918,6 +3919,40 @@ static bool 
could_mpcc_tree_change_for_active_pipes(struct dc *dc,
*is_plane_addition = true;
}
}
+   if (dc->config.enable_windowed_mpo_odm) {
+   const struct rect *guaranteed_viewport = >src;
+   const struct rect *surface_src, *surface_dst;
+   bool are_cur_planes_guaranteed = true;
+   bool are_new_planes_guaranteed = true;
+
+   for (i = 0; i < cur_stream_status->plane_count; i++) {
+   surface_src = 
_stream_status->plane_states[i]->src_rect;
+   surface_dst = 
_stream_status->plane_states[i]->dst_rect;
+   if ((surface_src->height > surface_dst->height 
&& surface_src->height > guaranteed_viewport->height) ||
+   (surface_src->width > 
surface_dst->width && surface_src->width > guaranteed_viewport->width))
+   are_cur_planes_guaranteed = false;
+   }
+
+   for (i = 0; i < surface_count; i++) {
+   if (srf_updates[i].scaling_info) {
+   surface_src = 
_updates[i].scaling_info->src_rect;
+   surface_dst = 
_updates[i].scaling_info->dst_rect;
+   } else {
+   surface_src = 
_updates[i].surface->src_rect;
+   surface_dst = 
_updates[i].surface->dst_rect;
+   }
+   if ((surface_src->height > surface_dst->height 
&& surface_src->height > guaranteed_viewport->height) ||
+   (surface_src->width > 
surface_dst->width && surface_src->width > guaranteed_viewport->width))
+   are_new_planes_guaranteed = false;
+   }
+
+   if (are_cur_planes_guaranteed && 
!are_new_planes_guaranteed) {
+   force_minimal_pipe_splitting = true;
+   *is_plane_addition = true;
+   } else if (!are_cur_planes_guaranteed && 
are_new_planes_guaranteed) {
+   force_minimal_pipe_splitting = true;
+   }
+   }
}
 
for (i = 0; i < dc->res_pool->pipe_count; i++) {
@@ -4270,6 +4305,7 @@ bool dc_update_planes_and_stream(struct dc *dc,
force_minimal_pipe_splitting = could_mpcc_tree_change_for_active_pipes(
dc,
stream,
+   srf_updates,
surface_count,
_plane_addition);
 
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
index 883e90be2257..2358c9100cff 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
@@ -1267,6 +1267,8 @@ static bool should_allow_odm_power_optimization(struct dc 
*dc,
 {
struct dc_stream_state *stream = context->streams[0];
struct pipe_slice_table slice_table;
+   struct dc_plane_state *plane;
+   struct rect guaranteed_viewport;
int i;
 
/*
@@ -1331,6 +1333,31 @@ static bool should_allow_odm_power_optimization(struct 
dc *dc,

[PATCH 14/28] drm/amd/display: remove a function that does complex calculation in every frame but not used

2023-09-06 Thread Stylon Wang

From: Wenjing Liu 

[why]
The result of predict_pipe_split calculation is no longer used but the
function is not removed. This will cause unnecessary calculation
of pipe split prediction in every frame update.

Reviewed-by: Dillon Varone 
Acked-by: Stylon Wang 
Signed-off-by: Wenjing Liu 
---
 .../drm/amd/display/dc/dcn32/dcn32_resource.c |  3 -
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c  | 84 ---
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.h  |  3 -
 3 files changed, 90 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
index fd12791995a7..8cb6b94e83d2 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
@@ -2031,9 +2031,6 @@ int dcn32_populate_dml_pipes_from_context(
}
}
 
-   DC_FP_START();
-   dcn32_predict_pipe_split(context, [pipe_cnt]);
-   DC_FP_END();
 
pipe_cnt++;
}
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
index 0c68cd97a461..496f0f58fa7d 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
@@ -348,90 +348,6 @@ void dcn32_helper_populate_phantom_dlg_params(struct dc 
*dc,
}
 }
 
-/**
- * dcn32_predict_pipe_split - Predict if pipe split will occur for a given DML 
pipe
- * @context: [in] New DC state to be programmed
- * @pipe_e2e: [in] DML pipe end to end context
- *
- * This function takes in a DML pipe (pipe_e2e) and predicts if pipe split is 
required (both
- * ODM and MPC). For pipe split, ODM combine is determined by the ODM mode, 
and MPC combine is
- * determined by DPPClk requirements
- *
- * This function follows the same policy as DML:
- * - Check for ODM combine requirements / policy first
- * - MPC combine is only chosen if there is no ODM combine requirements / 
policy in place, and
- *   MPC is required
- *
- * Return: Number of splits expected (1 for 2:1 split, 3 for 4:1 split, 0 for 
no splits).
- */
-uint8_t dcn32_predict_pipe_split(struct dc_state *context,
- display_e2e_pipe_params_st *pipe_e2e)
-{
-   double pscl_throughput;
-   double pscl_throughput_chroma;
-   double dpp_clk_single_dpp, clock;
-   double clk_frequency = 0.0;
-   double vco_speed = context->bw_ctx.dml.soc.dispclk_dppclk_vco_speed_mhz;
-   bool total_available_pipes_support = false;
-   uint32_t number_of_dpp = 0;
-   enum odm_combine_mode odm_mode = dm_odm_combine_mode_disabled;
-   double req_dispclk_per_surface = 0;
-   uint8_t num_splits = 0;
-
-   dc_assert_fp_enabled();
-
-   
dml32_CalculateODMMode(context->bw_ctx.dml.ip.maximum_pixels_per_line_per_dsc_unit,
-   pipe_e2e->pipe.dest.hactive,
-   pipe_e2e->dout.output_format,
-   pipe_e2e->dout.output_type,
-   pipe_e2e->pipe.dest.odm_combine_policy,
-   
context->bw_ctx.dml.soc.clock_limits[context->bw_ctx.dml.soc.num_states - 
1].dispclk_mhz,
-   
context->bw_ctx.dml.soc.clock_limits[context->bw_ctx.dml.soc.num_states - 
1].dispclk_mhz,
-   pipe_e2e->dout.dsc_enable != 0,
-   0, /* TotalNumberOfActiveDPP can be 0 since we're 
predicting pipe split requirement */
-   context->bw_ctx.dml.ip.max_num_dpp,
-   pipe_e2e->pipe.dest.pixel_rate_mhz,
-   context->bw_ctx.dml.soc.dcn_downspread_percent,
-   context->bw_ctx.dml.ip.dispclk_ramp_margin_percent,
-   context->bw_ctx.dml.soc.dispclk_dppclk_vco_speed_mhz,
-   pipe_e2e->dout.dsc_slices,
-   /* Output */
-   _available_pipes_support,
-   _of_dpp,
-   _mode,
-   _dispclk_per_surface);
-
-   
dml32_CalculateSinglePipeDPPCLKAndSCLThroughput(pipe_e2e->pipe.scale_ratio_depth.hscl_ratio,
-   pipe_e2e->pipe.scale_ratio_depth.hscl_ratio_c,
-   pipe_e2e->pipe.scale_ratio_depth.vscl_ratio,
-   pipe_e2e->pipe.scale_ratio_depth.vscl_ratio_c,
-   context->bw_ctx.dml.ip.max_dchub_pscl_bw_pix_per_clk,
-   context->bw_ctx.dml.ip.max_pscl_lb_bw_pix_per_clk,
-   pipe_e2e->pipe.dest.pixel_rate_mhz,
-   pipe_e2e->pipe.src.source_format,
-   pipe_e2e->pipe.scale_taps.htaps,
-   pipe_e2e->pipe.scale_taps.htaps_c,
-   pipe_e2e->pipe.scale_taps.vtaps,
-   pipe_e2e->pipe.scale_taps.vtaps_c,
-   /* Output */

[PATCH 13/28] drm/amd/display: Add DCHUBBUB callback to report MALL status

2023-09-06 Thread Stylon Wang

From: Aurabindo Pillai 

[Why]
For enabling automated testing, add a hook to DCHUBBUB interface so that
mall status can be queried by userspace through debugfs. This removes
dependence on requiring a userspace tool like UMR for querying status
for MALL static screen IGT test.

Reviewed-by: Alvin Lee 
Acked-by: Stylon Wang 
Signed-off-by: Aurabindo Pillai 
---
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c  | 13 ++---
 .../gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.h|  5 -
 .../gpu/drm/amd/display/dc/dcn32/dcn32_hubbub.c| 14 +-
 .../gpu/drm/amd/display/dc/dcn32/dcn32_hubbub.h|  6 +-
 .../gpu/drm/amd/display/dc/dcn32/dcn32_resource.h  |  1 +
 drivers/gpu/drm/amd/display/dc/inc/hw/dchubbub.h   |  1 +
 6 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index 05c1ad98a1f6..1259d6351c50 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -37,6 +37,7 @@
 #include "link_hwss.h"
 #include "dc/dc_dmub_srv.h"
 #include "link/protocols/link_dp_capability.h"
+#include "inc/hw/dchubbub.h"
 
 #ifdef CONFIG_DRM_AMD_SECURE_DISPLAY
 #include "amdgpu_dm_psr.h"
@@ -3642,10 +3643,16 @@ DEFINE_DEBUGFS_ATTRIBUTE(disable_hpd_ops, 
disable_hpd_get,
 static int capabilities_show(struct seq_file *m, void *unused)
 {
struct amdgpu_device *adev = (struct amdgpu_device *)m->private;
-   struct dc_caps caps = adev->dm.dc->caps;
-   bool mall_supported = caps.mall_size_total;
+   struct dc *dc = adev->dm.dc;
+   bool mall_supported = dc->caps.mall_size_total;
+   unsigned int mall_in_use = false;
+   struct hubbub *hubbub = dc->res_pool->hubbub;
+
+   if (hubbub->funcs->get_mall_en)
+   hubbub->funcs->get_mall_en(hubbub, _in_use);
 
-   seq_printf(m, "mall: %s\n", mall_supported ? "yes" : "no");
+   seq_printf(m, "mall supported: %s, enabled: %s\n",
+  mall_supported ? "yes" : "no", mall_in_use ? "yes" : 
"no");
 
return 0;
 }
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.h 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.h
index adc876156d2e..5ddf2b36986e 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.h
@@ -171,6 +171,7 @@ struct dcn_hubbub_registers {
uint32_t DCHUBBUB_ARB_FCLK_PSTATE_CHANGE_WATERMARK_B;
uint32_t DCHUBBUB_ARB_FCLK_PSTATE_CHANGE_WATERMARK_C;
uint32_t DCHUBBUB_ARB_FCLK_PSTATE_CHANGE_WATERMARK_D;
+   uint32_t DCHUBBUB_ARB_MALL_CNTL;
uint32_t SDPIF_REQUEST_RATE_LIMIT;
uint32_t DCHUBBUB_SDPIF_CFG0;
uint32_t DCHUBBUB_SDPIF_CFG1;
@@ -194,7 +195,9 @@ struct dcn_hubbub_registers {
type DCHUBBUB_ARB_FCLK_PSTATE_CHANGE_WATERMARK_A;\
type DCHUBBUB_ARB_FCLK_PSTATE_CHANGE_WATERMARK_B;\
type DCHUBBUB_ARB_FCLK_PSTATE_CHANGE_WATERMARK_C;\
-   type DCHUBBUB_ARB_FCLK_PSTATE_CHANGE_WATERMARK_D
+   type DCHUBBUB_ARB_FCLK_PSTATE_CHANGE_WATERMARK_D;\
+   type MALL_PREFETCH_COMPLETE;\
+   type MALL_IN_USE
 
 #define HUBBUB_REG_FIELD_LIST_DCN35(type) \
type DCHUBBUB_FGCG_REP_DIS
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hubbub.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hubbub.c
index 8bfef6d095b2..88dfc907553d 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hubbub.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hubbub.c
@@ -945,6 +945,17 @@ void hubbub32_force_wm_propagate_to_pipes(struct hubbub 
*hubbub)
DCHUBBUB_ARB_DATA_URGENCY_WATERMARK_A, prog_wm_value);
 }
 
+void hubbub32_get_mall_en(struct hubbub *hubbub, unsigned int *mall_in_use)
+{
+   struct dcn20_hubbub *hubbub2 = TO_DCN20_HUBBUB(hubbub);
+   uint32_t prefetch_complete, mall_en;
+
+   REG_GET_2(DCHUBBUB_ARB_MALL_CNTL, MALL_IN_USE, _en,
+ MALL_PREFETCH_COMPLETE, _complete);
+
+   *mall_in_use = prefetch_complete && mall_en;
+}
+
 void hubbub32_init(struct hubbub *hubbub)
 {
struct dcn20_hubbub *hubbub2 = TO_DCN20_HUBBUB(hubbub);
@@ -995,7 +1006,8 @@ static const struct hubbub_funcs hubbub32_funcs = {
.init_crb = dcn32_init_crb,
.hubbub_read_state = hubbub2_read_state,
.force_usr_retraining_allow = hubbub32_force_usr_retraining_allow,
-   .set_request_limit = hubbub32_set_request_limit
+   .set_request_limit = hubbub32_set_request_limit,
+   .get_mall_en = hubbub32_get_mall_en,
 };
 
 void hubbub32_construct(struct dcn20_hubbub *hubbub2,
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hubbub.h 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hubbub.h
index ad33427192c6..f073839a4b6d 100644
---

[PATCH 12/28] drm/amd/display: Add new logs for AutoDPMTest

2023-09-06 Thread Stylon Wang

From: Ethan Bitnun 

[Description]
 - Add new logs to be used by the AutoDPMTest
 - Enclose AutoDPMTest logs in settings
 - Add logging definition

Reviewed-by: Alvin Lee 
Acked-by: Stylon Wang 
Signed-off-by: Ethan Bitnun 
---
 .../display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c  | 36 +++
 drivers/gpu/drm/amd/display/dc/dc.h   |  1 +
 .../drm/amd/display/include/logger_types.h|  5 ++-
 3 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c
index 984b52923534..4fd25bb1ab92 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c
@@ -450,6 +450,38 @@ static int dcn32_get_dispclk_from_dentist(struct clk_mgr 
*clk_mgr_base)
return 0;
 }
 
+static void dcn32_auto_dpm_test_log(struct dc_clocks *new_clocks, struct 
clk_mgr_internal *clk_mgr)
+{
+   

+   //  IMPORTANT:  When adding more clocks to these logs, do NOT 
put a newline
+   //  anywhere other than at the very end of 
the string.
+   //
+   //  Formatting example (make sure to have " - " between each entry):
+   //
+   //  AutoDPMTest: clk1:%d - clk2:%d - 
clk3:%d - clk4:%d\n"
+   

+   if (new_clocks &&
+   new_clocks->dramclk_khz > 0 &&
+   new_clocks->fclk_khz > 0 &&
+   new_clocks->dcfclk_khz > 0 &&
+   new_clocks->dppclk_khz > 0) {
+
+   if (new_clocks->p_state_change_support) {
+   DC_LOG_AUTO_DPM_TEST("AutoDPMTest: dramclk_khz:%d - 
fclk_khz:%d - "
+"dcfclk_khz:%d - 
dppclk_khz:%d\n",
+new_clocks->dramclk_khz,
+new_clocks->fclk_khz,
+new_clocks->dcfclk_khz,
+new_clocks->dppclk_khz);
+   } else {
+   DC_LOG_AUTO_DPM_TEST("AutoDPMTest: dramclk_khz:1249000 
- fclk_khz:%d - "
+"dcfclk_khz:%d - 
dppclk_khz:%d\n",
+new_clocks->fclk_khz,
+new_clocks->dcfclk_khz,
+new_clocks->dppclk_khz);
+   }
+   }
+}
 
 static void dcn32_update_clocks(struct clk_mgr *clk_mgr_base,
struct dc_state *context,
@@ -646,6 +678,10 @@ static void dcn32_update_clocks(struct clk_mgr 
*clk_mgr_base,
/*update dmcu for wait_loop count*/
dmcu->funcs->set_psr_wait_loop(dmcu,
clk_mgr_base->clks.dispclk_khz / 1000 / 7);
+
+   if (dc->config.enable_auto_dpm_test_logs) {
+   dcn32_auto_dpm_test_log(new_clocks, clk_mgr);
+   }
 }
 
 static uint32_t dcn32_get_vco_frequency_from_reg(struct clk_mgr_internal 
*clk_mgr)
diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 7e6f819a9952..05ab24c81041 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -420,6 +420,7 @@ struct dc_config {
int sdpif_request_limit_words_per_umc;
bool use_old_fixed_vs_sequence;
bool dc_mode_clk_limit_support;
+   bool enable_auto_dpm_test_logs;
 };
 
 enum visual_confirm {
diff --git a/drivers/gpu/drm/amd/display/include/logger_types.h 
b/drivers/gpu/drm/amd/display/include/logger_types.h
index 3bf08a60c45c..fb657f7408a7 100644
--- a/drivers/gpu/drm/amd/display/include/logger_types.h
+++ b/drivers/gpu/drm/amd/display/include/logger_types.h
@@ -73,6 +73,7 @@
 #define DC_LOG_SMU(...) pr_debug("[SMU_MSG]:"__VA_ARGS__)
 #define DC_LOG_DWB(...) DRM_DEBUG_KMS(__VA_ARGS__)
 #define DC_LOG_DP2(...) DRM_DEBUG_KMS(__VA_ARGS__)
+#define DC_LOG_AUTO_DPM_TEST(...) pr_debug("[AutoDPMTest]: "__VA_ARGS__)
 
 struct dal_logger;
 
@@ -128,6 +129,7 @@ enum dc_log_type {
LOG_SAMPLE_1DLUT,
LOG_DP2,
LOG_DC2RESERVED12,
+   LOG_AUTO_DPM_TEST,
 };
 
 #define DC_MIN_LOG_MASK ((1 << LOG_ERROR) | \
@@ -157,7 +159,8 @@ enum dc_log_type {
(1ULL << LOG_IF_TRACE) | \
(1ULL << LOG_HDMI_FRL) | \
(1ULL << LOG_SCALER) | \
-   (1ULL << LOG_DTN) /* | \
+   (1ULL << LOG_DTN) | \
+   (1ULL << LOG_AUTO_DPM_TEST)/* | \
(1ULL << LOG_DEBUG) | \
(1ULL << LOG_BIOS) | \
(1ULL << LOG_SURFACE) | \
-- 
2.42.0

[PATCH 11/28] drm/amd/display: support main link off before specific vertical line

2023-09-06 Thread Stylon Wang

From: Paul Hsieh 

[Why]
Some panels request main link off before specific vertical line.
If source turn off main link after specific vertical line then
panel defect will be exposed.

[How]
Add interface to support turn off main link before specific
vertical line

Reviewed-by: Robin Chen 
Acked-by: Stylon Wang 
Signed-off-by: Paul Hsieh 
---
 drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c  | 10 +-
 drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h|  4 ++--
 .../gpu/drm/amd/display/include/ddc_service_types.h|  1 +
 3 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c 
b/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c
index 0f24b6fbd220..f27cc8f9d0aa 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c
@@ -35,6 +35,7 @@
 
 static const uint8_t DP_SINK_DEVICE_STR_ID_1[] = {7, 1, 8, 7, 3};
 static const uint8_t DP_SINK_DEVICE_STR_ID_2[] = {7, 1, 8, 7, 5};
+static const uint8_t DP_SINK_DEVICE_STR_ID_3[] = {0x42, 0x61, 0x6c, 0x73, 
0x61};
 
 /*
  * Convert dmcub psr state to dmcu psr state.
@@ -295,7 +296,7 @@ static bool dmub_psr_copy_settings(struct dmub_psr *dmub,
struct psr_context *psr_context,
uint8_t panel_inst)
 {
-   union dmub_rb_cmd cmd;
+   union dmub_rb_cmd cmd = { 0 };
struct dc_context *dc = dmub->ctx;
struct dmub_cmd_psr_copy_settings_data *copy_settings_data
= _copy_settings.psr_copy_settings_data;
@@ -408,6 +409,13 @@ static bool dmub_psr_copy_settings(struct dmub_psr *dmub,
else
copy_settings_data->debug.bitfields.force_wakeup_by_tps3 = 0;
 
+   if (link->psr_settings.psr_version == DC_PSR_VERSION_1 &&
+   link->dpcd_caps.sink_dev_id == DP_DEVICE_ID_0022B9 &&
+   !memcmp(link->dpcd_caps.sink_dev_id_str, 
DP_SINK_DEVICE_STR_ID_3,
+   sizeof(DP_SINK_DEVICE_STR_ID_3))) {
+   copy_settings_data->poweroff_before_vertical_line = 16;
+   }
+
//WA for PSR1 on specific TCON, require frame delay for frame re-lock
copy_settings_data->relock_delay_frame_cnt = 0;
if (link->dpcd_caps.sink_dev_id == DP_BRANCH_DEVICE_ID_001CF8)
diff --git a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h 
b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
index 0367d0850495..6e705b219872 100644
--- a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
+++ b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
@@ -2283,9 +2283,9 @@ struct dmub_cmd_psr_copy_settings_data {
 */
uint16_t dsc_slice_height;
/**
-* Explicit padding to 4 byte boundary.
+* Some panels request main link off before xth vertical line
 */
-   uint16_t pad;
+   uint16_t poweroff_before_vertical_line;
 };
 
 /**
diff --git a/drivers/gpu/drm/amd/display/include/ddc_service_types.h 
b/drivers/gpu/drm/amd/display/include/ddc_service_types.h
index 68dfc7968017..1c603b12957f 100644
--- a/drivers/gpu/drm/amd/display/include/ddc_service_types.h
+++ b/drivers/gpu/drm/amd/display/include/ddc_service_types.h
@@ -39,6 +39,7 @@
 #define DP_BRANCH_HW_REV_10 0x10
 #define DP_BRANCH_HW_REV_20 0x20
 
+#define DP_DEVICE_ID_0022B9 0x0022B9
 #define DP_DEVICE_ID_38EC11 0x38EC11
 #define DP_DEVICE_ID_BA4159 0xBA4159
 #define DP_FORCE_PSRSU_CAPABILITY 0x40F
-- 
2.42.0

[PATCH 10/28] drm/amd/display: Adjust the MST resume flow

2023-09-06 Thread Stylon Wang

From: Wayne Lin 

[Why]
In drm_dp_mst_topology_mgr_resume() today, it will resume the
mst branch to be ready handling mst mode and also consecutively do
the mst topology probing. Which will cause the dirver have chance
to fire hotplug event before restoring the old state. Then Userspace
will react to the hotplug event based on a wrong state.

[How]
Adjust the mst resume flow as:
1. set dpcd to resume mst branch status
2. restore source old state
3. Do mst resume topology probing

For drm_dp_mst_topology_mgr_resume(), it's better to adjust it to
pull out topology probing work into a 2nd part procedure of the mst
resume. Will have a follow up patch in drm.

Reviewed-by: Chao-kai Wang 
Cc: Mario Limonciello 
Cc: Alex Deucher 
Cc: sta...@vger.kernel.org
Acked-by: Stylon Wang 
Signed-off-by: Wayne Lin 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 93 ---
 1 file changed, 80 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 93f8ec2acb4a..15bd87200f6d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2350,14 +2350,62 @@ static int dm_late_init(void *handle)
return detect_mst_link_for_all_connectors(adev_to_drm(adev));
 }
 
+static void resume_mst_branch_status(struct drm_dp_mst_topology_mgr *mgr)
+{
+   int ret;
+   u8 guid[16];
+   u64 tmp64;
+
+   mutex_lock(>lock);
+   if (!mgr->mst_primary)
+   goto out_fail;
+
+   if (drm_dp_read_dpcd_caps(mgr->aux, mgr->dpcd) < 0) {
+   drm_dbg_kms(mgr->dev, "dpcd read failed - undocked during 
suspend?\n");
+   goto out_fail;
+   }
+
+   ret = drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL,
+DP_MST_EN |
+DP_UP_REQ_EN |
+DP_UPSTREAM_IS_SRC);
+   if (ret < 0) {
+   drm_dbg_kms(mgr->dev, "mst write failed - undocked during 
suspend?\n");
+   goto out_fail;
+   }
+
+   /* Some hubs forget their guids after they resume */
+   ret = drm_dp_dpcd_read(mgr->aux, DP_GUID, guid, 16);
+   if (ret != 16) {
+   drm_dbg_kms(mgr->dev, "dpcd read failed - undocked during 
suspend?\n");
+   goto out_fail;
+   }
+
+   if (memchr_inv(guid, 0, 16) == NULL) {
+   tmp64 = get_jiffies_64();
+   memcpy([0], , sizeof(u64));
+   memcpy([8], , sizeof(u64));
+
+   ret = drm_dp_dpcd_write(mgr->aux, DP_GUID, guid, 16);
+
+   if (ret != 16) {
+   drm_dbg_kms(mgr->dev, "check mstb guid failed - 
undocked during suspend?\n");
+   goto out_fail;
+   }
+   }
+
+   memcpy(mgr->mst_primary->guid, guid, 16);
+
+out_fail:
+   mutex_unlock(>lock);
+}
+
 static void s3_handle_mst(struct drm_device *dev, bool suspend)
 {
struct amdgpu_dm_connector *aconnector;
struct drm_connector *connector;
struct drm_connector_list_iter iter;
struct drm_dp_mst_topology_mgr *mgr;
-   int ret;
-   bool need_hotplug = false;
 
drm_connector_list_iter_begin(dev, );
drm_for_each_connector_iter(connector, ) {
@@ -2379,18 +2427,15 @@ static void s3_handle_mst(struct drm_device *dev, bool 
suspend)
if (!dp_is_lttpr_present(aconnector->dc_link))

try_to_configure_aux_timeout(aconnector->dc_link->ddc, 
LINK_AUX_DEFAULT_TIMEOUT_PERIOD);
 
-   ret = drm_dp_mst_topology_mgr_resume(mgr, true);
-   if (ret < 0) {
-   
dm_helpers_dp_mst_stop_top_mgr(aconnector->dc_link->ctx,
-   aconnector->dc_link);
-   need_hotplug = true;
-   }
+   /* TODO: move resume_mst_branch_status() into drm mst 
resume again
+* once topology probing work is pulled out from mst 
resume into mst
+* resume 2nd step. mst resume 2nd step should be 
called after old
+* state getting restored (i.e. 
drm_atomic_helper_resume()).
+*/
+   resume_mst_branch_status(mgr);
}
}
drm_connector_list_iter_end();
-
-   if (need_hotplug)
-   drm_kms_helper_hotplug_event(dev);
 }
 
 static int amdgpu_dm_smu_write_watermarks_table(struct amdgpu_device *adev)
@@ -2784,7 +2829,8 @@ static int dm_resume(void *handle)
struct dm_atomic_state *dm_state = 
to_dm_atomic_state(dm->atomic_obj.state);
enum dc_connection_type new_connection_type = dc_connection_none;
struct dc_state *dc_state;
-   int i, r, j;
+   int i, r, j, ret;
+   bool need_hotplug = false;

[PATCH 09/28] drm/amd/display: Fix 2nd DPIA encoder Assignment

2023-09-06 Thread Stylon Wang

From: Mustapha Ghaddar 

[HOW & Why]
There seems to be an issue with 2nd DPIA acquiring link encoder for tiled 
displays.
Solution is to remove check for eng_id before we get first dynamic encoder for 
it

Reviewed-by: Cruise Hung 
Reviewed-by: Meenakshikumar Somasundaram 
Cc: Mario Limonciello 
Cc: Alex Deucher 
Cc: sta...@vger.kernel.org
Acked-by: Stylon Wang 
Signed-off-by: Mustapha Ghaddar 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link_enc_cfg.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_enc_cfg.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_enc_cfg.c
index b66eeac4d3d2..be5a6d008b29 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_enc_cfg.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_enc_cfg.c
@@ -395,8 +395,7 @@ void link_enc_cfg_link_encs_assign(
stream->link->dpia_preferred_eng_id != 
ENGINE_ID_UNKNOWN)
eng_id_req = 
stream->link->dpia_preferred_eng_id;
 
-   if (eng_id == ENGINE_ID_UNKNOWN)
-   eng_id = find_first_avail_link_enc(stream->ctx, 
state, eng_id_req);
+   eng_id = find_first_avail_link_enc(stream->ctx, state, 
eng_id_req);
}
else
eng_id =  link_enc->preferred_engine;
@@ -501,7 +500,6 @@ struct dc_link *link_enc_cfg_get_link_using_link_enc(
if (stream)
link = stream->link;
 
-   // dm_output_to_console("%s: No link using DIG(%d).\n", __func__, 
eng_id);
return link;
 }
 
-- 
2.42.0

[PATCH 08/28] drm/amd/display: do not block ODM + OPM on one side of the screen

2023-09-06 Thread Stylon Wang

From: Wenjing Liu 

[why]
build scaling param is overriding validation policy regarding small viewport
support. Even if ODM + windowed MPO is not supported. The decision has
to be made at the time of validation. When building scaling params, we might
be building an initial dc state as an input to DML validation. The initial state
is not supposed to be always valid and we rely on DML to modify the initial
dc state and determine the final validation result. This check is pre judging
validation result when building the initial dc state.

This causes an issue where we are transitioning from desktop only ODM
combine 2:1 to ODM bypass with 2 planes. In this case we are building
an initial state with with ODM 2:1 combine + 2 planes. This is indeed not
supported but DML is about to modify the state so it no longer uses ODM
combine. Before it reaches DML, dc resource already fails validation because
it checks that the initial state is not supported by our policy. This overrides
the ODM decision to validate this state with ODM combine disabled. Therefore
causes an unexpected validation failure when the secondary plane is added
on one side of the screen.

Reviewed-by: Dillon Varone 
Acked-by: Stylon Wang 
Signed-off-by: Wenjing Liu 
---
 drivers/gpu/drm/amd/display/dc/core/dc_resource.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index c929003825f4..494efbede0b2 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -1371,13 +1371,6 @@ bool resource_build_scaling_params(struct pipe_ctx 
*pipe_ctx)
/* depends on scaling ratios and recout, does not calculate offset yet 
*/
calculate_viewport_size(pipe_ctx);
 
-   if (!pipe_ctx->stream->ctx->dc->config.enable_windowed_mpo_odm) {
-   /* Stopgap for validation of ODM + MPO on one side of screen 
case */
-   if (pipe_ctx->plane_res.scl_data.viewport.height < 1 ||
-   pipe_ctx->plane_res.scl_data.viewport.width < 1)
-   return false;
-   }
-
/*
 * LB calculations depend on vp size, h/v_active and scaling ratios
 * Setting line buffer pixel depth to 24bpp yields banding
-- 
2.42.0

[PATCH 07/28] drm/amd/display: Fix DML calculation errors

2023-09-06 Thread Stylon Wang

From: Nicholas Susanto 

[Why]
DML calculations differ with DCN3.1 spreadsheet values due to
translations errors from the visual basic code

[How]
Add missing calculations that set the value for DSCDelay

Reviewed-by: Nicholas Kazlauskas 
Reviewed-by: Jun Lei 
Acked-by: Stylon Wang 
Signed-off-by: Nicholas Susanto 
---
 .../gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c| 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
index a94aa0f21a7f..88e56889a68c 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
@@ -2311,6 +2311,7 @@ static void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman

v->OutputFormat[k],
v->Output[k]) + 
dscComputeDelay(v->OutputFormat[k], v->Output[k]));
}
+   v->DSCDelay[k] = v->DSCDelay[k] + (v->HTotal[k] - 
v->HActive[k]) * dml_ceil((double) v->DSCDelay[k] / v->HActive[k], 1);
v->DSCDelay[k] = v->DSCDelay[k] * v->PixelClock[k] / 
v->PixelClockBackEnd[k];
} else {
v->DSCDelay[k] = 0;
@@ -4719,6 +4720,7 @@ void dml314_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_

v->OutputFormat[k],

v->Output[k]) + dscComputeDelay(v->OutputFormat[k], v->Output[k]));
}
+   v->DSCDelayPerState[i][k] = 
v->DSCDelayPerState[i][k] + (v->HTotal[k] - v->HActive[k]) * dml_ceil((double) 
v->DSCDelayPerState[i][k] / v->HActive[k], 1.0);
v->DSCDelayPerState[i][k] = 
v->DSCDelayPerState[i][k] * v->PixelClock[k] / v->PixelClockBackEnd[k];
} else {
v->DSCDelayPerState[i][k] = 0.0;
-- 
2.42.0

[PATCH 06/28] drm/amd/display: [FW Promotion] Release 0.0.181.0

2023-09-06 Thread Stylon Wang

From: Anthony Koo 

 - Add new params to dmub_feature_caps for checking replay
   support in FW

Acked-by: Stylon Wang 
Signed-off-by: Anthony Koo 
---
 drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h 
b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
index e2aebba29f68..0367d0850495 100644
--- a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
+++ b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
@@ -450,6 +450,8 @@ struct dmub_feature_caps {
uint8_t reserved[4];
uint8_t subvp_psr_support;
uint8_t gecc_enable;
+   uint8_t replay_supported;
+   uint8_t replay_reserved[3];
 };
 
 struct dmub_visual_confirm_color {
-- 
2.42.0

[PATCH 05/28] drm/amd/display: Don't check registers, if using AUX BL control

2023-09-06 Thread Stylon Wang

From: Swapnil Patel 

[Why]
Currently the driver looks DCN registers to access if BL is on or not.
This check is not valid if we are using AUX based brightness control.
This causes driver to not send out "backlight off" command during power off
sequence as it already thinks it is off.

[How]
Only check DCN registers if we aren't using AUX based brightness control.

Reviewed-by: Wenjing Liu 
Acked-by: Stylon Wang 
Signed-off-by: Swapnil Patel 
---
 drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index 602fb149dc10..31454db00ed5 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -964,7 +964,9 @@ void dce110_edp_backlight_control(
return;
}
 
-   if (link->panel_cntl) {
+   if (link->panel_cntl && !(link->dpcd_sink_ext_caps.bits.oled ||
+   link->dpcd_sink_ext_caps.bits.hdr_aux_backlight_control == 1 ||
+   link->dpcd_sink_ext_caps.bits.sdr_aux_backlight_control == 1)) {
bool is_backlight_on = 
link->panel_cntl->funcs->is_panel_backlight_on(link->panel_cntl);
 
if ((enable && is_backlight_on) || (!enable && 
!is_backlight_on)) {
-- 
2.42.0

[PATCH 04/28] drm/amd/display: Add dirty rect support for Replay

2023-09-06 Thread Stylon Wang

From: Bhawanpreet Lakha 

Dirty rect can be used with replay, so enable them to allow for more
powersaving.

Reviewed-by: Sun peng Li 
Acked-by: Stylon Wang 
Signed-off-by: Bhawanpreet Lakha 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 1bb1a394f55f..93f8ec2acb4a 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -8103,7 +8103,8 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
bundle->surface_updates[planes_count].plane_info =
>plane_infos[planes_count];
 
-   if 
(acrtc_state->stream->link->psr_settings.psr_feature_enabled) {
+   if (acrtc_state->stream->link->psr_settings.psr_feature_enabled 
||
+   
acrtc_state->stream->link->replay_settings.replay_feature_enabled) {
fill_dc_dirty_rects(plane, old_plane_state,
new_plane_state, new_crtc_state,
>flip_addrs[planes_count],
-- 
2.42.0

[PATCH 03/28] drm/amd/display: set default return value for ODM Combine debugfs

2023-09-06 Thread Stylon Wang

From: Aurabindo Pillai 

[Why]
Set a default return value of -ENOTSUPP to indicate that the hardware
does not support querying ODM Combine mode.

Reviewed-by: Rodrigo Siqueira 
Acked-by: Stylon Wang 
Signed-off-by: Aurabindo Pillai 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index 17d1990ea832..05c1ad98a1f6 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -1211,7 +1211,7 @@ static int odm_combine_segments_show(struct seq_file *m, 
void *unused)
struct amdgpu_dm_connector *aconnector = 
to_amdgpu_dm_connector(connector);
struct dc_link *link = aconnector->dc_link;
struct pipe_ctx *pipe_ctx = NULL;
-   int i, segments = 0;
+   int i, segments = -EOPNOTSUPP;
 
for (i = 0; i < MAX_PIPES; i++) {
pipe_ctx = >dc->current_state->res_ctx.pipe_ctx[i];
-- 
2.42.0

[PATCH 02/28] drm/amd/display: Don't lock phantom pipe on disabling

2023-09-06 Thread Stylon Wang

From: Alvin Lee 

[Description]
- When disabling a phantom pipe, we first enable the phantom
  OTG so the double buffer update can successfully take place
- However, want to avoid locking the phantom otherwise setting
  DPG_EN=1 for the phantom pipe is blocked (without this we could
  hit underflow due to phantom HUBP being blanked by default)

Reviewed-by: Samson Tam 
Acked-by: Stylon Wang 
Signed-off-by: Alvin Lee 
---
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index 9834b75f1837..79befa17bb03 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -111,7 +111,8 @@ void dcn10_lock_all_pipes(struct dc *dc,
if (pipe_ctx->top_pipe ||
!pipe_ctx->stream ||
(!pipe_ctx->plane_state && !old_pipe_ctx->plane_state) ||
-   !tg->funcs->is_tg_enabled(tg))
+   !tg->funcs->is_tg_enabled(tg) ||
+   pipe_ctx->stream->mall_stream_config.type == 
SUBVP_PHANTOM)
continue;
 
if (lock)
-- 
2.42.0

[PATCH 01/28] drm/amd/display: Blank phantom OTG before enabling

2023-09-06 Thread Stylon Wang

From: Alvin Lee 

[Description]
Before enabling the phantom OTG for an update we
must enable DPG to avoid underflow.

Reviewed-by: Samson Tam 
Acked-by: Stylon Wang 
Signed-off-by: Alvin Lee 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 50 +--
 .../drm/amd/display/dc/dcn20/dcn20_hwseq.c| 10 +++-
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.c| 46 +
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.h|  5 ++
 .../gpu/drm/amd/display/dc/dcn32/dcn32_init.c |  1 +
 .../gpu/drm/amd/display/dc/inc/hw_sequencer.h |  5 ++
 6 files changed, 68 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 20a3b4c81d4b..dcedda85dcdb 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1070,53 +1070,6 @@ static void apply_ctx_interdependent_lock(struct dc *dc,
}
 }
 
-static void phantom_pipe_blank(
-   struct dc *dc,
-   struct timing_generator *tg,
-   int width,
-   int height)
-{
-   struct dce_hwseq *hws = dc->hwseq;
-   enum dc_color_space color_space;
-   struct tg_color black_color = {0};
-   struct output_pixel_processor *opp = NULL;
-   uint32_t num_opps, opp_id_src0, opp_id_src1;
-   uint32_t otg_active_width, otg_active_height;
-   uint32_t i;
-
-   /* program opp dpg blank color */
-   color_space = COLOR_SPACE_SRGB;
-   color_space_to_black_color(dc, color_space, _color);
-
-   otg_active_width = width;
-   otg_active_height = height;
-
-   /* get the OPTC source */
-   tg->funcs->get_optc_source(tg, _opps, _id_src0, _id_src1);
-   ASSERT(opp_id_src0 < dc->res_pool->res_cap->num_opp);
-
-   for (i = 0; i < dc->res_pool->res_cap->num_opp; i++) {
-   if (dc->res_pool->opps[i] != NULL && 
dc->res_pool->opps[i]->inst == opp_id_src0) {
-   opp = dc->res_pool->opps[i];
-   break;
-   }
-   }
-
-   if (opp && opp->funcs->opp_set_disp_pattern_generator)
-   opp->funcs->opp_set_disp_pattern_generator(
-   opp,
-   CONTROLLER_DP_TEST_PATTERN_SOLID_COLOR,
-   CONTROLLER_DP_COLOR_SPACE_UDEFINED,
-   COLOR_DEPTH_UNDEFINED,
-   _color,
-   otg_active_width,
-   otg_active_height,
-   0);
-
-   if (tg->funcs->is_tg_enabled(tg))
-   hws->funcs.wait_for_blank_complete(opp);
-}
-
 static void dc_update_viusal_confirm_color(struct dc *dc, struct dc_state 
*context, struct pipe_ctx *pipe_ctx)
 {
if (dc->ctx->dce_version >= DCN_VERSION_1_0) {
@@ -1207,7 +1160,8 @@ static void disable_dangling_plane(struct dc *dc, struct 
dc_state *context)
 
main_pipe_width = 
old_stream->mall_stream_config.paired_stream->dst.width;
main_pipe_height = 
old_stream->mall_stream_config.paired_stream->dst.height;
-   phantom_pipe_blank(dc, tg, 
main_pipe_width, main_pipe_height);
+   if (dc->hwss.blank_phantom)
+   dc->hwss.blank_phantom(dc, tg, 
main_pipe_width, main_pipe_height);
tg->funcs->enable_crtc(tg);
}
}
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
index ad82f19fe36a..5ac85df158b9 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
@@ -1840,8 +1840,16 @@ void dcn20_program_front_end_for_ctx(

dc->current_state->res_ctx.pipe_ctx[i].stream->mall_stream_config.type == 
SUBVP_PHANTOM) {
struct timing_generator *tg = 
dc->current_state->res_ctx.pipe_ctx[i].stream_res.tg;
 
-   if (tg->funcs->enable_crtc)
+   if (tg->funcs->enable_crtc) {
+   if (dc->hwss.blank_phantom) {
+   int main_pipe_width, main_pipe_height;
+
+   main_pipe_width = 
dc->current_state->res_ctx.pipe_ctx[i].stream->mall_stream_config.paired_stream->dst.width;
+   main_pipe_height = 
dc->current_state->res_ctx.pipe_ctx[i].stream->mall_stream_config.paired_stream->dst.height;
+   dc->hwss.blank_phantom(dc, tg, 
main_pipe_width, main_pipe_height);
+   }
tg->funcs->enable_crtc(tg);
+   }
}

[PATCH 00/28] DC Patches Sep 8, 2023

2023-09-06 Thread Stylon Wang

This DC patchset brings improvements in multiple areas. In summary, we have:
- Fix MST bugs
- Fix ODM combine debugfs
- Fix DML calculations
- Fix 2nd DPIA encoder issue
- Fix AUX-based backlight control
- Fix on MPO+ODM use case
- Fix DCCG clock programming
- Improvements on replay
- Improvements on logging and reporting
- Improvements on pipe and OTG handling
- Improvements and bug fixes on power optimization
- Improvements on VRR
- Code clean up and fix un-initialized values

Cc: Daniel Wheeler 

Alvin Lee (2):
  drm/amd/display: Blank phantom OTG before enabling
  drm/amd/display: Don't lock phantom pipe on disabling

Anthony Koo (1):
  drm/amd/display: [FW Promotion] Release 0.0.181.0

Aric Cyr (1):
  drm/amd/display: 3.2.250

Aurabindo Pillai (2):
  drm/amd/display: set default return value for ODM Combine debugfs
  drm/amd/display: Add DCHUBBUB callback to report MALL status

Austin Zheng (1):
  drm/amd/display: Add check for vrr_active_fixed

Bhawanpreet Lakha (1):
  drm/amd/display: Add dirty rect support for Replay

Charlene Liu (1):
  drm/amd/display: fix some non-initialized register mask and setting

Dillon Varone (1):
  drm/amd/display: add dp dto programming function to dccg

Ethan Bitnun (1):
  drm/amd/display: Add new logs for AutoDPMTest

Ian Chen (1):
  drm/amd/display: add skip_implict_edp_power_control flag for dcn32

Muhammad Ahmed (1):
  drm/amd/display: Fix MST recognizes connected displays as one

Mustapha Ghaddar (1):
  drm/amd/display: Fix 2nd DPIA encoder Assignment

Nicholas Susanto (1):
  drm/amd/display: Fix DML calculation errors

Paul Hsieh (1):
  drm/amd/display: support main link off before specific vertical line

Qingqing Zhuo (1):
  drm/amd/display: Drop unused registers

Sridevi Arvindekar (1):
  drm/amd/display: dc cleanup for tests

Swapnil Patel (1):
  drm/amd/display: Don't check registers, if using AUX BL control

Wayne Lin (1):
  drm/amd/display: Adjust the MST resume flow

Wenjing Liu (8):
  drm/amd/display: do not block ODM + OPM on one side of the screen
  drm/amd/display: remove a function that does complex calculation in
every frame but not used
  drm/amd/display: do not attempt ODM power optimization if minimal
transition doesn't exist
  drm/amd/display: only allow ODM power optimization if surface is
within guaranteed viewport size
  drm/amd/display: do not skip ODM minimal transition based on new state
  drm/amd/display: minior logging improvements
  drm/amd/display: add seamless pipe topology transition check
  drm/amd/display: move odm power optimization decision after subvp
optimization

 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  96 ++-
 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c |  15 +-
 .../display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c  |  64 ++
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 107 ++--
 .../drm/amd/display/dc/core/dc_link_enc_cfg.c |   4 +-
 .../gpu/drm/amd/display/dc/core/dc_resource.c |   7 -
 drivers/gpu/drm/amd/display/dc/dc.h   |   3 +-
 drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c  |   4 +-
 .../drm/amd/display/dc/dce/dce_clock_source.c |   1 +
 drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c |  10 +-
 .../display/dc/dce110/dce110_hw_sequencer.c   |  34 +-
 .../drm/amd/display/dc/dcn10/dcn10_hubbub.h   |   5 +-
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c |   3 +-
 .../display/dc/dcn10/dcn10_stream_encoder.h   |   5 +-
 .../drm/amd/display/dc/dcn20/dcn20_hwseq.c|  20 +-
 .../drm/amd/display/dc/dcn32/dcn32_hubbub.c   |  14 +-
 .../drm/amd/display/dc/dcn32/dcn32_hubbub.h   |   6 +-
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.c| 104 ++-
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.h|   9 +
 .../gpu/drm/amd/display/dc/dcn32/dcn32_init.c |   2 +
 .../gpu/drm/amd/display/dc/dcn32/dcn32_mpc.c  |   2 +-
 .../drm/amd/display/dc/dcn32/dcn32_resource.c |  79 +--
 .../drm/amd/display/dc/dcn32/dcn32_resource.h |   1 +
 .../display/dc/dcn32/dcn32_resource_helpers.c |   4 +-
 .../drm/amd/display/dc/dcn35/dcn35_hubbub.h   |   2 -
 .../dc/dml/dcn314/display_mode_vba_314.c  |   2 +
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c  | 601 ++
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.h  |   3 -
 .../gpu/drm/amd/display/dc/inc/hw/clk_mgr.h   |   6 +-
 .../amd/display/dc/inc/hw/clk_mgr_internal.h  |  16 +-
 drivers/gpu/drm/amd/display/dc/inc/hw/dccg.h  |  10 +
 .../gpu/drm/amd/display/dc/inc/hw/dchubbub.h  |   1 +
 .../gpu/drm/amd/display/dc/inc/hw_sequencer.h |   8 +
 .../gpu/drm/amd/display/dc/link/link_dpms.c   |  10 +-
 .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h   |   6 +-
 .../amd/display/include/ddc_service_types.h   |   1 +
 .../drm/amd/display/include/logger_types.h|   5 +-
 37 files changed, 793 insertions(+), 477 deletions(-)

-- 
2.42.0

[PATCH] drm/amdgpu: fix unsigned error codes

2023-09-06 Thread Lang Yu

Fixes: 77b13b916728 ("drm/amdgpu: add selftest framework for UMSCH")

Signed-off-by: Lang Yu 
Reported-by: Dan Carpenter 
Link: https://lore.kernel.org/all/ZPhddADtKmOuVyDq@lang-desktop
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c
index 284643e1efeb..9da80b54d63e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c
@@ -335,11 +335,10 @@ static int setup_umsch_mm_test(struct amdgpu_device *adev,
if (r)
goto error_free_vm;
 
-   test->pasid = amdgpu_pasid_alloc(16);
-   if (test->pasid < 0) {
-   r = test->pasid;
+   r = amdgpu_pasid_alloc(16);
+   if (r < 0)
goto error_fini_vm;
-   }
+   test->pasid = r;
 
r = amdgpu_bo_create_kernel(adev, sizeof(struct umsch_mm_test_ctx_data),
PAGE_SIZE, AMDGPU_GEM_DOMAIN_GTT,
-- 
2.25.1

[PATCH] drm/amdgpu: Use default reset method handler

2023-09-06 Thread Lijo Lazar

When reset method is not passed in reset context, look for the handler
for default reset method. On Aldebaran, default reset method for SOCs
connected to CPU over XGMI is MODE2.

Signed-off-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/aldebaran.c | 16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/aldebaran.c 
b/drivers/gpu/drm/amd/amdgpu/aldebaran.c
index 82e1c83a7ccc..5d2516210a3a 100644
--- a/drivers/gpu/drm/amd/amdgpu/aldebaran.c
+++ b/drivers/gpu/drm/amd/amdgpu/aldebaran.c
@@ -50,6 +50,13 @@ aldebaran_get_reset_handler(struct amdgpu_reset_control 
*reset_ctl,
struct amdgpu_device *adev = (struct amdgpu_device *)reset_ctl->handle;
int i;
 
+   if (reset_context->method == AMD_RESET_METHOD_NONE) {
+   if (aldebaran_is_mode2_default(reset_ctl))
+   reset_context->method = AMD_RESET_METHOD_MODE2;
+   else
+   reset_context->method = amdgpu_asic_reset_method(adev);
+   }
+
if (reset_context->method != AMD_RESET_METHOD_NONE) {
dev_dbg(adev->dev, "Getting reset handler for method %d\n",
reset_context->method);
@@ -59,15 +66,6 @@ aldebaran_get_reset_handler(struct amdgpu_reset_control 
*reset_ctl,
}
}
 
-   if (aldebaran_is_mode2_default(reset_ctl)) {
-   for_each_handler(i, handler, reset_ctl) {
-   if (handler->reset_method == AMD_RESET_METHOD_MODE2) {
-   reset_context->method = AMD_RESET_METHOD_MODE2;
-   return handler;
-   }
-   }
-   }
-
dev_dbg(adev->dev, "Reset handler not found!\n");
 
return NULL;
-- 
2.25.1

Re: [PATCH v2] drm/amd: Fix the flag setting code for interrupt request

2023-09-06 Thread Ma, Jun




On 9/6/2023 3:23 PM, Christian König wrote:
> Am 06.09.23 um 08:55 schrieb Ma Jun:
>> [1] Remove the irq flags setting code since pci_alloc_irq_vectors()
>> handles these flags.
>> [2] Free the msi vectors in case of error.
>>
>> v2:
>> - Remove local variable initializing code (Christian)
>> - Use PCI_IRQ_ALL_TYPES (Alex)
>>
>> Signed-off-by: Ma Jun 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 45 ++---
>>   1 file changed, 26 insertions(+), 19 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> index fa6d0adcec20..64c245015e17 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> @@ -270,29 +270,29 @@ static void amdgpu_restore_msix(struct amdgpu_device 
>> *adev)
>>*/
>>   int amdgpu_irq_init(struct amdgpu_device *adev)
>>   {
>> -int r = 0;
>> -unsigned int irq;
>> +int r;
>> +unsigned int irq, flags;
> 
> It's also good style to define variables like "r" and "i" last. Some 
> upstream maintainers even require reverse xmas tree style defines (e.g. 
> longest first, shortest last).
> 
> With that changed the patch is Acked-by: Christian König 
> 
> 

Thanks, I will update it when push.

Regards,
Ma Jun
> Regards,
> Christian.
> 
>>   
>>  spin_lock_init(>irq.lock);
>>   
>>  /* Enable MSI if not disabled by module parameter */
>>  adev->irq.msi_enabled = false;
>>   
>> +if (!amdgpu_msi_ok(adev))
>> +flags = PCI_IRQ_LEGACY;
>> +else
>> +flags = PCI_IRQ_ALL_TYPES;
>> +
>> +/* we only need one vector */
>> +r = pci_alloc_irq_vectors(adev->pdev, 1, 1, flags);
>> +if (r < 0) {
>> +dev_err(adev->dev, "Failed to alloc msi vectors\n");
>> +return r;
>> +}
>> +
>>  if (amdgpu_msi_ok(adev)) {
>> -int nvec = pci_msix_vec_count(adev->pdev);
>> -unsigned int flags;
>> -
>> -if (nvec <= 0)
>> -flags = PCI_IRQ_MSI;
>> -else
>> -flags = PCI_IRQ_MSI | PCI_IRQ_MSIX;
>> -
>> -/* we only need one vector */
>> -nvec = pci_alloc_irq_vectors(adev->pdev, 1, 1, flags);
>> -if (nvec > 0) {
>> -adev->irq.msi_enabled = true;
>> -dev_dbg(adev->dev, "using MSI/MSI-X.\n");
>> -}
>> +adev->irq.msi_enabled = true;
>> +dev_dbg(adev->dev, "using MSI/MSI-X.\n");
>>  }
>>   
>>  INIT_WORK(>irq.ih1_work, amdgpu_irq_handle_ih1);
>> @@ -302,22 +302,29 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
>>  /* Use vector 0 for MSI-X. */
>>  r = pci_irq_vector(adev->pdev, 0);
>>  if (r < 0)
>> -return r;
>> +goto free_vectors;
>>  irq = r;
>>   
>>  /* PCI devices require shared interrupts. */
>>  r = request_irq(irq, amdgpu_irq_handler, IRQF_SHARED, 
>> adev_to_drm(adev)->driver->name,
>>  adev_to_drm(adev));
>>  if (r)
>> -return r;
>> +goto free_vectors;
>> +
>>  adev->irq.installed = true;
>>  adev->irq.irq = irq;
>>  adev_to_drm(adev)->max_vblank_count = 0x00ff;
>>   
>>  DRM_DEBUG("amdgpu: irq initialized.\n");
>>  return 0;
>> -}
>>   
>> +free_vectors:
>> +if (adev->irq.msi_enabled)
>> +pci_free_irq_vectors(adev->pdev);
>> +
>> +adev->irq.msi_enabled = false;
>> +return r;
>> +}
>>   
>>   void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
>>   {
>

Re: [bug report] drm/amdgpu: add selftest framework for UMSCH

2023-09-06 Thread Lang Yu

On 09/06/ , Dan Carpenter wrote:

Thanks for reporting this bug. Can you give a link to this bug report? Commit 
message requests it.
("Reported-by: should be immediately followed by Link: with a URL to the 
report")

Regards,
Lang

> Hello Lang Yu,
> 
> The patch 5d5eac7e8303: "drm/amdgpu: add selftest framework for
> UMSCH" from Jun 21, 2023 (linux-next), leads to the following Smatch
> static checker warning:
> 
>   drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c:338 setup_umsch_mm_test()
>   warn: unsigned error codes 'test->pasid'
> 
> drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c
> 319 static int setup_umsch_mm_test(struct amdgpu_device *adev,
> 320   struct umsch_mm_test *test)
> 321 {
> 322 struct amdgpu_vmhub *hub = >vmhub[AMDGPU_MMHUB0(0)];
> 323 int r;
> 324 
> 325 test->vm_cntx_cntl = hub->vm_cntx_cntl;
> 326 
> 327 test->vm = kzalloc(sizeof(*test->vm), GFP_KERNEL);
> 328 if (!test->vm) {
> 329 r = -ENOMEM;
> 330 return r;
> 331 }
> 332 
> 333 r = amdgpu_vm_init(adev, test->vm, -1);
> 334 if (r)
> 335 goto error_free_vm;
> 336 
> 337 test->pasid = amdgpu_pasid_alloc(16);
> --> 338 if (test->pasid < 0) {
> ^^^
> Unsigned can't be less than zero.
> 
> 339 r = test->pasid;
> 340 goto error_fini_vm;
> 341 }
> 342 
> 343 r = amdgpu_bo_create_kernel(adev, sizeof(struct 
> umsch_mm_test_ctx_data),
> 
> regards,
> dan carpenter

Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Thomas Zimmermann


Hi

Am 06.09.23 um 11:48 schrieb suijingfeng:
[...]


There's 'nomodeset', which disables all native drivers. It's useful 
for debugging or as a quick-fix if the graphics driver breaks. If you 
want to disable a specific driver, please use one of the options for 
blacklisting.



Yeah, the 'nomodeset' disables all native drivers,
this is a good point of it, but this is also the weak point of it.


Well, that's by design. Graphics is at the core of the user experience. 
We often cannot _not_ provide it. And if it's broken, there needs to be 
a reliable fallback. There needs to be at least enough graphics support 
to run a terminal and repair the system. And it also needs to be simple 
enough for the average user. Falling back to serial terminals if often 
not an option.


At least here at SUSE, when users or customers report a broken graphics 
driver, we can tell them to start with 'nomodeset' and get at least the 
basic graphics. That's good enough for most productivity/office 
software. In the meantime, we investigate the problem.


There were concerns about the need of nomodeset, but I think it has 
proven to be useful in practice.



Sometimes, when you are developing a drm driver for a new device.
You will see the pain. Its too often a programmer's modification
make the entire Linux kernel hang there. The problematic drm
driver kernel module already in the initrd. Then, the real
need to disable the ill-functional drm driver kernel module
only. While what you recommend to disable them all. There
are subtle difference.


I found that initcall_blacklist= works reliable for me.



Another limitation of the 'nomodeset' parameter is that
it is only available on recent upstream kernel. Low version
downstream kernel don't has this parameter supported yet.
So this create inconstant developing experience. I believe that
there always some people need do back-port and upstream work
for various reasons.


Nomodeset used to be there, but in a different form. It forced VGA text 
mode IIRC. 'git grep' for vga_text_force() in an old kernel. We adopted 
the parameter for all of graphics, because it already did what we needed.


Best regards
Thomas



While (kindly, no offensive) debating, since we have the modprobe.blacklist
why we still need the 'nomodeset' parameter ?
why not try 
modprobe.blacklist="amdgpu,radeon,i915,ast,nouveau,gma500_gfx, ..."


:-/


But OK in overall, I will listen to your advice.



Best regards
Thomas

[1] 
https://elixir.bootlin.com/linux/v6.5/source/include/drm/drm_module.h#L83



for the modeset parameter, authors of various device driver try to 
make the usage not

conflict with others. I believe that this is good thing for Linux users.
It is probably the responsibility of the drm core maintainers to 
force various drm
drivers to reach a minimal consensus. Probably it pains to do so and 
doesn't pay off.

But reach a minimal consensus do benefit to Linux users.


You can use modprobe.blacklist or initcall_blacklist on the kernel 
command line.



There are some cases where the modprobe.blacklist doesn't works,
I have come cross several time during the past.
Because the device selected by the VGAARB is device-level thing,
it is not the driver's problem.

Sometimes when VGAARB has a bug, it will select a wrong device as 
primary.
And the X server will use this wrong device as primary and completely 
crash

there, due to lack a driver. Take my old S3 Graphics as an example:

$ lspci | grep VGA

  00:06.1 VGA compatible controller: Loongson Technology LLC DC 
(Display Controller) (rev 01)
  03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. 
[AMD/ATI] Caicos XT [Radeon HD 7470/8470 / R5 235/310 OEM]
  07:00.0 VGA compatible controller: S3 Graphics Ltd. Device 9070 
(rev 01)
  08:00.0 VGA compatible controller: S3 Graphics Ltd. Device 9070 
(rev 01)


Before apply this patch:

[    0.361748] pci :00:06.1: vgaarb: setting as boot VGA device
[    0.361753] pci :00:06.1: vgaarb: VGA device added: 
decodes=io+mem,owns=io+mem,locks=none
[    0.361765] pci :03:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
[    0.361773] pci :07:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
[    0.361779] pci :08:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none

[    0.361781] vgaarb: loaded
[    0.367838] pci :00:06.1: Overriding boot device as 1002:6778
[    0.367841] pci :00:06.1: Overriding boot device as 5333:9070
[    0.367843] pci :00:06.1: Overriding boot device as 5333:9070


For known reason, one of my system select the S3 Graphics as primary 
GPU.

But this S3 Graphics not even have a decent drm upstream driver yet.
Under such a case, I begin to believe that only the device who has a
driver deserve the primary.

Under such a condition, I want to reboot and enter the graphic 
environment
with other working video cards. Either platform integrated and 
discrete GPU.
This don't means I should

Re: [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

2023-09-06 Thread Christian König


Am 06.09.23 um 12:31 schrieb Sui Jingfeng:

Hi,

On 2023/9/6 14:45, Christian König wrote:
Firmware framebuffer device already get killed by the 
drm_aperture_remove_conflicting_pci_framebuffers()
function (or its siblings). So, this series is definitely not to 
interact with the firmware framebuffer
(or more intelligent framebuffer drivers).  It is for user space 
program, such as X server and Wayland
compositor. Its for Linux user or drm drivers testers, which allow 
them to direct graphic display server

using right hardware of interested as primary video card.

Also, I believe that X server and Wayland compositor are the best 
test examples.

If a specific DRM driver can't work with X server as a primary,
then there probably have something wrong.



But what's the use case for overriding this setting?



On a specific machine with multiple GPUs mounted,
only the primary graphics get POST-ed (initialized) by the firmware.
Therefore, the DRM drivers for the rest video cards, have to choose to
work without the prerequisite setups done by firmware, This is 
called as POST.


Well, you don't seem to understand the background here. This is 
perfectly normal behavior.


Secondary cards are posted after loading the appropriate DRM driver. 
At least for amdgpu this is done by calling the appropriate functions 
in the BIOS. 



Well, thanks for you tell me this. You know more than me and 
definitely have a better understanding.


Are you telling me that the POST function for AMDGPU reside in the BIOS?
The kernel call into the BIOS?


Yes, exactly that.

Does the BIOS here refer to the UEFI runtime or ATOM BIOS or something 
else?


On dGPUs it's the VBIOS on a flashrom on the board, for iGPUs (APUs as 
AMD calls them) it's part of the system BIOS.


UEFI is actually just a small subsystem in the system BIOS which 
replaced the old interface used between system BIOS, video BIOS and 
operating system.




But the POST function for the drm ast, reside in the kernel space (in 
other word, in ast.ko).

Is this statement correct?


I don't know the ast driver well enough to answer that, but I assume 
they just read the BIOS and execute the appropriate functions.




I means that for ASpeed BMC chip, if the firmware not POST the display 
controller.
Then we have to POST it at the kernel space before doing various 
modeset option.

We can only POST this chip by directly operate the various registers.
Am I correct for the judgement about ast drm driver?


Well POST just means Power On Self Test, but what you mean is 
initializing the hardware.


Some drivers can of course initialize the hardware without the help of 
the BIOS, but I don't think AST can do that. As far as I know it's a 
relatively simple driver.


BTW firmware is not the same as the BIOS (which runs the POST), firmware 
usually refers to something run on microcontrollers inside the ASIC 
while the (system or video) BIOS runs on the host CPU.


Regards,
Christian.



Thanks for your reviews.

RE: [PATCH 3/3] drm/amdgpu: print more address info of UMC bad page

2023-09-06 Thread Zhang, Hawking

[AMD Official Use Only - General]

Series is

Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: Zhou1, Tao 
Sent: Wednesday, September 6, 2023 18:10
To: amd-gfx@lists.freedesktop.org; Zhang, Hawking ; 
Yang, Stanley ; Li, Candice ; Chai, 
Thomas 
Cc: Zhou1, Tao 
Subject: [PATCH 3/3] drm/amdgpu: print more address info of UMC bad page

Print out row, column and bank value of UMC error address for UMC v12.

Signed-off-by: Tao Zhou 
---
 drivers/gpu/drm/amd/amdgpu/umc_v12_0.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c 
b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c
index 5f056dd7691e..6fde85367272 100644
--- a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c
@@ -173,7 +173,7 @@ static void umc_v12_0_convert_error_address(struct 
amdgpu_device *adev,  {
uint32_t channel_index, i;
uint64_t soc_pa, na, retired_page, column;
-   uint32_t bank_hash0, bank_hash1, bank_hash2, bank_hash3, col, row;
+   uint32_t bank_hash0, bank_hash1, bank_hash2, bank_hash3, col, row,
+row_xor;
uint32_t bank0, bank1, bank2, bank3, bank;

bank_hash0 = (err_addr >> UMC_V12_0_MCA_B0_BIT) & 0x1ULL; @@ -228,17 
+228,23 @@ static void umc_v12_0_convert_error_address(struct amdgpu_device 
*adev,
/* clear [C4] in soc physical address */
soc_pa &= ~(0x1ULL << UMC_V12_0_PA_C4_BIT);

+   row_xor = row ^ (0x1ULL << 13);
/* loop for all possibilities of [C4 C3 C2] */
for (column = 0; column < UMC_V12_0_NA_MAP_PA_NUM; column++) {
retired_page = soc_pa | ((column & 0x3) << UMC_V12_0_PA_C2_BIT);
retired_page |= (((column & 0x4) >> 2) << UMC_V12_0_PA_C4_BIT);
-   dev_info(adev->dev, "Error Address(PA): 0x%llx\n", 
retired_page);
+   /* include column bit 0 and 1 */
+   col &= 0x3;
+   col |= (column << 2);
+   dev_info(adev->dev, "Error Address(PA):0x%llx Row:0x%x Col:0x%x 
Bank:0x%x\n",
+   retired_page, row, col, bank);
amdgpu_umc_fill_error_record(err_data, err_addr,
retired_page, channel_index, umc_inst);

/* shift R13 bit */
retired_page ^= (0x1ULL << UMC_V12_0_PA_R13_BIT);
-   dev_info(adev->dev, "Error Address(PA): 0x%llx\n", 
retired_page);
+   dev_info(adev->dev, "Error Address(PA):0x%llx Row:0x%x Col:0x%x 
Bank:0x%x\n",
+   retired_page, row_xor, col, bank);
amdgpu_umc_fill_error_record(err_data, err_addr,
retired_page, channel_index, umc_inst);
}
--
2.35.1

RE: [PATCH] drm/amdgpu: Correct se_num and reg_inst for gfx v9_4_3 ras counters

2023-09-06 Thread Zhou1, Tao

[AMD Official Use Only - General]

Reviewed-by: Tao Zhou 

> -Original Message-
> From: amd-gfx  On Behalf Of Hawking
> Zhang
> Sent: Wednesday, September 6, 2023 6:12 PM
> To: amd-gfx@lists.freedesktop.org; Zhou1, Tao ; Yang,
> Stanley ; Li, Candice ; Chai,
> Thomas 
> Cc: Zhang, Hawking 
> Subject: [PATCH] drm/amdgpu: Correct se_num and reg_inst for gfx v9_4_3 ras
> counters
>
> gfx_v9_4_3_ue|ce_reg_list is an array per gfx core instance correct the 
> settings of
> se_num and reg_inst for some of gfx ras counters so all the available register
> instances can be polled for ras status.
>
> Signed-off-by: Hawking Zhang 
> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 40 -
>  1 file changed, 20 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
> index 0a26a00074a6..a60d1a8405d4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
> @@ -3653,19 +3653,19 @@ static const struct amdgpu_gfx_ras_reg_entry
> gfx_v9_4_3_ce_reg_list[] = {
>   AMDGPU_GFX_GC_CANE_MEM, 1},
>   {{AMDGPU_RAS_REG_ENTRY(GC, 0, regSPI_CE_ERR_STATUS_LO,
> regSPI_CE_ERR_STATUS_HI),
>   1, (AMDGPU_RAS_ERR_INFO_VALID |
> AMDGPU_RAS_ERR_STATUS_VALID), "SPI"},
> - AMDGPU_GFX_SPI_MEM, 8},
> + AMDGPU_GFX_SPI_MEM, 1},
>   {{AMDGPU_RAS_REG_ENTRY(GC, 0, regSP0_CE_ERR_STATUS_LO,
> regSP0_CE_ERR_STATUS_HI),
>   10, (AMDGPU_RAS_ERR_INFO_VALID |
> AMDGPU_RAS_ERR_STATUS_VALID), "SP0"},
> - AMDGPU_GFX_SP_MEM, 1},
> + AMDGPU_GFX_SP_MEM, 4},
>   {{AMDGPU_RAS_REG_ENTRY(GC, 0, regSP1_CE_ERR_STATUS_LO,
> regSP1_CE_ERR_STATUS_HI),
>   10, (AMDGPU_RAS_ERR_INFO_VALID |
> AMDGPU_RAS_ERR_STATUS_VALID), "SP1"},
> - AMDGPU_GFX_SP_MEM, 1},
> + AMDGPU_GFX_SP_MEM, 4},
>   {{AMDGPU_RAS_REG_ENTRY(GC, 0, regSQ_CE_ERR_STATUS_LO,
> regSQ_CE_ERR_STATUS_HI),
>   10, (AMDGPU_RAS_ERR_INFO_VALID |
> AMDGPU_RAS_ERR_STATUS_VALID), "SQ"},
> - AMDGPU_GFX_SQ_MEM, 8},
> + AMDGPU_GFX_SQ_MEM, 4},
>   {{AMDGPU_RAS_REG_ENTRY(GC, 0, regSQC_CE_EDC_LO,
> regSQC_CE_EDC_HI),
>   5, (AMDGPU_RAS_ERR_INFO_VALID |
> AMDGPU_RAS_ERR_STATUS_VALID), "SQC"},
> - AMDGPU_GFX_SQC_MEM, 8},
> + AMDGPU_GFX_SQC_MEM, 4},
>   {{AMDGPU_RAS_REG_ENTRY(GC, 0, regTCX_CE_ERR_STATUS_LO,
> regTCX_CE_ERR_STATUS_HI),
>   2, (AMDGPU_RAS_ERR_INFO_VALID |
> AMDGPU_RAS_ERR_STATUS_VALID), "TCX"},
>   AMDGPU_GFX_TCX_MEM, 1},
> @@ -3674,22 +3674,22 @@ static const struct amdgpu_gfx_ras_reg_entry
> gfx_v9_4_3_ce_reg_list[] = {
>   AMDGPU_GFX_TCC_MEM, 1},
>   {{AMDGPU_RAS_REG_ENTRY(GC, 0, regTA_CE_EDC_LO,
> regTA_CE_EDC_HI),
>   10, (AMDGPU_RAS_ERR_INFO_VALID |
> AMDGPU_RAS_ERR_STATUS_VALID), "TA"},
> - AMDGPU_GFX_TA_MEM, 8},
> + AMDGPU_GFX_TA_MEM, 4},
>   {{AMDGPU_RAS_REG_ENTRY(GC, 0, regTCI_CE_EDC_LO_REG,
> regTCI_CE_EDC_HI_REG),
> - 31, (AMDGPU_RAS_ERR_INFO_VALID |
> AMDGPU_RAS_ERR_STATUS_VALID), "TCI"},
> + 27, (AMDGPU_RAS_ERR_INFO_VALID |
> AMDGPU_RAS_ERR_STATUS_VALID),
> +"TCI"},
>   AMDGPU_GFX_TCI_MEM, 1},
>   {{AMDGPU_RAS_REG_ENTRY(GC, 0, regTCP_CE_EDC_LO_REG,
> regTCP_CE_EDC_HI_REG),
>   10, (AMDGPU_RAS_ERR_INFO_VALID |
> AMDGPU_RAS_ERR_STATUS_VALID), "TCP"},
> - AMDGPU_GFX_TCP_MEM, 8},
> + AMDGPU_GFX_TCP_MEM, 4},
>   {{AMDGPU_RAS_REG_ENTRY(GC, 0, regTD_CE_EDC_LO,
> regTD_CE_EDC_HI),
>   10, (AMDGPU_RAS_ERR_INFO_VALID |
> AMDGPU_RAS_ERR_STATUS_VALID), "TD"},
> - AMDGPU_GFX_TD_MEM, 8},
> + AMDGPU_GFX_TD_MEM, 4},
>   {{AMDGPU_RAS_REG_ENTRY(GC, 0, regGCEA_CE_ERR_STATUS_LO,
> regGCEA_CE_ERR_STATUS_HI),
>   16, (AMDGPU_RAS_ERR_INFO_VALID |
> AMDGPU_RAS_ERR_STATUS_VALID), "GCEA"},
>   AMDGPU_GFX_GCEA_MEM, 1},
>   {{AMDGPU_RAS_REG_ENTRY(GC, 0, regLDS_CE_ERR_STATUS_LO,
> regLDS_CE_ERR_STATUS_HI),
>   10, (AMDGPU_RAS_ERR_INFO_VALID |
> AMDGPU_RAS_ERR_STATUS_VALID), "LDS"},
> - AMDGPU_GFX_LDS_MEM, 1},
> + AMDGPU_GFX_LDS_MEM, 4},
>  };
>
>  static const struct amdgpu_gfx_ras_reg_entry gfx_v9_4_3_ue_reg_list[] = { @@
> -3713,19 +3713,19 @@ static const struct amdgpu_gfx_ras_reg_entry
> gfx_v9_4_3_ue_reg_list[] = {
>   AMDGPU_GFX_GC_CANE_MEM, 1},
>   {{AMDGPU_RAS_REG_ENTRY(GC, 0, regSPI_UE_ERR_STATUS_LO,
> regSPI_UE_ERR_STATUS_HI),
>   1, (AMDGPU_RAS_ERR_INFO_VALID |
> AMDGPU_RAS_ERR_STATUS_VALID), "SPI"},
> - AMDGPU_GFX_SPI_MEM, 8},
> + AMDGPU_GFX_SPI_MEM, 1},
>   {{AMDGPU_RAS_REG_ENTRY(GC, 0, regSP0_UE_ERR_STATUS_LO,
> regSP0_UE_ERR_STATUS_HI),
>   10, (AMDGPU_RAS_ERR_INFO_VALID |
> AMDGPU_RAS_ERR_STATUS_VALID), "SP0"},
> - AMDGPU_GFX_SP_MEM, 1},
> + AMDGPU_GFX_SP_MEM, 4},
>   {{AMDGPU_RAS_REG_ENTRY(GC, 0, regSP1_UE_ERR_STATUS_LO,
> regSP1_UE_ERR_STATUS_HI),
>   10,

[PATCH] drm/amdgpu: Correct se_num and reg_inst for gfx v9_4_3 ras counters

2023-09-06 Thread Hawking Zhang

gfx_v9_4_3_ue|ce_reg_list is an array per gfx core instance
correct the settings of se_num and reg_inst for some of
gfx ras counters so all the available register instances
can be polled for ras status.

Signed-off-by: Hawking Zhang 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 40 -
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
index 0a26a00074a6..a60d1a8405d4 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
@@ -3653,19 +3653,19 @@ static const struct amdgpu_gfx_ras_reg_entry 
gfx_v9_4_3_ce_reg_list[] = {
AMDGPU_GFX_GC_CANE_MEM, 1},
{{AMDGPU_RAS_REG_ENTRY(GC, 0, regSPI_CE_ERR_STATUS_LO, 
regSPI_CE_ERR_STATUS_HI),
1, (AMDGPU_RAS_ERR_INFO_VALID | AMDGPU_RAS_ERR_STATUS_VALID), 
"SPI"},
-   AMDGPU_GFX_SPI_MEM, 8},
+   AMDGPU_GFX_SPI_MEM, 1},
{{AMDGPU_RAS_REG_ENTRY(GC, 0, regSP0_CE_ERR_STATUS_LO, 
regSP0_CE_ERR_STATUS_HI),
10, (AMDGPU_RAS_ERR_INFO_VALID | AMDGPU_RAS_ERR_STATUS_VALID), 
"SP0"},
-   AMDGPU_GFX_SP_MEM, 1},
+   AMDGPU_GFX_SP_MEM, 4},
{{AMDGPU_RAS_REG_ENTRY(GC, 0, regSP1_CE_ERR_STATUS_LO, 
regSP1_CE_ERR_STATUS_HI),
10, (AMDGPU_RAS_ERR_INFO_VALID | AMDGPU_RAS_ERR_STATUS_VALID), 
"SP1"},
-   AMDGPU_GFX_SP_MEM, 1},
+   AMDGPU_GFX_SP_MEM, 4},
{{AMDGPU_RAS_REG_ENTRY(GC, 0, regSQ_CE_ERR_STATUS_LO, 
regSQ_CE_ERR_STATUS_HI),
10, (AMDGPU_RAS_ERR_INFO_VALID | AMDGPU_RAS_ERR_STATUS_VALID), 
"SQ"},
-   AMDGPU_GFX_SQ_MEM, 8},
+   AMDGPU_GFX_SQ_MEM, 4},
{{AMDGPU_RAS_REG_ENTRY(GC, 0, regSQC_CE_EDC_LO, regSQC_CE_EDC_HI),
5, (AMDGPU_RAS_ERR_INFO_VALID | AMDGPU_RAS_ERR_STATUS_VALID), 
"SQC"},
-   AMDGPU_GFX_SQC_MEM, 8},
+   AMDGPU_GFX_SQC_MEM, 4},
{{AMDGPU_RAS_REG_ENTRY(GC, 0, regTCX_CE_ERR_STATUS_LO, 
regTCX_CE_ERR_STATUS_HI),
2, (AMDGPU_RAS_ERR_INFO_VALID | AMDGPU_RAS_ERR_STATUS_VALID), 
"TCX"},
AMDGPU_GFX_TCX_MEM, 1},
@@ -3674,22 +3674,22 @@ static const struct amdgpu_gfx_ras_reg_entry 
gfx_v9_4_3_ce_reg_list[] = {
AMDGPU_GFX_TCC_MEM, 1},
{{AMDGPU_RAS_REG_ENTRY(GC, 0, regTA_CE_EDC_LO, regTA_CE_EDC_HI),
10, (AMDGPU_RAS_ERR_INFO_VALID | AMDGPU_RAS_ERR_STATUS_VALID), 
"TA"},
-   AMDGPU_GFX_TA_MEM, 8},
+   AMDGPU_GFX_TA_MEM, 4},
{{AMDGPU_RAS_REG_ENTRY(GC, 0, regTCI_CE_EDC_LO_REG, 
regTCI_CE_EDC_HI_REG),
-   31, (AMDGPU_RAS_ERR_INFO_VALID | AMDGPU_RAS_ERR_STATUS_VALID), 
"TCI"},
+   27, (AMDGPU_RAS_ERR_INFO_VALID | AMDGPU_RAS_ERR_STATUS_VALID), 
"TCI"},
AMDGPU_GFX_TCI_MEM, 1},
{{AMDGPU_RAS_REG_ENTRY(GC, 0, regTCP_CE_EDC_LO_REG, 
regTCP_CE_EDC_HI_REG),
10, (AMDGPU_RAS_ERR_INFO_VALID | AMDGPU_RAS_ERR_STATUS_VALID), 
"TCP"},
-   AMDGPU_GFX_TCP_MEM, 8},
+   AMDGPU_GFX_TCP_MEM, 4},
{{AMDGPU_RAS_REG_ENTRY(GC, 0, regTD_CE_EDC_LO, regTD_CE_EDC_HI),
10, (AMDGPU_RAS_ERR_INFO_VALID | AMDGPU_RAS_ERR_STATUS_VALID), 
"TD"},
-   AMDGPU_GFX_TD_MEM, 8},
+   AMDGPU_GFX_TD_MEM, 4},
{{AMDGPU_RAS_REG_ENTRY(GC, 0, regGCEA_CE_ERR_STATUS_LO, 
regGCEA_CE_ERR_STATUS_HI),
16, (AMDGPU_RAS_ERR_INFO_VALID | AMDGPU_RAS_ERR_STATUS_VALID), 
"GCEA"},
AMDGPU_GFX_GCEA_MEM, 1},
{{AMDGPU_RAS_REG_ENTRY(GC, 0, regLDS_CE_ERR_STATUS_LO, 
regLDS_CE_ERR_STATUS_HI),
10, (AMDGPU_RAS_ERR_INFO_VALID | AMDGPU_RAS_ERR_STATUS_VALID), 
"LDS"},
-   AMDGPU_GFX_LDS_MEM, 1},
+   AMDGPU_GFX_LDS_MEM, 4},
 };
 
 static const struct amdgpu_gfx_ras_reg_entry gfx_v9_4_3_ue_reg_list[] = {
@@ -3713,19 +3713,19 @@ static const struct amdgpu_gfx_ras_reg_entry 
gfx_v9_4_3_ue_reg_list[] = {
AMDGPU_GFX_GC_CANE_MEM, 1},
{{AMDGPU_RAS_REG_ENTRY(GC, 0, regSPI_UE_ERR_STATUS_LO, 
regSPI_UE_ERR_STATUS_HI),
1, (AMDGPU_RAS_ERR_INFO_VALID | AMDGPU_RAS_ERR_STATUS_VALID), 
"SPI"},
-   AMDGPU_GFX_SPI_MEM, 8},
+   AMDGPU_GFX_SPI_MEM, 1},
{{AMDGPU_RAS_REG_ENTRY(GC, 0, regSP0_UE_ERR_STATUS_LO, 
regSP0_UE_ERR_STATUS_HI),
10, (AMDGPU_RAS_ERR_INFO_VALID | AMDGPU_RAS_ERR_STATUS_VALID), 
"SP0"},
-   AMDGPU_GFX_SP_MEM, 1},
+   AMDGPU_GFX_SP_MEM, 4},
{{AMDGPU_RAS_REG_ENTRY(GC, 0, regSP1_UE_ERR_STATUS_LO, 
regSP1_UE_ERR_STATUS_HI),
10, (AMDGPU_RAS_ERR_INFO_VALID | AMDGPU_RAS_ERR_STATUS_VALID), 
"SP1"},
-   AMDGPU_GFX_SP_MEM, 1},
+   AMDGPU_GFX_SP_MEM, 4},
{{AMDGPU_RAS_REG_ENTRY(GC, 0, regSQ_UE_ERR_STATUS_LO, 
regSQ_UE_ERR_STATUS_HI),
10, (AMDGPU_RAS_ERR_INFO_VALID | AMDGPU_RAS_ERR_STATUS_VALID), 
"SQ"},
-   AMDGPU_GFX_SQ_MEM, 8},
+   AMDGPU_GFX_SQ_MEM, 4},
{{AMDGPU_RAS_REG_ENTRY(GC, 0, regSQC_UE_EDC_LO, regSQC_UE_EDC_HI),

[PATCH 3/3] drm/amdgpu: print more address info of UMC bad page

2023-09-06 Thread Tao Zhou

Print out row, column and bank value of UMC error address for UMC v12.

Signed-off-by: Tao Zhou 
---
 drivers/gpu/drm/amd/amdgpu/umc_v12_0.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c 
b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c
index 5f056dd7691e..6fde85367272 100644
--- a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c
@@ -173,7 +173,7 @@ static void umc_v12_0_convert_error_address(struct 
amdgpu_device *adev,
 {
uint32_t channel_index, i;
uint64_t soc_pa, na, retired_page, column;
-   uint32_t bank_hash0, bank_hash1, bank_hash2, bank_hash3, col, row;
+   uint32_t bank_hash0, bank_hash1, bank_hash2, bank_hash3, col, row, 
row_xor;
uint32_t bank0, bank1, bank2, bank3, bank;
 
bank_hash0 = (err_addr >> UMC_V12_0_MCA_B0_BIT) & 0x1ULL;
@@ -228,17 +228,23 @@ static void umc_v12_0_convert_error_address(struct 
amdgpu_device *adev,
/* clear [C4] in soc physical address */
soc_pa &= ~(0x1ULL << UMC_V12_0_PA_C4_BIT);
 
+   row_xor = row ^ (0x1ULL << 13);
/* loop for all possibilities of [C4 C3 C2] */
for (column = 0; column < UMC_V12_0_NA_MAP_PA_NUM; column++) {
retired_page = soc_pa | ((column & 0x3) << UMC_V12_0_PA_C2_BIT);
retired_page |= (((column & 0x4) >> 2) << UMC_V12_0_PA_C4_BIT);
-   dev_info(adev->dev, "Error Address(PA): 0x%llx\n", 
retired_page);
+   /* include column bit 0 and 1 */
+   col &= 0x3;
+   col |= (column << 2);
+   dev_info(adev->dev, "Error Address(PA):0x%llx Row:0x%x Col:0x%x 
Bank:0x%x\n",
+   retired_page, row, col, bank);
amdgpu_umc_fill_error_record(err_data, err_addr,
retired_page, channel_index, umc_inst);
 
/* shift R13 bit */
retired_page ^= (0x1ULL << UMC_V12_0_PA_R13_BIT);
-   dev_info(adev->dev, "Error Address(PA): 0x%llx\n", 
retired_page);
+   dev_info(adev->dev, "Error Address(PA):0x%llx Row:0x%x Col:0x%x 
Bank:0x%x\n",
+   retired_page, row_xor, col, bank);
amdgpu_umc_fill_error_record(err_data, err_addr,
retired_page, channel_index, umc_inst);
}
-- 
2.35.1

1 2 >

1 - 100 of 115 matches

Mail list logo