date:20231004

RE: [PATCH 4/4] Documentation/amdgpu: Add FRU attribute details

2023-10-04 Thread Zhang, Hawking

[AMD Official Use Only - General]

Series is

Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: Lazar, Lijo 
Sent: Wednesday, October 4, 2023 21:21
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Deucher, Alexander 

Subject: [PATCH 4/4] Documentation/amdgpu: Add FRU attribute details

Add documentation for the newly added manufacturer and fru_id attributes in 
sysfs.

Signed-off-by: Lijo Lazar 
---
 Documentation/gpu/amdgpu/driver-misc.rst  | 12 
 .../gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c| 19 +++
 2 files changed, 31 insertions(+)

diff --git a/Documentation/gpu/amdgpu/driver-misc.rst 
b/Documentation/gpu/amdgpu/driver-misc.rst
index 82b47f1818ac..e40e15f89fd3 100644
--- a/Documentation/gpu/amdgpu/driver-misc.rst
+++ b/Documentation/gpu/amdgpu/driver-misc.rst
@@ -26,6 +26,18 @@ serial_number
 .. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
:doc: serial_number

+fru_id
+-
+
+.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+   :doc: fru_id
+
+manufacturer
+-
+
+.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+   :doc: manufacturer
+
 unique_id
 -

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
index 5d627d0e19a4..d635e61805ea 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
@@ -321,6 +321,16 @@ static ssize_t amdgpu_fru_serial_number_show(struct device 
*dev,

 static DEVICE_ATTR(serial_number, 0444, amdgpu_fru_serial_number_show, NULL);

+/**
+ * DOC: fru_id
+ *
+ * The amdgpu driver provides a sysfs API for reporting FRU File Id
+ * for the device.
+ * The file fru_id is used for this and returns the File Id value
+ * as returned from the FRU.
+ * NOTE: This is only available for certain server cards  */
+
 static ssize_t amdgpu_fru_id_show(struct device *dev,
  struct device_attribute *attr, char *buf)  { 
@@ -332,6 +342,15 @@ static ssize_t amdgpu_fru_id_show(struct device *dev,

 static DEVICE_ATTR(fru_id, 0444, amdgpu_fru_id_show, NULL);

+/**
+ * DOC: manufacturer
+ *
+ * The amdgpu driver provides a sysfs API for reporting manufacturer
+name from
+ * FRU information.
+ * The file manufacturer returns the value as returned from the FRU.
+ * NOTE: This is only available for certain server cards  */
+
 static ssize_t amdgpu_fru_manufacturer_name_show(struct device *dev,
 struct device_attribute *attr,
 char *buf)
--
2.25.1

Re: [PATCH v2 1/5] drm/amdgpu: Move package type enum to amdgpu_smuio

2023-10-04 Thread Deucher, Alexander

[AMD Official Use Only - General]

Series is:
Reviewed-by: Alex Deucher 

From: Lazar, Lijo 
Sent: Wednesday, October 4, 2023 3:39 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Zhang, Hawking ; Deucher, Alexander 

Subject: [PATCH v2 1/5] drm/amdgpu: Move package type enum to amdgpu_smuio

Move definition of package type to amdgpu_smuio header and add new
package types for CEM and OAM.

Signed-off-by: Lijo Lazar 
---

v2: Move definition to amdgpu_smuio.h instead of amdgpu.h (Christian/Hawking)

 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h   | 5 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_smuio.h | 7 +++
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index 42ac6d1bf9ca..7088c5015675 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -69,11 +69,6 @@ enum amdgpu_gfx_partition {

 #define NUM_XCC(x) hweight16(x)

-enum amdgpu_pkg_type {
-   AMDGPU_PKG_TYPE_APU = 2,
-   AMDGPU_PKG_TYPE_UNKNOWN,
-};
-
 enum amdgpu_gfx_ras_mem_id_type {
 AMDGPU_GFX_CP_MEM = 0,
 AMDGPU_GFX_GCEA_MEM,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_smuio.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_smuio.h
index 89c38d864471..5910d50ac74d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_smuio.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_smuio.h
@@ -23,6 +23,13 @@
 #ifndef __AMDGPU_SMUIO_H__
 #define __AMDGPU_SMUIO_H__

+enum amdgpu_pkg_type {
+   AMDGPU_PKG_TYPE_APU = 2,
+   AMDGPU_PKG_TYPE_CEM = 3,
+   AMDGPU_PKG_TYPE_OAM = 4,
+   AMDGPU_PKG_TYPE_UNKNOWN,
+};
+
 struct amdgpu_smuio_funcs {
 u32 (*get_rom_index_offset)(struct amdgpu_device *adev);
 u32 (*get_rom_data_offset)(struct amdgpu_device *adev);
--
2.25.1

[pull] amdgpu drm-fixes-6.6

2023-10-04 Thread Alex Deucher

Hi Dave, Daniel,

Fixes for 6.6.

The following changes since commit 8a749fd1a8720d4619c91c8b6e7528c0a355c0aa:

  Linux 6.6-rc4 (2023-10-01 14:15:13 -0700)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-fixes-6.6-2023-10-04

for you to fetch changes up to b206011bf05069797df1f4c5ce639398728978e2:

  drm/amd/display: apply edge-case DISPCLK WDIVIDER changes to master OTG pipes 
only (2023-10-04 22:55:05 -0400)


amd-drm-fixes-6.6-2023-10-04:

amdgpu:
- Add missing unique_id for GC 11.0.3
- Fix memory leak in FRU error path
- Fix PCIe link reporting on some SMU 11 parts
- Fix ACPI _PR3 detection
- Fix DISPCLK WDIVIDER handling in OTG code


Kenneth Feng (1):
  drm/amd/pm: add unique_id for gc 11.0.3

Luben Tuikov (1):
  drm/amdgpu: Fix a memory leak

Mario Limonciello (2):
  drm/amd: Fix logic error in sienna_cichlid_update_pcie_parameters()
  drm/amd: Fix detection of _PR3 on the PCIe root port

Samson Tam (1):
  drm/amd/display: apply edge-case DISPCLK WDIVIDER changes to master OTG 
pipes only

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c |  1 +
 .../amd/display/dc/clk_mgr/dcn20/dcn20_clk_mgr.c   |  4 +--
 .../amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c   |  4 +--
 drivers/gpu/drm/amd/pm/amdgpu_pm.c |  1 +
 .../drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c| 41 --
 6 files changed, 30 insertions(+), 23 deletions(-)

RE: [PATCH] drm/amdgpu: Enable SMU 13.0.0 optimizations when ROCm is active (v2)

2023-10-04 Thread Zhang, Hawking

[AMD Official Use Only - General]

Hmm... thinking about it more, will it override the profile mode/workload for 
0xC8 or 0xCC SKU as well. In another words, does it mean the pmfw fix is 
general to all the 13_0_0 SKUs.

Other than that, the patch looks good to me.

Regards,
Hawking

-Original Message-
From: amd-gfx  On Behalf Of Zhang, 
Hawking
Sent: Thursday, October 5, 2023 11:32
To: Deucher, Alexander ; 
amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Liu, Kun 
Subject: RE: [PATCH] drm/amdgpu: Enable SMU 13.0.0 optimizations when ROCm is 
active (v2)

[AMD Official Use Only - General]

[AMD Official Use Only - General]

Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Wednesday, October 4, 2023 23:34
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Liu, Kun 
Subject: [PATCH] drm/amdgpu: Enable SMU 13.0.0 optimizations when ROCm is 
active (v2)

From: Kun Liu 

When ROCm is active enable additional SMU 13.0.0 optimizations.
This reuses the unused powersave profile on PMFW.

v2: move to the swsmu code since we need both bits active in
the workload mask.

Signed-off-by: Alex Deucher 
---
 .../drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c| 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
index 684b4e01fac2..83035fb1839a 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
@@ -2447,6 +2447,7 @@ static int smu_v13_0_0_set_power_profile_mode(struct 
smu_context *smu,
DpmActivityMonitorCoeffInt_t *activity_monitor =
&(activity_monitor_external.DpmActivityMonitorCoeffInt);
int workload_type, ret = 0;
+   u32 workload_mask;

smu->power_profile_mode = input[size];

@@ -2536,9 +2537,23 @@ static int smu_v13_0_0_set_power_profile_mode(struct 
smu_context *smu,
if (workload_type < 0)
return -EINVAL;

+   workload_mask = 1 << workload_type;
+
+   /* Add optimizations for SMU13.0.0.  Reuse the power saving profile */
+   if (smu->power_profile_mode == PP_SMC_POWER_PROFILE_COMPUTE &&
+   (amdgpu_ip_version(smu->adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 0)) 
&&
+   ((smu->adev->pm.fw_version == 0x004e6601) ||
+(smu->adev->pm.fw_version >= 0x004e7300))) {
+   workload_type = smu_cmn_to_asic_specific_index(smu,
+  
CMN2ASIC_MAPPING_WORKLOAD,
+  
PP_SMC_POWER_PROFILE_POWERSAVING);
+   if (workload_type >= 0)
+   workload_mask |= 1 << workload_type;
+   }
+
return smu_cmn_send_smc_msg_with_param(smu,
   SMU_MSG_SetWorkloadMask,
-  1 << workload_type,
+  workload_mask,
   NULL);  }

--
2.41.0

RE: [PATCH] drm/amdgpu: Enable SMU 13.0.0 optimizations when ROCm is active (v2)

2023-10-04 Thread Zhang, Hawking

[AMD Official Use Only - General]

Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Wednesday, October 4, 2023 23:34
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Liu, Kun 
Subject: [PATCH] drm/amdgpu: Enable SMU 13.0.0 optimizations when ROCm is 
active (v2)

From: Kun Liu 

When ROCm is active enable additional SMU 13.0.0 optimizations.
This reuses the unused powersave profile on PMFW.

v2: move to the swsmu code since we need both bits active in
the workload mask.

Signed-off-by: Alex Deucher 
---
 .../drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c| 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
index 684b4e01fac2..83035fb1839a 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
@@ -2447,6 +2447,7 @@ static int smu_v13_0_0_set_power_profile_mode(struct 
smu_context *smu,
DpmActivityMonitorCoeffInt_t *activity_monitor =
&(activity_monitor_external.DpmActivityMonitorCoeffInt);
int workload_type, ret = 0;
+   u32 workload_mask;

smu->power_profile_mode = input[size];

@@ -2536,9 +2537,23 @@ static int smu_v13_0_0_set_power_profile_mode(struct 
smu_context *smu,
if (workload_type < 0)
return -EINVAL;

+   workload_mask = 1 << workload_type;
+
+   /* Add optimizations for SMU13.0.0.  Reuse the power saving profile */
+   if (smu->power_profile_mode == PP_SMC_POWER_PROFILE_COMPUTE &&
+   (amdgpu_ip_version(smu->adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 0)) 
&&
+   ((smu->adev->pm.fw_version == 0x004e6601) ||
+(smu->adev->pm.fw_version >= 0x004e7300))) {
+   workload_type = smu_cmn_to_asic_specific_index(smu,
+  
CMN2ASIC_MAPPING_WORKLOAD,
+  
PP_SMC_POWER_PROFILE_POWERSAVING);
+   if (workload_type >= 0)
+   workload_mask |= 1 << workload_type;
+   }
+
return smu_cmn_send_smc_msg_with_param(smu,
   SMU_MSG_SetWorkloadMask,
-  1 << workload_type,
+  workload_mask,
   NULL);
 }

--
2.41.0

RE: [PATCH v2 5/5] Documentation/amdgpu: Add board info details

2023-10-04 Thread Zhang, Hawking

[AMD Official Use Only - General]

Series is

Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: Lazar, Lijo 
Sent: Wednesday, October 4, 2023 15:40
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Deucher, Alexander 
; Deucher, Alexander 
Subject: [PATCH v2 5/5] Documentation/amdgpu: Add board info details

Add documentation for board info sysfs attribute.

Signed-off-by: Lijo Lazar 
Reviewed-by: Alex Deucher 
---
 Documentation/gpu/amdgpu/driver-misc.rst   |  6 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 16 
 2 files changed, 22 insertions(+)

diff --git a/Documentation/gpu/amdgpu/driver-misc.rst 
b/Documentation/gpu/amdgpu/driver-misc.rst
index 4321c38fef21..82b47f1818ac 100644
--- a/Documentation/gpu/amdgpu/driver-misc.rst
+++ b/Documentation/gpu/amdgpu/driver-misc.rst
@@ -32,6 +32,12 @@ unique_id
 .. kernel-doc:: drivers/gpu/drm/amd/pm/amdgpu_pm.c
:doc: unique_id

+board_info
+--
+
+.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+   :doc: board_info
+
 Accelerated Processing Units (APU) Info
 ---

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 10f1641aede9..27c95bb02411 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -162,6 +162,22 @@ static ssize_t amdgpu_device_get_pcie_replay_count(struct 
device *dev,  static DEVICE_ATTR(pcie_replay_count, 0444,
amdgpu_device_get_pcie_replay_count, NULL);

+/**
+ * DOC: board_info
+ *
+ * The amdgpu driver provides a sysfs API for giving board related information.
+ * It provides the form factor information in the format
+ *
+ *   type : form factor
+ *
+ * Possible form factor values
+ *
+ * - "cem" - PCIE CEM card
+ * - "oam" - Open Compute Accelerator Module
+ * - "unknown" - Not known
+ *
+ */
+
 static ssize_t amdgpu_device_get_board_info(struct device *dev,
struct device_attribute *attr,
char *buf)
--
2.25.1

Re: [PATCH 2/3] power: supply: Don't count 'unknown' scope power supplies

2023-10-04 Thread Sebastian Reichel

Hi,

On Sun, Oct 01, 2023 at 07:00:11PM -0500, Mario Limonciello wrote:
> Let me try to add more detail.
> 
> This is an OEM system that has 3 USB type C ports.  It's an Intel system,
> but this doesn't matter for the issue.
> * when ucsi_acpi is not loaded there are no power supplies in the system and
> it reports power_supply_is_system_supplied() as AC.
> * When ucsi_acpi is loaded 3 power supplies will be registered.
> power_supply_is_system_supplied() reports as DC.
> 
> Now when you add in a Navi3x AMD dGPU to the system the power supplies don't
> change.  This particular dGPU model doesn't contain a USB-C port, so there
> is no UCSI power supply registered.
> 
> As amdgpu is loaded it looks at device initialization whether the system is
> powered by AC or DC.  Here is how it looks:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c?h=linux-6.5.y#n3834
> 
> On the OEM system if amdgpu loads before the ucsi_acpi driver (such as in
> the initramfs) then the right value is returned for
> power_supply_is_system_supplied() - AC.
> 
> If amdgpu is loaded after the ucsi_acpi driver, the wrong value is returned
> for power_supply_is_system_supplied() - DC.
> 
> This value is very important to set up the dGPU properly.  If the wrong
> value is returned, the wrong value will be notified to the hardware and the
> hardware will not behave properly.  On the OEM system this is a "black
> screen" at bootup along with RAS errors emitted by the dGPU.
> 
> With no changes to a malfunctioning kernel or initramfs binaries I can add
> modprobe.blacklist=ucsi_acpi to kernel command line avoid registering those
> 3 power supplies and the system behaves properly.
> 
> So I think it's inappropriate for "UNKNOWN" scope power supplies to be
> registered and treated as system supplies, at least as it pertains to
> power_supply_is_system_supplied().

So the main issue is, that the ucsi_acpi registers a bunch of
power-supply chargers with unknown scope on a desktop systems
and that results in the system assumed to be supplied from battery.

The problem with your change is, that many of the charger drivers
don't set a scope at all (and thus report unknown scope). Those
obviously should not be skipped. Probably most of these drivers
could be changed to properly set the scope, but it needs to be
checked on a case-by-case basis. With your current patch they would
regress in the oposite direction of your use-case.

Ideally ucsi is changed to properly describe the scope, but I
suppose this information is not available in ACPI?

Assuming that the above are not solvable easily, my idea would be to
only count the number of POWER_SUPPLY_TYPE_BATTERY device, which have
!POWER_SUPPLY_SCOPE_DEVICE and exit early if there are none.
Basically change __power_supply_is_system_supplied(), so that it
looks like this:

...
if (!psy->desc->get_property(psy, POWER_SUPPLY_PROP_SCOPE, &ret))
if (ret.intval == POWER_SUPPLY_SCOPE_DEVICE)
return 0;

if (psy->desc->type == POWER_SUPPLY_TYPE_BATTERY)
(*count)++;
else
if (!psy->desc->get_property(psy, POWER_SUPPLY_PROP_ONLINE,
&ret))
return ret.intval;
...

That should work in both cases.

-- Sebastian

> > >   drivers/power/supply/power_supply_core.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/power/supply/power_supply_core.c 
> > > b/drivers/power/supply/power_supply_core.c
> > > index d325e6dbc770..3de6e6d00815 100644
> > > --- a/drivers/power/supply/power_supply_core.c
> > > +++ b/drivers/power/supply/power_supply_core.c
> > > @@ -349,7 +349,7 @@ static int __power_supply_is_system_supplied(struct 
> > > device *dev, void *data)
> > >   unsigned int *count = data;
> > >   if (!psy->desc->get_property(psy, POWER_SUPPLY_PROP_SCOPE, 
> > > &ret))
> > > - if (ret.intval == POWER_SUPPLY_SCOPE_DEVICE)
> > > + if (ret.intval != POWER_SUPPLY_SCOPE_SYSTEM)
> > >   return 0;
> > >   (*count)++;
> > > -- 
> > > 2.34.1
> > > 
> 


signature.asc
Description: PGP signature

Re: [PATCH] drm/amd: Fix UBSAN array-index-out-of-bounds for Polaris and Tonga

2023-10-04 Thread Alex Deucher

On Wed, Oct 4, 2023 at 5:42 PM Mario Limonciello
 wrote:
>
> For pptable structs that use flexible array sizes, use flexible arrays.
>
> Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2036742
> Signed-off-by: Mario Limonciello 

Acked-by: Alex Deucher 

> ---
> From this bug report there are more to fix
>  .../gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h| 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h 
> b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h
> index 57bca1e81d3a..9fcad69a9f34 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h
> +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h
> @@ -164,7 +164,7 @@ typedef struct _ATOM_Tonga_State {
>  typedef struct _ATOM_Tonga_State_Array {
> UCHAR ucRevId;
> UCHAR ucNumEntries; /* Number of entries. */
> -   ATOM_Tonga_State entries[1];/* Dynamically allocate entries. */
> +   ATOM_Tonga_State entries[]; /* Dynamically allocate entries. */
>  } ATOM_Tonga_State_Array;
>
>  typedef struct _ATOM_Tonga_MCLK_Dependency_Record {
> @@ -210,7 +210,7 @@ typedef struct _ATOM_Polaris_SCLK_Dependency_Record {
>  typedef struct _ATOM_Polaris_SCLK_Dependency_Table {
> UCHAR ucRevId;
> UCHAR ucNumEntries;   
>   /* Number of entries. */
> -   ATOM_Polaris_SCLK_Dependency_Record entries[1];   
>/* Dynamically allocate entries. */
> +   ATOM_Polaris_SCLK_Dependency_Record entries[];
>/* Dynamically allocate entries. */
>  } ATOM_Polaris_SCLK_Dependency_Table;
>
>  typedef struct _ATOM_Tonga_PCIE_Record {
> @@ -222,7 +222,7 @@ typedef struct _ATOM_Tonga_PCIE_Record {
>  typedef struct _ATOM_Tonga_PCIE_Table {
> UCHAR ucRevId;
> UCHAR ucNumEntries;   
>   /* Number of entries. */
> -   ATOM_Tonga_PCIE_Record entries[1];
>   /* Dynamically allocate entries. */
> +   ATOM_Tonga_PCIE_Record entries[]; 
>   /* Dynamically allocate entries. */
>  } ATOM_Tonga_PCIE_Table;
>
>  typedef struct _ATOM_Polaris10_PCIE_Record {
> @@ -235,7 +235,7 @@ typedef struct _ATOM_Polaris10_PCIE_Record {
>  typedef struct _ATOM_Polaris10_PCIE_Table {
> UCHAR ucRevId;
> UCHAR ucNumEntries; /* Number 
> of entries. */
> -   ATOM_Polaris10_PCIE_Record entries[1];  /* 
> Dynamically allocate entries. */
> +   ATOM_Polaris10_PCIE_Record entries[];  /* 
> Dynamically allocate entries. */
>  } ATOM_Polaris10_PCIE_Table;
>
>
> @@ -252,7 +252,7 @@ typedef struct _ATOM_Tonga_MM_Dependency_Record {
>  typedef struct _ATOM_Tonga_MM_Dependency_Table {
> UCHAR ucRevId;
> UCHAR ucNumEntries;   
>   /* Number of entries. */
> -   ATOM_Tonga_MM_Dependency_Record entries[1];/* 
> Dynamically allocate entries. */
> +   ATOM_Tonga_MM_Dependency_Record entries[]; /* 
> Dynamically allocate entries. */
>  } ATOM_Tonga_MM_Dependency_Table;
>
>  typedef struct _ATOM_Tonga_Voltage_Lookup_Record {
> @@ -265,7 +265,7 @@ typedef struct _ATOM_Tonga_Voltage_Lookup_Record {
>  typedef struct _ATOM_Tonga_Voltage_Lookup_Table {
> UCHAR ucRevId;
> UCHAR ucNumEntries;   
>   /* Number of entries. */
> -   ATOM_Tonga_Voltage_Lookup_Record entries[1];  
>   /* Dynamically allocate entries. */
> +   ATOM_Tonga_Voltage_Lookup_Record entries[];   
>   /* Dynamically allocate entries. */
>  } ATOM_Tonga_Voltage_Lookup_Table;
>
>  typedef struct _ATOM_Tonga_Fan_Table {
> --
> 2.34.1
>

RE: [PATCH v3 03/16] drm/amd/display: Move bw_fixed outside DML folder

2023-10-04 Thread Zhuo, Lillian

[AMD Official Use Only - General]

Reviewed-by: Qingqing Zhuo 

-Original Message-
From: Siqueira, Rodrigo 
Sent: Wednesday, October 4, 2023 5:21 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Wentland, Harry 
; Li, Sun peng (Leo) ; Siqueira, 
Rodrigo ; Pillai, Aurabindo 
; Zhuo, Lillian ; Li, Roman 
; Lin, Wayne ; Zuo, Jerry 
; Mahfooz, Hamza ; Gong, Richard 

Subject: [PATCH v3 03/16] drm/amd/display: Move bw_fixed outside DML folder

bw_fixed does not need any FPU operation, and it is used on DCE and DCN.
For this reason, this commit moves bw_fixed to the basic folder outside DML.

Signed-off-by: Rodrigo Siqueira 
---
 drivers/gpu/drm/amd/display/dc/basics/Makefile  |  3 ++-
 .../amd/display/dc/{dml/calcs => basics}/bw_fixed.c | 13 ++---
 drivers/gpu/drm/amd/display/dc/dml/Makefile |  2 --
 3 files changed, 8 insertions(+), 10 deletions(-)  rename 
drivers/gpu/drm/amd/display/dc/{dml/calcs => basics}/bw_fixed.c (94%)

diff --git a/drivers/gpu/drm/amd/display/dc/basics/Makefile 
b/drivers/gpu/drm/amd/display/dc/basics/Makefile
index 65d713aff407..aabcebf69049 100644
--- a/drivers/gpu/drm/amd/display/dc/basics/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/basics/Makefile
@@ -30,7 +30,8 @@ BASICS := \
vector.o \
dc_common.o \
dce_calcs.o \
-   custom_float.o
+   custom_float.o \
+   bw_fixed.o

 AMD_DAL_BASICS = $(addprefix $(AMDDALPATH)/dc/basics/,$(BASICS))

diff --git a/drivers/gpu/drm/amd/display/dc/dml/calcs/bw_fixed.c 
b/drivers/gpu/drm/amd/display/dc/basics/bw_fixed.c
similarity index 94%
rename from drivers/gpu/drm/amd/display/dc/dml/calcs/bw_fixed.c
rename to drivers/gpu/drm/amd/display/dc/basics/bw_fixed.c
index 3aa8dd0acd5e..c8cb89e0d4d0 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/calcs/bw_fixed.c
+++ b/drivers/gpu/drm/amd/display/dc/basics/bw_fixed.c
@@ -1,5 +1,6 @@
+// SPDX-License-Identifier: MIT
 /*
- * Copyright 2015 Advanced Micro Devices, Inc.
+ * Copyright 2023 Advanced Micro Devices, Inc.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"), 
@@ -106,9 +107,8 @@ struct bw_fixed bw_frc_to_fixed(int64_t numerator, int64_t 
denominator)
return res;
 }

-struct bw_fixed bw_floor2(
-   const struct bw_fixed arg,
-   const struct bw_fixed significance)
+struct bw_fixed bw_floor2(const struct bw_fixed arg,
+ const struct bw_fixed significance)
 {
struct bw_fixed result;
int64_t multiplicand;
@@ -119,9 +119,8 @@ struct bw_fixed bw_floor2(
return result;
 }

-struct bw_fixed bw_ceil2(
-   const struct bw_fixed arg,
-   const struct bw_fixed significance)
+struct bw_fixed bw_ceil2(const struct bw_fixed arg,
+const struct bw_fixed significance)
 {
struct bw_fixed result;
int64_t multiplicand;
diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml/Makefile
index 2fe8588a070a..ea7d60f9a9b4 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
@@ -134,8 +134,6 @@ CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/calcs/dcn_calcs.o := 
$(dml_rcflags)  CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/calcs/dcn_calc_auto.o := 
$(dml_rcflags)  CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/calcs/dcn_calc_math.o := 
$(dml_rcflags)

-DML = calcs/bw_fixed.o
-
 ifdef CONFIG_DRM_AMD_DC_FP
 DML += display_mode_lib.o display_rq_dlg_helpers.o dml1_display_rq_dlg_calc.o  
DML += dcn10/dcn10_fpu.o
--
2.40.1

Re: [PATCH v4] drm/amdkfd: Use partial migrations in GPU page faults

2023-10-04 Thread Chen, Xiaogang




On 10/4/2023 1:47 PM, Felix Kuehling wrote:


On 2023-10-03 19:31, Xiaogang.Chen wrote:

From: Xiaogang Chen 

This patch implements partial migration in gpu page fault according 
to migration
granularity(default 2MB) and not split svm range in cpu page fault 
handling.
A svm range may include pages from both system ram and vram of one 
gpu now.
These chagnes are expected to improve migration performance and 
reduce mmu

callback and TLB flush workloads.

Signed-off-by: Xiaogang Chen


Minor (mostly cosemtic) nit-picks inline. With those fixed, the patch is

Reviewed-by: Felix Kuehling 

Thanks for the review. These indentations was due to my editor on Linux 
that it does not show some special characters correctly. I changed these 
by vi.


I need use a different editor now.

Regards

Xiaogang




---
  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 156 +--
  drivers/gpu/drm/amd/amdkfd/kfd_migrate.h |   6 +-
  drivers/gpu/drm/amd/amdkfd/kfd_svm.c |  83 +---
  drivers/gpu/drm/amd/amdkfd/kfd_svm.h |   6 +-
  4 files changed, 162 insertions(+), 89 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c

index 6c25dab051d5..6a059e4aff86 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -442,10 +442,10 @@ svm_migrate_vma_to_vram(struct kfd_node *node, 
struct svm_range *prange,

  goto out_free;
  }
  if (cpages != npages)
-    pr_debug("partial migration, 0x%lx/0x%llx pages migrated\n",
+    pr_debug("partial migration, 0x%lx/0x%llx pages collected\n",
   cpages, npages);
  else
-    pr_debug("0x%lx pages migrated\n", cpages);
+    pr_debug("0x%lx pages collected\n", cpages);
    r = svm_migrate_copy_to_vram(node, prange, &migrate, &mfence, 
scratch, ttm_res_offset);

  migrate_vma_pages(&migrate);
@@ -479,6 +479,8 @@ svm_migrate_vma_to_vram(struct kfd_node *node, 
struct svm_range *prange,

   * svm_migrate_ram_to_vram - migrate svm range from system to device
   * @prange: range structure
   * @best_loc: the device to migrate to
+ * @start_mgr: start page to migrate
+ * @last_mgr: last page to migrate
   * @mm: the process mm structure
   * @trigger: reason of migration
   *
@@ -489,6 +491,7 @@ svm_migrate_vma_to_vram(struct kfd_node *node, 
struct svm_range *prange,

   */
  static int
  svm_migrate_ram_to_vram(struct svm_range *prange, uint32_t best_loc,
+    unsigned long start_mgr, unsigned long last_mgr,
  struct mm_struct *mm, uint32_t trigger)
  {
  unsigned long addr, start, end;
@@ -498,23 +501,30 @@ svm_migrate_ram_to_vram(struct svm_range 
*prange, uint32_t best_loc,

  unsigned long cpages = 0;
  long r = 0;
  -    if (prange->actual_loc == best_loc) {
-    pr_debug("svms 0x%p [0x%lx 0x%lx] already on best_loc 0x%x\n",
- prange->svms, prange->start, prange->last, best_loc);
+    if (!best_loc) {
+    pr_debug("svms 0x%p [0x%lx 0x%lx] migrate to sys ram\n",
+    prange->svms, start_mgr, last_mgr);
  return 0;
  }
  +    if (start_mgr < prange->start || last_mgr > prange->last) {
+    pr_debug("range [0x%lx 0x%lx] out prange [0x%lx 0x%lx]\n",
+ start_mgr, last_mgr, prange->start, prange->last);
+    return -EFAULT;
+    }
+
  node = svm_range_get_node_by_id(prange, best_loc);
  if (!node) {
  pr_debug("failed to get kfd node by id 0x%x\n", best_loc);
  return -ENODEV;
  }
  -    pr_debug("svms 0x%p [0x%lx 0x%lx] to gpu 0x%x\n", prange->svms,
- prange->start, prange->last, best_loc);
+    pr_debug("svms 0x%p [0x%lx 0x%lx] in [0x%lx 0x%lx] to gpu 0x%x\n",
+    prange->svms, start_mgr, last_mgr, prange->start, prange->last,
+    best_loc);
  -    start = prange->start << PAGE_SHIFT;
-    end = (prange->last + 1) << PAGE_SHIFT;
+    start = start_mgr << PAGE_SHIFT;
+    end = (last_mgr + 1) << PAGE_SHIFT;
    r = svm_range_vram_node_new(node, prange, true);
  if (r) {
@@ -544,8 +554,11 @@ svm_migrate_ram_to_vram(struct svm_range 
*prange, uint32_t best_loc,

    if (cpages) {
  prange->actual_loc = best_loc;
-    svm_range_dma_unmap(prange);
-    } else {
+    prange->vram_pages = prange->vram_pages + cpages;
+    } else if (!prange->actual_loc) {
+    /* if no page migrated and all pages from prange are at
+ * sys ram drop svm_bo got from svm_range_vram_node_new
+ */
  svm_range_vram_node_free(prange);
  }
  @@ -663,19 +676,19 @@ svm_migrate_copy_to_ram(struct amdgpu_device 
*adev, struct svm_range *prange,
   * Context: Process context, caller hold mmap read lock, 
prange->migrate_mutex

   *
   * Return:
- *   0 - success with all pages migrated
   *   negative values - indicate error
- *   positive values - partial migration, number of pages not migrated
+ *   positive values or zero - number of pages got

RE: [PATCH v3 04/16] drm/amd/display: Move dml code under CONFIG_DRM_AMD_DC_FP guard

2023-10-04 Thread Zhuo, Lillian

[AMD Official Use Only - General]

Reviewed-by: Qingqing Zhuo 

-Original Message-
From: Siqueira, Rodrigo 
Sent: Wednesday, October 4, 2023 5:21 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Wentland, Harry 
; Li, Sun peng (Leo) ; Siqueira, 
Rodrigo ; Pillai, Aurabindo 
; Zhuo, Lillian ; Li, Roman 
; Lin, Wayne ; Zuo, Jerry 
; Mahfooz, Hamza ; Gong, Richard 

Subject: [PATCH v3 04/16] drm/amd/display: Move dml code under 
CONFIG_DRM_AMD_DC_FP guard

For some reason, the dml code is not guarded under CONFIG_DRM_AMD_DC_FP in the 
Makefile. This commit moves the dml code under the DC_FP guard.

Signed-off-by: Rodrigo Siqueira 
---
 drivers/gpu/drm/amd/display/dc/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/Makefile 
b/drivers/gpu/drm/amd/display/dc/Makefile
index 2f3d9602b7a0..dafa34bc2782 100644
--- a/drivers/gpu/drm/amd/display/dc/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/Makefile
@@ -22,7 +22,7 @@
 #
 # Makefile for Display Core (dc) component.

-DC_LIBS = basics bios dml clk_mgr dce gpio irq link virtual dsc
+DC_LIBS = basics bios clk_mgr dce gpio irq link virtual dsc

 ifdef CONFIG_DRM_AMD_DC_FP

@@ -43,6 +43,7 @@ DC_LIBS += dcn316
 DC_LIBS += dcn32
 DC_LIBS += dcn321
 DC_LIBS += dcn35
+DC_LIBS += dml
 endif

 DC_LIBS += dce120
--
2.40.1

Re: [PATCH v6 7/9] drm/amdgpu: map wptr BO into GART

2023-10-04 Thread Felix Kuehling




On 2023-09-18 06:32, Christian König wrote:

Am 08.09.23 um 18:04 schrieb Shashank Sharma:

To support oversubscription, MES FW expects WPTR BOs to
be mapped into GART, before they are submitted to usermode
queues. This patch adds a function for the same.

V4: fix the wptr value before mapping lookup (Bas, Christian).
V5: Addressed review comments from Christian:
 - Either pin object or allocate from GART, but not both.
 - All the handling must be done with the VM locks held.

Cc: Alex Deucher 
Cc: Christian Koenig 
Signed-off-by: Shashank Sharma 
Signed-off-by: Arvind Yadav 
---
  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c    | 81 +++
  .../gpu/drm/amd/include/amdgpu_userqueue.h    |  1 +
  2 files changed, 82 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c

index e266674e0d44..c0eb622dfc37 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -6427,6 +6427,79 @@ const struct amdgpu_ip_block_version 
gfx_v11_0_ip_block =

  .funcs = &gfx_v11_0_ip_funcs,
  };
  +static int
+gfx_v11_0_map_gtt_bo_to_gart(struct amdgpu_device *adev, struct 
amdgpu_bo *bo)

+{
+    int ret;
+
+    ret = amdgpu_bo_reserve(bo, true);
+    if (ret) {
+    DRM_ERROR("Failed to reserve bo. ret %d\n", ret);
+    goto err_reserve_bo_failed;
+    }
+
+    ret = amdgpu_ttm_alloc_gart(&bo->tbo);
+    if (ret) {
+    DRM_ERROR("Failed to bind bo to GART. ret %d\n", ret);
+    goto err_map_bo_gart_failed;
+    }
+
+    amdgpu_bo_unreserve(bo);


The GART mapping can become invalid as soon as you unlock the BOs.

You need to attach an eviction fence for this to work correctly.


Don't you need an eviction fence on the WPTR BO regardless of the GTT 
mapping?


Regards,
  Felix





+    bo = amdgpu_bo_ref(bo);
+
+    return 0;
+
+err_map_bo_gart_failed:
+    amdgpu_bo_unreserve(bo);
+err_reserve_bo_failed:
+    return ret;
+}
+
+static int
+gfx_v11_0_create_wptr_mapping(struct amdgpu_device *adev,
+  struct amdgpu_usermode_queue *queue,
+  uint64_t wptr)
+{
+    struct amdgpu_bo_va_mapping *wptr_mapping;
+    struct amdgpu_vm *wptr_vm;
+    struct amdgpu_bo *wptr_bo = NULL;
+    int ret;
+
+    mutex_lock(&queue->vm->eviction_lock);


Never ever touch the eviction lock outside of the VM code! That lock 
is completely unrelated to what you do here.



+    wptr_vm = queue->vm;
+    ret = amdgpu_bo_reserve(wptr_vm->root.bo, false);
+    if (ret)
+    goto unlock;
+
+    wptr &= AMDGPU_GMC_HOLE_MASK;
+    wptr_mapping = amdgpu_vm_bo_lookup_mapping(wptr_vm, wptr >> 
PAGE_SHIFT);

+    amdgpu_bo_unreserve(wptr_vm->root.bo);
+    if (!wptr_mapping) {
+    DRM_ERROR("Failed to lookup wptr bo\n");
+    ret = -EINVAL;
+    goto unlock;
+    }
+
+    wptr_bo = wptr_mapping->bo_va->base.bo;
+    if (wptr_bo->tbo.base.size > PAGE_SIZE) {
+    DRM_ERROR("Requested GART mapping for wptr bo larger than 
one page\n");

+    ret = -EINVAL;
+    goto unlock;
+    }


We probably also want to enforce that this BO is a per VM BO.


+
+    ret = gfx_v11_0_map_gtt_bo_to_gart(adev, wptr_bo);
+    if (ret) {
+    DRM_ERROR("Failed to map wptr bo to GART\n");
+    goto unlock;
+    }
+
+    queue->wptr_mc_addr = wptr_bo->tbo.resource->start << PAGE_SHIFT;


This needs to be amdgpu_bo_gpu_offset() instead.

Regards,
Christian.


+
+unlock:
+    mutex_unlock(&queue->vm->eviction_lock);
+    return ret;
+}
+
  static void gfx_v11_0_userq_unmap(struct amdgpu_userq_mgr *uq_mgr,
    struct amdgpu_usermode_queue *queue)
  {
@@ -6475,6 +6548,7 @@ static int gfx_v11_0_userq_map(struct 
amdgpu_userq_mgr *uq_mgr,

  queue_input.queue_size = userq_props->queue_size >> 2;
  queue_input.doorbell_offset = userq_props->doorbell_index;
  queue_input.page_table_base_addr = 
amdgpu_gmc_pd_addr(queue->vm->root.bo);

+    queue_input.wptr_mc_addr = queue->wptr_mc_addr;
    amdgpu_mes_lock(&adev->mes);
  r = adev->mes.funcs->add_hw_queue(&adev->mes, &queue_input);
@@ -6601,6 +6675,13 @@ static int gfx_v11_0_userq_mqd_create(struct 
amdgpu_userq_mgr *uq_mgr,

  goto free_mqd;
  }
  +    /* FW expects WPTR BOs to be mapped into GART */
+    r = gfx_v11_0_create_wptr_mapping(adev, queue, 
userq_props.wptr_gpu_addr);

+    if (r) {
+    DRM_ERROR("Failed to create WPTR mapping\n");
+    goto free_ctx;
+    }
+
  /* Map userqueue into FW using MES */
  r = gfx_v11_0_userq_map(uq_mgr, queue, &userq_props);
  if (r) {
diff --git a/drivers/gpu/drm/amd/include/amdgpu_userqueue.h 
b/drivers/gpu/drm/amd/include/amdgpu_userqueue.h

index 34e20daa06c8..ae155de62560 100644
--- a/drivers/gpu/drm/amd/include/amdgpu_userqueue.h
+++ b/drivers/gpu/drm/amd/include/amdgpu_userqueue.h
@@ -39,6 +39,7 @@ struct amdgpu_usermode_queue {
  int    queue_type;
  uint64_t    doorbell_handle;
  uint64_t

Re: [PATCH v6 1/9] drm/amdgpu: UAPI for user queue management

2023-10-04 Thread Felix Kuehling




On 2023-09-08 12:04, Shashank Sharma wrote:

From: Alex Deucher 

This patch intorduces new UAPI/IOCTL for usermode graphics
queue. The userspace app will fill this structure and request
the graphics driver to add a graphics work queue for it. The
output of this UAPI is a queue id.

This UAPI maps the queue into GPU, so the graphics app can start
submitting work to the queue as soon as the call returns.

V2: Addressed review comments from Alex and Christian
 - Make the doorbell offset's comment clearer
 - Change the output parameter name to queue_id

V3: Integration with doorbell manager

V4:
 - Updated the UAPI doc (Pierre-Eric)
 - Created a Union for engine specific MQDs (Alex)
 - Added Christian's R-B
V5:
 - Add variables for GDS and CSA in MQD structure (Alex)
 - Make MQD data a ptr-size pair instead of union (Alex)

Cc: Alex Deucher 
Cc: Christian Koenig 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 
---
  include/uapi/drm/amdgpu_drm.h | 110 ++
  1 file changed, 110 insertions(+)

diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
index 79b14828d542..627b4a38c855 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -54,6 +54,7 @@ extern "C" {
  #define DRM_AMDGPU_VM 0x13
  #define DRM_AMDGPU_FENCE_TO_HANDLE0x14
  #define DRM_AMDGPU_SCHED  0x15
+#define DRM_AMDGPU_USERQ   0x16
  
  #define DRM_IOCTL_AMDGPU_GEM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)

  #define DRM_IOCTL_AMDGPU_GEM_MMAP DRM_IOWR(DRM_COMMAND_BASE + 
DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
@@ -71,6 +72,7 @@ extern "C" {
  #define DRM_IOCTL_AMDGPU_VM   DRM_IOWR(DRM_COMMAND_BASE + 
DRM_AMDGPU_VM, union drm_amdgpu_vm)
  #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + 
DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
  #define DRM_IOCTL_AMDGPU_SCHEDDRM_IOW(DRM_COMMAND_BASE + 
DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
+#define DRM_IOCTL_AMDGPU_USERQ DRM_IOW(DRM_COMMAND_BASE + 
DRM_AMDGPU_USERQ, union drm_amdgpu_userq)
  
  /**

   * DOC: memory domains
@@ -304,6 +306,114 @@ union drm_amdgpu_ctx {
union drm_amdgpu_ctx_out out;
  };
  
+/* user queue IOCTL */

+#define AMDGPU_USERQ_OP_CREATE 1
+#define AMDGPU_USERQ_OP_FREE   2
+
+/* Flag to indicate secure buffer related workload, unused for now */
+#define AMDGPU_USERQ_MQD_FLAGS_SECURE  (1 << 0)
+/* Flag to indicate AQL workload, unused for now */
+#define AMDGPU_USERQ_MQD_FLAGS_AQL (1 << 1)
+
+/*
+ * MQD (memory queue descriptor) is a set of parameters which allow


I find the term MQD misleading. For the firmware the MQD is a very 
different data structure from what you are defining here. It's a 
persistent data structure in kernel address space (VMID0) that is shared 
between the driver and the firmware that gets loaded or updated when 
queues are mapped or unmapped. I'd want to avoid confusing the firmware 
MQD with this structure.


Regards,
  Felix



+ * the GPU to uniquely define and identify a usermode queue. This
+ * structure defines the MQD for GFX-V11 IP ver 0.
+ */
+struct drm_amdgpu_userq_mqd_gfx_v11_0 {
+   /**
+* @queue_va: Virtual address of the GPU memory which holds the queue
+* object. The queue holds the workload packets.
+*/
+   __u64   queue_va;
+   /**
+* @queue_size: Size of the queue in bytes, this needs to be 256-byte
+* aligned.
+*/
+   __u64   queue_size;
+   /**
+* @rptr_va : Virtual address of the GPU memory which holds the ring 
RPTR.
+* This object must be at least 8 byte in size and aligned to 8-byte 
offset.
+*/
+   __u64   rptr_va;
+   /**
+* @wptr_va : Virtual address of the GPU memory which holds the ring 
WPTR.
+* This object must be at least 8 byte in size and aligned to 8-byte 
offset.
+*
+* Queue, RPTR and WPTR can come from the same object, as long as the 
size
+* and alignment related requirements are met.
+*/
+   __u64   wptr_va;
+   /**
+* @shadow_va: Virtual address of the GPU memory to hold the shadow 
buffer.
+* This must be a from a separate GPU object, and must be at least 
4-page
+* sized.
+*/
+   __u64   shadow_va;
+   /**
+* @gds_va: Virtual address of the GPU memory to hold the GDS buffer.
+* This must be a from a separate GPU object, and must be at least 
1-page
+* sized.
+*/
+   __u64   gds_va;
+   /**
+* @csa_va: Virtual address of the GPU memory to hold the CSA buffer.
+* This must be a from a separate GPU object, and must be at least 
1-page
+* sized.
+*/
+   __u64   csa_va;
+};
+
+struct drm_amdgpu_userq_in {
+   /** AMDGPU_USE

[PATCH v3 08/16] drm/amd/display: Handle multiple streams sourcing same surface

2023-10-04 Thread Rodrigo Siqueira

From: Sung Joon Kim 

[why]
There are cases where more than 1 stream can be mapped to the same
surface. DML2.0 does not seem to handle these cases.

[how]
Make sure to account for the stream id when deriving the plane id. By
doing this, each plane id will be unique based on the stream id.

Reviewed-by: Charlene Liu 
Acked-by: Qingqing Zhuo 
Signed-off-by: Sung Joon Kim 
Signed-off-by: Qingqing Zhuo 
---
 .../display/dc/dml2/dml2_dc_resource_mgmt.c   | 41 ---
 .../display/dc/dml2/dml2_translation_helper.c | 25 ++-
 .../gpu/drm/amd/display/dc/dml2/dml2_utils.c  | 19 +
 3 files changed, 53 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_dc_resource_mgmt.c 
b/drivers/gpu/drm/amd/display/dc/dml2/dml2_dc_resource_mgmt.c
index 7fd0e1c3d552..8da145fd4d7b 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_dc_resource_mgmt.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_dc_resource_mgmt.c
@@ -53,7 +53,8 @@ struct dc_pipe_mapping_scratch {
struct dc_plane_pipe_pool pipe_pool;
 };
 
-static bool get_plane_id(const struct dc_state *state, const struct 
dc_plane_state *plane, unsigned int *plane_id)
+static bool get_plane_id(const struct dc_state *state, const struct 
dc_plane_state *plane,
+   unsigned int stream_id, unsigned int *plane_id)
 {
int i, j;
 
@@ -61,10 +62,12 @@ static bool get_plane_id(const struct dc_state *state, 
const struct dc_plane_sta
return false;
 
for (i = 0; i < state->stream_count; i++) {
-   for (j = 0; j < state->stream_status[i].plane_count; j++) {
-   if (state->stream_status[i].plane_states[j] == plane) {
-   *plane_id = (i << 16) | j;
-   return true;
+   if (state->streams[i]->stream_id == stream_id) {
+   for (j = 0; j < state->stream_status[i].plane_count; 
j++) {
+   if (state->stream_status[i].plane_states[j] == 
plane) {
+   *plane_id = (i << 16) | j;
+   return true;
+   }
}
}
}
@@ -111,13 +114,15 @@ static struct pipe_ctx *find_master_pipe_of_stream(struct 
dml2_context *ctx, str
return NULL;
 }
 
-static struct pipe_ctx *find_master_pipe_of_plane(struct dml2_context *ctx, 
struct dc_state *state, unsigned int plane_id)
+static struct pipe_ctx *find_master_pipe_of_plane(struct dml2_context *ctx,
+   struct dc_state *state, unsigned int plane_id)
 {
int i;
unsigned int plane_id_assigned_to_pipe;
 
for (i = 0; i < ctx->config.dcn_pipe_count; i++) {
-   if (state->res_ctx.pipe_ctx[i].plane_state && 
get_plane_id(state, state->res_ctx.pipe_ctx[i].plane_state, 
&plane_id_assigned_to_pipe)) {
+   if (state->res_ctx.pipe_ctx[i].plane_state && 
get_plane_id(state, state->res_ctx.pipe_ctx[i].plane_state,
+   state->res_ctx.pipe_ctx[i].stream->stream_id, 
&plane_id_assigned_to_pipe)) {
if (plane_id_assigned_to_pipe == plane_id)
return &state->res_ctx.pipe_ctx[i];
}
@@ -126,14 +131,16 @@ static struct pipe_ctx *find_master_pipe_of_plane(struct 
dml2_context *ctx, stru
return NULL;
 }
 
-static unsigned int find_pipes_assigned_to_plane(struct dml2_context *ctx, 
struct dc_state *state, unsigned int plane_id, unsigned int *pipes)
+static unsigned int find_pipes_assigned_to_plane(struct dml2_context *ctx,
+   struct dc_state *state, unsigned int plane_id, unsigned int *pipes)
 {
int i;
unsigned int num_found = 0;
unsigned int plane_id_assigned_to_pipe;
 
for (i = 0; i < ctx->config.dcn_pipe_count; i++) {
-   if (state->res_ctx.pipe_ctx[i].plane_state && 
get_plane_id(state, state->res_ctx.pipe_ctx[i].plane_state, 
&plane_id_assigned_to_pipe)) {
+   if (state->res_ctx.pipe_ctx[i].plane_state && 
get_plane_id(state, state->res_ctx.pipe_ctx[i].plane_state,
+   state->res_ctx.pipe_ctx[i].stream->stream_id, 
&plane_id_assigned_to_pipe)) {
if (plane_id_assigned_to_pipe == plane_id)
pipes[num_found++] = i;
}
@@ -499,7 +506,7 @@ static struct pipe_ctx *assign_pipes_to_plane(struct 
dml2_context *ctx, struct d
unsigned int next_pipe_to_assign;
int odm_slice, mpc_slice;
 
-   if (!get_plane_id(state, plane, &plane_id)) {
+   if (!get_plane_id(state, plane, stream->stream_id, &plane_id)) {
ASSERT(false);
return master_pipe;
}
@@ -545,11 +552,14 @@ static void free_pipe(struct pipe_ctx *pipe)
memset(pipe, 0, sizeof(struct pipe_ctx));
 }
 
-static void free_unused_pipes_for_plane(struct dml2_context *ctx, struct 
dc_state *

[PATCH v3 16/16] drm/amd/display: add check in validate_only in dml2

2023-10-04 Thread Rodrigo Siqueira

From: Gabe Teeger 

[what]
does_configuration_meet_sw_policies check was not done in the
validate_only portion of dml2, so some unsupported modes were passing bw
validation, only to fail the same check later in validate_and_build. now
we add the check to validate_only.

Also add line in dcn35_resource to ensure that value set for
enable_windowed_mpo_odm gets passed to dml.

[why]
Immediate black screen during video playback at 4k144hz. The debugger
showed that we were failing validation in dml on every updateplanes().

Reviewed-by: Wenjing Liu 
Acked-by: Qingqing Zhuo 
Signed-off-by: Qingqing Zhuo 
Signed-off-by: Gabe Teeger 
---
 drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c | 1 +
 drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c| 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c
index 2283daa45318..828846538a92 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c
@@ -2072,6 +2072,7 @@ static bool dcn35_resource_construct(
dc->dml2_options.use_native_soc_bb_construction = true;
if (dc->config.EnableMinDispClkODM)
dc->dml2_options.minimize_dispclk_using_odm = true;
+   dc->dml2_options.enable_windowed_mpo_odm = 
dc->config.enable_windowed_mpo_odm;
 
dc->dml2_options.callbacks.dc = dc;
dc->dml2_options.callbacks.build_scaling_params = 
&resource_build_scaling_params;
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c 
b/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c
index 11c131f6cf26..9a5e145168bc 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c
@@ -659,6 +659,9 @@ static bool dml2_validate_only(const struct dc_state 
*context)
&dml2->v20.scratch.cur_display_config,
&dml2->v20.scratch.mode_support_info);
 
+   if (result)
+   result = does_configuration_meet_sw_policies(dml2, 
&dml2->v20.scratch.cur_display_config, &dml2->v20.scratch.mode_support_info);
+
return (result == 1) ? true : false;
 }
 
-- 
2.40.1

[PATCH v3 11/16] drm/amd/display: Move stereo timing check to helper

2023-10-04 Thread Rodrigo Siqueira

From: Taimur Hassan 

Rework dml2_map_dc_pipes to keep the logic clean.

Reviewed-by: Chaitanya Dhere 
Acked-by: Qingqing Zhuo 
Signed-off-by: Qingqing Zhuo 
Signed-off-by: Taimur Hassan 
---
 .../amd/display/dc/dml2/dml2_dc_resource_mgmt.c |  9 +
 .../gpu/drm/amd/display/dc/dml2/dml2_utils.c| 17 +
 .../gpu/drm/amd/display/dc/dml2/dml2_utils.h|  1 +
 3 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_dc_resource_mgmt.c 
b/drivers/gpu/drm/amd/display/dc/dml2/dml2_dc_resource_mgmt.c
index 116b78a5107c..e22b5106df8f 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_dc_resource_mgmt.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_dc_resource_mgmt.c
@@ -710,14 +710,7 @@ bool dml2_map_dc_pipes(struct dml2_context *ctx, struct 
dc_state *state, const s
scratch.mpc_info.mpc_factor = 
DPPPerSurface[plane_disp_cfg_index];
 
//For stereo timings, we need to pipe 
split
-   if 
((state->streams[stream_index]->view_format ==
-   
VIEW_3D_FORMAT_SIDE_BY_SIDE ||
-   
state->streams[stream_index]->view_format ==
-   
VIEW_3D_FORMAT_TOP_AND_BOTTOM) &&
-   
(state->streams[stream_index]->timing.timing_3d_format ==
-   
TIMING_3D_FORMAT_TOP_AND_BOTTOM ||
-   
state->streams[stream_index]->timing.timing_3d_format ==
-   
TIMING_3D_FORMAT_SIDE_BY_SIDE))
+   if 
(dml2_is_stereo_timing(state->streams[stream_index]))
scratch.mpc_info.mpc_factor = 2;
} else {
// If ODM combine is enabled, then we 
use at most 1 pipe per
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c 
b/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c
index 4c3661fbecbc..ac6bf776bad0 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c
@@ -461,3 +461,20 @@ bool dml2_verify_det_buffer_configuration(struct 
dml2_context *in_ctx, struct dc
 
return need_recalculation;
 }
+
+bool dml2_is_stereo_timing(struct dc_stream_state *stream)
+{
+   bool is_stereo = false;
+
+   if ((stream->view_format ==
+   VIEW_3D_FORMAT_SIDE_BY_SIDE ||
+   stream->view_format ==
+   VIEW_3D_FORMAT_TOP_AND_BOTTOM) &&
+   (stream->timing.timing_3d_format ==
+   TIMING_3D_FORMAT_TOP_AND_BOTTOM ||
+   stream->timing.timing_3d_format ==
+   TIMING_3D_FORMAT_SIDE_BY_SIDE))
+   is_stereo = true;
+
+   return is_stereo;
+}
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.h 
b/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.h
index 342d64039f9a..23b9028337d4 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.h
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.h
@@ -42,6 +42,7 @@ void dml2_copy_clocks_to_dc_state(struct dml2_dcn_clocks 
*out_clks, struct dc_st
 void dml2_extract_watermark_set(struct dcn_watermarks *watermark, struct 
display_mode_lib_st *dml_core_ctx);
 int dml2_helper_find_dml_pipe_idx_by_stream_id(struct dml2_context *ctx, 
unsigned int stream_id);
 bool is_dtbclk_required(const struct dc *dc, struct dc_state *context);
+bool dml2_is_stereo_timing(struct dc_stream_state *stream);
 
 /*
  * dml2_dc_construct_pipes - This function will determine if we need 
additional pipes based
-- 
2.40.1

[PATCH v3 15/16] drm/amd/display: Port replay vblank logic to DML2

2023-10-04 Thread Rodrigo Siqueira

From: Daniel Miess 

Update DML2 with replay vblank logic found in DML1.

Reviewed-by: Charlene Liu 
Acked-by: Qingqing Zhuo 
Signed-off-by: Daniel Miess 
Signed-off-by: Qingqing Zhuo 
---
 .../amd/display/dc/dml2/display_mode_core.c   | 25 ---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c 
b/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c
index 0d446d850313..fddd52f3f601 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c
@@ -6153,6 +6153,13 @@ static void CalculateImmediateFlipBandwithSupport(
 #endif
 }
 
+static dml_uint_t MicroSecToVertLines(dml_uint_t num_us, dml_uint_t h_total, 
dml_float_t pixel_clock)
+{
+   dml_uint_t lines_time_in_ns = 1000.0 * (h_total * 1000.0) / 
(pixel_clock * 1000.0);
+
+   return dml_ceil(1000.0 * num_us / lines_time_in_ns, 1.0);
+}
+
 /// @brief Calculate the maximum vstartup for mode support and mode 
programming consideration
 /// Bounded by min of actual vblank and input vblank_nom, dont want 
vstartup/ready to start too early if actual vbllank is huge
 static dml_uint_t CalculateMaxVStartup(
@@ -6164,12 +6171,24 @@ static dml_uint_t CalculateMaxVStartup(
 {
dml_uint_t vblank_size = 0;
dml_uint_t max_vstartup_lines = 0;
+   const dml_uint_t max_allowed_vblank_nom = 1023;
 
dml_float_t line_time_us = (dml_float_t) timing->HTotal[plane_idx] / 
timing->PixelClock[plane_idx];
dml_uint_t vblank_actual = timing->VTotal[plane_idx] - 
timing->VActive[plane_idx];
-   dml_uint_t vblank_nom_default_in_line = (dml_uint_t) 
dml_floor((dml_float_t) vblank_nom_default_us/line_time_us, 1.0);
-   dml_uint_t vblank_nom_input = (dml_uint_t) 
dml_min(timing->VBlankNom[plane_idx], vblank_nom_default_in_line);
-   dml_uint_t vblank_avail = (vblank_nom_input == 0) ? 
vblank_nom_default_in_line : vblank_nom_input;
+
+   dml_uint_t vblank_nom_default_in_line = 
MicroSecToVertLines(vblank_nom_default_us, timing->HTotal[plane_idx],
+   timing->PixelClock[plane_idx]);
+   dml_uint_t vblank_nom_input = (dml_uint_t)dml_min(vblank_actual, 
vblank_nom_default_in_line);
+
+   // vblank_nom should not be smaller than (VSync (VTotal - VActive - 
VFrontPorch) + 2)
+   // + 2 is because
+   // 1 -> VStartup_start should be 1 line before VSync
+   // 1 -> always reserve 1 line between start of VBlank to VStartup signal
+   dml_uint_t vblank_nom_vsync_capped = dml_max(vblank_nom_input,
+   timing->VTotal[plane_idx] - timing->VActive[plane_idx] 
- timing->VFrontPorch[plane_idx] + 2);
+   dml_uint_t vblank_nom_max_allowed_capped = 
dml_min(vblank_nom_vsync_capped, max_allowed_vblank_nom);
+   dml_uint_t vblank_avail = (vblank_nom_max_allowed_capped == 0) ?
+   vblank_nom_default_in_line : 
vblank_nom_max_allowed_capped;
 
vblank_size = (dml_uint_t) dml_min(vblank_actual, vblank_avail);
 
-- 
2.40.1

[PATCH v3 13/16] drm/amd/display: correct dml2 input and dlg_refclk

2023-10-04 Thread Rodrigo Siqueira

From: Charlene Liu 

dc->dml2_options.use_native_pstate_optimization flag will make driver
use dcn32 legacy_svp_drr related tuning. Set this to false fixed the
stutter underflow issue also based on HW suggest disable ODM by default
and let DML choose it.

Reviewed-by: Zhan Liu 
Acked-by: Qingqing Zhuo 
Signed-off-by: Qingqing Zhuo 
Signed-off-by: Charlene Liu 
---
 drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c | 7 +++
 drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c| 8 ++--
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c
index e2f3ddb3f225..2283daa45318 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c
@@ -733,8 +733,7 @@ static const struct dc_debug_options debug_defaults_drv = {
.support_eDP1_5 = true,
.enable_hpo_pg_support = false,
.enable_legacy_fast_update = true,
-   .disable_stutter = true,
-   .enable_single_display_2to1_odm_policy = true,
+   .enable_single_display_2to1_odm_policy = false,
.disable_idle_power_optimizations = true,
.dmcub_emulation = false,
.disable_boot_optimizations = false,
@@ -1835,6 +1834,7 @@ static bool dcn35_resource_construct(
 
/* Use pipe context based otg sync logic */
dc->config.use_pipe_ctx_sync_logic = true;
+   dc->config.use_default_clock_table = true;
/* read VBIOS LTTPR caps */
{
if (ctx->dc_bios->funcs->get_lttpr_caps) {
@@ -2065,11 +2065,10 @@ static bool dcn35_resource_construct(
 
dc->cap_funcs = cap_funcs;
 
-
dc->dcn_ip->max_num_dpp = pool->base.pipe_count;
 
dc->dml2_options.dcn_pipe_count = pool->base.pipe_count;
-   dc->dml2_options.use_native_pstate_optimization = false;
+   dc->dml2_options.use_native_pstate_optimization = true;
dc->dml2_options.use_native_soc_bb_construction = true;
if (dc->config.EnableMinDispClkODM)
dc->dml2_options.minimize_dispclk_using_odm = true;
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c 
b/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c
index 552d5cffce2d..11c131f6cf26 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c
@@ -67,8 +67,12 @@ static void map_hw_resources(struct dml2_context *dml2,
in_out_display_cfg->hw.DPPPerSurface[i] = 
mode_support_info->DPPPerSurface[i];
in_out_display_cfg->hw.DSCEnabled[i] = 
mode_support_info->DSCEnabled[i];
in_out_display_cfg->hw.NumberOfDSCSlices[i] = 
mode_support_info->NumberOfDSCSlices[i];
-   in_out_display_cfg->hw.DLGRefClkFreqMHz = 50;
-
+   in_out_display_cfg->hw.DLGRefClkFreqMHz = 24;
+   if (dml2->v20.dml_core_ctx.project != dml_project_dcn35 &&
+   dml2->v20.dml_core_ctx.project != dml_project_dcn351) {
+   /*dGPU default as 50Mhz*/
+   in_out_display_cfg->hw.DLGRefClkFreqMHz = 50;
+   }
for (j = 0; j < mode_support_info->DPPPerSurface[i]; j++) {

dml2->v20.scratch.dml_to_dc_pipe_mapping.dml_pipe_idx_to_stream_id[num_pipes] = 
dml2->v20.scratch.dml_to_dc_pipe_mapping.disp_cfg_to_stream_id[i];

dml2->v20.scratch.dml_to_dc_pipe_mapping.dml_pipe_idx_to_stream_id_valid[num_pipes]
 = true;
-- 
2.40.1

[PATCH v3 14/16] drm/amd/display: Modify Pipe Selection for Policy for ODM

2023-10-04 Thread Rodrigo Siqueira

From: Saaem Rizvi 

[Why]
There are certain cases during a transition to ODM that might cause
corruption on the display. This occurs when we choose certain pipes in a
particular state.

[How]
We now will store the pipe indexes of the any pipes that might be
problematic to switch to during an ODM transition, and only use them as
a last resort.

Reviewed-by: Dmytro Laktyushkin 
Acked-by: Qingqing Zhuo 
Signed-off-by: Qingqing Zhuo 
Signed-off-by: Saaem Rizvi 
---
 .../display/dc/dml2/dml2_dc_resource_mgmt.c   | 140 --
 1 file changed, 126 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_dc_resource_mgmt.c 
b/drivers/gpu/drm/amd/display/dc/dml2/dml2_dc_resource_mgmt.c
index e22b5106df8f..36baf35bb170 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_dc_resource_mgmt.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_dc_resource_mgmt.c
@@ -213,6 +213,82 @@ static bool is_pipe_free(const struct pipe_ctx *pipe)
return false;
 }
 
+static unsigned int find_preferred_pipe_candidates(const struct dc_state 
*existing_state,
+   const int pipe_count,
+   const unsigned int stream_id,
+   unsigned int *preferred_pipe_candidates)
+{
+   unsigned int num_preferred_candidates = 0;
+   int i;
+
+   /* There is only one case which we consider for adding a pipe to the 
preferred
+* pipe candidate array:
+*
+* 1. If the existing stream id of the pipe is equivalent to the stream 
id
+* of the stream we are trying to achieve MPC/ODM combine for. This 
allows
+* us to minimize the changes in pipe topology during the transition.
+*
+* However this condition comes with a caveat. We need to ignore pipes 
that will
+* require a change in OPP but still have the same stream id. For 
example during
+* an MPC to ODM transiton.
+*/
+   if (existing_state) {
+   for (i = 0; i < pipe_count; i++) {
+   if (existing_state->res_ctx.pipe_ctx[i].stream && 
existing_state->res_ctx.pipe_ctx[i].stream->stream_id == stream_id) {
+   if 
(existing_state->res_ctx.pipe_ctx[i].plane_res.hubp &&
+   
existing_state->res_ctx.pipe_ctx[i].plane_res.hubp->opp_id != i)
+   continue;
+
+   
preferred_pipe_candidates[num_preferred_candidates++] = i;
+   }
+   }
+   }
+
+   return num_preferred_candidates;
+}
+
+static unsigned int find_last_resort_pipe_candidates(const struct dc_state 
*existing_state,
+   const int pipe_count,
+   const unsigned int stream_id,
+   unsigned int *last_resort_pipe_candidates)
+{
+   unsigned int num_last_resort_candidates = 0;
+   int i;
+
+   /* There are two cases where we would like to add a given pipe into the 
last
+* candidate array:
+*
+* 1. If the pipe requires a change in OPP, for example during an MPC
+* to ODM transiton.
+*
+* 2. If the pipe already has an enabled OTG.
+*/
+   if (existing_state) {
+   for (i  = 0; i < pipe_count; i++) {
+   if ((existing_state->res_ctx.pipe_ctx[i].plane_res.hubp 
&&
+   
existing_state->res_ctx.pipe_ctx[i].plane_res.hubp->opp_id != i) ||
+   
existing_state->res_ctx.pipe_ctx[i].stream_res.tg)
+   
last_resort_pipe_candidates[num_last_resort_candidates++] = i;
+   }
+   }
+
+   return num_last_resort_candidates;
+}
+
+static bool is_pipe_in_candidate_array(const unsigned int pipe_idx,
+   const unsigned int *candidate_array,
+   const unsigned int candidate_array_size)
+{
+   int i;
+
+   for (i = 0; i < candidate_array_size; i++) {
+   if (candidate_array[i] == pipe_idx)
+   return true;
+   }
+
+   return false;
+}
+
 static bool find_more_pipes_for_stream(struct dml2_context *ctx,
struct dc_state *state, // The state we want to find a free mapping in
unsigned int stream_id, // The stream we want this pipe to drive
@@ -222,16 +298,18 @@ static bool find_more_pipes_for_stream(struct 
dml2_context *ctx,
const struct dc_state *existing_state) // The state (optional) that we 
want to minimize remapping relative to
 {
struct pipe_ctx *pipe = NULL;
-   unsigned int preferred_pipe_candidates[MAX_PIPES];
+   unsigned int preferred_pipe_candidates[MAX_PIPES] = {0};
+   unsigned int last_resort_pipe_candidates[MAX_PIPES] = {0};
unsigned int num_preferred_candidates = 0;
+   unsigned int num_last_resort_candidates = 0;
int i;
 
if (existing_state) {
-   // To minimize prioritize candidates from existing stream
-   for (i = 0; i < ctx->config.dcn_pipe_count; i++) {

[PATCH v3 10/16] drm/amd/display: Split pipe for stereo timings

2023-10-04 Thread Rodrigo Siqueira

From: Taimur Hassan 

[Why & How]
DML2 did not carry over DML1 logic that splits pipe for stero timings. Pipe
splitting is needed in this case to pass stereo tests.

Reviewed-by: Charlene Liu 
Acked-by: Qingqing Zhuo 
Signed-off-by: Qingqing Zhuo 
Signed-off-by: Taimur Hassan 
---
 .../drm/amd/display/dc/dml2/dml2_dc_resource_mgmt.c   | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_dc_resource_mgmt.c 
b/drivers/gpu/drm/amd/display/dc/dml2/dml2_dc_resource_mgmt.c
index 8da145fd4d7b..116b78a5107c 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_dc_resource_mgmt.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_dc_resource_mgmt.c
@@ -708,6 +708,17 @@ bool dml2_map_dc_pipes(struct dml2_context *ctx, struct 
dc_state *state, const s
// If ODM combine is not inuse, then 
the number of pipes
// per plane is determined by MPC 
combine factor
scratch.mpc_info.mpc_factor = 
DPPPerSurface[plane_disp_cfg_index];
+
+   //For stereo timings, we need to pipe 
split
+   if 
((state->streams[stream_index]->view_format ==
+   
VIEW_3D_FORMAT_SIDE_BY_SIDE ||
+   
state->streams[stream_index]->view_format ==
+   
VIEW_3D_FORMAT_TOP_AND_BOTTOM) &&
+   
(state->streams[stream_index]->timing.timing_3d_format ==
+   
TIMING_3D_FORMAT_TOP_AND_BOTTOM ||
+   
state->streams[stream_index]->timing.timing_3d_format ==
+   
TIMING_3D_FORMAT_SIDE_BY_SIDE))
+   scratch.mpc_info.mpc_factor = 2;
} else {
// If ODM combine is enabled, then we 
use at most 1 pipe per
// odm slice per plane, i.e. MPC 
combine is never used
-- 
2.40.1

[PATCH v3 12/16] drm/amd/display: Fix Chroma Surface height/width initialization

2023-10-04 Thread Rodrigo Siqueira

From: Sung Joon Kim 

[why]
Surface height/width for Chroma has another variable that it should be
intialized to, chroma_size. Fixing this will help pass DML2.0 validation
for YCbCr420 tests, DCHB006.109,129, DCHB014.011,012.

[how]
Assign SurfaceHeight/WidthC to chroma_size.height/width

Reviewed-by: Charlene Liu 
Acked-by: Qingqing Zhuo 
Signed-off-by: Sung Joon Kim 
---
 drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c 
b/drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c
index 2dd8eedfc17d..e5ccd2887c94 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c
@@ -721,8 +721,8 @@ static void populate_dml_surface_cfg_from_plane_state(enum 
dml_project_id dml2_p
out->PitchY[location] = in->plane_size.surface_pitch;
out->SurfaceHeightY[location] = in->plane_size.surface_size.height;
out->SurfaceWidthY[location] = in->plane_size.surface_size.width;
-   out->SurfaceHeightC[location] = in->plane_size.surface_size.height;
-   out->SurfaceWidthC[location] = in->plane_size.surface_size.width;
+   out->SurfaceHeightC[location] = in->plane_size.chroma_size.height;
+   out->SurfaceWidthC[location] = in->plane_size.chroma_size.width;
out->PitchC[location] = in->plane_size.chroma_pitch;
out->DCCEnable[location] = in->dcc.enable;
out->DCCMetaPitchY[location] = in->dcc.meta_pitch;
-- 
2.40.1

[PATCH v3 09/16] drm/amd/display: Use fixed DET Buffer Size

2023-10-04 Thread Rodrigo Siqueira

From: Sung Joon Kim 

[why]
Regression from DML1.0 where we use differen DET buffer sizes for each
pipe. From the spec, we need to use DET buffer size of 384 kb for each
pipe

[how]
Ensure to use 384 kb DET buffer sizes for each available pipe.

Reviewed-by: Charlene Liu 
Acked-by: Qingqing Zhuo 
Signed-off-by: Sung Joon Kim 
---
 .../drm/amd/display/dc/dcn35/dcn35_resource.c |  3 ++-
 .../gpu/drm/amd/display/dc/dml2/dml2_utils.c  | 22 +++
 .../drm/amd/display/dc/dml2/dml2_wrapper.h|  1 +
 3 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c
index 2fa876d9e1f7..e2f3ddb3f225 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c
@@ -2078,7 +2078,8 @@ static bool dcn35_resource_construct(
dc->dml2_options.callbacks.build_scaling_params = 
&resource_build_scaling_params;

dc->dml2_options.callbacks.can_support_mclk_switch_using_fw_based_vblank_stretch
 = &dcn30_can_support_mclk_switch_using_fw_based_vblank_stretch;
dc->dml2_options.callbacks.acquire_secondary_pipe_for_mpc_odm = 
&dc_resource_acquire_secondary_pipe_for_mpc_odm_legacy;
-   dc->dml2_options.max_segments_per_hubp = 18;
+   dc->dml2_options.max_segments_per_hubp = 24;
+
dc->dml2_options.det_segment_size = DCN3_2_DET_SEG_SIZE;/*todo*/
 
if (dc->config.sdpif_request_limit_words_per_umc == 0)
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c 
b/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c
index 946a98af0020..4c3661fbecbc 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c
@@ -414,17 +414,21 @@ void dml2_apply_det_buffer_allocation_policy(struct 
dml2_context *in_ctx, struct
 
for (plane_index = 0; plane_index < dml_dispcfg->num_surfaces; 
plane_index++) {
 
-   dml_dispcfg->plane.DETSizeOverride[plane_index] = 
((max_det_size / num_of_streams) / num_of_planes_per_stream[stream_index] / 
in_ctx->det_helper_scratch.dpps_per_surface[plane_index]);
+   if (in_ctx->config.override_det_buffer_size_kbytes)
+   dml_dispcfg->plane.DETSizeOverride[plane_index] = 
max_det_size / in_ctx->config.dcn_pipe_count;
+   else {
+   dml_dispcfg->plane.DETSizeOverride[plane_index] = 
((max_det_size / num_of_streams) / num_of_planes_per_stream[stream_index] / 
in_ctx->det_helper_scratch.dpps_per_surface[plane_index]);
+
+   /* If the override size is not divisible by 
det_segment_size then round off to nearest number divisible by det_segment_size 
as
+   * this is a requirement.
+   */
+   if (dml_dispcfg->plane.DETSizeOverride[plane_index] % 
in_ctx->config.det_segment_size != 0) {
+   dml_dispcfg->plane.DETSizeOverride[plane_index] 
= dml_dispcfg->plane.DETSizeOverride[plane_index] & ~0x3F;
+   }
 
-   /* If the override size is not divisible by det_segment_size 
then round off to nearest number divisible by det_segment_size as
-* this is a requirement.
-*/
-   if (dml_dispcfg->plane.DETSizeOverride[plane_index] % 
in_ctx->config.det_segment_size != 0) {
-   dml_dispcfg->plane.DETSizeOverride[plane_index] = 
dml_dispcfg->plane.DETSizeOverride[plane_index] & ~0x3F;
+   if (plane_index + 1 < dml_dispcfg->num_surfaces && 
dml_dispcfg->plane.BlendingAndTiming[plane_index] != 
dml_dispcfg->plane.BlendingAndTiming[plane_index + 1])
+   stream_index++;
}
-
-   if (plane_index + 1 < dml_dispcfg->num_surfaces && 
dml_dispcfg->plane.BlendingAndTiming[plane_index] != 
dml_dispcfg->plane.BlendingAndTiming[plane_index + 1])
-   stream_index++;
}
 }
 
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.h 
b/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.h
index 4d0377354bdd..f3b85b0891d3 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.h
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.h
@@ -137,6 +137,7 @@ struct dml2_configuration_options {
bool skip_hw_state_mapping;
bool optimize_odm_4to1;
bool minimize_dispclk_using_odm;
+   bool override_det_buffer_size_kbytes;
struct dml2_dc_callbacks callbacks;
struct {
bool force_disable_subvp;
-- 
2.40.1

[PATCH v3 07/16] drm/amd/display: Add z8_marks in dml

2023-10-04 Thread Rodrigo Siqueira

From: Charlene Liu 

Add z8 watermarks to struct for later ASIC use.

Reviewed-by: Alvin Lee 
Acked-by: Qingqing Zhuo 
Signed-off-by: Charlene Liu 
---
 drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c | 2 ++
 drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.h | 2 ++
 drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c| 2 ++
 drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c  | 2 ++
 drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.h  | 1 +
 5 files changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c 
b/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c
index e65e86c84745..0d446d850313 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c
@@ -10169,6 +10169,8 @@ dml_get_var_func(wm_memory_trip, dml_float_t, 
mode_lib->mp.UrgentLatency);
 dml_get_var_func(wm_fclk_change, dml_float_t, 
mode_lib->mp.Watermark.FCLKChangeWatermark);
 dml_get_var_func(wm_usr_retraining, dml_float_t, 
mode_lib->mp.Watermark.USRRetrainingWatermark);
 dml_get_var_func(wm_dram_clock_change, dml_float_t, 
mode_lib->mp.Watermark.DRAMClockChangeWatermark);
+dml_get_var_func(wm_z8_stutter_enter_exit, dml_float_t, 
mode_lib->mp.Watermark.Z8StutterEnterPlusExitWatermark);
+dml_get_var_func(wm_z8_stutter, dml_float_t, 
mode_lib->mp.Watermark.Z8StutterExitWatermark);
 dml_get_var_func(fraction_of_urgent_bandwidth, dml_float_t, 
mode_lib->mp.FractionOfUrgentBandwidth);
 dml_get_var_func(fraction_of_urgent_bandwidth_imm_flip, dml_float_t, 
mode_lib->mp.FractionOfUrgentBandwidthImmediateFlip);
 dml_get_var_func(urgent_latency, dml_float_t, mode_lib->mp.UrgentLatency);
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.h 
b/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.h
index 2a0545801f77..8452485684f5 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.h
+++ b/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.h
@@ -85,6 +85,8 @@ dml_get_var_decl(wm_stutter_exit, dml_float_t);
 dml_get_var_decl(wm_stutter_enter_exit, dml_float_t);
 dml_get_var_decl(wm_memory_trip, dml_float_t);
 dml_get_var_decl(wm_dram_clock_change, dml_float_t);
+dml_get_var_decl(wm_z8_stutter_enter_exit, dml_float_t);
+dml_get_var_decl(wm_z8_stutter, dml_float_t);
 dml_get_var_decl(urgent_latency, dml_float_t);
 dml_get_var_decl(clk_dcf_deepsleep, dml_float_t);
 dml_get_var_decl(wm_fclk_change, dml_float_t);
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c 
b/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c
index 5bd695628ce8..da18c4b8c257 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c
@@ -369,6 +369,8 @@ void dml2_extract_watermark_set(struct dcn_watermarks 
*watermark, struct display
watermark->urgent_latency_ns = dml_get_urgent_latency(dml_core_ctx) * 
1000;
watermark->cstate_pstate.fclk_pstate_change_ns = 
dml_get_wm_fclk_change(dml_core_ctx) * 1000;
watermark->usr_retraining_ns = dml_get_wm_usr_retraining(dml_core_ctx) 
* 1000;
+   watermark->cstate_pstate.cstate_enter_plus_exit_z8_ns = 
dml_get_wm_z8_stutter_enter_exit(dml_core_ctx) * 1000;
+   watermark->cstate_pstate.cstate_exit_z8_ns = 
dml_get_wm_z8_stutter(dml_core_ctx) * 1000;
 }
 
 void dml2_initialize_det_scratch(struct dml2_context *in_ctx)
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c 
b/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c
index e4f2f3eb9b32..552d5cffce2d 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c
@@ -717,6 +717,8 @@ bool dml2_create(const struct dc *in_dc, const struct 
dml2_configuration_options
 
initialize_dml2_soc_states(*dml2, in_dc, 
&(*dml2)->v20.dml_core_ctx.soc, &(*dml2)->v20.dml_core_ctx.states);
 
+   /*Initialize DML20 instance which calls dml2_core_create, and 
core_dcn3_populate_informative*/
+   //dml2_initialize_instance(&(*dml_ctx)->v20.dml_init);
return true;
 }
 
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.h 
b/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.h
index e76726018927..4d0377354bdd 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.h
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.h
@@ -63,6 +63,7 @@ struct dml2_dcn_clocks {
unsigned int ref_dtbclk_khz;
bool p_state_supported;
unsigned int cab_num_ways_required;
+   unsigned int dcfclk_khz_ds;
 };
 
 struct dml2_dc_callbacks {
-- 
2.40.1

[PATCH v3 04/16] drm/amd/display: Move dml code under CONFIG_DRM_AMD_DC_FP guard

2023-10-04 Thread Rodrigo Siqueira

For some reason, the dml code is not guarded under CONFIG_DRM_AMD_DC_FP
in the Makefile. This commit moves the dml code under the DC_FP guard.

Signed-off-by: Rodrigo Siqueira 
---
 drivers/gpu/drm/amd/display/dc/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/Makefile 
b/drivers/gpu/drm/amd/display/dc/Makefile
index 2f3d9602b7a0..dafa34bc2782 100644
--- a/drivers/gpu/drm/amd/display/dc/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/Makefile
@@ -22,7 +22,7 @@
 #
 # Makefile for Display Core (dc) component.
 
-DC_LIBS = basics bios dml clk_mgr dce gpio irq link virtual dsc
+DC_LIBS = basics bios clk_mgr dce gpio irq link virtual dsc
 
 ifdef CONFIG_DRM_AMD_DC_FP
 
@@ -43,6 +43,7 @@ DC_LIBS += dcn316
 DC_LIBS += dcn32
 DC_LIBS += dcn321
 DC_LIBS += dcn35
+DC_LIBS += dml
 endif
 
 DC_LIBS += dce120
-- 
2.40.1

[PATCH v3 06/16] drm/amd/display: Add DCN35 DML2 support

2023-10-04 Thread Rodrigo Siqueira

From: Qingqing Zhuo 

Enable DML2 for DCN35.

Changes since V1:
- Remove hard coded values

Acked-by: Harry Wentland 
Signed-off-by: Alex Deucher 
Signed-off-by: Roman Li 
Signed-off-by: Qingqing Zhuo 
---
 .../drm/amd/display/dc/dcn35/dcn35_resource.c | 13 +--
 .../dc/dml2/display_mode_core_structs.h   |  2 +
 .../gpu/drm/amd/display/dc/dml2/dml2_policy.c |  7 ++
 .../display/dc/dml2/dml2_translation_helper.c | 89 ++-
 .../drm/amd/display/dc/dml2/dml2_wrapper.c|  6 ++
 5 files changed, 110 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c
index 693c7ba4b34d..2fa876d9e1f7 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c
@@ -31,7 +31,7 @@
 #include "resource.h"
 #include "include/irq_service_interface.h"
 #include "dcn35_resource.h"
-/*#include "dml2/dml2_wrapper.h"*/
+#include "dml2/dml2_wrapper.h"
 
 #include "dcn20/dcn20_resource.h"
 #include "dcn30/dcn30_resource.h"
@@ -729,7 +729,7 @@ static const struct dc_debug_options debug_defaults_drv = {
},
.seamless_boot_odm_combine = DML_FAIL_SOURCE_PIXEL_FORMAT,
.enable_z9_disable_interface = true, /* Allow support for the PMFW 
interface for disable Z9*/
-   /* .using_dml2 = true, */
+   .using_dml2 = true,
.support_eDP1_5 = true,
.enable_hpo_pg_support = false,
.enable_legacy_fast_update = true,
@@ -1694,7 +1694,7 @@ static bool dcn35_validate_bandwidth(struct dc *dc,
 {
bool out = false;
 
-   /*out = dml2_validate(dc, context, fast_validate);*/
+   out = dml2_validate(dc, context, fast_validate);
 
return out;
 }
@@ -2067,18 +2067,19 @@ static bool dcn35_resource_construct(
 
 
dc->dcn_ip->max_num_dpp = pool->base.pipe_count;
-#if 0
+
dc->dml2_options.dcn_pipe_count = pool->base.pipe_count;
dc->dml2_options.use_native_pstate_optimization = false;
dc->dml2_options.use_native_soc_bb_construction = true;
+   if (dc->config.EnableMinDispClkODM)
+   dc->dml2_options.minimize_dispclk_using_odm = true;
 
dc->dml2_options.callbacks.dc = dc;
dc->dml2_options.callbacks.build_scaling_params = 
&resource_build_scaling_params;

dc->dml2_options.callbacks.can_support_mclk_switch_using_fw_based_vblank_stretch
 = &dcn30_can_support_mclk_switch_using_fw_based_vblank_stretch;
-   dc->dml2_options.callbacks.acquire_secondary_pipe_for_mpc_odm = 
&dc_resource_acquire_secondary_pipe_for_mpc_odm;
+   dc->dml2_options.callbacks.acquire_secondary_pipe_for_mpc_odm = 
&dc_resource_acquire_secondary_pipe_for_mpc_odm_legacy;
dc->dml2_options.max_segments_per_hubp = 18;
dc->dml2_options.det_segment_size = DCN3_2_DET_SEG_SIZE;/*todo*/
-#endif
 
if (dc->config.sdpif_request_limit_words_per_umc == 0)
dc->config.sdpif_request_limit_words_per_umc = 16;/*todo*/
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core_structs.h 
b/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core_structs.h
index d2e1510a504f..c2fa28ff57ab 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core_structs.h
+++ b/drivers/gpu/drm/amd/display/dc/dml2/display_mode_core_structs.h
@@ -32,6 +32,8 @@ enum dml_project_id {
dml_project_default = 1,
dml_project_dcn32 = dml_project_default,
dml_project_dcn321 = 2,
+   dml_project_dcn35 = 3,
+   dml_project_dcn351 = 4,
 };
 enum dml_prefetch_modes {
dml_prefetch_support_uclk_fclk_and_stutter_if_possible = 0,
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_policy.c 
b/drivers/gpu/drm/amd/display/dc/dml2/dml2_policy.c
index 0c71a8aa5587..f8e9aa32ceab 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_policy.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_policy.c
@@ -298,4 +298,11 @@ void build_unoptimized_policy_settings(enum dml_project_id 
project, struct dml_m
policy->SynchronizeDRRDisplaysForUCLKPStateChangeFinal = true;
policy->AssumeModeSupportAtMaxPwrStateEvenDRAMClockChangeNotSupported = 
true; // TOREVIEW: What does this mean?
policy->AssumeModeSupportAtMaxPwrStateEvenFClockChangeNotSupported = 
true; // TOREVIEW: What does this mean?
+   if (project == dml_project_dcn35 ||
+   project == dml_project_dcn351) {
+   policy->DCCProgrammingAssumesScanDirectionUnknownFinal = false;
+   policy->EnhancedPrefetchScheduleAccelerationFinal = 0;
+   policy->AllowForPStateChangeOrStutterInVBlankFinal = 
dml_prefetch_support_uclk_fclk_and_stutter_if_possible; /*new*/
+   policy->UseOnlyMaxPrefetchModes = 1;
+   }
 }
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c 
b/drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c
index 353f22a6b7f8..de58c7b867e8 100644
--- a/drivers/gpu/drm/am

[PATCH v3 01/16] drm/amd/display: Move dce_calcs from DML folder

2023-10-04 Thread Rodrigo Siqueira

dce_calcs does not have FPU operations, and it is required for DCE and
DCN. Remove this file from the DML folder and add it to the basic folder
visible for DCE and DCN.

Signed-off-by: Rodrigo Siqueira 
---
 drivers/gpu/drm/amd/display/dc/basics/Makefile | 7 ++-
 .../amd/display/dc/{dml/calcs => basics}/calcs_logger.h| 0
 .../drm/amd/display/dc/{dml/calcs => basics}/dce_calcs.c   | 0
 drivers/gpu/drm/amd/display/dc/dml/Makefile| 2 +-
 4 files changed, 7 insertions(+), 2 deletions(-)
 rename drivers/gpu/drm/amd/display/dc/{dml/calcs => basics}/calcs_logger.h 
(100%)
 rename drivers/gpu/drm/amd/display/dc/{dml/calcs => basics}/dce_calcs.c (100%)

diff --git a/drivers/gpu/drm/amd/display/dc/basics/Makefile 
b/drivers/gpu/drm/amd/display/dc/basics/Makefile
index 01b99e0d788e..ee611b03dc48 100644
--- a/drivers/gpu/drm/amd/display/dc/basics/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/basics/Makefile
@@ -24,7 +24,12 @@
 # It provides the general basic services required by other DAL
 # subcomponents.
 
-BASICS = conversion.o fixpt31_32.o vector.o dc_common.o
+BASICS := \
+   conversion.o \
+   fixpt31_32.o \
+   vector.o \
+   dc_common.o \
+   dce_calcs.o
 
 AMD_DAL_BASICS = $(addprefix $(AMDDALPATH)/dc/basics/,$(BASICS))
 
diff --git a/drivers/gpu/drm/amd/display/dc/dml/calcs/calcs_logger.h 
b/drivers/gpu/drm/amd/display/dc/basics/calcs_logger.h
similarity index 100%
rename from drivers/gpu/drm/amd/display/dc/dml/calcs/calcs_logger.h
rename to drivers/gpu/drm/amd/display/dc/basics/calcs_logger.h
diff --git a/drivers/gpu/drm/amd/display/dc/dml/calcs/dce_calcs.c 
b/drivers/gpu/drm/amd/display/dc/basics/dce_calcs.c
similarity index 100%
rename from drivers/gpu/drm/amd/display/dc/dml/calcs/dce_calcs.c
rename to drivers/gpu/drm/amd/display/dc/basics/dce_calcs.c
diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml/Makefile
index b06c3983af36..8621dfe9a68c 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
@@ -134,7 +134,7 @@ CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/calcs/dcn_calcs.o := 
$(dml_rcflags)
 CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/calcs/dcn_calc_auto.o := $(dml_rcflags)
 CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/calcs/dcn_calc_math.o := $(dml_rcflags)
 
-DML = calcs/dce_calcs.o calcs/custom_float.o calcs/bw_fixed.o
+DML = calcs/custom_float.o calcs/bw_fixed.o
 
 ifdef CONFIG_DRM_AMD_DC_FP
 DML += display_mode_lib.o display_rq_dlg_helpers.o dml1_display_rq_dlg_calc.o
-- 
2.40.1

[PATCH v3 02/16] drm/amd/display: Move custom_float outside DML

2023-10-04 Thread Rodrigo Siqueira

The custom_float file does not have any FPU operation, so it should be
inside DML. This commit moves the file to the basic folder.

Signed-off-by: Rodrigo Siqueira 
---
 .../gpu/drm/amd/display/dc/basics/Makefile|  3 +-
 .../dc/{dml/calcs => basics}/custom_float.c   | 90 +++
 drivers/gpu/drm/amd/display/dc/dml/Makefile   |  2 +-
 3 files changed, 36 insertions(+), 59 deletions(-)
 rename drivers/gpu/drm/amd/display/dc/{dml/calcs => basics}/custom_float.c 
(66%)

diff --git a/drivers/gpu/drm/amd/display/dc/basics/Makefile 
b/drivers/gpu/drm/amd/display/dc/basics/Makefile
index ee611b03dc48..65d713aff407 100644
--- a/drivers/gpu/drm/amd/display/dc/basics/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/basics/Makefile
@@ -29,7 +29,8 @@ BASICS := \
fixpt31_32.o \
vector.o \
dc_common.o \
-   dce_calcs.o
+   dce_calcs.o \
+   custom_float.o
 
 AMD_DAL_BASICS = $(addprefix $(AMDDALPATH)/dc/basics/,$(BASICS))
 
diff --git a/drivers/gpu/drm/amd/display/dc/dml/calcs/custom_float.c 
b/drivers/gpu/drm/amd/display/dc/basics/custom_float.c
similarity index 66%
rename from drivers/gpu/drm/amd/display/dc/dml/calcs/custom_float.c
rename to drivers/gpu/drm/amd/display/dc/basics/custom_float.c
index 31d167bc548f..ae05ded9a7f3 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/calcs/custom_float.c
+++ b/drivers/gpu/drm/amd/display/dc/basics/custom_float.c
@@ -1,5 +1,6 @@
+// SPDX-License-Identifier: MIT
 /*
- * Copyright 2017 Advanced Micro Devices, Inc.
+ * Copyright 2023 Advanced Micro Devices, Inc.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"),
@@ -25,52 +26,41 @@
 #include "dm_services.h"
 #include "custom_float.h"
 
-
-static bool build_custom_float(
-   struct fixed31_32 value,
-   const struct custom_float_format *format,
-   bool *negative,
-   uint32_t *mantissa,
-   uint32_t *exponenta)
+static bool build_custom_float(struct fixed31_32 value,
+  const struct custom_float_format *format,
+  bool *negative,
+  uint32_t *mantissa,
+  uint32_t *exponenta)
 {
uint32_t exp_offset = (1 << (format->exponenta_bits - 1)) - 1;
 
const struct fixed31_32 mantissa_constant_plus_max_fraction =
-   dc_fixpt_from_fraction(
-   (1LL << (format->mantissa_bits + 1)) - 1,
-   1LL << format->mantissa_bits);
+   dc_fixpt_from_fraction((1LL << (format->mantissa_bits + 1)) - 1,
+  1LL << format->mantissa_bits);
 
struct fixed31_32 mantiss;
 
-   if (dc_fixpt_eq(
-   value,
-   dc_fixpt_zero)) {
+   if (dc_fixpt_eq(value, dc_fixpt_zero)) {
*negative = false;
*mantissa = 0;
*exponenta = 0;
return true;
}
 
-   if (dc_fixpt_lt(
-   value,
-   dc_fixpt_zero)) {
+   if (dc_fixpt_lt(value, dc_fixpt_zero)) {
*negative = format->sign;
value = dc_fixpt_neg(value);
} else {
*negative = false;
}
 
-   if (dc_fixpt_lt(
-   value,
-   dc_fixpt_one)) {
+   if (dc_fixpt_lt(value, dc_fixpt_one)) {
uint32_t i = 1;
 
do {
value = dc_fixpt_shl(value, 1);
++i;
-   } while (dc_fixpt_lt(
-   value,
-   dc_fixpt_one));
+   } while (dc_fixpt_lt(value, dc_fixpt_one));
 
--i;
 
@@ -81,54 +71,40 @@ static bool build_custom_float(
}
 
*exponenta = exp_offset - i;
-   } else if (dc_fixpt_le(
-   mantissa_constant_plus_max_fraction,
-   value)) {
+   } else if (dc_fixpt_le(mantissa_constant_plus_max_fraction, value)) {
uint32_t i = 1;
 
do {
value = dc_fixpt_shr(value, 1);
++i;
-   } while (dc_fixpt_lt(
-   mantissa_constant_plus_max_fraction,
-   value));
+   } while (dc_fixpt_lt(mantissa_constant_plus_max_fraction, 
value));
 
*exponenta = exp_offset + i - 1;
} else {
*exponenta = exp_offset;
}
 
-   mantiss = dc_fixpt_sub(
-   value,
-   dc_fixpt_one);
+   mantiss = dc_fixpt_sub(value, dc_fixpt_one);
 
-   if (dc_fixpt_lt(
-   mantiss,
-   dc_fixpt_zero) ||
-   dc_fixpt_lt(
-   dc_fixpt_one,
-   mantiss))
+   if (dc_fixpt_lt(mantiss, dc_fixpt_zero) ||
+   dc_fixpt_lt(dc_fixpt_one, m

[PATCH v3 03/16] drm/amd/display: Move bw_fixed outside DML folder

2023-10-04 Thread Rodrigo Siqueira

bw_fixed does not need any FPU operation, and it is used on DCE and DCN.
For this reason, this commit moves bw_fixed to the basic folder outside
DML.

Signed-off-by: Rodrigo Siqueira 
---
 drivers/gpu/drm/amd/display/dc/basics/Makefile  |  3 ++-
 .../amd/display/dc/{dml/calcs => basics}/bw_fixed.c | 13 ++---
 drivers/gpu/drm/amd/display/dc/dml/Makefile |  2 --
 3 files changed, 8 insertions(+), 10 deletions(-)
 rename drivers/gpu/drm/amd/display/dc/{dml/calcs => basics}/bw_fixed.c (94%)

diff --git a/drivers/gpu/drm/amd/display/dc/basics/Makefile 
b/drivers/gpu/drm/amd/display/dc/basics/Makefile
index 65d713aff407..aabcebf69049 100644
--- a/drivers/gpu/drm/amd/display/dc/basics/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/basics/Makefile
@@ -30,7 +30,8 @@ BASICS := \
vector.o \
dc_common.o \
dce_calcs.o \
-   custom_float.o
+   custom_float.o \
+   bw_fixed.o
 
 AMD_DAL_BASICS = $(addprefix $(AMDDALPATH)/dc/basics/,$(BASICS))
 
diff --git a/drivers/gpu/drm/amd/display/dc/dml/calcs/bw_fixed.c 
b/drivers/gpu/drm/amd/display/dc/basics/bw_fixed.c
similarity index 94%
rename from drivers/gpu/drm/amd/display/dc/dml/calcs/bw_fixed.c
rename to drivers/gpu/drm/amd/display/dc/basics/bw_fixed.c
index 3aa8dd0acd5e..c8cb89e0d4d0 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/calcs/bw_fixed.c
+++ b/drivers/gpu/drm/amd/display/dc/basics/bw_fixed.c
@@ -1,5 +1,6 @@
+// SPDX-License-Identifier: MIT
 /*
- * Copyright 2015 Advanced Micro Devices, Inc.
+ * Copyright 2023 Advanced Micro Devices, Inc.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"),
@@ -106,9 +107,8 @@ struct bw_fixed bw_frc_to_fixed(int64_t numerator, int64_t 
denominator)
return res;
 }
 
-struct bw_fixed bw_floor2(
-   const struct bw_fixed arg,
-   const struct bw_fixed significance)
+struct bw_fixed bw_floor2(const struct bw_fixed arg,
+ const struct bw_fixed significance)
 {
struct bw_fixed result;
int64_t multiplicand;
@@ -119,9 +119,8 @@ struct bw_fixed bw_floor2(
return result;
 }
 
-struct bw_fixed bw_ceil2(
-   const struct bw_fixed arg,
-   const struct bw_fixed significance)
+struct bw_fixed bw_ceil2(const struct bw_fixed arg,
+const struct bw_fixed significance)
 {
struct bw_fixed result;
int64_t multiplicand;
diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml/Makefile
index 2fe8588a070a..ea7d60f9a9b4 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
@@ -134,8 +134,6 @@ CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/calcs/dcn_calcs.o := 
$(dml_rcflags)
 CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/calcs/dcn_calc_auto.o := $(dml_rcflags)
 CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/calcs/dcn_calc_math.o := $(dml_rcflags)
 
-DML = calcs/bw_fixed.o
-
 ifdef CONFIG_DRM_AMD_DC_FP
 DML += display_mode_lib.o display_rq_dlg_helpers.o dml1_display_rq_dlg_calc.o
 DML += dcn10/dcn10_fpu.o
-- 
2.40.1

[PATCH v3 00/16] Introduce DML version 2

2023-10-04 Thread Rodrigo Siqueira

This patchset introduces a new version of DML that will be used for some
already available ASIC based on DCN3x and future devices. This new
version of the DML is more reliable, provide a better programming model
for hardware/software, and is more flexible for creating new tools for
automation/validation (e.g., unit test). This first version is a
transition step for new ASICs, meaning that we will keep improving this
new component. Finally, it is important to highlight that a large part
of the DML code is generated.

This patchset starts with a code refactor in the DML to improve the code
isolation and avoid compilation issues when using some 32-bit
architecture such as ARM and PPC. Next, it is introduced the basic code
for DML2. Finally, the end of this patchset enables DML2 for some
specific ASICs followed by patches that improve DML when used with
specific devices.

Thanks
Siqueira

Charlene Liu (2):
  drm/amd/display: Add z8_marks in dml
  drm/amd/display: correct dml2 input and dlg_refclk

Daniel Miess (1):
  drm/amd/display: Port replay vblank logic to DML2

Gabe Teeger (1):
  drm/amd/display: add check in validate_only in dml2

Qingqing Zhuo (2):
  drm/amd/display: Introduce DML2
  drm/amd/display: Add DCN35 DML2 support

Rodrigo Siqueira (4):
  drm/amd/display: Move dce_calcs from DML folder
  drm/amd/display: Move custom_float outside DML
  drm/amd/display: Move bw_fixed outside DML folder
  drm/amd/display: Move dml code under CONFIG_DRM_AMD_DC_FP guard

Saaem Rizvi (1):
  drm/amd/display: Modify Pipe Selection for Policy for ODM

Sung Joon Kim (3):
  drm/amd/display: Handle multiple streams sourcing same surface
  drm/amd/display: Use fixed DET Buffer Size
  drm/amd/display: Fix Chroma Surface height/width initialization

Taimur Hassan (2):
  drm/amd/display: Split pipe for stereo timings
  drm/amd/display: Move stereo timing check to helper

 drivers/gpu/drm/amd/display/dc/Makefile   | 6 +-
 .../gpu/drm/amd/display/dc/basics/Makefile| 9 +-
 .../dc/{dml/calcs => basics}/bw_fixed.c   |13 +-
 .../dc/{dml/calcs => basics}/calcs_logger.h   | 0
 .../dc/{dml/calcs => basics}/custom_float.c   |90 +-
 .../dc/{dml/calcs => basics}/dce_calcs.c  | 0
 drivers/gpu/drm/amd/display/dc/core/dc.c  |39 +
 .../gpu/drm/amd/display/dc/core/dc_resource.c |20 +
 drivers/gpu/drm/amd/display/dc/dc.h   | 5 +
 .../drm/amd/display/dc/dcn32/dcn32_resource.c |61 +-
 .../amd/display/dc/dcn321/dcn321_resource.c   |41 +
 .../drm/amd/display/dc/dcn35/dcn35_resource.c |24 +-
 drivers/gpu/drm/amd/display/dc/dml/Makefile   | 2 -
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c  |80 +
 .../amd/display/dc/dml/dcn321/dcn321_fpu.c|81 +
 drivers/gpu/drm/amd/display/dc/dml2/Makefile  |91 +
 .../gpu/drm/amd/display/dc/dml2/cmntypes.h|92 +
 .../amd/display/dc/dml2/display_mode_core.c   | 10296 
 .../amd/display/dc/dml2/display_mode_core.h   |   201 +
 .../dc/dml2/display_mode_core_structs.h   |  1970 +++
 .../dc/dml2/display_mode_lib_defines.h|75 +
 .../amd/display/dc/dml2/display_mode_util.c   |   796 ++
 .../amd/display/dc/dml2/display_mode_util.h   |74 +
 .../display/dc/dml2/dml2_dc_resource_mgmt.c   |   861 ++
 .../display/dc/dml2/dml2_dc_resource_mgmt.h   |48 +
 .../drm/amd/display/dc/dml2/dml2_dc_types.h   |40 +
 .../amd/display/dc/dml2/dml2_internal_types.h |   121 +
 .../amd/display/dc/dml2/dml2_mall_phantom.c   |   913 ++
 .../amd/display/dc/dml2/dml2_mall_phantom.h   |50 +
 .../gpu/drm/amd/display/dc/dml2/dml2_policy.c |   308 +
 .../gpu/drm/amd/display/dc/dml2/dml2_policy.h |47 +
 .../display/dc/dml2/dml2_translation_helper.c |  1201 ++
 .../display/dc/dml2/dml2_translation_helper.h |39 +
 .../gpu/drm/amd/display/dc/dml2/dml2_utils.c  |   480 +
 .../gpu/drm/amd/display/dc/dml2/dml2_utils.h  |   144 +
 .../drm/amd/display/dc/dml2/dml2_wrapper.c|   745 ++
 .../drm/amd/display/dc/dml2/dml2_wrapper.h|   212 +
 .../gpu/drm/amd/display/dc/dml2/dml_assert.h  |30 +
 .../drm/amd/display/dc/dml2/dml_depedencies.h |31 +
 .../display/dc/dml2/dml_display_rq_dlg_calc.c |   585 +
 .../display/dc/dml2/dml_display_rq_dlg_calc.h |63 +
 .../gpu/drm/amd/display/dc/dml2/dml_logging.h |29 +
 .../gpu/drm/amd/display/dc/inc/core_types.h   | 1 +
 43 files changed, 19931 insertions(+), 83 deletions(-)
 rename drivers/gpu/drm/amd/display/dc/{dml/calcs => basics}/bw_fixed.c (94%)
 rename drivers/gpu/drm/amd/display/dc/{dml/calcs => basics}/calcs_logger.h 
(100%)
 rename drivers/gpu/drm/amd/display/dc/{dml/calcs => basics}/custom_float.c 
(66%)
 rename drivers/gpu/drm/amd/display/dc/{dml/calcs => basics}/dce_calcs.c (100%)
 create mode 100644 drivers/gpu/drm/amd/display/dc/dml2/Makefile
 create mode 100644 drivers/gpu/drm/amd/display/dc/dml2/cmntypes.h
 create mode 100644 drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c
 create mode 100644 drivers/gpu/drm/amd/display/dc

Re: [PATCH 0/5] drm/amd/display: Remove migrate-disable and move memory allocation.

2023-10-04 Thread Hamza Mahfooz


On 9/21/23 10:15, Sebastian Andrzej Siewior wrote:

Hi,

I stumbled uppon the amdgpu driver via a bugzilla report. The actual fix
is #4 + #5 and the rest was made while looking at the code.

Sebastian


I have applied the series, thanks!





--
Hamza

Re: [PATCH 0/5] drm/amd/display: Remove migrate-disable and move memory allocation.

2023-10-04 Thread Rodrigo Siqueira Jordao





On 9/21/23 08:15, Sebastian Andrzej Siewior wrote:

Hi,

I stumbled uppon the amdgpu driver via a bugzilla report. The actual fix
is #4 + #5 and the rest was made while looking at the code.

Sebastian




Hi Sebastian,

Thanks a lot for this patchset. We tested it on multiple devices, and 
everything looks good. I also reviewed it and lgtm.


Reviewed-by: Rodrigo Siqueira 

Thanks
Siqueira

[PATCH] drm/amd: Fix UBSAN array-index-out-of-bounds for Polaris and Tonga

2023-10-04 Thread Mario Limonciello

For pptable structs that use flexible array sizes, use flexible arrays.

Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2036742
Signed-off-by: Mario Limonciello 
---
>From this bug report there are more to fix
 .../gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h| 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h
index 57bca1e81d3a..9fcad69a9f34 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h
@@ -164,7 +164,7 @@ typedef struct _ATOM_Tonga_State {
 typedef struct _ATOM_Tonga_State_Array {
UCHAR ucRevId;
UCHAR ucNumEntries; /* Number of entries. */
-   ATOM_Tonga_State entries[1];/* Dynamically allocate entries. */
+   ATOM_Tonga_State entries[]; /* Dynamically allocate entries. */
 } ATOM_Tonga_State_Array;
 
 typedef struct _ATOM_Tonga_MCLK_Dependency_Record {
@@ -210,7 +210,7 @@ typedef struct _ATOM_Polaris_SCLK_Dependency_Record {
 typedef struct _ATOM_Polaris_SCLK_Dependency_Table {
UCHAR ucRevId;
UCHAR ucNumEntries; 
/* Number of entries. */
-   ATOM_Polaris_SCLK_Dependency_Record entries[1]; 
 /* Dynamically allocate entries. */
+   ATOM_Polaris_SCLK_Dependency_Record entries[];  
 /* Dynamically allocate entries. */
 } ATOM_Polaris_SCLK_Dependency_Table;
 
 typedef struct _ATOM_Tonga_PCIE_Record {
@@ -222,7 +222,7 @@ typedef struct _ATOM_Tonga_PCIE_Record {
 typedef struct _ATOM_Tonga_PCIE_Table {
UCHAR ucRevId;
UCHAR ucNumEntries; 
/* Number of entries. */
-   ATOM_Tonga_PCIE_Record entries[1];  
/* Dynamically allocate entries. */
+   ATOM_Tonga_PCIE_Record entries[];   
/* Dynamically allocate entries. */
 } ATOM_Tonga_PCIE_Table;
 
 typedef struct _ATOM_Polaris10_PCIE_Record {
@@ -235,7 +235,7 @@ typedef struct _ATOM_Polaris10_PCIE_Record {
 typedef struct _ATOM_Polaris10_PCIE_Table {
UCHAR ucRevId;
UCHAR ucNumEntries; /* Number 
of entries. */
-   ATOM_Polaris10_PCIE_Record entries[1];  /* 
Dynamically allocate entries. */
+   ATOM_Polaris10_PCIE_Record entries[];  /* 
Dynamically allocate entries. */
 } ATOM_Polaris10_PCIE_Table;
 
 
@@ -252,7 +252,7 @@ typedef struct _ATOM_Tonga_MM_Dependency_Record {
 typedef struct _ATOM_Tonga_MM_Dependency_Table {
UCHAR ucRevId;
UCHAR ucNumEntries; 
/* Number of entries. */
-   ATOM_Tonga_MM_Dependency_Record entries[1];/* 
Dynamically allocate entries. */
+   ATOM_Tonga_MM_Dependency_Record entries[]; /* 
Dynamically allocate entries. */
 } ATOM_Tonga_MM_Dependency_Table;
 
 typedef struct _ATOM_Tonga_Voltage_Lookup_Record {
@@ -265,7 +265,7 @@ typedef struct _ATOM_Tonga_Voltage_Lookup_Record {
 typedef struct _ATOM_Tonga_Voltage_Lookup_Table {
UCHAR ucRevId;
UCHAR ucNumEntries; 
/* Number of entries. */
-   ATOM_Tonga_Voltage_Lookup_Record entries[1];
/* Dynamically allocate entries. */
+   ATOM_Tonga_Voltage_Lookup_Record entries[]; 
/* Dynamically allocate entries. */
 } ATOM_Tonga_Voltage_Lookup_Table;
 
 typedef struct _ATOM_Tonga_Fan_Table {
-- 
2.34.1

[PATCH 2/2] drm/radeon: Fix UBSAN array-index-out-of-bounds for Radeon HD 5430

2023-10-04 Thread Mario Limonciello

For pptable structs that use flexible array sizes, use flexible arrays.

Suggested-by: Felix Held 
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2894
Signed-off-by: Mario Limonciello 
---
 drivers/gpu/drm/radeon/pptable.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/pptable.h b/drivers/gpu/drm/radeon/pptable.h
index 4c2eec49dadc..94947229888b 100644
--- a/drivers/gpu/drm/radeon/pptable.h
+++ b/drivers/gpu/drm/radeon/pptable.h
@@ -74,7 +74,7 @@ typedef struct _ATOM_PPLIB_THERMALCONTROLLER
 typedef struct _ATOM_PPLIB_STATE
 {
 UCHAR ucNonClockStateIndex;
-UCHAR ucClockStateIndices[1]; // variable-sized
+UCHAR ucClockStateIndices[]; // variable-sized
 } ATOM_PPLIB_STATE;
 
 
-- 
2.34.1

[PATCH 1/2] drm/amd: Fix UBSAN array-index-out-of-bounds for SMU7

2023-10-04 Thread Mario Limonciello

For pptable structs that use flexible array sizes, use flexible arrays.

Suggested-by: Felix Held 
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2874
Signed-off-by: Mario Limonciello 
---
 drivers/gpu/drm/amd/include/pptable.h | 4 ++--
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/include/pptable.h 
b/drivers/gpu/drm/amd/include/pptable.h
index 0b6a057e0a4c..5aac8d545bdc 100644
--- a/drivers/gpu/drm/amd/include/pptable.h
+++ b/drivers/gpu/drm/amd/include/pptable.h
@@ -78,7 +78,7 @@ typedef struct _ATOM_PPLIB_THERMALCONTROLLER
 typedef struct _ATOM_PPLIB_STATE
 {
 UCHAR ucNonClockStateIndex;
-UCHAR ucClockStateIndices[1]; // variable-sized
+UCHAR ucClockStateIndices[]; // variable-sized
 } ATOM_PPLIB_STATE;
 
 
@@ -473,7 +473,7 @@ typedef struct _ATOM_PPLIB_STATE_V2
   /**
   * Driver will read the first ucNumDPMLevels in this array
   */
-  UCHAR clockInfoIndex[1];
+  UCHAR clockInfoIndex[];
 } ATOM_PPLIB_STATE_V2;
 
 typedef struct _StateArray{
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h
index 7a31cfa5e7fb..57bca1e81d3a 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h
@@ -179,7 +179,7 @@ typedef struct _ATOM_Tonga_MCLK_Dependency_Record {
 typedef struct _ATOM_Tonga_MCLK_Dependency_Table {
UCHAR ucRevId;
UCHAR ucNumEntries; 
/* Number of entries. */
-   ATOM_Tonga_MCLK_Dependency_Record entries[1];   
/* Dynamically allocate entries. */
+   ATOM_Tonga_MCLK_Dependency_Record entries[];
/* Dynamically allocate entries. */
 } ATOM_Tonga_MCLK_Dependency_Table;
 
 typedef struct _ATOM_Tonga_SCLK_Dependency_Record {
@@ -194,7 +194,7 @@ typedef struct _ATOM_Tonga_SCLK_Dependency_Record {
 typedef struct _ATOM_Tonga_SCLK_Dependency_Table {
UCHAR ucRevId;
UCHAR ucNumEntries; 
/* Number of entries. */
-   ATOM_Tonga_SCLK_Dependency_Record entries[1];   
 /* Dynamically allocate entries. */
+   ATOM_Tonga_SCLK_Dependency_Record entries[];
 /* Dynamically allocate entries. */
 } ATOM_Tonga_SCLK_Dependency_Table;
 
 typedef struct _ATOM_Polaris_SCLK_Dependency_Record {
-- 
2.34.1

Re: [PATCH v4] drm/amdkfd: Use partial migrations in GPU page faults

2023-10-04 Thread Felix Kuehling




On 2023-10-03 19:31, Xiaogang.Chen wrote:

From: Xiaogang Chen 

This patch implements partial migration in gpu page fault according to migration
granularity(default 2MB) and not split svm range in cpu page fault handling.
A svm range may include pages from both system ram and vram of one gpu now.
These chagnes are expected to improve migration performance and reduce mmu
callback and TLB flush workloads.

Signed-off-by: Xiaogang Chen


Minor (mostly cosemtic) nit-picks inline. With those fixed, the patch is

Reviewed-by: Felix Kuehling 



---
  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 156 +--
  drivers/gpu/drm/amd/amdkfd/kfd_migrate.h |   6 +-
  drivers/gpu/drm/amd/amdkfd/kfd_svm.c |  83 +---
  drivers/gpu/drm/amd/amdkfd/kfd_svm.h |   6 +-
  4 files changed, 162 insertions(+), 89 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index 6c25dab051d5..6a059e4aff86 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -442,10 +442,10 @@ svm_migrate_vma_to_vram(struct kfd_node *node, struct 
svm_range *prange,
goto out_free;
}
if (cpages != npages)
-   pr_debug("partial migration, 0x%lx/0x%llx pages migrated\n",
+   pr_debug("partial migration, 0x%lx/0x%llx pages collected\n",
 cpages, npages);
else
-   pr_debug("0x%lx pages migrated\n", cpages);
+   pr_debug("0x%lx pages collected\n", cpages);
  
  	r = svm_migrate_copy_to_vram(node, prange, &migrate, &mfence, scratch, ttm_res_offset);

migrate_vma_pages(&migrate);
@@ -479,6 +479,8 @@ svm_migrate_vma_to_vram(struct kfd_node *node, struct 
svm_range *prange,
   * svm_migrate_ram_to_vram - migrate svm range from system to device
   * @prange: range structure
   * @best_loc: the device to migrate to
+ * @start_mgr: start page to migrate
+ * @last_mgr: last page to migrate
   * @mm: the process mm structure
   * @trigger: reason of migration
   *
@@ -489,6 +491,7 @@ svm_migrate_vma_to_vram(struct kfd_node *node, struct 
svm_range *prange,
   */
  static int
  svm_migrate_ram_to_vram(struct svm_range *prange, uint32_t best_loc,
+   unsigned long start_mgr, unsigned long last_mgr,
struct mm_struct *mm, uint32_t trigger)
  {
unsigned long addr, start, end;
@@ -498,23 +501,30 @@ svm_migrate_ram_to_vram(struct svm_range *prange, 
uint32_t best_loc,
unsigned long cpages = 0;
long r = 0;
  
-	if (prange->actual_loc == best_loc) {

-   pr_debug("svms 0x%p [0x%lx 0x%lx] already on best_loc 0x%x\n",
-prange->svms, prange->start, prange->last, best_loc);
+   if (!best_loc) {
+   pr_debug("svms 0x%p [0x%lx 0x%lx] migrate to sys ram\n",
+   prange->svms, start_mgr, last_mgr);
return 0;
}
  
+	if (start_mgr < prange->start || last_mgr > prange->last) {

+   pr_debug("range [0x%lx 0x%lx] out prange [0x%lx 0x%lx]\n",
+start_mgr, last_mgr, prange->start, 
prange->last);
+   return -EFAULT;
+   }
+
node = svm_range_get_node_by_id(prange, best_loc);
if (!node) {
pr_debug("failed to get kfd node by id 0x%x\n", best_loc);
return -ENODEV;
}
  
-	pr_debug("svms 0x%p [0x%lx 0x%lx] to gpu 0x%x\n", prange->svms,

-prange->start, prange->last, best_loc);
+   pr_debug("svms 0x%p [0x%lx 0x%lx] in [0x%lx 0x%lx] to gpu 0x%x\n",
+   prange->svms, start_mgr, last_mgr, prange->start, prange->last,
+   best_loc);
  
-	start = prange->start << PAGE_SHIFT;

-   end = (prange->last + 1) << PAGE_SHIFT;
+   start = start_mgr << PAGE_SHIFT;
+   end = (last_mgr + 1) << PAGE_SHIFT;
  
  	r = svm_range_vram_node_new(node, prange, true);

if (r) {
@@ -544,8 +554,11 @@ svm_migrate_ram_to_vram(struct svm_range *prange, uint32_t 
best_loc,
  
  	if (cpages) {

prange->actual_loc = best_loc;
-   svm_range_dma_unmap(prange);
-   } else {
+   prange->vram_pages = prange->vram_pages + cpages;
+   } else if (!prange->actual_loc) {
+   /* if no page migrated and all pages from prange are at
+* sys ram drop svm_bo got from svm_range_vram_node_new
+*/
svm_range_vram_node_free(prange);
}
  
@@ -663,19 +676,19 @@ svm_migrate_copy_to_ram(struct amdgpu_device *adev, struct svm_range *prange,

   * Context: Process context, caller hold mmap read lock, prange->migrate_mutex
   *
   * Return:
- *   0 - success with all pages migrated
   *   negative values - indicate error
- *   positive values - partial migration, number of pages not migrated
+ *   positive values or zero - number of pages got migrated

RE: [PATCH] drm/amdgpu: Improve MES responsiveness during oversubscription

2023-10-04 Thread Kasiviswanathan, Harish

[AMD Official Use Only - General]

Reviewed-by: Harish Kasiviswanathan 

-Original Message-
From: amd-gfx  On Behalf Of Jay Cornwall
Sent: Wednesday, October 4, 2023 12:00 PM
To: amd-gfx@lists.freedesktop.org
Cc: Cornwall, Jay ; Tudor, Alexandru 

Subject: [PATCH] drm/amdgpu: Improve MES responsiveness during oversubscription

When MES is oversubscribed it may not frequently check for new
command submissions from driver if the scheduling load is high.
Response latency as high as 5 seconds has been observed.

Enable a flag which adds a check for new commands between
scheduling quantums.

Signed-off-by: Jay Cornwall 
Cc: Alexandru Tudor 
---
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index 4a3020b5b30f..31b26e6f0b30 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -406,6 +406,7 @@ static int mes_v11_0_set_hw_resources(struct amdgpu_mes 
*mes)
mes_set_hw_res_pkt.disable_mes_log = 1;
mes_set_hw_res_pkt.use_different_vmid_compute = 1;
mes_set_hw_res_pkt.enable_reg_active_poll = 1;
+   mes_set_hw_res_pkt.enable_level_process_quantum_check = 1;
mes_set_hw_res_pkt.oversubscription_timer = 50;

return mes_v11_0_submit_pkt_and_poll_completion(mes,
--
2.25.1

Re: [PATCH v2] drm/amdkfd: Fix EXT_COHERENT memory allocation crash

2023-10-04 Thread Francis, David

[AMD Official Use Only - General]



On 2023-10-03 17:37, Felix Kuehling wrote:
On 2023-10-03 16:50, Philip Yang wrote:
If there is no VRAM domain, bo_node is NULL and this causes crash.
Refactor the change, and use the module parameter as higher privilege.

Need another patch to support override PTE flag on APU.

Fixes: 55d7e2001c7e ("drm/amdgpu: Add EXT_COHERENT memory allocation flags")
Signed-off-by: Philip Yang 
---
  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 18 +++---
  1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 0d88698ae33f..305b2c54edfa 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1248,26 +1248,22 @@ svm_range_get_pte_flags(struct kfd_node *node,
  break;
  case IP_VERSION(9, 4, 3):
  mtype_local = amdgpu_mtype_local == 1 ? AMDGPU_VM_MTYPE_NC :
-  (amdgpu_mtype_local == 2 ? AMDGPU_VM_MTYPE_CC : 
AMDGPU_VM_MTYPE_RW);
+   (amdgpu_mtype_local == 2 || ext_coherent ?
+ AMDGPU_VM_MTYPE_CC : AMDGPU_VM_MTYPE_RW);

We had some offline discussion where I thought that MTYPE_NC should
become MTYPE_UC when ext_coherent is enabled to get the desired memory
semantics. With that idea in mind, this would become a bit more messy,
but here it goes, as clean as I can make it:

-   mtype_local = amdgpu_mtype_local == 1 ? AMDGPU_VM_MTYPE_NC :
-(amdgpu_mtype_local == 2 ? AMDGPU_VM_MTYPE_CC : 
AMDGPU_VM_MTYPE_RW);
+   mtype_local = amdgpu_mtype_local == 1 && !ext_coherent ? 
AMDGPU_VM_MTYPE_NC :
+(amdgpu_mtype_local == 1 &&  ext_coherent ? 
AMDGPU_VM_MTYPE_UC :
+(amdgpu_mtype_local == 2 ||  ext_coherent ? 
AMDGPU_VM_MTYPE_CC :
+
AMDGPU_VM_MTYPE_RW));


That ternary looks fairly gnarly. I think it would be worth the extra ink to 
write

   mtype_local = amdgpu_mtype_local == 1 ? AMDGPU_VM_MTYPE_NC :
(amdgpu_mtype_local == 2 ? AMDGPU_VM_MTYPE_CC : 
AMDGPU_VM_MTYPE_RW);

if (ext_coherent) {
if (amdgpu_mtype_local = 1)
mtype_local = AMDGPU_VM_MTYPE_UC;
else
mtype_local = AMDGPU_VM_MTYPE_CC;
}

But maybe that could be fixed up in a follow up patch. Either way, for
the purpose of fixing the crash, this patch is

Reviewed-by: Felix Kuehling 



  snoop = true;
  if (uncached) {
  mapping_flags |= AMDGPU_VM_MTYPE_UC;
- } else if (ext_coherent) {
- /* local HBM region close to partition */
- if (bo_node->adev == node->adev &&
- (!bo_node->xcp || !node->xcp || bo_node->xcp->mem_id 
== node->xcp->mem_id))
- mapping_flags |= AMDGPU_VM_MTYPE_CC;
- else
- mapping_flags |= AMDGPU_VM_MTYPE_UC;
  } else if (domain == SVM_RANGE_VRAM_DOMAIN) {
  /* local HBM region close to partition */
  if (bo_node->adev == node->adev &&
  (!bo_node->xcp || !node->xcp || bo_node->xcp->mem_id 
== node->xcp->mem_id))
  mapping_flags |= mtype_local;
- /* local HBM region far from partition or remote XGMI GPU 
*/
- else if (svm_nodes_in_same_hive(bo_node, node))
+ /* local HBM region far from partition or remote XGMI GPU
+  * with regular system scope coherence
+  */
+ else if (svm_nodes_in_same_hive(bo_node, node) && 
!ext_coherent)
  mapping_flags |= AMDGPU_VM_MTYPE_NC;
- /* PCIe P2P */
+ /* PCIe P2P or extended system scope coherence */
  else
  mapping_flags |= AMDGPU_VM_MTYPE_UC;

Would probably clearer if these two branches were swapped so the first was

(!svm_nodes_in_same_hive(bo_node, node) || ext_coherent)

Not a required change, though.

  /* system memory accessed by the APU */

This patch as written causes ext_coherent to no longer affect gfx9.4.3 APU 
devices, which it should.

The following (or equivalent) needs to be added just below this hunk

if (num_possible_nodes() <= 1)
mapping_flags |= mtype_local;
else
- mapping_flags |= AMDGPU_VM_MTYPE_NC;
+mapping_flags |= ext_coherent ? AMDGPU_VM_MTYPE_UC : 
AMDGPU_VM_MTYPE_NC;

Re: [PATCH] drm/amdgpu: Annotate struct amdgpu_bo_list with __counted_by

2023-10-04 Thread Luben Tuikov

On 2023-10-03 19:29, Kees Cook wrote:
> Prepare for the coming implementation by GCC and Clang of the __counted_by
> attribute. Flexible array members annotated with __counted_by can have
> their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for
> array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
> functions).
> 
> As found with Coccinelle[1], add __counted_by for struct amdgpu_bo_list.
> Additionally, since the element count member must be set before accessing
> the annotated flexible array member, move its initialization earlier.
> 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: "Pan, Xinhui" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: "Gustavo A. R. Silva" 
> Cc: Luben Tuikov 
> Cc: Christophe JAILLET 
> Cc: Felix Kuehling 
> Cc: amd-gfx@lists.freedesktop.org
> Cc: dri-de...@lists.freedesktop.org
> Cc: linux-harden...@vger.kernel.org
> Link: 
> https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
>  [1]
> Signed-off-by: Kees Cook 

Reviewed-by: Luben Tuikov 
-- 
Regards,
Luben

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> index 6f5b641b631e..781e5c5ce04d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> @@ -84,6 +84,7 @@ int amdgpu_bo_list_create(struct amdgpu_device *adev, 
> struct drm_file *filp,
>  
>   kref_init(&list->refcount);
>  
> + list->num_entries = num_entries;
>   array = list->entries;
>  
>   for (i = 0; i < num_entries; ++i) {
> @@ -129,7 +130,6 @@ int amdgpu_bo_list_create(struct amdgpu_device *adev, 
> struct drm_file *filp,
>   }
>  
>   list->first_userptr = first_userptr;
> - list->num_entries = num_entries;
>   sort(array, last_entry, sizeof(struct amdgpu_bo_list_entry),
>amdgpu_bo_list_entry_cmp, NULL);
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h
> index 6a703be45d04..555cd6d877c3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h
> @@ -56,7 +56,7 @@ struct amdgpu_bo_list {
>*/
>   struct mutex bo_list_mutex;
>  
> - struct amdgpu_bo_list_entry entries[];
> + struct amdgpu_bo_list_entry entries[] __counted_by(num_entries);
>  };
>  
>  int amdgpu_bo_list_get(struct amdgpu_fpriv *fpriv, int id,

[PATCH v4 3/3] drm/amd/display: make dc_set_power_state() return type `void` again

2023-10-04 Thread Mario Limonciello

As dc_set_power_state() no longer allocates memory, it's not necessary
to have return types and check return code as it can't fail anymore.

Change it back to `void`.

Signed-off-by: Mario Limonciello 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c   | 17 +
 drivers/gpu/drm/amd/display/dc/core/dc.c|  6 ++
 drivers/gpu/drm/amd/display/dc/dc.h |  2 +-
 3 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index a59a11ae42db..df9d9437f149 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2685,11 +2685,6 @@ static void hpd_rx_irq_work_suspend(struct 
amdgpu_display_manager *dm)
}
 }
 
-static int dm_set_power_state(struct dc *dc, enum dc_acpi_cm_power_state 
power_state)
-{
-   return dc_set_power_state(dc, power_state) ? 0 : -ENOMEM;
-}
-
 static int dm_suspend(void *handle)
 {
struct amdgpu_device *adev = handle;
@@ -2723,7 +2718,9 @@ static int dm_suspend(void *handle)
 
hpd_rx_irq_work_suspend(dm);
 
-   return dm_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D3);
+   dc_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D3);
+
+   return 0;
 }
 
 struct drm_connector *
@@ -2917,9 +2914,7 @@ static int dm_resume(void *handle)
if (r)
DRM_ERROR("DMUB interface failed to initialize: 
status=%d\n", r);
 
-   r = dm_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D0);
-   if (r)
-   return r;
+   dc_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D0);
 
dc_resume(dm->dc);
 
@@ -2969,9 +2964,7 @@ static int dm_resume(void *handle)
}
 
/* power on hardware */
-   r = dm_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D0);
-   if (r)
-   return r;
+dc_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D0);
 
/* program HPD filter */
dc_resume(dm->dc);
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index cb8c7c5a8807..2645d59dc58e 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -4724,12 +4724,12 @@ void dc_power_down_on_boot(struct dc *dc)
dc->hwss.power_down_on_boot(dc);
 }
 
-bool dc_set_power_state(
+void dc_set_power_state(
struct dc *dc,
enum dc_acpi_cm_power_state power_state)
 {
if (!dc->current_state)
-   return true;
+   return;
 
switch (power_state) {
case DC_ACPI_CM_POWER_STATE_D0:
@@ -4752,8 +4752,6 @@ bool dc_set_power_state(
 
break;
}
-
-   return true;
 }
 
 void dc_resume(struct dc *dc)
diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index b140eb240ad7..b6002b11a745 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -2330,7 +2330,7 @@ void dc_notify_vsync_int_state(struct dc *dc, struct 
dc_stream_state *stream, bo
 
 /* Power Interfaces */
 
-bool dc_set_power_state(
+void dc_set_power_state(
struct dc *dc,
enum dc_acpi_cm_power_state power_state);
 void dc_resume(struct dc *dc);
-- 
2.34.1

[PATCH v4 1/3] drm/amd: Evict resources during PM ops prepare() callback

2023-10-04 Thread Mario Limonciello

Linux PM core has a prepare() callback run before suspend.

If the system is under high memory pressure, the resources may need
to be evicted into swap instead.  If the storage backing for swap
is offlined during the suspend() step then such a call may fail.

So duplicate this step into prepare() to move evict majority of
resources while leaving all existing steps that put the GPU into a
low power state in suspend().

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362
Signed-off-by: Mario Limonciello 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  7 +++---
 3 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index d23fb4b5ad95..6643d0ed6b1b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1413,6 +1413,7 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
 void amdgpu_driver_release_kms(struct drm_device *dev);
 
 int amdgpu_device_ip_suspend(struct amdgpu_device *adev);
+int amdgpu_device_prepare(struct drm_device *dev);
 int amdgpu_device_suspend(struct drm_device *dev, bool fbcon);
 int amdgpu_device_resume(struct drm_device *dev, bool fbcon);
 u32 amdgpu_get_vblank_counter_kms(struct drm_crtc *crtc);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index bad2b5577e96..67acee569c08 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4259,6 +4259,31 @@ static int amdgpu_device_evict_resources(struct 
amdgpu_device *adev)
 /*
  * Suspend & resume.
  */
+/**
+ * amdgpu_device_prepare - prepare for device suspend
+ *
+ * @dev: drm dev pointer
+ *
+ * Prepare to put the hw in the suspend state (all asics).
+ * Returns 0 for success or an error on failure.
+ * Called at driver suspend.
+ */
+int amdgpu_device_prepare(struct drm_device *dev)
+{
+   struct amdgpu_device *adev = drm_to_adev(dev);
+   int r;
+
+   if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
+   return 0;
+
+   /* Evict the majority of BOs before starting suspend sequence */
+   r = amdgpu_device_evict_resources(adev);
+   if (r)
+   return r;
+
+   return 0;
+}
+
 /**
  * amdgpu_device_suspend - initiate device suspend
  *
@@ -4279,7 +4304,6 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
fbcon)
 
adev->in_suspend = true;
 
-   /* Evict the majority of BOs before grabbing the full access */
r = amdgpu_device_evict_resources(adev);
if (r)
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index e3471293846f..175167582db0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2425,8 +2425,9 @@ static int amdgpu_pmops_prepare(struct device *dev)
/* Return a positive number here so
 * DPM_FLAG_SMART_SUSPEND works properly
 */
-   if (amdgpu_device_supports_boco(drm_dev))
-   return pm_runtime_suspended(dev);
+   if (amdgpu_device_supports_boco(drm_dev) &&
+   pm_runtime_suspended(dev))
+   return 1;
 
/* if we will not support s3 or s2i for the device
 *  then skip suspend
@@ -2435,7 +2436,7 @@ static int amdgpu_pmops_prepare(struct device *dev)
!amdgpu_acpi_is_s3_active(adev))
return 1;
 
-   return 0;
+   return amdgpu_device_prepare(drm_dev);
 }
 
 static void amdgpu_pmops_complete(struct device *dev)
-- 
2.34.1

[PATCH v4 2/3] drm/amd/display: Destroy DC context while keeping DML

2023-10-04 Thread Mario Limonciello

If there is memory pressure at suspend time then dynamically
allocating a large structure as part of DC suspend code will
fail.

Instead re-use the same structure and clear all members except
those that should be maintained.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362
Signed-off-by: Mario Limonciello 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 25 ---
 .../gpu/drm/amd/display/dc/core/dc_resource.c | 12 +
 2 files changed, 12 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 39e291a467e2..cb8c7c5a8807 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -4728,9 +4728,6 @@ bool dc_set_power_state(
struct dc *dc,
enum dc_acpi_cm_power_state power_state)
 {
-   struct kref refcount;
-   struct display_mode_lib *dml;
-
if (!dc->current_state)
return true;
 
@@ -4750,30 +4747,8 @@ bool dc_set_power_state(
break;
default:
ASSERT(dc->current_state->stream_count == 0);
-   /* Zero out the current context so that on resume we start with
-* clean state, and dc hw programming optimizations will not
-* cause any trouble.
-*/
-   dml = kzalloc(sizeof(struct display_mode_lib),
-   GFP_KERNEL);
-
-   ASSERT(dml);
-   if (!dml)
-   return false;
-
-   /* Preserve refcount */
-   refcount = dc->current_state->refcount;
-   /* Preserve display mode lib */
-   memcpy(dml, &dc->current_state->bw_ctx.dml, sizeof(struct 
display_mode_lib));
 
dc_resource_state_destruct(dc->current_state);
-   memset(dc->current_state, 0,
-   sizeof(*dc->current_state));
-
-   dc->current_state->refcount = refcount;
-   dc->current_state->bw_ctx.dml = *dml;
-
-   kfree(dml);
 
break;
}
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index aa7b5db83644..e487c966c118 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -4350,6 +4350,18 @@ void dc_resource_state_destruct(struct dc_state *context)
context->streams[i] = NULL;
}
context->stream_count = 0;
+   context->stream_mask = 0;
+   memset(&context->res_ctx, 0, sizeof(context->res_ctx));
+   memset(&context->pp_display_cfg, 0, sizeof(context->pp_display_cfg));
+   memset(&context->dcn_bw_vars, 0, sizeof(context->dcn_bw_vars));
+   context->clk_mgr = NULL;
+   memset(&context->bw_ctx.bw, 0, sizeof(context->bw_ctx.bw));
+   memset(context->block_sequence, 0, sizeof(context->block_sequence));
+   context->block_sequence_steps = 0;
+   memset(context->dc_dmub_cmd, 0, sizeof(context->dc_dmub_cmd));
+   context->dmub_cmd_count = 0;
+   memset(&context->perf_params, 0, sizeof(context->perf_params));
+   memset(&context->scratch, 0, sizeof(context->scratch));
 }
 
 void dc_resource_state_copy_construct(
-- 
2.34.1

[PATCH v4 0/3] Better handle memory pressure at suspend

2023-10-04 Thread Mario Limonciello

At suspend time if there is memory pressure then dynamically allocating
memory will cause failures that don't clean up properly when trying
suspend a second time.

Move the bigger memory allocations into Linux PM prepare() callback and
drop allocations that aren't really needed in DC code.

v1: 
https://lore.kernel.org/amd-gfx/20230925143359.14932-1-mario.limoncie...@amd.com/
v2: 
https://lore.kernel.org/amd-gfx/20231002224449.95565-1-mario.limoncie...@amd.com/T/#mc800319a05df821cd1875234b09bf212e2e3282b
v3: 
https://lore.kernel.org/amd-gfx/20231003205437.123426-1-mario.limoncie...@amd.com/T/#m00a49b75cd2638bf8a0ebd549d6a6010bfb7328b

v3->v4:
 * Combine patches 1/2
 * Drop adev->in_suspend references
v2->v3:
 * Handle adev->in_suspend in prepare() and complete()
 * Add missing scratch variable in dc_resource_state_destruct()
 * Revert error code propagation in same series
v1->v2:
 * Handle DC code too
 * Add prepare callback rather than moving symbol calls
Mario Limonciello (3):
  drm/amd: Evict resources during PM ops prepare() callback
  drm/amd/display: Destroy DC context while keeping DML
  drm/amd/display: make dc_set_power_state() return type `void` again

 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 26 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  7 +++--
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 17 +++---
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 31 ++-
 .../gpu/drm/amd/display/dc/core/dc_resource.c | 12 +++
 drivers/gpu/drm/amd/display/dc/dc.h   |  2 +-
 7 files changed, 50 insertions(+), 46 deletions(-)

-- 
2.34.1

Re: [PATCH v3 2/2] drm/amdkfd: get doorbell's absolute offset based on the db size

2023-10-04 Thread Yadav, Arvind




On 10/4/2023 10:29 PM, Felix Kuehling wrote:


On 2023-10-04 12:16, Arvind Yadav wrote:

This patch is to align the absolute doorbell offset
based on the doorbell's size. So that doorbell offset
will be aligned for both 32 bit and 64 bit.

v2:
- Addressed the review comment from Felix.
v3:
- Adding doorbell_size as parameter to get db absolute offset.

Cc: Christian Koenig 
Cc: Alex Deucher 
Signed-off-by: Shashank Sharma 
Signed-off-by: Arvind Yadav 


The final result looks good to me. But please squash the two patches 
into one. The first patch on its own breaks the build, and that's 
something we don't want to commit to the branch history as it makes 
tracking regressions (e.g. with git bisect) very hard or impossible.


More nit-picks inline.

Sure, we can have one patch.




---
  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c   |  6 +-
  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c   | 13 +++--
  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c  |  4 +++-
  3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c

index 0d3d538b64eb..690ff131fe4b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -346,6 +346,7 @@ static int allocate_doorbell(struct 
qcm_process_device *qpd,

   uint32_t const *restore_id)
  {
  struct kfd_node *dev = qpd->dqm->dev;
+    uint32_t doorbell_size;
    if (!KFD_IS_SOC15(dev)) {
  /* On pre-SOC15 chips we need to use the queue ID to
@@ -405,9 +406,12 @@ static int allocate_doorbell(struct 
qcm_process_device *qpd,

  }
  }
  +    doorbell_size = dev->kfd->device_info.doorbell_size;
+
  q->properties.doorbell_off = 
amdgpu_doorbell_index_on_bar(dev->adev,

    qpd->proc_doorbells,
-  q->doorbell_id);
+  q->doorbell_id,
+  doorbell_size);


You don't need a local variable for doorbell size that's only used 
once. Just pass dev->kfd->device_info.doorbell_size directly.


I have used local variable to make the code cleaner but I will remove 
local variable.



  return 0;
  }
  diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c

index 7b38537c7c99..59dd76c4b138 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
@@ -161,7 +161,10 @@ void __iomem *kfd_get_kernel_doorbell(struct 
kfd_dev *kfd,

  if (inx >= KFD_MAX_NUM_OF_QUEUES_PER_PROCESS)
  return NULL;
  -    *doorbell_off = amdgpu_doorbell_index_on_bar(kfd->adev, 
kfd->doorbells, inx);

+    *doorbell_off = amdgpu_doorbell_index_on_bar(kfd->adev,
+ kfd->doorbells,
+ inx,
+ kfd->device_info.doorbell_size);
  inx *= 2;
    pr_debug("Get kernel queue doorbell\n"
@@ -233,6 +236,7 @@ phys_addr_t kfd_get_process_doorbells(struct 
kfd_process_device *pdd)

  {
  struct amdgpu_device *adev = pdd->dev->adev;
  uint32_t first_db_index;
+    uint32_t doorbell_size;
    if (!pdd->qpd.proc_doorbells) {
  if (kfd_alloc_process_doorbells(pdd->dev->kfd, pdd))
@@ -240,7 +244,12 @@ phys_addr_t kfd_get_process_doorbells(struct 
kfd_process_device *pdd)

  return 0;
  }
  -    first_db_index = amdgpu_doorbell_index_on_bar(adev, 
pdd->qpd.proc_doorbells, 0);

+    doorbell_size = pdd->dev->kfd->device_info.doorbell_size;
+
+    first_db_index = amdgpu_doorbell_index_on_bar(adev,
+  pdd->qpd.proc_doorbells,
+  0,
+  doorbell_size);


Same as above, no local variable needed.

Noted,




  return adev->doorbell.base + first_db_index * sizeof(uint32_t);
  }
  diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c

index adb5e4bdc0b2..010cd8e8e6a1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -375,9 +375,11 @@ int pqm_create_queue(struct 
process_queue_manager *pqm,

   * relative doorbell index = Absolute doorbell index -
   * absolute index of first doorbell in the page.
   */
+    uint32_t doorbell_size = 
pdd->dev->kfd->device_info.doorbell_size;
  uint32_t first_db_index = 
amdgpu_doorbell_index_on_bar(pdd->dev->adev,

pdd->qpd.proc_doorbells,
-   0);
+   0,
+   doorbell_size);


No local variable needed.


Noted,

Thanks
~Arvind


Regards,
  Felix



    *p_doorbell_offset_in_process = (q->properties.doorbell_off
  - first_db_i

Re: [PATCH v3 2/2] drm/amdkfd: get doorbell's absolute offset based on the db size

2023-10-04 Thread Felix Kuehling




On 2023-10-04 12:16, Arvind Yadav wrote:

This patch is to align the absolute doorbell offset
based on the doorbell's size. So that doorbell offset
will be aligned for both 32 bit and 64 bit.

v2:
- Addressed the review comment from Felix.
v3:
- Adding doorbell_size as parameter to get db absolute offset.

Cc: Christian Koenig 
Cc: Alex Deucher 
Signed-off-by: Shashank Sharma 
Signed-off-by: Arvind Yadav 


The final result looks good to me. But please squash the two patches 
into one. The first patch on its own breaks the build, and that's 
something we don't want to commit to the branch history as it makes 
tracking regressions (e.g. with git bisect) very hard or impossible.


More nit-picks inline.



---
  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c   |  6 +-
  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c   | 13 +++--
  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c  |  4 +++-
  3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 0d3d538b64eb..690ff131fe4b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -346,6 +346,7 @@ static int allocate_doorbell(struct qcm_process_device *qpd,
 uint32_t const *restore_id)
  {
struct kfd_node *dev = qpd->dqm->dev;
+   uint32_t doorbell_size;
  
  	if (!KFD_IS_SOC15(dev)) {

/* On pre-SOC15 chips we need to use the queue ID to
@@ -405,9 +406,12 @@ static int allocate_doorbell(struct qcm_process_device 
*qpd,
}
}
  
+	doorbell_size = dev->kfd->device_info.doorbell_size;

+
q->properties.doorbell_off = amdgpu_doorbell_index_on_bar(dev->adev,
  
qpd->proc_doorbells,
- 
q->doorbell_id);
+ 
q->doorbell_id,
+ 
doorbell_size);


You don't need a local variable for doorbell size that's only used once. 
Just pass dev->kfd->device_info.doorbell_size directly.




return 0;
  }
  
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c

index 7b38537c7c99..59dd76c4b138 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
@@ -161,7 +161,10 @@ void __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
if (inx >= KFD_MAX_NUM_OF_QUEUES_PER_PROCESS)
return NULL;
  
-	*doorbell_off = amdgpu_doorbell_index_on_bar(kfd->adev, kfd->doorbells, inx);

+   *doorbell_off = amdgpu_doorbell_index_on_bar(kfd->adev,
+kfd->doorbells,
+inx,
+
kfd->device_info.doorbell_size);
inx *= 2;
  
  	pr_debug("Get kernel queue doorbell\n"

@@ -233,6 +236,7 @@ phys_addr_t kfd_get_process_doorbells(struct 
kfd_process_device *pdd)
  {
struct amdgpu_device *adev = pdd->dev->adev;
uint32_t first_db_index;
+   uint32_t doorbell_size;
  
  	if (!pdd->qpd.proc_doorbells) {

if (kfd_alloc_process_doorbells(pdd->dev->kfd, pdd))
@@ -240,7 +244,12 @@ phys_addr_t kfd_get_process_doorbells(struct 
kfd_process_device *pdd)
return 0;
}
  
-	first_db_index = amdgpu_doorbell_index_on_bar(adev, pdd->qpd.proc_doorbells, 0);

+   doorbell_size = pdd->dev->kfd->device_info.doorbell_size;
+
+   first_db_index = amdgpu_doorbell_index_on_bar(adev,
+ pdd->qpd.proc_doorbells,
+ 0,
+ doorbell_size);


Same as above, no local variable needed.



return adev->doorbell.base + first_db_index * sizeof(uint32_t);
  }
  
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c

index adb5e4bdc0b2..010cd8e8e6a1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -375,9 +375,11 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 * relative doorbell index = Absolute doorbell index -
 * absolute index of first doorbell in the page.
 */
+   uint32_t doorbell_size = 
pdd->dev->kfd->device_info.doorbell_size;
uint32_t first_db_index = 
amdgpu_doorbell_index_on_bar(pdd->dev->adev,
   
pdd->qpd.proc_doorbells,
-

RE: [PATCH 13/16] drm/amd/display: Don't set dpms_off for seamless boot

2023-10-04 Thread Limonciello, Mario

[AMD Official Use Only - General]

> From: Daniel Miess 
>
> [Why]
> eDPs fail to light up with seamless boot enabled
>
> [How]
> When seamless boot is enabled don't configure dpms_off
> in disable_vbios_mode_if_required.
>
> Reviewed-by: Charlene Liu 
> Cc: Mario Limonciello 
> Cc: Alex Deucher 
> Cc: sta...@vger.kernel.org
> Acked-by: Tom Chung 
> Signed-off-by: Daniel Miess 
> ---
>  drivers/gpu/drm/amd/display/dc/core/dc.c | 3 +++
>  1 file changed, 3 insertions(+)

Feifei,

Can you recheck seamless boot on DCN3.2 after this lands into 
amd-staging-drm-next?
If it works, we may remove the check to only apply it to APUs.

>
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c
> b/drivers/gpu/drm/amd/display/dc/core/dc.c
> index bd4834f921c1..88d41bf6d53a 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
> @@ -1230,6 +1230,9 @@ static void disable_vbios_mode_if_required(
>   if (stream == NULL)
>   continue;
>
> + if (stream->apply_seamless_boot_optimization)
> + continue;
> +
>   // only looking for first odm pipe
>   if (pipe->prev_odm_pipe)
>   continue;
> --
> 2.25.1

[PATCH v3 2/2] drm/amdkfd: get doorbell's absolute offset based on the db size

2023-10-04 Thread Arvind Yadav

This patch is to align the absolute doorbell offset
based on the doorbell's size. So that doorbell offset
will be aligned for both 32 bit and 64 bit.

v2:
- Addressed the review comment from Felix.
v3:
- Adding doorbell_size as parameter to get db absolute offset.

Cc: Christian Koenig 
Cc: Alex Deucher 
Signed-off-by: Shashank Sharma 
Signed-off-by: Arvind Yadav 
---
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c   |  6 +-
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c   | 13 +++--
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c  |  4 +++-
 3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 0d3d538b64eb..690ff131fe4b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -346,6 +346,7 @@ static int allocate_doorbell(struct qcm_process_device *qpd,
 uint32_t const *restore_id)
 {
struct kfd_node *dev = qpd->dqm->dev;
+   uint32_t doorbell_size;
 
if (!KFD_IS_SOC15(dev)) {
/* On pre-SOC15 chips we need to use the queue ID to
@@ -405,9 +406,12 @@ static int allocate_doorbell(struct qcm_process_device 
*qpd,
}
}
 
+   doorbell_size = dev->kfd->device_info.doorbell_size;
+
q->properties.doorbell_off = amdgpu_doorbell_index_on_bar(dev->adev,
  
qpd->proc_doorbells,
- 
q->doorbell_id);
+ 
q->doorbell_id,
+ 
doorbell_size);
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
index 7b38537c7c99..59dd76c4b138 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
@@ -161,7 +161,10 @@ void __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
if (inx >= KFD_MAX_NUM_OF_QUEUES_PER_PROCESS)
return NULL;
 
-   *doorbell_off = amdgpu_doorbell_index_on_bar(kfd->adev, kfd->doorbells, 
inx);
+   *doorbell_off = amdgpu_doorbell_index_on_bar(kfd->adev,
+kfd->doorbells,
+inx,
+
kfd->device_info.doorbell_size);
inx *= 2;
 
pr_debug("Get kernel queue doorbell\n"
@@ -233,6 +236,7 @@ phys_addr_t kfd_get_process_doorbells(struct 
kfd_process_device *pdd)
 {
struct amdgpu_device *adev = pdd->dev->adev;
uint32_t first_db_index;
+   uint32_t doorbell_size;
 
if (!pdd->qpd.proc_doorbells) {
if (kfd_alloc_process_doorbells(pdd->dev->kfd, pdd))
@@ -240,7 +244,12 @@ phys_addr_t kfd_get_process_doorbells(struct 
kfd_process_device *pdd)
return 0;
}
 
-   first_db_index = amdgpu_doorbell_index_on_bar(adev, 
pdd->qpd.proc_doorbells, 0);
+   doorbell_size = pdd->dev->kfd->device_info.doorbell_size;
+
+   first_db_index = amdgpu_doorbell_index_on_bar(adev,
+ pdd->qpd.proc_doorbells,
+ 0,
+ doorbell_size);
return adev->doorbell.base + first_db_index * sizeof(uint32_t);
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index adb5e4bdc0b2..010cd8e8e6a1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -375,9 +375,11 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 * relative doorbell index = Absolute doorbell index -
 * absolute index of first doorbell in the page.
 */
+   uint32_t doorbell_size = 
pdd->dev->kfd->device_info.doorbell_size;
uint32_t first_db_index = 
amdgpu_doorbell_index_on_bar(pdd->dev->adev,
   
pdd->qpd.proc_doorbells,
-  0);
+  0,
+  
doorbell_size);
 
*p_doorbell_offset_in_process = (q->properties.doorbell_off
- first_db_index) * 
sizeof(uint32_t);
-- 
2.34.1

[PATCH v3 1/2] drm/amdgpu: Adding db_size to get doorbell absolute offset

2023-10-04 Thread Arvind Yadav

Here, passing db_size in byte to find the doorbell's
absolute offset for both 32-bit and 64-bit doorbell sizes.
So that doorbell offset will be aligned based on the doorbell
size.

v3:
- Adding db_size as parameter to get db absolute offset.

Cc: Christian Koenig 
Cc: Alex Deucher 
Signed-off-by: Shashank Sharma 
Signed-off-by: Arvind Yadav 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h |  5 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c | 13 +
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
index 09f6727e7c73..4a8b33f55f6b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
@@ -357,8 +357,9 @@ int amdgpu_doorbell_init(struct amdgpu_device *adev);
 void amdgpu_doorbell_fini(struct amdgpu_device *adev);
 int amdgpu_doorbell_create_kernel_doorbells(struct amdgpu_device *adev);
 uint32_t amdgpu_doorbell_index_on_bar(struct amdgpu_device *adev,
-  struct amdgpu_bo *db_bo,
-  uint32_t doorbell_index);
+ struct amdgpu_bo *db_bo,
+ uint32_t doorbell_index,
+ uint32_t db_size);
 
 #define RDOORBELL32(index) amdgpu_mm_rdoorbell(adev, (index))
 #define WDOORBELL32(index, v) amdgpu_mm_wdoorbell(adev, (index), (v))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c
index da4be0bbb446..6690f5a72f4d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c
@@ -114,19 +114,24 @@ void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, 
u32 index, u64 v)
  * @adev: amdgpu_device pointer
  * @db_bo: doorbell object's bo
  * @db_index: doorbell relative index in this doorbell object
+ * @db_size: doorbell size is in byte
  *
  * returns doorbell's absolute index in BAR
  */
 uint32_t amdgpu_doorbell_index_on_bar(struct amdgpu_device *adev,
-  struct amdgpu_bo *db_bo,
-  uint32_t doorbell_index)
+ struct amdgpu_bo *db_bo,
+ uint32_t doorbell_index,
+ uint32_t db_size)
 {
int db_bo_offset;
 
db_bo_offset = amdgpu_bo_gpu_offset_no_check(db_bo);
 
-   /* doorbell index is 32 bit but doorbell's size is 64-bit, so *2 */
-   return db_bo_offset / sizeof(u32) + doorbell_index * 2;
+   /* doorbell index is 32 bit but doorbell's size can be 32 bit
+* or 64 bit, so *db_size(in byte)/4 for alignment.
+*/
+   return db_bo_offset / sizeof(u32) + doorbell_index *
+  DIV_ROUND_UP(db_size, 4);
 }
 
 /**
-- 
2.34.1

[PATCH v3 0/2] drm/amdkfd: Fix unaligned doorbell absolute offset for gfx8

2023-10-04 Thread Arvind Yadav

On older chips, the absolute doorbell offset within
the doorbell page is based on the queue ID.
KFD is using queue ID and doorbell size to get an
absolute doorbell offset in userspace.

Here, adding db_size in byte to find the doorbell's
absolute offset for both 32-bit and 64-bit doorbell sizes.
So that doorbell offset will be aligned based on the doorbell
size.

v2:
- Addressed the review comment from Felix.

v3:
- Adding doorbell_size as parameter to get db absolute offset. 

Arvind Yadav (2):
  drm/amdgpu: Adding db_size to get doorbell absolute offset
  drm/amdkfd: get doorbell's absolute offset based on the db size

 drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h|  5 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c| 13 +
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c   |  6 +-
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c   | 13 +++--
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c  |  4 +++-
 5 files changed, 31 insertions(+), 10 deletions(-)

-- 
2.34.1

[PATCH] drm/amdgpu: Improve MES responsiveness during oversubscription

2023-10-04 Thread Jay Cornwall

When MES is oversubscribed it may not frequently check for new
command submissions from driver if the scheduling load is high.
Response latency as high as 5 seconds has been observed.

Enable a flag which adds a check for new commands between
scheduling quantums.

Signed-off-by: Jay Cornwall 
Cc: Alexandru Tudor 
---
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index 4a3020b5b30f..31b26e6f0b30 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -406,6 +406,7 @@ static int mes_v11_0_set_hw_resources(struct amdgpu_mes 
*mes)
mes_set_hw_res_pkt.disable_mes_log = 1;
mes_set_hw_res_pkt.use_different_vmid_compute = 1;
mes_set_hw_res_pkt.enable_reg_active_poll = 1;
+   mes_set_hw_res_pkt.enable_level_process_quantum_check = 1;
mes_set_hw_res_pkt.oversubscription_timer = 50;
 
return mes_v11_0_submit_pkt_and_poll_completion(mes,
-- 
2.25.1

[PATCH] drm/amdgpu: Enable SMU 13.0.0 optimizations when ROCm is active (v2)

2023-10-04 Thread Alex Deucher

From: Kun Liu 

When ROCm is active enable additional SMU 13.0.0 optimizations.
This reuses the unused powersave profile on PMFW.

v2: move to the swsmu code since we need both bits active in
the workload mask.

Signed-off-by: Alex Deucher 
---
 .../drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c| 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
index 684b4e01fac2..83035fb1839a 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
@@ -2447,6 +2447,7 @@ static int smu_v13_0_0_set_power_profile_mode(struct 
smu_context *smu,
DpmActivityMonitorCoeffInt_t *activity_monitor =
&(activity_monitor_external.DpmActivityMonitorCoeffInt);
int workload_type, ret = 0;
+   u32 workload_mask;
 
smu->power_profile_mode = input[size];
 
@@ -2536,9 +2537,23 @@ static int smu_v13_0_0_set_power_profile_mode(struct 
smu_context *smu,
if (workload_type < 0)
return -EINVAL;
 
+   workload_mask = 1 << workload_type;
+
+   /* Add optimizations for SMU13.0.0.  Reuse the power saving profile */
+   if (smu->power_profile_mode == PP_SMC_POWER_PROFILE_COMPUTE &&
+   (amdgpu_ip_version(smu->adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 0)) 
&&
+   ((smu->adev->pm.fw_version == 0x004e6601) ||
+(smu->adev->pm.fw_version >= 0x004e7300))) {
+   workload_type = smu_cmn_to_asic_specific_index(smu,
+  
CMN2ASIC_MAPPING_WORKLOAD,
+  
PP_SMC_POWER_PROFILE_POWERSAVING);
+   if (workload_type >= 0)
+   workload_mask |= 1 << workload_type;
+   }
+
return smu_cmn_send_smc_msg_with_param(smu,
   SMU_MSG_SetWorkloadMask,
-  1 << workload_type,
+  workload_mask,
   NULL);
 }
 
-- 
2.41.0

Re: [PATCH] drm/amdgpu: Enable SMU 13.0.0 optimizations when ROCm is active

2023-10-04 Thread Alex Deucher

On Wed, Oct 4, 2023 at 10:51 AM Alex Deucher  wrote:
>
> On Wed, Oct 4, 2023 at 10:46 AM Wang, Yang(Kevin)
>  wrote:
> >
> > Hi Alex,
> >
> > why need to switch profile twice for smu 13.0.0 ? in idle state : set 
> > compute profile then set power save profile?
> > Afaik, Pmfw always uses the last set result to represent the current 
> > profile.
> > But it shouldn't affect the results. Anyway.
>
> We want to be setting both workload bits (powersave and compute).  I
> guess this won't work as is?  I thought the code or'ed both bits.

Confirming this won't work as expected.  Will rework and resend.

Alex

>
> Alex
>
>
> >
> > Reviewed-by: Yang Wang 
> >
> > Best Regards,
> > Kevin
> >
> > -Original Message-
> > From: amd-gfx  On Behalf Of Alex 
> > Deucher
> > Sent: Wednesday, October 4, 2023 10:04 PM
> > To: Deucher, Alexander 
> > Cc: amd-gfx@lists.freedesktop.org
> > Subject: Re: [PATCH] drm/amdgpu: Enable SMU 13.0.0 optimizations when ROCm 
> > is active
> >
> > Ping?
> >
> > On Tue, Oct 3, 2023 at 6:47 PM Alex Deucher  
> > wrote:
> > >
> > > When ROCm is active enable additional SMU 13.0.0 optimizations.
> > > This reuses the unused powersave profile on PMFW.
> > >
> > > Signed-off-by: Alex Deucher 
> > > ---
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 8 
> > >  1 file changed, 8 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > > index 38b5457baded..b6c0c42de725 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > > @@ -714,6 +714,14 @@ void amdgpu_amdkfd_set_compute_idle(struct 
> > > amdgpu_device *adev, bool idle)
> > > amdgpu_dpm_switch_power_profile(adev,
> > > PP_SMC_POWER_PROFILE_COMPUTE,
> > > !idle);
> > > +   /* Add optimizations for SMU13.0.0.  Reuse the power saving 
> > > profile */
> > > +   if ((amdgpu_ip_version(adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 
> > > 0)) &&
> > > +   ((adev->pm.fw_version == 0x004e6601) ||
> > > +(adev->pm.fw_version >= 0x004e7300))) {
> > > +   amdgpu_dpm_switch_power_profile(adev,
> > > +   
> > > PP_SMC_POWER_PROFILE_POWERSAVING,
> > > +   !idle);
> > > +   }
> > >  }
> > >
> > >  bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 vmid)
> > > --
> > > 2.41.0
> > >

Re: [PATCH] drm/amdgpu: Enable SMU 13.0.0 optimizations when ROCm is active

2023-10-04 Thread Christian König


Am 03.10.23 um 21:07 schrieb Alex Deucher:

When ROCm is active enable additional SMU 13.0.0 optimizations.
This reuses the unused powersave profile on PMFW.

Signed-off-by: Alex Deucher 


Acked-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 38b5457baded..b6c0c42de725 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -714,6 +714,14 @@ void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device 
*adev, bool idle)
amdgpu_dpm_switch_power_profile(adev,
PP_SMC_POWER_PROFILE_COMPUTE,
!idle);
+   /* Add optimizations for SMU13.0.0.  Reuse the power saving profile */
+   if ((amdgpu_ip_version(adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 0)) &&
+   ((adev->pm.fw_version == 0x004e6601) ||
+(adev->pm.fw_version >= 0x004e7300))) {
+   amdgpu_dpm_switch_power_profile(adev,
+   
PP_SMC_POWER_PROFILE_POWERSAVING,
+   !idle);
+   }
  }
  
  bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 vmid)

Re: [PATCH] drm/amdgpu: Enable SMU 13.0.0 optimizations when ROCm is active

2023-10-04 Thread Alex Deucher

On Wed, Oct 4, 2023 at 10:46 AM Wang, Yang(Kevin)
 wrote:
>
> Hi Alex,
>
> why need to switch profile twice for smu 13.0.0 ? in idle state : set compute 
> profile then set power save profile?
> Afaik, Pmfw always uses the last set result to represent the current profile.
> But it shouldn't affect the results. Anyway.

We want to be setting both workload bits (powersave and compute).  I
guess this won't work as is?  I thought the code or'ed both bits.

Alex


>
> Reviewed-by: Yang Wang 
>
> Best Regards,
> Kevin
>
> -Original Message-
> From: amd-gfx  On Behalf Of Alex 
> Deucher
> Sent: Wednesday, October 4, 2023 10:04 PM
> To: Deucher, Alexander 
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: Enable SMU 13.0.0 optimizations when ROCm is 
> active
>
> Ping?
>
> On Tue, Oct 3, 2023 at 6:47 PM Alex Deucher  wrote:
> >
> > When ROCm is active enable additional SMU 13.0.0 optimizations.
> > This reuses the unused powersave profile on PMFW.
> >
> > Signed-off-by: Alex Deucher 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 8 
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > index 38b5457baded..b6c0c42de725 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > @@ -714,6 +714,14 @@ void amdgpu_amdkfd_set_compute_idle(struct 
> > amdgpu_device *adev, bool idle)
> > amdgpu_dpm_switch_power_profile(adev,
> > PP_SMC_POWER_PROFILE_COMPUTE,
> > !idle);
> > +   /* Add optimizations for SMU13.0.0.  Reuse the power saving profile 
> > */
> > +   if ((amdgpu_ip_version(adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 0)) 
> > &&
> > +   ((adev->pm.fw_version == 0x004e6601) ||
> > +(adev->pm.fw_version >= 0x004e7300))) {
> > +   amdgpu_dpm_switch_power_profile(adev,
> > +   
> > PP_SMC_POWER_PROFILE_POWERSAVING,
> > +   !idle);
> > +   }
> >  }
> >
> >  bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 vmid)
> > --
> > 2.41.0
> >

RE: [PATCH] drm/amdgpu: Enable SMU 13.0.0 optimizations when ROCm is active

2023-10-04 Thread Wang, Yang(Kevin)

Hi Alex,

why need to switch profile twice for smu 13.0.0 ? in idle state : set compute 
profile then set power save profile?
Afaik, Pmfw always uses the last set result to represent the current profile.
But it shouldn't affect the results. Anyway.

Reviewed-by: Yang Wang 

Best Regards,
Kevin

-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Wednesday, October 4, 2023 10:04 PM
To: Deucher, Alexander 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Enable SMU 13.0.0 optimizations when ROCm is 
active

Ping?

On Tue, Oct 3, 2023 at 6:47 PM Alex Deucher  wrote:
>
> When ROCm is active enable additional SMU 13.0.0 optimizations.
> This reuses the unused powersave profile on PMFW.
>
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 38b5457baded..b6c0c42de725 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -714,6 +714,14 @@ void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device 
> *adev, bool idle)
> amdgpu_dpm_switch_power_profile(adev,
> PP_SMC_POWER_PROFILE_COMPUTE,
> !idle);
> +   /* Add optimizations for SMU13.0.0.  Reuse the power saving profile */
> +   if ((amdgpu_ip_version(adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 0)) &&
> +   ((adev->pm.fw_version == 0x004e6601) ||
> +(adev->pm.fw_version >= 0x004e7300))) {
> +   amdgpu_dpm_switch_power_profile(adev,
> +   
> PP_SMC_POWER_PROFILE_POWERSAVING,
> +   !idle);
> +   }
>  }
>
>  bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 vmid)
> --
> 2.41.0
>

Re: [PATCH] drm/amdgpu: Enable SMU 13.0.0 optimizations when ROCm is active

2023-10-04 Thread Alex Deucher

Ping?

On Tue, Oct 3, 2023 at 6:47 PM Alex Deucher  wrote:
>
> When ROCm is active enable additional SMU 13.0.0 optimizations.
> This reuses the unused powersave profile on PMFW.
>
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 38b5457baded..b6c0c42de725 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -714,6 +714,14 @@ void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device 
> *adev, bool idle)
> amdgpu_dpm_switch_power_profile(adev,
> PP_SMC_POWER_PROFILE_COMPUTE,
> !idle);
> +   /* Add optimizations for SMU13.0.0.  Reuse the power saving profile */
> +   if ((amdgpu_ip_version(adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 0)) &&
> +   ((adev->pm.fw_version == 0x004e6601) ||
> +(adev->pm.fw_version >= 0x004e7300))) {
> +   amdgpu_dpm_switch_power_profile(adev,
> +   
> PP_SMC_POWER_PROFILE_POWERSAVING,
> +   !idle);
> +   }
>  }
>
>  bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 vmid)
> --
> 2.41.0
>

Re: [PATCH v2] drm/amdgpu: fix ip count query for xcp partitions

2023-10-04 Thread Sundararaju, Sathishkumar


Hi Christian,

Thank you for explaining, I understand it now.

Regards,

Sathish

On 10/4/2023 7:23 PM, Christian König wrote:

Hi Sathish,

an ack from a maintainer basically means "go ahead, push it to a 
branch" (in this case to amd-staging-drm-next).


A reviewed-by means "I've verified the technical background and think 
that this is correct".


A RB is indeed better, but not always necessary.

Regards,
Christian.

Am 03.10.23 um 18:43 schrieb Sundararaju, Sathishkumar:


Hi Alex,

My apology, I was under the impression that RB is a must. I 
understand now that ACK is good, checked with Leo after your 
response. Thank you.



Regards,

Sathish


On 10/3/2023 10:01 PM, Alex Deucher wrote:

On Tue, Oct 3, 2023 at 12:22 PM Sundararaju, Sathishkumar
  wrote:

Hi ,

Kind request to help review the change. Thank you.

I acked this change back when you sent it out, but if it didn't come
through for some reason:
Acked-by: Alex Deucher


Regards,

Sathish

On 9/21/2023 8:17 PM, Alex Deucher wrote:

On Thu, Sep 21, 2023 at 9:07 AM Sathishkumar S
  wrote:

fix wrong ip count INFO on spatial partitions. update the query
to return the instance count corresponding to the partition id.

v2:
   initialize variables only when required to be (Christian)
   move variable declarations to the beginning of function (Christian)

Signed-off-by: Sathishkumar S

Acked-by: Alex Deucher


---
   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 44 -
   1 file changed, 36 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 081bd28e2443..d4ccbe7c78d6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -595,11 +595,16 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
  struct drm_amdgpu_info *info = data;
  struct amdgpu_mode_info *minfo = &adev->mode_info;
  void __user *out = (void __user *)(uintptr_t)info->return_pointer;
+   struct amdgpu_fpriv *fpriv;
+   struct amdgpu_ip_block *ip_block;
+   enum amd_ip_block_type type;
+   struct amdgpu_xcp *xcp;
+   uint32_t count, inst_mask;
  uint32_t size = info->return_size;
  struct drm_crtc *crtc;
  uint32_t ui32 = 0;
  uint64_t ui64 = 0;
-   int i, found;
+   int i, found, ret;
  int ui32_size = sizeof(ui32);

  if (!info->return_size || !info->return_pointer)
@@ -627,7 +632,6 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
  return copy_to_user(out, &ui32, min(size, 4u)) ? -EFAULT : 0;
  case AMDGPU_INFO_HW_IP_INFO: {
  struct drm_amdgpu_info_hw_ip ip = {};
-   int ret;

  ret = amdgpu_hw_ip_info(adev, info, &ip);
  if (ret)
@@ -637,15 +641,41 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
  return ret ? -EFAULT : 0;
  }
  case AMDGPU_INFO_HW_IP_COUNT: {
-   enum amd_ip_block_type type;
-   struct amdgpu_ip_block *ip_block = NULL;
-   uint32_t count = 0;
-
+   fpriv = (struct amdgpu_fpriv *) filp->driver_priv;
  type = amdgpu_ip_get_block_type(adev, info->query_hw_ip.type);
  ip_block = amdgpu_device_ip_get_ip_block(adev, type);
+
  if (!ip_block || !ip_block->status.valid)
  return -EINVAL;

+   if (adev->xcp_mgr && adev->xcp_mgr->num_xcps > 0 &&
+   fpriv->xcp_id >= 0 && fpriv->xcp_id < 
adev->xcp_mgr->num_xcps) {
+   xcp = &adev->xcp_mgr->xcp[fpriv->xcp_id];
+   switch (type) {
+   case AMD_IP_BLOCK_TYPE_GFX:
+   ret = amdgpu_xcp_get_inst_details(xcp, 
AMDGPU_XCP_GFX, &inst_mask);
+   count = hweight32(inst_mask);
+   break;
+   case AMD_IP_BLOCK_TYPE_SDMA:
+   ret = amdgpu_xcp_get_inst_details(xcp, 
AMDGPU_XCP_SDMA, &inst_mask);
+   count = hweight32(inst_mask);
+   break;
+   case AMD_IP_BLOCK_TYPE_JPEG:
+   ret = amdgpu_xcp_get_inst_details(xcp, 
AMDGPU_XCP_VCN, &inst_mask);
+   count = hweight32(inst_mask) * 
adev->jpeg.num_jpeg_rings;
+   break;
+   case AMD_IP_BLOCK_TYPE_VCN:
+   ret = amdgpu_xcp_get_inst_details(xcp, 
AMDGPU_XCP_VCN, &inst_mask);
+   count = hweight32(inst_mask);
+   break;
+   default:
+   return -EINVAL;
+   }
+

Re: [PATCH v2] drm/amdgpu: fix ip count query for xcp partitions

2023-10-04 Thread Christian König


Hi Sathish,

an ack from a maintainer basically means "go ahead, push it to a branch" 
(in this case to amd-staging-drm-next).


A reviewed-by means "I've verified the technical background and think 
that this is correct".


A RB is indeed better, but not always necessary.

Regards,
Christian.

Am 03.10.23 um 18:43 schrieb Sundararaju, Sathishkumar:


Hi Alex,

My apology, I was under the impression that RB is a must. I understand 
now that ACK is good, checked with Leo after your response. Thank you.



Regards,

Sathish


On 10/3/2023 10:01 PM, Alex Deucher wrote:

On Tue, Oct 3, 2023 at 12:22 PM Sundararaju, Sathishkumar
  wrote:

Hi ,

Kind request to help review the change. Thank you.

I acked this change back when you sent it out, but if it didn't come
through for some reason:
Acked-by: Alex Deucher


Regards,

Sathish

On 9/21/2023 8:17 PM, Alex Deucher wrote:

On Thu, Sep 21, 2023 at 9:07 AM Sathishkumar S
  wrote:

fix wrong ip count INFO on spatial partitions. update the query
to return the instance count corresponding to the partition id.

v2:
   initialize variables only when required to be (Christian)
   move variable declarations to the beginning of function (Christian)

Signed-off-by: Sathishkumar S

Acked-by: Alex Deucher


---
   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 44 -
   1 file changed, 36 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 081bd28e2443..d4ccbe7c78d6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -595,11 +595,16 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
  struct drm_amdgpu_info *info = data;
  struct amdgpu_mode_info *minfo = &adev->mode_info;
  void __user *out = (void __user *)(uintptr_t)info->return_pointer;
+   struct amdgpu_fpriv *fpriv;
+   struct amdgpu_ip_block *ip_block;
+   enum amd_ip_block_type type;
+   struct amdgpu_xcp *xcp;
+   uint32_t count, inst_mask;
  uint32_t size = info->return_size;
  struct drm_crtc *crtc;
  uint32_t ui32 = 0;
  uint64_t ui64 = 0;
-   int i, found;
+   int i, found, ret;
  int ui32_size = sizeof(ui32);

  if (!info->return_size || !info->return_pointer)
@@ -627,7 +632,6 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
  return copy_to_user(out, &ui32, min(size, 4u)) ? -EFAULT : 0;
  case AMDGPU_INFO_HW_IP_INFO: {
  struct drm_amdgpu_info_hw_ip ip = {};
-   int ret;

  ret = amdgpu_hw_ip_info(adev, info, &ip);
  if (ret)
@@ -637,15 +641,41 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
  return ret ? -EFAULT : 0;
  }
  case AMDGPU_INFO_HW_IP_COUNT: {
-   enum amd_ip_block_type type;
-   struct amdgpu_ip_block *ip_block = NULL;
-   uint32_t count = 0;
-
+   fpriv = (struct amdgpu_fpriv *) filp->driver_priv;
  type = amdgpu_ip_get_block_type(adev, info->query_hw_ip.type);
  ip_block = amdgpu_device_ip_get_ip_block(adev, type);
+
  if (!ip_block || !ip_block->status.valid)
  return -EINVAL;

+   if (adev->xcp_mgr && adev->xcp_mgr->num_xcps > 0 &&
+   fpriv->xcp_id >= 0 && fpriv->xcp_id < 
adev->xcp_mgr->num_xcps) {
+   xcp = &adev->xcp_mgr->xcp[fpriv->xcp_id];
+   switch (type) {
+   case AMD_IP_BLOCK_TYPE_GFX:
+   ret = amdgpu_xcp_get_inst_details(xcp, 
AMDGPU_XCP_GFX, &inst_mask);
+   count = hweight32(inst_mask);
+   break;
+   case AMD_IP_BLOCK_TYPE_SDMA:
+   ret = amdgpu_xcp_get_inst_details(xcp, 
AMDGPU_XCP_SDMA, &inst_mask);
+   count = hweight32(inst_mask);
+   break;
+   case AMD_IP_BLOCK_TYPE_JPEG:
+   ret = amdgpu_xcp_get_inst_details(xcp, 
AMDGPU_XCP_VCN, &inst_mask);
+   count = hweight32(inst_mask) * 
adev->jpeg.num_jpeg_rings;
+   break;
+   case AMD_IP_BLOCK_TYPE_VCN:
+   ret = amdgpu_xcp_get_inst_details(xcp, 
AMDGPU_XCP_VCN, &inst_mask);
+   count = hweight32(inst_mask);
+   break;
+   default:
+   return -EINVAL;
+   }
+   if (ret)
+   return ret;
+   return copy_to_user(out, &count, min(size, 4

[PATCH 3/4] drm/amdgpu: Add more FRU field information

2023-10-04 Thread Lijo Lazar

Add support to read Manufacturer Name and FRU File Id fields. Also add
sysfs device attributes for external usage.

Signed-off-by: Lijo Lazar 
---
 .../gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c| 52 +--
 .../gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.h|  2 +
 2 files changed, 51 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
index 79ba74dfc576..5d627d0e19a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
@@ -200,10 +200,19 @@ int amdgpu_fru_get_product_info(struct amdgpu_device 
*adev)
 
/* Now extract useful information from the PIA.
 *
-* Skip the Manufacturer Name at [3] and go directly to
-* the Product Name field.
+* Read Manufacturer Name field whose length is [3].
 */
-   addr = 3 + 1 + (pia[3] & 0x3F);
+   addr = 3;
+   if (addr + 1 >= len)
+   goto Out;
+   memcpy(fru_info->manufacturer_name, pia + addr + 1,
+  min_t(size_t, sizeof(fru_info->manufacturer_name),
+pia[addr] & 0x3F));
+   fru_info->manufacturer_name[sizeof(fru_info->manufacturer_name) - 1] =
+   '\0';
+
+   /* Read Product Name field. */
+   addr += 1 + (pia[addr] & 0x3F);
if (addr + 1 >= len)
goto Out;
memcpy(fru_info->product_name, pia + addr + 1,
@@ -229,6 +238,18 @@ int amdgpu_fru_get_product_info(struct amdgpu_device *adev)
memcpy(fru_info->serial, pia + addr + 1,
   min_t(size_t, sizeof(fru_info->serial), pia[addr] & 0x3F));
fru_info->serial[sizeof(fru_info->serial) - 1] = '\0';
+
+   /* Asset Tag field */
+   addr += 1 + (pia[addr] & 0x3F);
+
+   /* FRU File Id field. This could be 'null'. */
+   addr += 1 + (pia[addr] & 0x3F);
+   if ((addr + 1 >= len) || !(pia[addr] & 0x3F))
+   goto Out;
+   memcpy(fru_info->fru_id, pia + addr + 1,
+  min_t(size_t, sizeof(fru_info->fru_id), pia[addr] & 0x3F));
+   fru_info->fru_id[sizeof(fru_info->fru_id) - 1] = '\0';
+
 Out:
kfree(pia);
return 0;
@@ -300,10 +321,35 @@ static ssize_t amdgpu_fru_serial_number_show(struct 
device *dev,
 
 static DEVICE_ATTR(serial_number, 0444, amdgpu_fru_serial_number_show, NULL);
 
+static ssize_t amdgpu_fru_id_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+   struct drm_device *ddev = dev_get_drvdata(dev);
+   struct amdgpu_device *adev = drm_to_adev(ddev);
+
+   return sysfs_emit(buf, "%s\n", adev->fru_info->fru_id);
+}
+
+static DEVICE_ATTR(fru_id, 0444, amdgpu_fru_id_show, NULL);
+
+static ssize_t amdgpu_fru_manufacturer_name_show(struct device *dev,
+struct device_attribute *attr,
+char *buf)
+{
+   struct drm_device *ddev = dev_get_drvdata(dev);
+   struct amdgpu_device *adev = drm_to_adev(ddev);
+
+   return sysfs_emit(buf, "%s\n", adev->fru_info->manufacturer_name);
+}
+
+static DEVICE_ATTR(manufacturer, 0444, amdgpu_fru_manufacturer_name_show, 
NULL);
+
 static const struct attribute *amdgpu_fru_attributes[] = {
&dev_attr_product_name.attr,
&dev_attr_product_number.attr,
&dev_attr_serial_number.attr,
+   &dev_attr_fru_id.attr,
+   &dev_attr_manufacturer.attr,
NULL
 };
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.h
index c99c74811c78..bc58dca18035 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.h
@@ -31,6 +31,8 @@ struct amdgpu_fru_info {
charproduct_number[20];
charproduct_name[AMDGPU_PRODUCT_NAME_LEN];
charserial[20];
+   charmanufacturer_name[32];
+   charfru_id[32];
 };
 
 int amdgpu_fru_get_product_info(struct amdgpu_device *adev);
-- 
2.25.1

[PATCH 2/4] drm/amdgpu: Refactor FRU product information

2023-10-04 Thread Lijo Lazar

Keep FRU related information together in a separate structure.

Signed-off-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  8 +---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c|  3 ++
 .../gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c| 46 +++
 .../gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.h|  9 
 .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |  4 --
 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |  2 -
 .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c|  2 -
 .../drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c  |  2 -
 .../drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c  |  2 -
 9 files changed, 42 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index d23fb4b5ad95..6b5dd5f9964a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -771,8 +771,8 @@ struct amdgpu_mqd {
 
 #define AMDGPU_RESET_MAGIC_NUM 64
 #define AMDGPU_MAX_DF_PERFMONS 4
-#define AMDGPU_PRODUCT_NAME_LEN 64
 struct amdgpu_reset_domain;
+struct amdgpu_fru_info;
 
 /*
  * Non-zero (true) if the GPU has VRAM. Zero (false) otherwise.
@@ -1056,11 +1056,7 @@ struct amdgpu_device {
 
boolucode_sysfs_en;
 
-   /* Chip product information */
-   charproduct_number[20];
-   charproduct_name[AMDGPU_PRODUCT_NAME_LEN];
-   charserial[20];
-
+   struct amdgpu_fru_info  *fru_info;
atomic_tthrottling_logging_enabled;
struct ratelimit_state  throttling_logging_rs;
uint32_tras_hw_enabled;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 27c95bb02411..0cb702c3046a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4274,6 +4274,9 @@ void amdgpu_device_fini_sw(struct amdgpu_device *adev)
kfree(adev->bios);
adev->bios = NULL;
 
+   kfree(adev->fru_info);
+   adev->fru_info = NULL;
+
px = amdgpu_device_supports_px(adev_to_drm(adev));
 
if (px || (!pci_is_thunderbolt_attached(adev->pdev) &&
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
index d0ae9cada110..79ba74dfc576 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
@@ -109,6 +109,7 @@ static bool is_fru_eeprom_supported(struct amdgpu_device 
*adev, u32 *fru_addr)
 
 int amdgpu_fru_get_product_info(struct amdgpu_device *adev)
 {
+   struct amdgpu_fru_info *fru_info;
unsigned char buf[8], *pia;
u32 addr, fru_addr;
int size, len;
@@ -117,6 +118,19 @@ int amdgpu_fru_get_product_info(struct amdgpu_device *adev)
if (!is_fru_eeprom_supported(adev, &fru_addr))
return 0;
 
+   if (!adev->fru_info) {
+   adev->fru_info = kzalloc(sizeof(*adev->fru_info), GFP_KERNEL);
+   if (!adev->fru_info)
+   return -ENOMEM;
+   }
+
+   fru_info = adev->fru_info;
+   /* For Arcturus-and-later, default value of serial_number is unique_id
+* so convert it to a 16-digit HEX string for convenience and
+* backwards-compatibility.
+*/
+   sprintf(fru_info->serial, "%llx", adev->unique_id);
+
/* If algo exists, it means that the i2c_adapter's initialized */
if (!adev->pm.fru_eeprom_i2c_bus || !adev->pm.fru_eeprom_i2c_bus->algo) 
{
DRM_WARN("Cannot access FRU, EEPROM accessor not initialized");
@@ -192,21 +206,18 @@ int amdgpu_fru_get_product_info(struct amdgpu_device 
*adev)
addr = 3 + 1 + (pia[3] & 0x3F);
if (addr + 1 >= len)
goto Out;
-   memcpy(adev->product_name, pia + addr + 1,
-  min_t(size_t,
-sizeof(adev->product_name),
-pia[addr] & 0x3F));
-   adev->product_name[sizeof(adev->product_name) - 1] = '\0';
+   memcpy(fru_info->product_name, pia + addr + 1,
+  min_t(size_t, sizeof(fru_info->product_name), pia[addr] & 0x3F));
+   fru_info->product_name[sizeof(fru_info->product_name) - 1] = '\0';
 
/* Go to the Product Part/Model Number field. */
addr += 1 + (pia[addr] & 0x3F);
if (addr + 1 >= len)
goto Out;
-   memcpy(adev->product_number, pia + addr + 1,
-  min_t(size_t,
-sizeof(adev->product_number),
+   memcpy(fru_info->product_number, pia + addr + 1,
+  min_t(size_t, sizeof(fru_info->product_number),
 pia[addr] & 0x3F));
-   adev->product_number[sizeof(adev->product_number) - 1] = '\0';
+   fru_info->product_number[sizeof(fru_info->product_number) - 1] = '\0';
 
/* Go to the Product Version field. */

[PATCH 4/4] Documentation/amdgpu: Add FRU attribute details

2023-10-04 Thread Lijo Lazar

Add documentation for the newly added manufacturer and fru_id attributes
in sysfs.

Signed-off-by: Lijo Lazar 
---
 Documentation/gpu/amdgpu/driver-misc.rst  | 12 
 .../gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c| 19 +++
 2 files changed, 31 insertions(+)

diff --git a/Documentation/gpu/amdgpu/driver-misc.rst 
b/Documentation/gpu/amdgpu/driver-misc.rst
index 82b47f1818ac..e40e15f89fd3 100644
--- a/Documentation/gpu/amdgpu/driver-misc.rst
+++ b/Documentation/gpu/amdgpu/driver-misc.rst
@@ -26,6 +26,18 @@ serial_number
 .. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
:doc: serial_number
 
+fru_id
+-
+
+.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+   :doc: fru_id
+
+manufacturer
+-
+
+.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+   :doc: manufacturer
+
 unique_id
 -
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
index 5d627d0e19a4..d635e61805ea 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
@@ -321,6 +321,16 @@ static ssize_t amdgpu_fru_serial_number_show(struct device 
*dev,
 
 static DEVICE_ATTR(serial_number, 0444, amdgpu_fru_serial_number_show, NULL);
 
+/**
+ * DOC: fru_id
+ *
+ * The amdgpu driver provides a sysfs API for reporting FRU File Id
+ * for the device.
+ * The file fru_id is used for this and returns the File Id value
+ * as returned from the FRU.
+ * NOTE: This is only available for certain server cards
+ */
+
 static ssize_t amdgpu_fru_id_show(struct device *dev,
  struct device_attribute *attr, char *buf)
 {
@@ -332,6 +342,15 @@ static ssize_t amdgpu_fru_id_show(struct device *dev,
 
 static DEVICE_ATTR(fru_id, 0444, amdgpu_fru_id_show, NULL);
 
+/**
+ * DOC: manufacturer
+ *
+ * The amdgpu driver provides a sysfs API for reporting manufacturer name from
+ * FRU information.
+ * The file manufacturer returns the value as returned from the FRU.
+ * NOTE: This is only available for certain server cards
+ */
+
 static ssize_t amdgpu_fru_manufacturer_name_show(struct device *dev,
 struct device_attribute *attr,
 char *buf)
-- 
2.25.1

[PATCH 1/4] drm/amdgpu: enable FRU device for SMU v13.0.6

2023-10-04 Thread Lijo Lazar

From: Yang Wang 

v1:
enable GFX v9.4.3 FRU device to query board information.

v2:
use MP1 version to identify different asic

Signed-off-by: Yang Wang 
Reviewed-by: Lijo Lazar 
Acked-by: Alex Deucher 
---
 .../gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c| 48 +++
 1 file changed, 29 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
index 7cd0dfaeee20..d0ae9cada110 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
@@ -57,27 +57,26 @@ static bool is_fru_eeprom_supported(struct amdgpu_device 
*adev, u32 *fru_addr)
 * for ease/speed/readability. For now, 2 string comparisons are
 * reasonable and not too expensive
 */
-   switch (adev->asic_type) {
-   case CHIP_VEGA20:
-   /* D161 and D163 are the VG20 server SKUs */
-   if (strnstr(atom_ctx->vbios_pn, "D161",
-   sizeof(atom_ctx->vbios_pn)) ||
-   strnstr(atom_ctx->vbios_pn, "D163",
-   sizeof(atom_ctx->vbios_pn))) {
-   if (fru_addr)
-   *fru_addr = FRU_EEPROM_MADDR_6;
-   return true;
-   } else {
+   switch (amdgpu_ip_version(adev, MP1_HWIP, 0)) {
+   case IP_VERSION(11, 0, 2):
+   switch (adev->asic_type) {
+   case CHIP_VEGA20:
+   /* D161 and D163 are the VG20 server SKUs */
+   if (strnstr(atom_ctx->vbios_pn, "D161",
+   sizeof(atom_ctx->vbios_pn)) ||
+   strnstr(atom_ctx->vbios_pn, "D163",
+   sizeof(atom_ctx->vbios_pn))) {
+   if (fru_addr)
+   *fru_addr = FRU_EEPROM_MADDR_6;
+   return true;
+   } else {
+   return false;
+   }
+   case CHIP_ARCTURUS:
+   default:
return false;
}
-   case CHIP_ALDEBARAN:
-   /* All Aldebaran SKUs have an FRU */
-   if (!strnstr(atom_ctx->vbios_pn, "D673",
-sizeof(atom_ctx->vbios_pn)))
-   if (fru_addr)
-   *fru_addr = FRU_EEPROM_MADDR_6;
-   return true;
-   case CHIP_SIENNA_CICHLID:
+   case IP_VERSION(11, 0, 7):
if (strnstr(atom_ctx->vbios_pn, "D603",
sizeof(atom_ctx->vbios_pn))) {
if (strnstr(atom_ctx->vbios_pn, "D603GLXE",
@@ -92,6 +91,17 @@ static bool is_fru_eeprom_supported(struct amdgpu_device 
*adev, u32 *fru_addr)
} else {
return false;
}
+   case IP_VERSION(13, 0, 2):
+   /* All Aldebaran SKUs have an FRU */
+   if (!strnstr(atom_ctx->vbios_pn, "D673",
+sizeof(atom_ctx->vbios_pn)))
+   if (fru_addr)
+   *fru_addr = FRU_EEPROM_MADDR_6;
+   return true;
+   case IP_VERSION(13, 0, 6):
+   if (fru_addr)
+   *fru_addr = FRU_EEPROM_MADDR_8;
+   return true;
default:
return false;
}
-- 
2.25.1

Re: [PATCH v2 01/16] platform/x86/amd/pmf: Add PMF TEE interface

2023-10-04 Thread Ilpo Järvinen

On Sat, 30 Sep 2023, Shyam Sundar S K wrote:

> AMD PMF driver loads the PMF TA (Trusted Application) into the AMD
> ASP's (AMD Security Processor) TEE (Trusted Execution Environment).
> 
> PMF Trusted Application is a secured firmware placed under
> /lib/firmware/amdtee gets loaded only when the TEE environment is
> initialized. Add the initial code path to build these pipes.
> 
> Reviewed-by: Mario Limonciello 
> Signed-off-by: Shyam Sundar S K 
> ---
>  drivers/platform/x86/amd/pmf/Makefile |   3 +-
>  drivers/platform/x86/amd/pmf/core.c   |  11 ++-
>  drivers/platform/x86/amd/pmf/pmf.h|  16 
>  drivers/platform/x86/amd/pmf/tee-if.c | 112 ++
>  4 files changed, 138 insertions(+), 4 deletions(-)
>  create mode 100644 drivers/platform/x86/amd/pmf/tee-if.c
> 
> diff --git a/drivers/platform/x86/amd/pmf/Makefile 
> b/drivers/platform/x86/amd/pmf/Makefile
> index fdededf54392..d2746ee7369f 100644
> --- a/drivers/platform/x86/amd/pmf/Makefile
> +++ b/drivers/platform/x86/amd/pmf/Makefile
> @@ -6,4 +6,5 @@
>  
>  obj-$(CONFIG_AMD_PMF) += amd-pmf.o
>  amd-pmf-objs := core.o acpi.o sps.o \
> - auto-mode.o cnqf.o
> + auto-mode.o cnqf.o \
> + tee-if.o
> diff --git a/drivers/platform/x86/amd/pmf/core.c 
> b/drivers/platform/x86/amd/pmf/core.c
> index 78ed3ee22555..68f1389dda3e 100644
> --- a/drivers/platform/x86/amd/pmf/core.c
> +++ b/drivers/platform/x86/amd/pmf/core.c
> @@ -309,8 +309,11 @@ static void amd_pmf_init_features(struct amd_pmf_dev 
> *dev)
>   dev_dbg(dev->dev, "SPS enabled and Platform Profiles 
> registered\n");
>   }
>  
> - /* Enable Auto Mode */
> - if (is_apmf_func_supported(dev, APMF_FUNC_AUTO_MODE)) {
> + if (amd_pmf_init_smart_pc(dev)) {
> + /* Enable Smart PC Solution builder */
> + dev_dbg(dev->dev, "Smart PC Solution Enabled\n");
> + } else if (is_apmf_func_supported(dev, APMF_FUNC_AUTO_MODE)) {
> + /* Enable Auto Mode */
>   amd_pmf_init_auto_mode(dev);
>   dev_dbg(dev->dev, "Auto Mode Init done\n");
>   } else if (is_apmf_func_supported(dev, APMF_FUNC_DYN_SLIDER_AC) ||
> @@ -330,7 +333,9 @@ static void amd_pmf_deinit_features(struct amd_pmf_dev 
> *dev)
>   amd_pmf_deinit_sps(dev);
>   }
>  
> - if (is_apmf_func_supported(dev, APMF_FUNC_AUTO_MODE)) {
> + if (dev->smart_pc_enabled) {
> + amd_pmf_deinit_smart_pc(dev);
> + } else if (is_apmf_func_supported(dev, APMF_FUNC_AUTO_MODE)) {
>   amd_pmf_deinit_auto_mode(dev);
>   } else if (is_apmf_func_supported(dev, APMF_FUNC_DYN_SLIDER_AC) ||
> is_apmf_func_supported(dev, APMF_FUNC_DYN_SLIDER_DC)) 
> {
> diff --git a/drivers/platform/x86/amd/pmf/pmf.h 
> b/drivers/platform/x86/amd/pmf/pmf.h
> index deba88e6e4c8..02460c2a31ea 100644
> --- a/drivers/platform/x86/amd/pmf/pmf.h
> +++ b/drivers/platform/x86/amd/pmf/pmf.h
> @@ -179,6 +179,12 @@ struct amd_pmf_dev {
>   bool cnqf_enabled;
>   bool cnqf_supported;
>   struct notifier_block pwr_src_notifier;
> + /* Smart PC solution builder */
> + struct tee_context *tee_ctx;
> + struct tee_shm *fw_shm_pool;
> + u32 session_id;
> + void *shbuf;
> + bool smart_pc_enabled;
>  };
>  
>  struct apmf_sps_prop_granular {
> @@ -389,6 +395,13 @@ struct apmf_dyn_slider_output {
>   struct apmf_cnqf_power_set ps[APMF_CNQF_MAX];
>  } __packed;
>  
> +struct ta_pmf_shared_memory {
> + int command_id;
> + int resp_id;
> + u32 pmf_result;
> + u32 if_version;
> +};
> +
>  /* Core Layer */
>  int apmf_acpi_init(struct amd_pmf_dev *pmf_dev);
>  void apmf_acpi_deinit(struct amd_pmf_dev *pmf_dev);
> @@ -433,4 +446,7 @@ void amd_pmf_deinit_cnqf(struct amd_pmf_dev *dev);
>  int amd_pmf_trans_cnqf(struct amd_pmf_dev *dev, int socket_power, ktime_t 
> time_lapsed_ms);
>  extern const struct attribute_group cnqf_feature_attribute_group;
>  
> +/* Smart PC builder Layer*/
> +int amd_pmf_init_smart_pc(struct amd_pmf_dev *dev);
> +void amd_pmf_deinit_smart_pc(struct amd_pmf_dev *dev);
>  #endif /* PMF_H */
> diff --git a/drivers/platform/x86/amd/pmf/tee-if.c 
> b/drivers/platform/x86/amd/pmf/tee-if.c
> new file mode 100644
> index ..4db80ca59a11
> --- /dev/null
> +++ b/drivers/platform/x86/amd/pmf/tee-if.c
> @@ -0,0 +1,112 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * AMD Platform Management Framework Driver - TEE Interface
> + *
> + * Copyright (c) 2023, Advanced Micro Devices, Inc.
> + * All Rights Reserved.
> + *
> + * Author: Shyam Sundar S K 
> + */
> +
> +#include 
> +#include 
> +#include "pmf.h"
> +
> +#define MAX_TEE_PARAM4
> +static const uuid_t amd_pmf_ta_uuid = UUID_INIT(0x6fd93b77, 0x3fb8, 0x524d,
> + 0xb1, 0x2d, 0xc5, 0x29, 0xb1, 
> 0x3d, 0x85, 0x43);
> +
> +static int amd_pmf_amdtee_ta_match(struct tee_ioctl_version_data *ver, const 
> void *data)
>

Re: [PATCH v2 01/16] platform/x86/amd/pmf: Add PMF TEE interface

2023-10-04 Thread Ilpo Järvinen

On Sat, 30 Sep 2023, Shyam Sundar S K wrote:

> AMD PMF driver loads the PMF TA (Trusted Application) into the AMD
> ASP's (AMD Security Processor) TEE (Trusted Execution Environment).
> 
> PMF Trusted Application is a secured firmware placed under
> /lib/firmware/amdtee gets loaded only when the TEE environment is
> initialized. Add the initial code path to build these pipes.
> 
> Reviewed-by: Mario Limonciello 
> Signed-off-by: Shyam Sundar S K 
> ---
>  drivers/platform/x86/amd/pmf/Makefile |   3 +-
>  drivers/platform/x86/amd/pmf/core.c   |  11 ++-
>  drivers/platform/x86/amd/pmf/pmf.h|  16 
>  drivers/platform/x86/amd/pmf/tee-if.c | 112 ++
>  4 files changed, 138 insertions(+), 4 deletions(-)
>  create mode 100644 drivers/platform/x86/amd/pmf/tee-if.c
> 
> diff --git a/drivers/platform/x86/amd/pmf/Makefile 
> b/drivers/platform/x86/amd/pmf/Makefile
> index fdededf54392..d2746ee7369f 100644
> --- a/drivers/platform/x86/amd/pmf/Makefile
> +++ b/drivers/platform/x86/amd/pmf/Makefile
> @@ -6,4 +6,5 @@
>  
>  obj-$(CONFIG_AMD_PMF) += amd-pmf.o
>  amd-pmf-objs := core.o acpi.o sps.o \
> - auto-mode.o cnqf.o
> + auto-mode.o cnqf.o \
> + tee-if.o
> diff --git a/drivers/platform/x86/amd/pmf/core.c 
> b/drivers/platform/x86/amd/pmf/core.c
> index 78ed3ee22555..68f1389dda3e 100644
> --- a/drivers/platform/x86/amd/pmf/core.c
> +++ b/drivers/platform/x86/amd/pmf/core.c
> @@ -309,8 +309,11 @@ static void amd_pmf_init_features(struct amd_pmf_dev 
> *dev)
>   dev_dbg(dev->dev, "SPS enabled and Platform Profiles 
> registered\n");
>   }
>  
> - /* Enable Auto Mode */
> - if (is_apmf_func_supported(dev, APMF_FUNC_AUTO_MODE)) {
> + if (amd_pmf_init_smart_pc(dev)) {
> + /* Enable Smart PC Solution builder */
> + dev_dbg(dev->dev, "Smart PC Solution Enabled\n");
> + } else if (is_apmf_func_supported(dev, APMF_FUNC_AUTO_MODE)) {
> + /* Enable Auto Mode */

I'm pretty certain neither of these two comments add any information to 
what's readily visible from the code itself so they can be dropped.

>   amd_pmf_init_auto_mode(dev);
>   dev_dbg(dev->dev, "Auto Mode Init done\n");
>   } else if (is_apmf_func_supported(dev, APMF_FUNC_DYN_SLIDER_AC) ||
> @@ -330,7 +333,9 @@ static void amd_pmf_deinit_features(struct amd_pmf_dev 
> *dev)
>   amd_pmf_deinit_sps(dev);
>   }
>  
> - if (is_apmf_func_supported(dev, APMF_FUNC_AUTO_MODE)) {
> + if (dev->smart_pc_enabled) {
> + amd_pmf_deinit_smart_pc(dev);
> + } else if (is_apmf_func_supported(dev, APMF_FUNC_AUTO_MODE)) {
>   amd_pmf_deinit_auto_mode(dev);
>   } else if (is_apmf_func_supported(dev, APMF_FUNC_DYN_SLIDER_AC) ||
> is_apmf_func_supported(dev, APMF_FUNC_DYN_SLIDER_DC)) 
> {
> diff --git a/drivers/platform/x86/amd/pmf/pmf.h 
> b/drivers/platform/x86/amd/pmf/pmf.h
> index deba88e6e4c8..02460c2a31ea 100644
> --- a/drivers/platform/x86/amd/pmf/pmf.h
> +++ b/drivers/platform/x86/amd/pmf/pmf.h
> @@ -179,6 +179,12 @@ struct amd_pmf_dev {
>   bool cnqf_enabled;
>   bool cnqf_supported;
>   struct notifier_block pwr_src_notifier;
> + /* Smart PC solution builder */
> + struct tee_context *tee_ctx;
> + struct tee_shm *fw_shm_pool;
> + u32 session_id;
> + void *shbuf;
> + bool smart_pc_enabled;
>  };
>  
>  struct apmf_sps_prop_granular {
> @@ -389,6 +395,13 @@ struct apmf_dyn_slider_output {
>   struct apmf_cnqf_power_set ps[APMF_CNQF_MAX];
>  } __packed;
>  
> +struct ta_pmf_shared_memory {
> + int command_id;
> + int resp_id;
> + u32 pmf_result;
> + u32 if_version;
> +};
> +
>  /* Core Layer */
>  int apmf_acpi_init(struct amd_pmf_dev *pmf_dev);
>  void apmf_acpi_deinit(struct amd_pmf_dev *pmf_dev);
> @@ -433,4 +446,7 @@ void amd_pmf_deinit_cnqf(struct amd_pmf_dev *dev);
>  int amd_pmf_trans_cnqf(struct amd_pmf_dev *dev, int socket_power, ktime_t 
> time_lapsed_ms);
>  extern const struct attribute_group cnqf_feature_attribute_group;
>  
> +/* Smart PC builder Layer*/

Missing space.

> +int amd_pmf_init_smart_pc(struct amd_pmf_dev *dev);
> +void amd_pmf_deinit_smart_pc(struct amd_pmf_dev *dev);
>  #endif /* PMF_H */
> diff --git a/drivers/platform/x86/amd/pmf/tee-if.c 
> b/drivers/platform/x86/amd/pmf/tee-if.c
> new file mode 100644
> index ..4db80ca59a11
> --- /dev/null
> +++ b/drivers/platform/x86/amd/pmf/tee-if.c
> @@ -0,0 +1,112 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * AMD Platform Management Framework Driver - TEE Interface
> + *
> + * Copyright (c) 2023, Advanced Micro Devices, Inc.
> + * All Rights Reserved.
> + *
> + * Author: Shyam Sundar S K 
> + */
> +
> +#include 
> +#include 
> +#include "pmf.h"
> +
> +#define MAX_TEE_PARAM4
> +static const uuid_t amd_pmf_ta_uuid = UUID_INIT(0x6fd93b77, 0x3fb8, 0x524d,
> +

Re: [PATCH v2 11/16] platform/x86/amd/pmf: dump policy binary data

2023-10-04 Thread Ilpo Järvinen

On Sat, 30 Sep 2023, Shyam Sundar S K wrote:

> Sometimes policy binary retrieved from the BIOS maybe incorrect that can
> end up in failing to enable the Smart PC solution feature.
> 
> Use print_hex_dump_debug() to dump the policy binary in hex, so that we
> debug the issues related to the binary even before sending that to TA.
> 
> Signed-off-by: Shyam Sundar S K 
> ---
>  drivers/platform/x86/amd/pmf/tee-if.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/platform/x86/amd/pmf/tee-if.c 
> b/drivers/platform/x86/amd/pmf/tee-if.c
> index 01f974b55a6a..d16bdecfd43a 100644
> --- a/drivers/platform/x86/amd/pmf/tee-if.c
> +++ b/drivers/platform/x86/amd/pmf/tee-if.c
> @@ -290,6 +290,9 @@ static ssize_t amd_pmf_get_pb_data(struct file *filp, 
> const char __user *buf,
>   if (copy_from_user(dev->policy_buf, buf, dev->policy_sz))
>   return -EFAULT;
>  
> + print_hex_dump_debug("(pb):  ", DUMP_PREFIX_OFFSET, 16, 1, 
> dev->policy_buf,
> +  dev->policy_sz, false);
> +
>   ret = amd_pmf_start_policy_engine(dev);
>   if (ret)
>   return -EINVAL;
> @@ -341,6 +344,10 @@ static int amd_pmf_get_bios_buffer(struct amd_pmf_dev 
> *dev)
>   return -ENOMEM;
>  
>   memcpy(dev->policy_buf, dev->policy_base, dev->policy_sz);
> +#ifdef CONFIG_AMD_PMF_DEBUG
> + print_hex_dump_debug("(pb):  ", DUMP_PREFIX_OFFSET, 16, 1, 
> dev->policy_buf,
> +  dev->policy_sz, false);
> +#endif

Create a wrapper for print_hex_dump_debug into #ifdef and #else blocks for 
this too so you don't need the ifdef here.

-- 
 i.

Re: [PATCH v2 09/16] platform/x86/amd/pmf: Add facility to dump TA inputs

2023-10-04 Thread Ilpo Järvinen

On Sat, 30 Sep 2023, Shyam Sundar S K wrote:

> PMF driver sends constant inputs to TA which its gets via the other
> subsystems in the kernel. To debug certain TA issues knowing what inputs
> being sent to TA becomes critical. Add debug facility to the driver which
> can isolate Smart PC and TA related issues.
> 
> Also, make source_as_str() as non-static function as this helper is
> required outside of sps.c file.
> 
> Reviewed-by: Mario Limonciello 
> Signed-off-by: Shyam Sundar S K 
> ---
>  drivers/platform/x86/amd/pmf/pmf.h|  3 +++
>  drivers/platform/x86/amd/pmf/spc.c| 37 +++
>  drivers/platform/x86/amd/pmf/sps.c|  2 +-
>  drivers/platform/x86/amd/pmf/tee-if.c |  1 +
>  4 files changed, 42 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/platform/x86/amd/pmf/pmf.h 
> b/drivers/platform/x86/amd/pmf/pmf.h
> index 34778192432e..2ad5ece47601 100644
> --- a/drivers/platform/x86/amd/pmf/pmf.h
> +++ b/drivers/platform/x86/amd/pmf/pmf.h
> @@ -595,6 +595,7 @@ int apmf_get_static_slider_granular(struct amd_pmf_dev 
> *pdev,
>  bool is_pprof_balanced(struct amd_pmf_dev *pmf);
>  int amd_pmf_power_slider_update_event(struct amd_pmf_dev *dev);
>  
> +const char *source_as_str(unsigned int state);

Too generic name, add prefix to the name.

-- 
 i.

>  int apmf_update_fan_idx(struct amd_pmf_dev *pdev, bool manual, u32 idx);
>  int amd_pmf_set_sps_power_limits(struct amd_pmf_dev *pmf);
> @@ -625,4 +626,6 @@ int apmf_check_smart_pc(struct amd_pmf_dev *pmf_dev);
>  
>  /* Smart PC - TA interfaces */
>  void amd_pmf_populate_ta_inputs(struct amd_pmf_dev *dev, struct 
> ta_pmf_enact_table *in);
> +void amd_pmf_dump_ta_inputs(struct amd_pmf_dev *dev, struct 
> ta_pmf_enact_table *in);
> +
>  #endif /* PMF_H */
> diff --git a/drivers/platform/x86/amd/pmf/spc.c 
> b/drivers/platform/x86/amd/pmf/spc.c
> index 3113bde051d9..3aee78629cce 100644
> --- a/drivers/platform/x86/amd/pmf/spc.c
> +++ b/drivers/platform/x86/amd/pmf/spc.c
> @@ -14,6 +14,43 @@
>  #include 
>  #include "pmf.h"
>  
> +#ifdef CONFIG_AMD_PMF_DEBUG
> +static const char *ta_slider_as_str(unsigned int state)
> +{
> + switch (state) {
> + case TA_BEST_PERFORMANCE:
> + return "PERFORMANCE";
> + case TA_BETTER_PERFORMANCE:
> + return "BALANCED";
> + case TA_BEST_BATTERY:
> + return "POWER_SAVER";
> + default:
> + return "Unknown TA Slider State";
> + }
> +}
> +
> +void amd_pmf_dump_ta_inputs(struct amd_pmf_dev *dev, struct 
> ta_pmf_enact_table *in)
> +{
> + dev_dbg(dev->dev, " TA inputs START \n");
> + dev_dbg(dev->dev, "Slider State : %s\n", 
> ta_slider_as_str(in->ev_info.power_slider));
> + dev_dbg(dev->dev, "Power Source : %s\n", 
> source_as_str(in->ev_info.power_source));
> + dev_dbg(dev->dev, "Battery Percentage : %u\n", 
> in->ev_info.bat_percentage);
> + dev_dbg(dev->dev, "Designed Battery Capacity : %u\n", 
> in->ev_info.bat_design);
> + dev_dbg(dev->dev, "Fully Charged Capacity : %u\n", 
> in->ev_info.full_charge_capacity);
> + dev_dbg(dev->dev, "Drain Rate : %d\n", in->ev_info.drain_rate);
> + dev_dbg(dev->dev, "Socket Power : %u\n", in->ev_info.socket_power);
> + dev_dbg(dev->dev, "Skin Temperature : %u\n", 
> in->ev_info.skin_temperature);
> + dev_dbg(dev->dev, "Avg C0 Residency : %u\n", 
> in->ev_info.avg_c0residency);
> + dev_dbg(dev->dev, "Max C0 Residency : %u\n", 
> in->ev_info.max_c0residency);
> + dev_dbg(dev->dev, "GFX Busy : %u\n", in->ev_info.gfx_busy);
> + dev_dbg(dev->dev, "Connected Display Count : %u\n", 
> in->ev_info.monitor_count);
> + dev_dbg(dev->dev, "LID State : %s\n", in->ev_info.lid_state ? "Close" : 
> "Open");
> + dev_dbg(dev->dev, " TA inputs END \n");
> +}
> +#else
> +void amd_pmf_dump_ta_inputs(struct amd_pmf_dev *dev, struct 
> ta_pmf_enact_table *in) {}
> +#endif
> +
>  static void amd_pmf_get_smu_info(struct amd_pmf_dev *dev, struct 
> ta_pmf_enact_table *in)
>  {
>   u16 max, avg = 0;
> diff --git a/drivers/platform/x86/amd/pmf/sps.c 
> b/drivers/platform/x86/amd/pmf/sps.c
> index a70e67749be3..13e36b52dfe8 100644
> --- a/drivers/platform/x86/amd/pmf/sps.c
> +++ b/drivers/platform/x86/amd/pmf/sps.c
> @@ -27,7 +27,7 @@ static const char *slider_as_str(unsigned int state)
>   }
>  }
>  
> -static const char *source_as_str(unsigned int state)
> +const char *source_as_str(unsigned int state)
>  {
>   switch (state) {
>   case POWER_SOURCE_AC:
> diff --git a/drivers/platform/x86/amd/pmf/tee-if.c 
> b/drivers/platform/x86/amd/pmf/tee-if.c
> index 961011530c1b..b0711b2f8c8f 100644
> --- a/drivers/platform/x86/amd/pmf/tee-if.c
> +++ b/drivers/platform/x86/amd/pmf/tee-if.c
> @@ -187,6 +187,7 @@ static int amd_pmf_invoke_cmd_enact(struct amd_pmf_dev 
> *dev)
>   }
>  
>   if (ta_sm->pmf_result == TA_PMF_TYPE_SUCCESS && out->actions_count) {
> + amd_pmf_dump_ta_inputs(dev, in);
>   dev_dbg(de

Re: [PATCH v2 08/16] platform/x86/amd/pmf: Add support to update system state

2023-10-04 Thread Ilpo Järvinen

On Sat, 30 Sep 2023, Shyam Sundar S K wrote:

> PMF driver based on the output actions from the TA can request to update
> the system states like entering s0i3, lock screen etc. by generating
> an uevent. Based on the udev rules set in the userspace the event id
> matching the uevent shall get updated accordingly using the systemctl.
> 
> Sample udev rules under Documentation/admin-guide/pmf.rst.
> 
> Reported-by: kernel test robot 
> Closes: 
> https://lore.kernel.org/oe-kbuild-all/202309260515.5xbr6n0g-...@intel.com/

Please don't put lkp tags for patches that are still under development 
(even if the email you get misleadingly instructs you to). Only use them 
when you fix code that's already in tree based on LKP's report.

> Signed-off-by: Shyam Sundar S K 
> ---
>  Documentation/admin-guide/index.rst   |  1 +
>  Documentation/admin-guide/pmf.rst | 25 
>  drivers/platform/x86/amd/pmf/pmf.h|  9 ++
>  drivers/platform/x86/amd/pmf/tee-if.c | 41 ++-
>  4 files changed, 75 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/admin-guide/pmf.rst
> 
> diff --git a/Documentation/admin-guide/index.rst 
> b/Documentation/admin-guide/index.rst
> index 43ea35613dfc..fb40a1f6f79e 100644
> --- a/Documentation/admin-guide/index.rst
> +++ b/Documentation/admin-guide/index.rst
> @@ -119,6 +119,7 @@ configure specific aspects of kernel behavior to your 
> liking.
> parport
> perf-security
> pm/index
> +   pmf
> pnp
> rapidio
> ras
> diff --git a/Documentation/admin-guide/pmf.rst 
> b/Documentation/admin-guide/pmf.rst
> new file mode 100644
> index ..90072add511e
> --- /dev/null
> +++ b/Documentation/admin-guide/pmf.rst
> @@ -0,0 +1,25 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +Set udev rules for PMF Smart PC Builder
> +---
> +
> +AMD PMF(Platform Management Framework) Smart PC Solution builder has to set 
> the system states
> +like S0i3, Screen lock, hibernate etc, based on the output actions provided 
> by the PMF
> +TA (Trusted Application).
> +
> +In order for this to work the PMF driver generates a uevent for userspace to 
> react to. Below are
> +sample udev rules that can facilitate this experience when a machine has PMF 
> Smart PC solution builder
> +enabled.
> +
> +Please add the following line(s) to
> +``/etc/udev/rules.d/99-local.rules``::
> +
> +DRIVERS=="amd-pmf", ACTION=="change", ENV{EVENT_ID}=="1", 
> RUN+="/usr/bin/systemctl suspend"
> +DRIVERS=="amd-pmf", ACTION=="change", ENV{EVENT_ID}=="2", 
> RUN+="/usr/bin/systemctl hibernate"
> +DRIVERS=="amd-pmf", ACTION=="change", ENV{EVENT_ID}=="3", 
> RUN+="/bin/loginctl lock-sessions"
> +
> +EVENT_ID values:
> +1= Put the system to S0i3/S2Idle
> +2= Put the system to hibernate
> +3= Lock the screen
> +
> diff --git a/drivers/platform/x86/amd/pmf/pmf.h 
> b/drivers/platform/x86/amd/pmf/pmf.h
> index d5e410c41e81..34778192432e 100644
> --- a/drivers/platform/x86/amd/pmf/pmf.h
> +++ b/drivers/platform/x86/amd/pmf/pmf.h
> @@ -73,6 +73,7 @@
>  #define PMF_POLICY_STT_MIN   6
>  #define PMF_POLICY_STT_SKINTEMP_APU  7
>  #define PMF_POLICY_STT_SKINTEMP_HS2  8
> +#define PMF_POLICY_SYSTEM_STATE  9
>  #define PMF_POLICY_P3T   38
>  
>  /* TA macros */
> @@ -439,6 +440,13 @@ struct apmf_dyn_slider_output {
>  } __packed;
>  
>  /* Smart PC - TA internals */
> +enum system_state {
> + SYSTEM_STATE__S0i3 = 1,
> + SYSTEM_STATE__S4,
> + SYSTEM_STATE__SCREEN_LOCK,
> + SYSTEM_STATE__MAX
> +};
> +
>  enum ta_slider {
>   TA_BEST_BATTERY, /* Best Battery */
>   TA_BETTER_BATTERY, /* Better Battery */
> @@ -470,6 +478,7 @@ enum ta_pmf_error_type {
>  };
>  
>  struct pmf_action_table {
> + enum system_state system_state;
>   unsigned long spl; /* in mW */
>   unsigned long sppt; /* in mW */
>   unsigned long sppt_apuonly; /* in mW */
> diff --git a/drivers/platform/x86/amd/pmf/tee-if.c 
> b/drivers/platform/x86/amd/pmf/tee-if.c
> index 315e3d2eacdf..961011530c1b 100644
> --- a/drivers/platform/x86/amd/pmf/tee-if.c
> +++ b/drivers/platform/x86/amd/pmf/tee-if.c
> @@ -24,6 +24,20 @@ MODULE_PARM_DESC(pb_actions_ms, "Policy binary actions 
> sampling frequency (defau
>  static const uuid_t amd_pmf_ta_uuid = UUID_INIT(0x6fd93b77, 0x3fb8, 0x524d,
>   0xb1, 0x2d, 0xc5, 0x29, 0xb1, 
> 0x3d, 0x85, 0x43);
>  
> +static const char *amd_pmf_uevent_as_str(unsigned int state)
> +{
> + switch (state) {
> + case SYSTEM_STATE__S0i3:
> + return "S0i3";
> + case SYSTEM_STATE__S4:
> + return "S4";
> + case SYSTEM_STATE__SCREEN_LOCK:
> + return "SCREEN_LOCK";
> + default:
> + return "Unknown Smart PC event";
> + }
> +}
> +
>

Re: [PATCH 1/5] drm/amd/display: Remove migrate_en/dis from dc_fpu_begin().

2023-10-04 Thread Sebastian Andrzej Siewior

On 2023-10-03 15:53:41 [-0400], Harry Wentland wrote:
> On 2023-09-21 10:15, Sebastian Andrzej Siewior wrote:
> > This is a revert of the commit mentioned below while it is not wrong, as
> > in the kernel will explode, having migrate_disable() here it is
> > complete waste of resources.
> > 
> > Additionally commit message is plain wrong the review tag does not make
> 
> Not sure I follow what's unhelpful about the review tag with
> 0c316556d1249 ("drm/amd/display: Disable migration to ensure consistency of 
> per-CPU variable")

I explained it below with two points what the reviewer should have
noticed why reading the commit message even if he does not know what
migrate_disable() itself does.

> I do wish the original patch showed the splat it's attempting
> to fix. It apparently made a difference for something, whether
> inadvertently or not. I wish I knew what that "something" was.

As far as I can tell the patch does make a difference.

> Harry

Sebastian

Re: [PATCH v2 04/16] platform/x86/amd/pmf: Add support for PMF Policy Binary

2023-10-04 Thread Ilpo Järvinen

On Sat, 30 Sep 2023, Shyam Sundar S K wrote:

> PMF Policy binary is a encrypted and signed binary that will be part
> of the BIOS. PMF driver via the ACPI interface checks the existence
> of Smart PC bit. If the advertised bit is found, PMF driver walks
> the acpi namespace to find out the policy binary size and the address
> which has to be passed to the TA during the TA init sequence.
> 
> The policy binary is comprised of inputs (or the events) and outputs
> (or the actions). With the PMF ecosystem, OEMs generate the policy
> binary (or could be multiple binaries) that contains a supported set
> of inputs and outputs which could be specifically carved out for each
> usage segment (or for each user also) that could influence the system
> behavior either by enriching the user experience or/and boost/throttle
> power limits.
> 
> Once the TA init command succeeds, the PMF driver sends the changing
> events in the current environment to the TA for a constant sampling
> frequency time (the event here could be a lid close or open) and
> if the policy binary has corresponding action built within it, the
> TA sends the action for it in the subsequent enact command.
> 
> If the inputs sent to the TA has no output defined in the policy
> binary generated by OEMs, there will be no action to be performed
> by the PMF driver.
> 
> Example policies:
> 
> 1) if slider is performance ; set the SPL to 40W
> Here PMF driver registers with the platform profile interface and
> when the slider position is changed, PMF driver lets the TA know
> about this. TA sends back an action to update the Sustained
> Power Limit (SPL). PMF driver updates this limit via the PMFW mailbox.
> 
> 2) if user_away ; then lock the system
> Here PMF driver hooks to the AMD SFH driver to know the user presence
> and send the inputs to TA and if the condition is met, the TA sends
> the action of locking the system. PMF driver generates a uevent and
> based on the udev rule in the userland the system gets locked with
> systemctl.
> 
> The intent here is to provide the OEM's to make a policy to lock the
> system when the user is away ; but the userland can make a choice to
> ignore it.
> 
> and so on.
> 
> The OEMs will have an utility to create numerous such policies and
> the policies shall be reviewed by AMD before signing and encrypting
> them. Policies are shared between operating systems to have seemless user
> experience.
> 
> Since all this action has to happen via the "amdtee" driver, currently
> there is no caller for it in the kernel which can load the amdtee driver.
> Without amdtee driver loading onto the system the "tee" calls shall fail
> from the PMF driver. Hence an explicit "request_module" has been added
> to address this.
> 
> Signed-off-by: Shyam Sundar S K 
> ---
>  drivers/platform/x86/amd/pmf/Kconfig  |   1 +
>  drivers/platform/x86/amd/pmf/acpi.c   |  37 +++
>  drivers/platform/x86/amd/pmf/core.c   |  12 +++
>  drivers/platform/x86/amd/pmf/pmf.h| 135 
>  drivers/platform/x86/amd/pmf/tee-if.c | 141 +-
>  5 files changed, 324 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/platform/x86/amd/pmf/Kconfig 
> b/drivers/platform/x86/amd/pmf/Kconfig
> index 3064bc8ea167..437b78c6d1c5 100644
> --- a/drivers/platform/x86/amd/pmf/Kconfig
> +++ b/drivers/platform/x86/amd/pmf/Kconfig
> @@ -9,6 +9,7 @@ config AMD_PMF
>   depends on POWER_SUPPLY
>   depends on AMD_NB
>   select ACPI_PLATFORM_PROFILE
> + depends on AMDTEE
>   help
> This driver provides support for the AMD Platform Management 
> Framework.
> The goal is to enhance end user experience by making AMD PCs smarter,
> diff --git a/drivers/platform/x86/amd/pmf/acpi.c 
> b/drivers/platform/x86/amd/pmf/acpi.c
> index 3fc5e4547d9f..d0512af2cd42 100644
> --- a/drivers/platform/x86/amd/pmf/acpi.c
> +++ b/drivers/platform/x86/amd/pmf/acpi.c
> @@ -286,6 +286,43 @@ int apmf_install_handler(struct amd_pmf_dev *pmf_dev)
>   return 0;
>  }
>  
> +static acpi_status apmf_walk_resources(struct acpi_resource *res, void *data)
> +{
> + struct amd_pmf_dev *dev = data;
> +
> + switch (res->type) {
> + case ACPI_RESOURCE_TYPE_ADDRESS64:
> + dev->policy_addr = res->data.address64.address.minimum;
> + dev->policy_sz = res->data.address64.address.address_length;
> + break;
> + case ACPI_RESOURCE_TYPE_FIXED_MEMORY32:
> + dev->policy_addr = res->data.fixed_memory32.address;
> + dev->policy_sz = res->data.fixed_memory32.address_length;
> + break;
> + }
> +
> + if (!dev->policy_addr || dev->policy_sz > POLICY_BUF_MAX_SZ || 
> dev->policy_sz == 0) {
> + pr_err("Incorrect Policy params, possibly a SBIOS bug\n");
> + return AE_ERROR;
> + }
> +
> + return AE_OK;
> +}
> +
> +int apmf_check_smart_pc(struct amd_pmf_dev *pmf_dev)
> +{
> + acpi_handle ahandle = ACPI_HANDLE(pmf_dev->

Re: [PATCH 0/5] drm/amd/display: Remove migrate-disable and move memory allocation.

2023-10-04 Thread Sebastian Andrzej Siewior

On 2023-10-03 15:54:58 [-0400], Harry Wentland wrote:
> On 2023-10-02 06:58, Sebastian Andrzej Siewior wrote:
> > On 2023-09-22 07:33:26 [+0200], Christian König wrote:
> >> Am 21.09.23 um 16:15 schrieb Sebastian Andrzej Siewior:
> >>> Hi,
> >>>
> >>> I stumbled uppon the amdgpu driver via a bugzilla report. The actual fix
> >>> is #4 + #5 and the rest was made while looking at the code.
> >>
> >> Oh, yes please :)
> >>
> >> Rodrigo and I have been trying to sort those things out previously, but
> >> that's Sisyphean work.
> >>
> >> In general the DC team needs to judge, but of hand it looks good to me.
> > 
> > Any way to get this merged? There was no reply from the DC team… No
> > reply from the person breaking it either. The bugzilla reporter stated
> > that it solves his trouble. He didn't report anything new ;)
> > 
> 
> Apologies for the slow progress. We're feeding it through our CI and
> will let you know the verdict soon.
> 
> Do you happen to have the bugzilla link that this is fixing? It would
> be helpful to include that as a link in the patches as well, to give
> them context.
The bugzilla report is at
  https://bugzilla.kernel.org/show_bug.cgi?id=217928

but the patches explain the situation, too. Even more verbose than the
report…

> Harry

Sebastian

Re: [PATCH v2 03/16] platform/x86/amd/pmf: Change return type of amd_pmf_set_dram_addr()

2023-10-04 Thread Ilpo Järvinen

On Sat, 30 Sep 2023, Shyam Sundar S K wrote:

> In the current code, the metrics table information was required only
> for auto-mode or CnQF at a given time. Hence keeping the return type
> of amd_pmf_set_dram_addr() as static made sense.
> 
> But with the addition of Smart PC builder feature, the metrics table
> information has to be shared by the Smart PC also and this feature
> resides outside of core.c.
> 
> To make amd_pmf_set_dram_addr() visible outside of core.c make it
> as a non-static function and move the allocation of memory for
> metrics table from amd_pmf_init_metrics_table() to amd_pmf_set_dram_addr()
> as amd_pmf_set_dram_addr() is the common function to set the DRAM
> address.
> 
> Reviewed-by: Mario Limonciello 
> Signed-off-by: Shyam Sundar S K 
> ---
>  drivers/platform/x86/amd/pmf/core.c | 26 ++
>  drivers/platform/x86/amd/pmf/pmf.h  |  1 +
>  2 files changed, 19 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/platform/x86/amd/pmf/core.c 
> b/drivers/platform/x86/amd/pmf/core.c
> index 68f1389dda3e..678dce4fea08 100644
> --- a/drivers/platform/x86/amd/pmf/core.c
> +++ b/drivers/platform/x86/amd/pmf/core.c
> @@ -251,29 +251,35 @@ static const struct pci_device_id pmf_pci_ids[] = {
>   { }
>  };
>  
> -static void amd_pmf_set_dram_addr(struct amd_pmf_dev *dev)
> +int amd_pmf_set_dram_addr(struct amd_pmf_dev *dev)
>  {
>   u64 phys_addr;
>   u32 hi, low;
>  
> + /* Get Metrics Table Address */
> + dev->buf = kzalloc(sizeof(dev->m_table), GFP_KERNEL);
> + if (!dev->buf)
> + return -ENOMEM;
> +
>   phys_addr = virt_to_phys(dev->buf);
>   hi = phys_addr >> 32;
>   low = phys_addr & GENMASK(31, 0);
>  
>   amd_pmf_send_cmd(dev, SET_DRAM_ADDR_HIGH, 0, hi, NULL);
>   amd_pmf_send_cmd(dev, SET_DRAM_ADDR_LOW, 0, low, NULL);
> +
> + return 0;
>  }
>  
>  int amd_pmf_init_metrics_table(struct amd_pmf_dev *dev)
>  {
> - /* Get Metrics Table Address */
> - dev->buf = kzalloc(sizeof(dev->m_table), GFP_KERNEL);
> - if (!dev->buf)
> - return -ENOMEM;
> + int ret;
>  
>   INIT_DELAYED_WORK(&dev->work_buffer, amd_pmf_get_metrics);
>  
> - amd_pmf_set_dram_addr(dev);
> + ret = amd_pmf_set_dram_addr(dev);
> + if (ret)
> + return ret;
>  
>   /*
>* Start collecting the metrics data after a small delay
> @@ -287,9 +293,13 @@ int amd_pmf_init_metrics_table(struct amd_pmf_dev *dev)
>  static int amd_pmf_resume_handler(struct device *dev)
>  {
>   struct amd_pmf_dev *pdev = dev_get_drvdata(dev);
> + int ret;
>  
> - if (pdev->buf)
> - amd_pmf_set_dram_addr(pdev);
> + if (pdev->buf) {
> + ret = amd_pmf_set_dram_addr(pdev);

Won't this now leak the previous ->buf?

> + if (ret)
> + return ret;
> + }
>  
>   return 0;
>  }
> diff --git a/drivers/platform/x86/amd/pmf/pmf.h 
> b/drivers/platform/x86/amd/pmf/pmf.h
> index e0837799f521..3930b8ed8333 100644
> --- a/drivers/platform/x86/amd/pmf/pmf.h
> +++ b/drivers/platform/x86/amd/pmf/pmf.h
> @@ -421,6 +421,7 @@ int amd_pmf_init_metrics_table(struct amd_pmf_dev *dev);
>  int amd_pmf_get_power_source(void);
>  int apmf_install_handler(struct amd_pmf_dev *pmf_dev);
>  int apmf_os_power_slider_update(struct amd_pmf_dev *dev, u8 flag);
> +int amd_pmf_set_dram_addr(struct amd_pmf_dev *dev);
>  
>  /* SPS Layer */
>  int amd_pmf_get_pprof_modes(struct amd_pmf_dev *pmf);
> 

-- 
 i.

Re: [PATCH v2 12/16] platform/x86/amd/pmf: Add PMF-AMDGPU get interface

2023-10-04 Thread Ilpo Järvinen

On Sat, 30 Sep 2023, Shyam Sundar S K wrote:

> In order to provide GPU inputs to TA for the Smart PC solution to work, we
> need to have interface between the PMF driver and the AMDGPU driver.
> 
> Add the initial code path for get interface from AMDGPU.
> 
> Co-developed-by: Mario Limonciello 
> Signed-off-by: Mario Limonciello 
> Signed-off-by: Shyam Sundar S K 

> @@ -355,6 +356,21 @@ static int amd_pmf_get_bios_buffer(struct amd_pmf_dev 
> *dev)
>   return amd_pmf_start_policy_engine(dev);
>  }
>  
> +static int amd_pmf_get_gpu_handle(struct pci_dev *pdev, void *data)
> +{
> + struct amd_pmf_dev *dev = data;
> +
> + if (pdev->vendor == PCI_VENDOR_ID_ATI && pdev->devfn == 0) {
> + /* get the amdgpu handle from the pci root after walking 
> through the pci bus */

I can see from the code that you assign to amdgpu handle so this comment 
added no information.

It doesn't really answer at all why you're doing this second step. Based 
on the give parameters to pci_get_device(), it looks as if you're asking 
for the same device you already have in pdev to be searched to you.

> + dev->gfx_data.gpu_dev = pci_get_device(pdev->vendor, 
> pdev->device, NULL);
> + if (dev->gfx_data.gpu_dev) {
> + pci_dev_put(pdev);
> + return 1; /* stop walking */
> + }
> + }
> + return 0; /* continue walking */
> +}
> +
>  static int amd_pmf_amdtee_ta_match(struct tee_ioctl_version_data *ver, const 
> void *data)
>  {
>   return ver->impl_id == TEE_IMPL_ID_AMDTEE;
> @@ -451,6 +467,15 @@ int amd_pmf_init_smart_pc(struct amd_pmf_dev *dev)
>   INIT_DELAYED_WORK(&dev->pb_work, amd_pmf_invoke_cmd);
>   amd_pmf_set_dram_addr(dev);
>   amd_pmf_get_bios_buffer(dev);
> +
> + /* get amdgpu handle */
> + pci_walk_bus(dev->root->bus, amd_pmf_get_gpu_handle, dev);
> + if (!dev->gfx_data.gpu_dev)
> + dev_err(dev->dev, "GPU handle not found!\n");
> +
> + if (!amd_pmf_gpu_init(&dev->gfx_data))
> + dev->gfx_data.gpu_dev_en = true;
> +


-- 
 i.

Re: [PATCH v2 06/16] platform/x86/amd/pmf: Add support to get inputs from other subsystems

2023-10-04 Thread Ilpo Järvinen

On Sat, 30 Sep 2023, Shyam Sundar S K wrote:

> PMF driver sends changing inputs from each subystem to TA for evaluating
> the conditions in the policy binary.
> 
> Add initial support of plumbing in the PMF driver for Smart PC to get
> information from other subsystems in the kernel.
> 
> Signed-off-by: Shyam Sundar S K 
> ---
>  drivers/platform/x86/amd/pmf/Makefile |   2 +-
>  drivers/platform/x86/amd/pmf/pmf.h|  18 
>  drivers/platform/x86/amd/pmf/spc.c| 119 ++
>  drivers/platform/x86/amd/pmf/tee-if.c |   3 +
>  4 files changed, 141 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/platform/x86/amd/pmf/spc.c
> 
> diff --git a/drivers/platform/x86/amd/pmf/Makefile 
> b/drivers/platform/x86/amd/pmf/Makefile
> index d2746ee7369f..6b26e48ce8ad 100644
> --- a/drivers/platform/x86/amd/pmf/Makefile
> +++ b/drivers/platform/x86/amd/pmf/Makefile
> @@ -7,4 +7,4 @@
>  obj-$(CONFIG_AMD_PMF) += amd-pmf.o
>  amd-pmf-objs := core.o acpi.o sps.o \
>   auto-mode.o cnqf.o \
> - tee-if.o
> + tee-if.o spc.o
> diff --git a/drivers/platform/x86/amd/pmf/pmf.h 
> b/drivers/platform/x86/amd/pmf/pmf.h
> index 6f4b6f4ecee4..60b11455dadf 100644
> --- a/drivers/platform/x86/amd/pmf/pmf.h
> +++ b/drivers/platform/x86/amd/pmf/pmf.h
> @@ -149,6 +149,21 @@ struct smu_pmf_metrics {
>   u16 infra_gfx_maxfreq; /* in MHz */
>   u16 skin_temp; /* in centi-Celsius */
>   u16 device_state;
> + u16 curtemp; /* in centi-Celsius */
> + u16 filter_alpha_value;
> + u16 avg_gfx_clkfrequency;
> + u16 avg_fclk_frequency;
> + u16 avg_gfx_activity;
> + u16 avg_socclk_frequency;
> + u16 avg_vclk_frequency;
> + u16 avg_vcn_activity;
> + u16 avg_dram_reads;
> + u16 avg_dram_writes;
> + u16 avg_socket_power;
> + u16 avg_core_power[2];
> + u16 avg_core_c0residency[16];
> + u16 spare1;
> + u32 metrics_counter;
>  } __packed;
>  
>  enum amd_stt_skin_temp {
> @@ -595,4 +610,7 @@ extern const struct attribute_group 
> cnqf_feature_attribute_group;
>  int amd_pmf_init_smart_pc(struct amd_pmf_dev *dev);
>  void amd_pmf_deinit_smart_pc(struct amd_pmf_dev *dev);
>  int apmf_check_smart_pc(struct amd_pmf_dev *pmf_dev);
> +
> +/* Smart PC - TA interfaces */
> +void amd_pmf_populate_ta_inputs(struct amd_pmf_dev *dev, struct 
> ta_pmf_enact_table *in);
>  #endif /* PMF_H */
> diff --git a/drivers/platform/x86/amd/pmf/spc.c 
> b/drivers/platform/x86/amd/pmf/spc.c
> new file mode 100644
> index ..3113bde051d9
> --- /dev/null
> +++ b/drivers/platform/x86/amd/pmf/spc.c
> @@ -0,0 +1,119 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * AMD Platform Management Framework Driver - Smart PC Capabilities
> + *
> + * Copyright (c) 2023, Advanced Micro Devices, Inc.
> + * All Rights Reserved.
> + *
> + * Authors: Shyam Sundar S K 
> + *  Patil Rajesh Reddy 
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include "pmf.h"
> +
> +static void amd_pmf_get_smu_info(struct amd_pmf_dev *dev, struct 
> ta_pmf_enact_table *in)
> +{
> + u16 max, avg = 0;
> + int i;
> +
> + memset(dev->buf, 0, sizeof(dev->m_table));
> + amd_pmf_send_cmd(dev, SET_TRANSFER_TABLE, 0, 7, NULL);
> + memcpy(&dev->m_table, dev->buf, sizeof(dev->m_table));
> +
> + in->ev_info.socket_power = dev->m_table.apu_power + 
> dev->m_table.dgpu_power;
> + in->ev_info.skin_temperature = dev->m_table.skin_temp;
> +
> + /* get the avg C0 residency of all the cores */
> + for (i = 0; i < ARRAY_SIZE(dev->m_table.avg_core_c0residency); i++)
> + avg += dev->m_table.avg_core_c0residency[i];
> +
> + /* get the max C0 residency of all the cores */
> + max = dev->m_table.avg_core_c0residency[0];
> + for (i = 1; i < ARRAY_SIZE(dev->m_table.avg_core_c0residency); i++) {
> + if (dev->m_table.avg_core_c0residency[i] > max)
> + max = dev->m_table.avg_core_c0residency[i];
> + }

My comments weren't either answered adequately or changes made here.
Please check the v1 comments. I hope it's not because you feel hurry to 
get the next version out...

I'm still unsure if the u16 thing can overflow because I don't know what's 
the max value for avg_core_c0residency[i].

> +
> + in->ev_info.avg_c0residency = avg / 
> ARRAY_SIZE(dev->m_table.avg_core_c0residency);
> + in->ev_info.max_c0residency = max;
> + in->ev_info.gfx_busy = dev->m_table.avg_gfx_activity;
> +}
> +
> +static const char * const pmf_battery_supply_name[] = {
> + "BATT",
> + "BAT0",
> +};
> +
> +static int get_battery_prop(enum power_supply_property prop)
> +{
> + union power_supply_propval value;
> + struct power_supply *psy;
> + int i, ret = -EINVAL;
> +
> + for (i = 0; i < ARRAY_SIZE(pmf_battery_supply_name); i++) {
> + psy = power_supply_get_by_name(pmf_battery_supply_name[i]);
> + if (!psy)
> + continue;
> +
> +

Re: [PATCH v2 02/16] platform/x86/amd/pmf: Add support PMF-TA interaction

2023-10-04 Thread Ilpo Järvinen

On Sat, 30 Sep 2023, Shyam Sundar S K wrote:

> PMF TA (Trusted Application) loads via the TEE environment into the
> AMD ASP.
> 
> PMF-TA supports two commands:
> 1) Init: Initialize the TA with the PMF Smart PC policy binary and
> start the policy engine. A policy is a combination of inputs and
> outputs, where;
>  - the inputs are the changing dynamics of the system like the user
>behaviour, system heuristics etc.
>  - the outputs, which are the actions to be set on the system which
>lead to better power management and enhanced user experience.
> 
> PMF driver acts as a central manager in this case to supply the
> inputs required to the TA (either by getting the information from
> the other kernel subsystems or from userland)
> 
> 2) Enact: Enact the output actions from the TA. The action could be
> applying a new thermal limit to boost/throttle the power limits or
> change system behavior.
> 
> Reviewed-by: Mario Limonciello 
> Signed-off-by: Shyam Sundar S K 
> ---
>  drivers/platform/x86/amd/pmf/pmf.h| 10 +++
>  drivers/platform/x86/amd/pmf/tee-if.c | 97 ++-
>  2 files changed, 106 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/platform/x86/amd/pmf/pmf.h 
> b/drivers/platform/x86/amd/pmf/pmf.h
> index 02460c2a31ea..e0837799f521 100644
> --- a/drivers/platform/x86/amd/pmf/pmf.h
> +++ b/drivers/platform/x86/amd/pmf/pmf.h
> @@ -59,6 +59,9 @@
>  #define ARG_NONE 0
>  #define AVG_SAMPLE_SIZE 3
>  
> +/* TA macros */
> +#define PMF_TA_IF_VERSION_MAJOR  1
> +
>  /* AMD PMF BIOS interfaces */
>  struct apmf_verify_interface {
>   u16 size;
> @@ -184,6 +187,7 @@ struct amd_pmf_dev {
>   struct tee_shm *fw_shm_pool;
>   u32 session_id;
>   void *shbuf;
> + struct delayed_work pb_work;
>   bool smart_pc_enabled;
>  };
>  
> @@ -395,6 +399,12 @@ struct apmf_dyn_slider_output {
>   struct apmf_cnqf_power_set ps[APMF_CNQF_MAX];
>  } __packed;
>  
> +/* cmd ids for TA communication */
> +enum ta_pmf_command {
> + TA_PMF_COMMAND_POLICY_BUILDER_INITIALIZE,
> + TA_PMF_COMMAND_POLICY_BUILDER_ENACT_POLICIES,
> +};
> +
>  struct ta_pmf_shared_memory {
>   int command_id;
>   int resp_id;
> diff --git a/drivers/platform/x86/amd/pmf/tee-if.c 
> b/drivers/platform/x86/amd/pmf/tee-if.c
> index 4db80ca59a11..1b3985cd7c08 100644
> --- a/drivers/platform/x86/amd/pmf/tee-if.c
> +++ b/drivers/platform/x86/amd/pmf/tee-if.c
> @@ -13,9 +13,96 @@
>  #include "pmf.h"
>  
>  #define MAX_TEE_PARAM4
> +
> +/* Policy binary actions sampling frequency (in ms) */
> +static int pb_actions_ms = 1000;

MSEC_PER_SEC (from #include , don't include the vdso one).

> +#ifdef CONFIG_AMD_PMF_DEBUG
> +module_param(pb_actions_ms, int, 0644);
> +MODULE_PARM_DESC(pb_actions_ms, "Policy binary actions sampling frequency 
> (default = 1000ms)");
> +#endif

-- 
 i.

Re: [PATCH v2 02/16] platform/x86/amd/pmf: Add support PMF-TA interaction

2023-10-04 Thread Ilpo Järvinen

On Sat, 30 Sep 2023, Shyam Sundar S K wrote:

> PMF TA (Trusted Application) loads via the TEE environment into the
> AMD ASP.
> 
> PMF-TA supports two commands:
> 1) Init: Initialize the TA with the PMF Smart PC policy binary and
> start the policy engine. A policy is a combination of inputs and
> outputs, where;
>  - the inputs are the changing dynamics of the system like the user
>behaviour, system heuristics etc.
>  - the outputs, which are the actions to be set on the system which
>lead to better power management and enhanced user experience.
> 
> PMF driver acts as a central manager in this case to supply the
> inputs required to the TA (either by getting the information from
> the other kernel subsystems or from userland)
> 
> 2) Enact: Enact the output actions from the TA. The action could be
> applying a new thermal limit to boost/throttle the power limits or
> change system behavior.
> 
> Reviewed-by: Mario Limonciello 
> Signed-off-by: Shyam Sundar S K 
> ---
>  drivers/platform/x86/amd/pmf/pmf.h| 10 +++
>  drivers/platform/x86/amd/pmf/tee-if.c | 97 ++-
>  2 files changed, 106 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/platform/x86/amd/pmf/pmf.h 
> b/drivers/platform/x86/amd/pmf/pmf.h
> index 02460c2a31ea..e0837799f521 100644
> --- a/drivers/platform/x86/amd/pmf/pmf.h
> +++ b/drivers/platform/x86/amd/pmf/pmf.h
> @@ -59,6 +59,9 @@
>  #define ARG_NONE 0
>  #define AVG_SAMPLE_SIZE 3
>  
> +/* TA macros */
> +#define PMF_TA_IF_VERSION_MAJOR  1
> +
>  /* AMD PMF BIOS interfaces */
>  struct apmf_verify_interface {
>   u16 size;
> @@ -184,6 +187,7 @@ struct amd_pmf_dev {
>   struct tee_shm *fw_shm_pool;
>   u32 session_id;
>   void *shbuf;
> + struct delayed_work pb_work;
>   bool smart_pc_enabled;
>  };
>  
> @@ -395,6 +399,12 @@ struct apmf_dyn_slider_output {
>   struct apmf_cnqf_power_set ps[APMF_CNQF_MAX];
>  } __packed;
>  
> +/* cmd ids for TA communication */
> +enum ta_pmf_command {
> + TA_PMF_COMMAND_POLICY_BUILDER_INITIALIZE,
> + TA_PMF_COMMAND_POLICY_BUILDER_ENACT_POLICIES,
> +};
> +
>  struct ta_pmf_shared_memory {
>   int command_id;
>   int resp_id;
> diff --git a/drivers/platform/x86/amd/pmf/tee-if.c 
> b/drivers/platform/x86/amd/pmf/tee-if.c
> index 4db80ca59a11..1b3985cd7c08 100644
> --- a/drivers/platform/x86/amd/pmf/tee-if.c
> +++ b/drivers/platform/x86/amd/pmf/tee-if.c
> @@ -13,9 +13,96 @@
>  #include "pmf.h"
>  
>  #define MAX_TEE_PARAM4
> +
> +/* Policy binary actions sampling frequency (in ms) */
> +static int pb_actions_ms = 1000;
> +#ifdef CONFIG_AMD_PMF_DEBUG
> +module_param(pb_actions_ms, int, 0644);
> +MODULE_PARM_DESC(pb_actions_ms, "Policy binary actions sampling frequency 
> (default = 1000ms)");
> +#endif
> +
>  static const uuid_t amd_pmf_ta_uuid = UUID_INIT(0x6fd93b77, 0x3fb8, 0x524d,
>   0xb1, 0x2d, 0xc5, 0x29, 0xb1, 
> 0x3d, 0x85, 0x43);
>  
> +static void amd_pmf_prepare_args(struct amd_pmf_dev *dev, int cmd,
> +  struct tee_ioctl_invoke_arg *arg,
> +  struct tee_param *param)
> +{
> + memset(arg, 0, sizeof(*arg));
> + memset(param, 0, MAX_TEE_PARAM * sizeof(*param));
> +
> + arg->func = cmd;
> + arg->session = dev->session_id;
> + arg->num_params = MAX_TEE_PARAM;
> +
> + /* Fill invoke cmd params */
> + param[0].u.memref.size = sizeof(struct ta_pmf_shared_memory);
> + param[0].attr = TEE_IOCTL_PARAM_ATTR_TYPE_MEMREF_INOUT;
> + param[0].u.memref.shm = dev->fw_shm_pool;
> + param[0].u.memref.shm_offs = 0;
> +}
> +
> +static int amd_pmf_invoke_cmd_enact(struct amd_pmf_dev *dev)
> +{
> + struct ta_pmf_shared_memory *ta_sm = NULL;
> + struct tee_param param[MAX_TEE_PARAM];
> + struct tee_ioctl_invoke_arg arg;
> + int ret = 0;
> +
> + if (!dev->tee_ctx)
> + return -ENODEV;
> +
> + ta_sm = (struct ta_pmf_shared_memory *)dev->shbuf;

Don't cast from void * to a typed pointer, it's unnecessary as compiler 
will handle that for you.

> + memset(ta_sm, 0, sizeof(struct ta_pmf_shared_memory));

sizeof(*ta_sm) to be on the safer side of things.

> + ta_sm->command_id = TA_PMF_COMMAND_POLICY_BUILDER_ENACT_POLICIES;
> + ta_sm->if_version = PMF_TA_IF_VERSION_MAJOR;
> +
> + amd_pmf_prepare_args(dev, TA_PMF_COMMAND_POLICY_BUILDER_ENACT_POLICIES, 
> &arg, param);
> +
> + ret = tee_client_invoke_func(dev->tee_ctx, &arg, param);
> + if (ret < 0 || arg.ret != 0) {
> + dev_err(dev->dev, "TEE enact cmd failed. err: %x, ret:%x\n", 
> arg.ret, ret);

-Exx code should be printed as %x

> + return -EINVAL;

This overrides the original error code if ret < 0.

> + }
> +
> + return 0;
> +}
> +
> +static int amd_pmf_invoke_cmd_init(struct amd_pmf_dev *dev)
> +{
> + struct ta_pmf_shared_memory *ta_sm = NULL;
> + struct tee_param param

Re: [PATCH v6 5/9] drm/amdgpu: create context space for usermode queue

2023-10-04 Thread Alex Deucher

On Fri, Sep 29, 2023 at 1:50 PM Shashank Sharma  wrote:
>
>
> On 20/09/2023 17:21, Alex Deucher wrote:
> > On Fri, Sep 8, 2023 at 12:45 PM Shashank Sharma  
> > wrote:
> >> The FW expects us to allocate at least one page as context
> >> space to process gang, process, GDS and FW  related work.
> >> This patch creates a joint object for the same, and calculates
> >> GPU space offsets of these spaces.
> >>
> >> V1: Addressed review comments on RFC patch:
> >>  Alex: Make this function IP specific
> >>
> >> V2: Addressed review comments from Christian
> >>  - Allocate only one object for total FW space, and calculate
> >>offsets for each of these objects.
> >>
> >> V3: Integration with doorbell manager
> >>
> >> V4: Review comments:
> >>  - Remove shadow from FW space list from cover letter (Alex)
> >>  - Alignment of macro (Luben)
> >>
> >> V5: Merged patches 5 and 6 into this single patch
> >>  Addressed review comments:
> >>  - Use lower_32_bits instead of mask (Christian)
> >>  - gfx_v11_0 instead of gfx_v11 in function names (Alex)
> >>  - Shadow and GDS objects are now coming from userspace (Christian,
> >>Alex)
> >>
> >> V6:
> >>  - Add a comment to replace amdgpu_bo_create_kernel() with
> >>amdgpu_bo_create() during fw_ctx object creation (Christian).
> >>  - Move proc_ctx_gpu_addr, gang_ctx_gpu_addr and fw_ctx_gpu_addr out
> >>of generic queue structure and make it gen11 specific (Alex).
> >>
> >> Cc: Alex Deucher 
> >> Cc: Christian Koenig 
> >> Signed-off-by: Shashank Sharma 
> >> Signed-off-by: Arvind Yadav 
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c| 61 +++
> >>   .../gpu/drm/amd/include/amdgpu_userqueue.h|  1 +
> >>   2 files changed, 62 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
> >> b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> >> index 6760abda08df..8ffb5dee72a9 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> >> @@ -61,6 +61,9 @@
> >>   #define regCGTT_WD_CLK_CTRL_BASE_IDX   1
> >>   #define regRLC_RLCS_BOOTLOAD_STATUS_gc_11_0_1  0x4e7e
> >>   #define regRLC_RLCS_BOOTLOAD_STATUS_gc_11_0_1_BASE_IDX 1
> >> +#define AMDGPU_USERQ_PROC_CTX_SZ   PAGE_SIZE
> >> +#define AMDGPU_USERQ_GANG_CTX_SZ   PAGE_SIZE
> >> +#define AMDGPU_USERQ_FW_CTX_SZ PAGE_SIZE
> >>
> >>   MODULE_FIRMWARE("amdgpu/gc_11_0_0_pfp.bin");
> >>   MODULE_FIRMWARE("amdgpu/gc_11_0_0_me.bin");
> >> @@ -6424,6 +6427,56 @@ const struct amdgpu_ip_block_version 
> >> gfx_v11_0_ip_block =
> >>  .funcs = &gfx_v11_0_ip_funcs,
> >>   };
> >>
> >> +static void gfx_v11_0_userq_destroy_ctx_space(struct amdgpu_userq_mgr 
> >> *uq_mgr,
> >> + struct amdgpu_usermode_queue 
> >> *queue)
> >> +{
> >> +   struct amdgpu_userq_obj *ctx = &queue->fw_obj;
> >> +
> >> +   amdgpu_bo_free_kernel(&ctx->obj, &ctx->gpu_addr, &ctx->cpu_ptr);
> >> +}
> >> +
> >> +static int gfx_v11_0_userq_create_ctx_space(struct amdgpu_userq_mgr 
> >> *uq_mgr,
> >> +   struct amdgpu_usermode_queue 
> >> *queue,
> >> +   struct 
> >> drm_amdgpu_userq_mqd_gfx_v11_0 *mqd_user)
> >> +{
> >> +   struct amdgpu_device *adev = uq_mgr->adev;
> >> +   struct amdgpu_userq_obj *ctx = &queue->fw_obj;
> >> +   struct v11_gfx_mqd *mqd = queue->mqd.cpu_ptr;
> >> +   uint64_t fw_ctx_gpu_addr;
> >> +   int r, size;
> >> +
> >> +   /*
> >> +* The FW expects at least one page space allocated for
> >> +* process ctx, gang ctx and fw ctx each. Create an object
> >> +* for the same.
> >> +*/
> >> +   size = AMDGPU_USERQ_PROC_CTX_SZ + AMDGPU_USERQ_FW_CTX_SZ +
> >> +  AMDGPU_USERQ_GANG_CTX_SZ;
> >> +   r = amdgpu_bo_create_kernel(adev, size, PAGE_SIZE,
> >> +   AMDGPU_GEM_DOMAIN_GTT,
> >> +   &ctx->obj,
> >> +   &ctx->gpu_addr,
> >> +   &ctx->cpu_ptr);
> >> +   if (r) {
> >> +   DRM_ERROR("Failed to allocate ctx space bo for userqueue, 
> >> err:%d\n", r);
> >> +   return r;
> >> +   }
> >> +
> >> +   fw_ctx_gpu_addr = ctx->gpu_addr + AMDGPU_USERQ_PROC_CTX_SZ +
> >> + AMDGPU_USERQ_GANG_CTX_SZ;
> >> +   mqd->fw_work_area_base_lo = lower_32_bits(fw_ctx_gpu_addr);
> >> +   mqd->fw_work_area_base_lo = upper_32_bits(fw_ctx_gpu_addr);
> >> +
> >> +   /* Shadow and GDS objects come directly from userspace */
> >> +   mqd->shadow_base_lo = lower_32_bits(mqd_user->shadow_va);
> >> +   mqd->shadow_base_hi = upper_32_bits(mqd_user->shadow_va);
> >> +
> >> +   mqd->gds_bkup_base_lo = lower_32_bits(mqd_user->gds_va);
> >> +   mqd->gds_bkup_base_hi = upper_32_bits(mqd_user->gds_va);
> >> +

Re: [PATCH 0/5] drm/amd/display: Remove migrate-disable and move memory allocation.

2023-10-04 Thread Harry Wentland




On 2023-10-03 15:54, Harry Wentland wrote:
> On 2023-10-02 06:58, Sebastian Andrzej Siewior wrote:
>> On 2023-09-22 07:33:26 [+0200], Christian König wrote:
>>> Am 21.09.23 um 16:15 schrieb Sebastian Andrzej Siewior:
 Hi,

 I stumbled uppon the amdgpu driver via a bugzilla report. The actual fix
 is #4 + #5 and the rest was made while looking at the code.
>>>
>>> Oh, yes please :)
>>>
>>> Rodrigo and I have been trying to sort those things out previously, but
>>> that's Sisyphean work.
>>>
>>> In general the DC team needs to judge, but of hand it looks good to me.
>>
>> Any way to get this merged? There was no reply from the DC team… No
>> reply from the person breaking it either. The bugzilla reporter stated
>> that it solves his trouble. He didn't report anything new ;)
>>
> 
> Apologies for the slow progress. We're feeding it through our CI and
> will let you know the verdict soon.
> 
> Do you happen to have the bugzilla link that this is fixing? It would
> be helpful to include that as a link in the patches as well, to give
> them context.
> 

CI passed.

Series is
Acked-by: Harry Wentland 

Harry

> Harry
> 
>>> Christian.
>>
>> Sebastian
>

Re: [PATCH 1/5] drm/amd/display: Remove migrate_en/dis from dc_fpu_begin().

2023-10-04 Thread Hamza Mahfooz


On 10/3/23 15:53, Harry Wentland wrote:

On 2023-09-21 10:15, Sebastian Andrzej Siewior wrote:

This is a revert of the commit mentioned below while it is not wrong, as
in the kernel will explode, having migrate_disable() here it is
complete waste of resources.

Additionally commit message is plain wrong the review tag does not make


Not sure I follow what's unhelpful about the review tag with
0c316556d1249 ("drm/amd/display: Disable migration to ensure consistency of per-CPU 
variable")

I do wish the original patch showed the splat it's attempting
to fix. It apparently made a difference for something, whether
inadvertently or not. I wish I knew what that "something" was.


I did some digging, and it seems like the intention of that patch was to
fix the following splat:

WARNING: CPU: 5 PID: 1062 at 
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/dc_fpu.c:71 
dc_assert_fp_enabled+0x1a/0x30 [amdgpu]

[...]
CPU: 5 PID: 1062 Comm: Xorg Tainted: G   OE 
5.15.0-56-generic #62-Ubuntu
Hardware name: ASUS System Product Name/ROG STRIX Z590-F GAMING WIFI, 
BIOS 1202 10/27/2021

RIP: 0010:dc_assert_fp_enabled+0x1a/0x30 [amdgpu]
Code: ff 45 31 f6 0f 0b e9 ca fe ff ff e8 90 1c 1f f7 48 c7 c0 00 30 03 
00 65 48 03 05 b1 aa 86 3f 8b 00 85 c0 7e 05 c3 cc cc cc cc <0f> 0b c3 
cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00

RSP: :b89b82a8f118 EFLAGS: 00010246
RAX:  RBX: 8c271cd0 RCX: 
RDX: 8c2708025000 RSI: 8c270e8c RDI: 8c271cd0
RBP: b89b82a8f1d0 R08:  R09: 7f6a
R10: b89b82a8f240 R11:  R12: 0002
R13: 8c271cd0 R14: 8c270e8c R15: 8c2708025000
FS:  7f0570019a80() GS:8c2e3fb4() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 5594858a0058 CR3: 00010e204001 CR4: 00770ee0
PKRU: 5554
Call Trace:
 
 ? dcn20_populate_dml_pipes_from_context+0x47/0x1730 [amdgpu]
 ? __kmalloc_node+0x2cb/0x3a0
 dcn32_populate_dml_pipes_from_context+0x2b/0x450 [amdgpu]
 dcn32_internal_validate_bw+0x15f9/0x2670 [amdgpu]
 dcn32_find_dummy_latency_index_for_fw_based_mclk_switch+0xd0/0x310 
[amdgpu]

 dcn32_calculate_wm_and_dlg_fpu+0xe6/0x1e50 [amdgpu]
 dcn32_calculate_wm_and_dlg+0x46/0x70 [amdgpu]
 dcn32_validate_bandwidth+0x1b7/0x3e0 [amdgpu]
 dc_validate_global_state+0x32c/0x560 [amdgpu]
 dc_validate_with_context+0x6e6/0xd80 [amdgpu]
 dc_commit_streams+0x21b/0x500 [amdgpu]
 dc_commit_state+0xf3/0x150 [amdgpu]
 amdgpu_dm_atomic_commit_tail+0x60d/0x3120 [amdgpu]
 ? dcn32_internal_validate_bw+0xcf8/0x2670 [amdgpu]
 ? fill_plane_buffer_attributes+0x1e5/0x560 [amdgpu]
 ? dcn32_validate_bandwidth+0x1e0/0x3e0 [amdgpu]
 ? kfree+0x1f7/0x250
 ? dcn32_validate_bandwidth+0x1e0/0x3e0 [amdgpu]
 ? dc_validate_global_state+0x32c/0x560 [amdgpu]
 ? __cond_resched+0x1a/0x50
 ? __wait_for_common+0x3e/0x150
 ? fill_plane_buffer_attributes+0x1e5/0x560 [amdgpu]
 ? usleep_range_state+0x90/0x90
 ? wait_for_completion_timeout+0x1d/0x30
 commit_tail+0xc2/0x170 [drm_kms_helper]
 ? drm_atomic_helper_swap_state+0x20f/0x370 [drm_kms_helper]
 drm_atomic_helper_commit+0x12b/0x150 [drm_kms_helper]
 amdgpu_dm_atomic_commit+0x11/0x20 [amdgpu]
 drm_atomic_commit+0x47/0x60 [drm]
 drm_mode_obj_set_property_ioctl+0x16b/0x420 [drm]
 ? mutex_lock+0x13/0x50
 ? drm_mode_createblob_ioctl+0xf6/0x130 [drm]
 ? drm_mode_obj_find_prop_id+0x90/0x90 [drm]
 drm_ioctl_kernel+0xb0/0x100 [drm]
 drm_ioctl+0x268/0x4b0 [drm]
 ? drm_mode_obj_find_prop_id+0x90/0x90 [drm]
 ? ktime_get_mono_fast_ns+0x4a/0xa0
 amdgpu_drm_ioctl+0x4e/0x90 [amdgpu]
 __x64_sys_ioctl+0x92/0xd0
 do_syscall_64+0x59/0xc0
 ? do_user_addr_fault+0x1e7/0x670
 ? do_syscall_64+0x69/0xc0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? irqentry_exit_to_user_mode+0x9/0x20
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f05704a2aff
Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 
44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 
3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00

RSP: 002b:7ffc8c45a3f0 EFLAGS: 0246 ORIG_RAX: 0010
RAX: ffda RBX: 7ffc8c45a480 RCX: 7f05704a2aff
RDX: 7ffc8c45a480 RSI: c01864ba RDI: 000e
RBP: c01864ba R08: 0077 R09: 
R10: 7f05705a22f0 R11: 0246 R12: 0004
R13: 000e R14: 000f R15: 7ffc8c45a8a8
 
---[ end trace 4deab30bb69df00f ]---



Harry


it any better. The migrate_disable() interface has a fat comment
describing it and it includes the word "undesired" in the headline which
should tickle people to read it before using it.
Initially I assumed it is worded too harsh but now I beg to differ.

The reviewer of the original commit, even not understanding what
migrate_disable() does should ask the following:

- migrate_disable()

[PATCH 16/16] drm/amd/display: 3.2.255

2023-10-04 Thread Tom Chung

From: Aric Cyr 

This version brings along following fixes:
- Refactor DPG test pattern logic for ODM cases
- Refactor HWSS into component folder
- Revert "drm/amd/display: Add a check for idle power optimization"
- Revert "drm/amd/display: remove duplicated edp relink to fastboot
- Update cursor limits based on SW cursor fallback limits
- Update stream mask
- Update pmfw_driver_if new structure
- Modify SMU message logs
- Don't set dpms_off for seamless boot

Acked-by: Tom Chung 
Signed-off-by: Aric Cyr 
---
 drivers/gpu/drm/amd/display/dc/dc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 0b9384707106..9f6d8d5000bf 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -47,7 +47,7 @@ struct aux_payload;
 struct set_config_cmd_payload;
 struct dmub_notification;
 
-#define DC_VER "3.2.254"
+#define DC_VER "3.2.255"
 
 #define MAX_SURFACES 3
 #define MAX_PLANES 6
-- 
2.25.1

[PATCH 15/16] drm/amd/display: Disable SubVP if test pattern is enabled

2023-10-04 Thread Tom Chung

From: George Shen 

[Why]
Enabling DPG causes HUBP to stay in blank constantly. If DPG is enabled
while an MCLK switch is taking place with SubVP, it will cause the MCLK
to never complete. This is because SubVP MCLK switch relies a HUBP
VLine interrupt, which will never occur when HUBP is constantly in
blank.

[How]
Disable SubVP when test pattern is enabled.

Reviewed-by: Alvin Lee 
Reviewed-by: Nevenko Stupar 
Acked-by: Tom Chung 
Signed-off-by: George Shen 
---
 .../gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c  | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
index 7179c2b3b1b7..4c2c0e252867 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
@@ -1383,6 +1383,19 @@ static void try_odm_power_optimization_and_revalidate(
}
 }
 
+static bool is_test_pattern_enabled(
+   struct dc_state *context)
+{
+   int i;
+
+   for (i = 0; i < context->stream_count; i++) {
+   if (context->streams[i]->test_pattern.type != 
DP_TEST_PATTERN_VIDEO_MODE)
+   return true;
+   }
+
+   return false;
+}
+
 static void dcn32_full_validate_bw_helper(struct dc *dc,
   struct dc_state *context,
   display_e2e_pipe_params_st *pipes,
@@ -1426,7 +1439,7 @@ static void dcn32_full_validate_bw_helper(struct dc *dc,
 * 5. (Config doesn't support MCLK in VACTIVE/VBLANK || 
dc->debug.force_subvp_mclk_switch)
 */
if (!dc->debug.force_disable_subvp && !dc->caps.dmub_caps.gecc_enable 
&& dcn32_all_pipes_have_stream_and_plane(dc, context) &&
-   !dcn32_mpo_in_use(context) && !dcn32_any_surfaces_rotated(dc, 
context) &&
+   !dcn32_mpo_in_use(context) && !dcn32_any_surfaces_rotated(dc, 
context) && !is_test_pattern_enabled(context) &&
(*vlevel == context->bw_ctx.dml.soc.num_states ||
vba->DRAMClockChangeSupport[*vlevel][vba->maxMpcComb] == 
dm_dram_clock_change_unsupported ||
dc->debug.force_subvp_mclk_switch)) {
-- 
2.25.1

[PATCH 14/16] drm/amd/display: Refactor DPG test pattern logic for ODM cases

2023-10-04 Thread Tom Chung

From: George Shen 

[Why]
Current DPG test pattern logic does not account for ODM configuration
changes after test pattern has already been programmed. For example, if
ODM2:1 is enabled after test pattern is already being output, the second
pipe is not programmed to output test pattern, causing half the screen
to be black.

[How]
Move DPG test pattern parameter calculations into separate function.
Whenever ODM pipe configuration changes, re-calculate DPG test pattern
parameters and program DPG if test pattern is currently enabled.

Reviewed-by: Wenjing Liu 
Acked-by: Tom Chung 
Signed-off-by: George Shen 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c  |   8 ++
 .../gpu/drm/amd/display/dc/core/dc_resource.c | 104 
 .../amd/display/dc/hwss/dcn20/dcn20_hwseq.c   |  22 
 .../gpu/drm/amd/display/dc/inc/core_types.h   |  13 ++
 drivers/gpu/drm/amd/display/dc/inc/resource.h |   4 +
 .../display/dc/link/accessories/link_dp_cts.c | 117 --
 6 files changed, 175 insertions(+), 93 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 88d41bf6d53a..544c915469f9 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -3145,6 +3145,14 @@ static bool update_planes_and_stream_state(struct dc *dc,
BREAK_TO_DEBUGGER();
goto fail;
}
+
+   for (i = 0; i < context->stream_count; i++) {
+   struct pipe_ctx *otg_master = 
resource_get_otg_master_for_stream(&context->res_ctx,
+   context->streams[i]);
+
+   if (otg_master->stream->test_pattern.type != 
DP_TEST_PATTERN_VIDEO_MODE)
+   
resource_build_test_pattern_params(&context->res_ctx, otg_master);
+   }
}
 
*new_context = context;
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index 3549a9b852a2..22e05f3d01e0 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -1360,6 +1360,110 @@ static bool is_subvp_high_refresh_candidate(struct 
dc_stream_state *stream)
return false;
 }
 
+static enum controller_dp_test_pattern convert_dp_to_controller_test_pattern(
+   enum dp_test_pattern test_pattern)
+{
+   enum controller_dp_test_pattern controller_test_pattern;
+
+   switch (test_pattern) {
+   case DP_TEST_PATTERN_COLOR_SQUARES:
+   controller_test_pattern =
+   CONTROLLER_DP_TEST_PATTERN_COLORSQUARES;
+   break;
+   case DP_TEST_PATTERN_COLOR_SQUARES_CEA:
+   controller_test_pattern =
+   CONTROLLER_DP_TEST_PATTERN_COLORSQUARES_CEA;
+   break;
+   case DP_TEST_PATTERN_VERTICAL_BARS:
+   controller_test_pattern =
+   CONTROLLER_DP_TEST_PATTERN_VERTICALBARS;
+   break;
+   case DP_TEST_PATTERN_HORIZONTAL_BARS:
+   controller_test_pattern =
+   CONTROLLER_DP_TEST_PATTERN_HORIZONTALBARS;
+   break;
+   case DP_TEST_PATTERN_COLOR_RAMP:
+   controller_test_pattern =
+   CONTROLLER_DP_TEST_PATTERN_COLORRAMP;
+   break;
+   default:
+   controller_test_pattern =
+   CONTROLLER_DP_TEST_PATTERN_VIDEOMODE;
+   break;
+   }
+
+   return controller_test_pattern;
+}
+
+static enum controller_dp_color_space convert_dp_to_controller_color_space(
+   enum dp_test_pattern_color_space color_space)
+{
+   enum controller_dp_color_space controller_color_space;
+
+   switch (color_space) {
+   case DP_TEST_PATTERN_COLOR_SPACE_RGB:
+   controller_color_space = CONTROLLER_DP_COLOR_SPACE_RGB;
+   break;
+   case DP_TEST_PATTERN_COLOR_SPACE_YCBCR601:
+   controller_color_space = CONTROLLER_DP_COLOR_SPACE_YCBCR601;
+   break;
+   case DP_TEST_PATTERN_COLOR_SPACE_YCBCR709:
+   controller_color_space = CONTROLLER_DP_COLOR_SPACE_YCBCR709;
+   break;
+   case DP_TEST_PATTERN_COLOR_SPACE_UNDEFINED:
+   default:
+   controller_color_space = CONTROLLER_DP_COLOR_SPACE_UDEFINED;
+   break;
+   }
+
+   return controller_color_space;
+}
+
+void resource_build_test_pattern_params(struct resource_context *res_ctx,
+   struct pipe_ctx *otg_master)
+{
+   int odm_slice_width, last_odm_slice_width, offset = 0;
+   struct pipe_ctx *opp_heads[MAX_PIPES];
+   struct test_pattern_params *params;
+   int odm_cnt = 1;
+   enum controller_dp_test_pattern controller_test_pattern;
+   enum controller_dp_color_space cont

[PATCH 13/16] drm/amd/display: Don't set dpms_off for seamless boot

2023-10-04 Thread Tom Chung

From: Daniel Miess 

[Why]
eDPs fail to light up with seamless boot enabled

[How]
When seamless boot is enabled don't configure dpms_off
in disable_vbios_mode_if_required.

Reviewed-by: Charlene Liu 
Cc: Mario Limonciello 
Cc: Alex Deucher 
Cc: sta...@vger.kernel.org
Acked-by: Tom Chung 
Signed-off-by: Daniel Miess 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index bd4834f921c1..88d41bf6d53a 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1230,6 +1230,9 @@ static void disable_vbios_mode_if_required(
if (stream == NULL)
continue;
 
+   if (stream->apply_seamless_boot_optimization)
+   continue;
+
// only looking for first odm pipe
if (pipe->prev_odm_pipe)
continue;
-- 
2.25.1

[PATCH 12/16] drm/amd/display: Refactor HWSS into component folder

2023-10-04 Thread Tom Chung

From: Mounika Adhuri 

[why]
Rename hw_sequencer to hwseq.
Move all hwseq files to unique
folder hwss.

[how]
creating hwss repo in dc, and moved the dcnxx_hwseq.c
and .h files into corresponding new folders inside the hwss
and cleared the linkage errors by adding relative paths
in the Makefile.template.

Reviewed-by: Martin Leung 
Acked-by: Tom Chung 
Signed-off-by: Mounika Adhuri 
---
 drivers/gpu/drm/amd/display/Makefile  |   1 +
 drivers/gpu/drm/amd/display/dc/Makefile   |   2 +-
 .../dc/clk_mgr/dce120/dce120_clk_mgr.c|   2 +-
 drivers/gpu/drm/amd/display/dc/dc.h   |   2 +-
 drivers/gpu/drm/amd/display/dc/dce/Makefile   |   2 +-
 .../gpu/drm/amd/display/dc/dce100/Makefile|   2 +-
 .../amd/display/dc/dce100/dce100_resource.c   |   4 +-
 .../gpu/drm/amd/display/dc/dce110/Makefile|   2 +-
 .../amd/display/dc/dce110/dce110_resource.c   |   2 +-
 .../gpu/drm/amd/display/dc/dce112/Makefile|   2 +-
 .../amd/display/dc/dce112/dce112_resource.c   |   2 +-
 .../gpu/drm/amd/display/dc/dce120/Makefile|   1 -
 .../amd/display/dc/dce120/dce120_resource.c   |   6 +-
 .../amd/display/dc/dce60/dce60_hw_sequencer.c |   4 +-
 drivers/gpu/drm/amd/display/dc/dce80/Makefile |   2 +-
 .../drm/amd/display/dc/dce80/dce80_resource.c |   2 +-
 drivers/gpu/drm/amd/display/dc/dcn10/Makefile |   2 +-
 .../dc/dcn10/dcn10_hw_sequencer_debug.c   |   4 +-
 .../gpu/drm/amd/display/dc/dcn10/dcn10_init.c |   4 +-
 .../drm/amd/display/dc/dcn10/dcn10_resource.c |   4 +-
 drivers/gpu/drm/amd/display/dc/dcn20/Makefile |   2 +-
 .../gpu/drm/amd/display/dc/dcn20/dcn20_init.c |   6 +-
 .../drm/amd/display/dc/dcn20/dcn20_resource.c |   4 +-
 .../gpu/drm/amd/display/dc/dcn201/Makefile|   2 +-
 .../drm/amd/display/dc/dcn201/dcn201_init.c   |   6 +-
 .../amd/display/dc/dcn201/dcn201_resource.c   |   4 +-
 drivers/gpu/drm/amd/display/dc/dcn21/Makefile |   2 +-
 .../gpu/drm/amd/display/dc/dcn21/dcn21_init.c |   6 +-
 .../drm/amd/display/dc/dcn21/dcn21_resource.c |   2 +-
 drivers/gpu/drm/amd/display/dc/dcn30/Makefile |   1 -
 .../gpu/drm/amd/display/dc/dcn30/dcn30_init.c |   6 +-
 .../drm/amd/display/dc/dcn30/dcn30_resource.c |   2 +-
 .../gpu/drm/amd/display/dc/dcn301/Makefile|   2 +-
 .../drm/amd/display/dc/dcn301/dcn301_init.c   |   6 +-
 .../amd/display/dc/dcn301/dcn301_resource.c   |   2 +-
 .../gpu/drm/amd/display/dc/dcn302/Makefile|   2 +-
 .../drm/amd/display/dc/dcn302/dcn302_init.c   |   2 +-
 .../gpu/drm/amd/display/dc/dcn303/Makefile|   2 +-
 .../drm/amd/display/dc/dcn303/dcn303_init.c   |   2 +-
 drivers/gpu/drm/amd/display/dc/dcn31/Makefile |   2 +-
 .../gpu/drm/amd/display/dc/dcn31/dcn31_init.c |   4 +-
 .../drm/amd/display/dc/dcn31/dcn31_resource.c |   2 +-
 .../gpu/drm/amd/display/dc/dcn314/Makefile|   2 +-
 .../drm/amd/display/dc/dcn314/dcn314_init.c   |   4 +-
 .../amd/display/dc/dcn314/dcn314_resource.c   |   2 +-
 .../amd/display/dc/dcn315/dcn315_resource.c   |   2 +-
 .../amd/display/dc/dcn316/dcn316_resource.c   |   2 +-
 drivers/gpu/drm/amd/display/dc/dcn32/Makefile |   2 +-
 .../gpu/drm/amd/display/dc/dcn32/dcn32_init.c |   6 +-
 .../drm/amd/display/dc/dcn32/dcn32_resource.c |   2 +-
 .../amd/display/dc/dcn321/dcn321_resource.c   |   2 +-
 drivers/gpu/drm/amd/display/dc/dcn35/Makefile |   2 +-
 .../gpu/drm/amd/display/dc/dcn35/dcn35_init.c |   4 +-
 .../drm/amd/display/dc/dcn35/dcn35_resource.c |   4 +-
 drivers/gpu/drm/amd/display/dc/hwss/Makefile  | 183 ++
 .../amd/display/dc/{ => hwss}/dce/dce_hwseq.c |   0
 .../amd/display/dc/{ => hwss}/dce/dce_hwseq.h |   0
 .../dce100/dce100_hwseq.c}|   4 +-
 .../dce100/dce100_hwseq.h}|   0
 .../dce110/dce110_hwseq.c}|   8 +-
 .../dce110/dce110_hwseq.h}|   0
 .../dce112/dce112_hwseq.c}|   4 +-
 .../dce112/dce112_hwseq.h}|   0
 .../dce120/dce120_hwseq.c}|   4 +-
 .../dce120/dce120_hwseq.h}|   0
 .../dce80/dce80_hwseq.c}  |   6 +-
 .../dce80/dce80_hwseq.h}  |   0
 .../dcn10/dcn10_hwseq.c}  |  16 +-
 .../dcn10/dcn10_hwseq.h}  |   0
 .../display/dc/{ => hwss}/dcn20/dcn20_hwseq.c |   6 +-
 .../display/dc/{ => hwss}/dcn20/dcn20_hwseq.h |   0
 .../dc/{ => hwss}/dcn201/dcn201_hwseq.c   |   2 +-
 .../dc/{ => hwss}/dcn201/dcn201_hwseq.h   |   0
 .../display/dc/{ => hwss}/dcn21/dcn21_hwseq.c |   2 +-
 .../display/dc/{ => hwss}/dcn21/dcn21_hwseq.h |   0
 .../display/dc/{ => hwss}/dcn30/dcn30_hwseq.c |  10 +-
 .../display/dc/{ => hwss}/dcn30/dcn30_hwseq.h |   0
 .../dc/{ => hwss}/dcn301/dcn301_hwseq.c   |   0
 .../dc/{ => hwss}/dcn301/dcn301_hwseq.h   |   0
 .../dc/{ => hwss}/dcn302/dcn302_hwseq.c   |   0
 .../dc/{ => hwss}/dcn302/dcn302_hwseq.h   |   0
 .../dc/{ => hwss}/dcn303/dcn303_hwseq.c   |   0
 .../dc/{ => hwss}/dcn303/dcn303_hwseq.h   |   0
 .

[PATCH 10/16] drm/amd/display: Make DCN3x use older FPO sequence

2023-10-04 Thread Tom Chung

From: Alvin Lee 

[Why]
Latest FPO sequence is causing intermittent hangs

[How]
Update the FPO sequence

Reviewed-by: Saaem Rizvi 
Acked-by: Tom Chung 
Signed-off-by: Alvin Lee 
---
 .../drm/amd/display/dc/dcn30/dcn30_hwseq.c| 21 ---
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.c| 12 +++
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.h|  3 +++
 .../gpu/drm/amd/display/dc/dcn32/dcn32_init.c |  2 +-
 4 files changed, 16 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
index af9a9fc7db48..b6d88266e8ab 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
@@ -997,14 +997,6 @@ void dcn30_set_disp_pattern_generator(const struct dc *dc,
 void dcn30_prepare_bandwidth(struct dc *dc,
struct dc_state *context)
 {
-   bool p_state_change_support = 
context->bw_ctx.bw.dcn.clk.p_state_change_support;
-   /* Any transition into an FPO config should disable MCLK switching 
first to avoid
-* driver and FW P-State synchronization issues.
-*/
-   if (context->bw_ctx.bw.dcn.clk.fw_based_mclk_switching || 
dc->clk_mgr->clks.fw_based_mclk_switching) {
-   dc->optimized_required = true;
-   context->bw_ctx.bw.dcn.clk.p_state_change_support = false;
-   }
 
if (dc->clk_mgr->dc_mode_softmax_enabled)
if (dc->clk_mgr->clks.dramclk_khz <= 
dc->clk_mgr->bw_params->dc_mode_softmax_memclk * 1000 &&
@@ -1012,20 +1004,7 @@ void dcn30_prepare_bandwidth(struct dc *dc,
dc->clk_mgr->funcs->set_max_memclk(dc->clk_mgr, 
dc->clk_mgr->bw_params->clk_table.entries[dc->clk_mgr->bw_params->clk_table.num_entries
 - 1].memclk_mhz);
 
dcn20_prepare_bandwidth(dc, context);
-   /*
-* enabled -> enabled: do not disable
-* enabled -> disabled: disable
-* disabled -> enabled: don't care
-* disabled -> disabled: don't care
-*/
-   if (!context->bw_ctx.bw.dcn.clk.fw_based_mclk_switching)
-   dc_dmub_srv_p_state_delegate(dc, false, context);
 
-   if (context->bw_ctx.bw.dcn.clk.fw_based_mclk_switching || 
dc->clk_mgr->clks.fw_based_mclk_switching) {
-   /* After disabling P-State, restore the original value to 
ensure we get the correct P-State
-* on the next optimize. */
-   context->bw_ctx.bw.dcn.clk.p_state_change_support = 
p_state_change_support;
-   }
 }
 
 void dcn30_set_static_screen_control(struct pipe_ctx **pipe_ctx,
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
index 45b557d8e089..67687e45f031 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
@@ -50,6 +50,7 @@
 #include "dce/dmub_hw_lock_mgr.h"
 #include "dcn32_resource.h"
 #include "link.h"
+#include "../dcn20/dcn20_hwseq.h"
 
 #define DC_LOGGER_INIT(logger)
 
@@ -1676,3 +1677,14 @@ bool dcn32_is_pipe_topology_transition_seamless(struct 
dc *dc,
 
return is_seamless;
 }
+
+void dcn32_prepare_bandwidth(struct dc *dc,
+   struct dc_state *context)
+{
+   if (dc->clk_mgr->dc_mode_softmax_enabled)
+   if (dc->clk_mgr->clks.dramclk_khz <= 
dc->clk_mgr->bw_params->dc_mode_softmax_memclk * 1000 &&
+   context->bw_ctx.bw.dcn.clk.dramclk_khz > 
dc->clk_mgr->bw_params->dc_mode_softmax_memclk * 1000)
+   dc->clk_mgr->funcs->set_max_memclk(dc->clk_mgr, 
dc->clk_mgr->bw_params->clk_table.entries[dc->clk_mgr->bw_params->clk_table.num_entries
 - 1].memclk_mhz);
+
+   dcn20_prepare_bandwidth(dc, context);
+}
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.h 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.h
index 9992e40acd21..cecf7f0f5671 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.h
@@ -124,4 +124,7 @@ bool dcn32_is_pipe_topology_transition_seamless(struct dc 
*dc,
const struct dc_state *cur_ctx,
const struct dc_state *new_ctx);
 
+void dcn32_prepare_bandwidth(struct dc *dc,
+   struct dc_state *context);
+
 #endif /* __DC_HWSS_DCN32_H__ */
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c
index 6e7f6df1d423..04309412b087 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c
@@ -60,7 +60,7 @@ static const struct hw_sequencer_funcs dcn32_funcs = {
.pipe_control_lock = dcn20_pipe_control_lock,
.interdependent_update_lock = dcn10_lock_all_pipes,
.cursor_lock = dcn10_cursor_lock,
-   .prepare_bandwidth = dcn30_prepare_bandwidth,
+   .prepare_bandwidth = dcn32_prepare_bandwidth,
.optimize

[PATCH 11/16] drm/amd/display: Revert "drm/amd/display: Add a check for idle power optimization"

2023-10-04 Thread Tom Chung

From: Sung Joon Kim 

Revert commit 0ca0151b9902 ("drm/amd/display: Add a check for idle power 
optimization")
Because it cause Freesync and S4 regression

Reviewed-by: Aric Cyr 
Acked-by: Tom Chung 
Signed-off-by: Sung Joon Kim 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 20 +--
 drivers/gpu/drm/amd/display/dc/dc.h   |  1 -
 .../gpu/drm/amd/display/dmub/src/dmub_srv.c   |  1 -
 3 files changed, 1 insertion(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 623f4ac0bf42..bd4834f921c1 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -4872,8 +4872,7 @@ bool dc_set_psr_allow_active(struct dc *dc, bool enable)
 
 void dc_allow_idle_optimizations(struct dc *dc, bool allow)
 {
-   if (dc->debug.disable_idle_power_optimizations ||
-   (dc->caps.ips_support && dc->config.disable_ips))
+   if (dc->debug.disable_idle_power_optimizations)
return;
 
if (dc->clk_mgr != NULL && dc->clk_mgr->funcs->is_smu_present)
@@ -4887,23 +4886,6 @@ void dc_allow_idle_optimizations(struct dc *dc, bool 
allow)
dc->idle_optimizations_allowed = allow;
 }
 
-bool dc_is_idle_power_optimized(struct dc *dc)
-{
-   uint32_t idle_state = 0;
-
-   if (dc->debug.disable_idle_power_optimizations)
-   return false;
-
-   if (dc->hwss.get_idle_state)
-   idle_state = dc->hwss.get_idle_state(dc);
-
-   if ((idle_state & DMUB_IPS1_ALLOW_MASK) ||
-   (idle_state & DMUB_IPS2_ALLOW_MASK))
-   return true;
-
-   return false;
-}
-
 /* set min and max memory clock to lowest and highest DPM level, respectively 
*/
 void dc_unlock_memory_clock_frequency(struct dc *dc)
 {
diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 6c51ebf5bbad..41c77910f046 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -2359,7 +2359,6 @@ bool dc_is_plane_eligible_for_idle_optimizations(struct 
dc *dc, struct dc_plane_
struct dc_cursor_attributes *cursor_attr);
 
 void dc_allow_idle_optimizations(struct dc *dc, bool allow);
-bool dc_is_idle_power_optimized(struct dc *dc);
 
 /* set min and max memory clock to lowest and highest DPM level, respectively 
*/
 void dc_unlock_memory_clock_frequency(struct dc *dc);
diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c 
b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c
index e43e8d4bfe37..b99db771e071 100644
--- a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c
+++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c
@@ -352,7 +352,6 @@ static bool dmub_srv_hw_setup(struct dmub_srv *dmub, enum 
dmub_asic asic)
funcs->init_reg_offsets = dmub_srv_dcn35_regs_init;
 
funcs->is_hw_powered_up = dmub_dcn35_is_hw_powered_up;
-   funcs->should_detect = dmub_dcn35_should_detect;
break;
 
default:
-- 
2.25.1

[PATCH 09/16] drm/amd/display: Don't use fsleep for PSR exit waits

2023-10-04 Thread Tom Chung

From: Nicholas Kazlauskas 

[Why]
These functions can be called from high IRQ levels and the OS will hang
if it tries to use a usleep_highres or a msleep.

[How]
Replace the fsleep with a udelay.

Reviewed-by: Aric Cyr 
Acked-by: Tom Chung 
Signed-off-by: Nicholas Kazlauskas 
---
 drivers/gpu/drm/amd/display/dc/dce/dce_dmcu.c | 3 ++-
 drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_dmcu.c 
b/drivers/gpu/drm/amd/display/dc/dce/dce_dmcu.c
index b87bfecb7755..a8e79104b684 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_dmcu.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_dmcu.c
@@ -586,7 +586,8 @@ static void dcn10_dmcu_set_psr_enable(struct dmcu *dmcu, 
bool enable, bool wait)
if (state == PSR_STATE0)
break;
}
-   fsleep(500);
+   /* must *not* be fsleep - this can be called from high 
irq levels */
+   udelay(500);
}
 
/* assert if max retry hit */
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c 
b/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c
index f27cc8f9d0aa..9d4170a356a2 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c
@@ -217,7 +217,8 @@ static void dmub_psr_enable(struct dmub_psr *dmub, bool 
enable, bool wait, uint8
break;
}
 
-   fsleep(500);
+   /* must *not* be fsleep - this can be called from high 
irq levels */
+   udelay(500);
}
 
/* assert if max retry hit */
-- 
2.25.1

[PATCH 08/16] drm/amd/display: Update cursor limits based on SW cursor fallback limits

2023-10-04 Thread Tom Chung

From: Alvin Lee 

[Why&How]
For determining the cursor size limit, use the same checks that
are used for determining SW cursor fallback instead of only
using SubVP

Reviewed-by: Aric Cyr 
Acked-by: Tom Chung 
Signed-off-by: Alvin Lee 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c  |  8 ++--
 .../gpu/drm/amd/display/dc/core/dc_resource.c | 39 +++
 .../gpu/drm/amd/display/dc/core/dc_stream.c   | 34 +---
 drivers/gpu/drm/amd/display/dc/inc/resource.h |  3 ++
 4 files changed, 47 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 825f275ea9eb..623f4ac0bf42 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -5436,15 +5436,15 @@ bool dc_abm_save_restore(
 void dc_query_current_properties(struct dc *dc, struct dc_current_properties 
*properties)
 {
unsigned int i;
-   bool subvp_in_use = false;
+   bool subvp_sw_cursor_req = false;
 
for (i = 0; i < dc->current_state->stream_count; i++) {
-   if (dc->current_state->streams[i]->mall_stream_config.type != 
SUBVP_NONE) {
-   subvp_in_use = true;
+   if (check_subvp_sw_cursor_fallback_req(dc, 
dc->current_state->streams[i])) {
+   subvp_sw_cursor_req = true;
break;
}
}
-   properties->cursor_size_limit = subvp_in_use ? 64 : 
dc->caps.max_cursor_size;
+   properties->cursor_size_limit = subvp_sw_cursor_req ? 64 : 
dc->caps.max_cursor_size;
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index aa7b5db83644..3549a9b852a2 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -1334,6 +1334,32 @@ static void calculate_inits_and_viewports(struct 
pipe_ctx *pipe_ctx)
data->viewport_c.y += src.y / vpc_div;
 }
 
+static bool is_subvp_high_refresh_candidate(struct dc_stream_state *stream)
+{
+   uint32_t refresh_rate;
+   struct dc *dc = stream->ctx->dc;
+
+   refresh_rate = (stream->timing.pix_clk_100hz * (uint64_t)100 +
+   stream->timing.v_total * stream->timing.h_total - (uint64_t)1);
+   refresh_rate = div_u64(refresh_rate, stream->timing.v_total);
+   refresh_rate = div_u64(refresh_rate, stream->timing.h_total);
+
+   /* If there's any stream that fits the SubVP high refresh criteria,
+* we must return true. This is because cursor updates are asynchronous
+* with full updates, so we could transition into a SubVP config and
+* remain in HW cursor mode if there's no cursor update which will
+* then cause corruption.
+*/
+   if ((refresh_rate >= 120 && refresh_rate <= 175 &&
+   stream->timing.v_addressable >= 1080 &&
+   stream->timing.v_addressable <= 2160) &&
+   (dc->current_state->stream_count > 1 ||
+   (dc->current_state->stream_count == 1 && 
!stream->allow_freesync)))
+   return true;
+
+   return false;
+}
+
 bool resource_build_scaling_params(struct pipe_ctx *pipe_ctx)
 {
const struct dc_plane_state *plane_state = pipe_ctx->plane_state;
@@ -5101,3 +5127,16 @@ enum dc_status 
update_dp_encoder_resources_for_test_harness(const struct dc *dc,
return DC_OK;
 }
 
+bool check_subvp_sw_cursor_fallback_req(const struct dc *dc, struct 
dc_stream_state *stream)
+{
+   if (!dc->debug.disable_subvp_high_refresh && 
is_subvp_high_refresh_candidate(stream))
+   return true;
+   if (dc->current_state->stream_count == 1 && 
stream->timing.v_addressable >= 2880 &&
+   ((stream->timing.pix_clk_100hz * 100) / 
stream->timing.v_total / stream->timing.h_total) < 120)
+   return true;
+   else if (dc->current_state->stream_count > 1 && 
stream->timing.v_addressable >= 2160 &&
+   ((stream->timing.pix_clk_100hz * 100) / 
stream->timing.v_total / stream->timing.h_total) < 120)
+   return true;
+
+   return false;
+}
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
index ac493dd7fa68..8a6a2881be41 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
@@ -288,32 +288,6 @@ static void program_cursor_attributes(
}
 }
 
-static bool is_subvp_high_refresh_candidate(struct dc_stream_state *stream)
-{
-   uint32_t refresh_rate;
-   struct dc *dc = stream->ctx->dc;
-
-   refresh_rate = (stream->timing.pix_clk_100hz * (uint64_t)100 +
-   stream->timing.v_total * stream->timing.h_total - (uint64_t)1);
-   refresh_rate = div_u64(refresh_rate, stream->timing.v_total);
-   refresh_rate = div_u64(refresh

[PATCH 07/16] drm/amd/display: Update dml ssb from pmfw clock table

2023-10-04 Thread Tom Chung

From: Muhammad Ahmed 

[why]
Need to use real clock table

[How]
Update the clock table

Reviewed-by: Charlene Liu 
Acked-by: Tom Chung 
Signed-off-by: Muhammad Ahmed 
---
 .../drm/amd/display/dc/dcn35/dcn35_resource.c |  3 ++-
 .../drm/amd/display/dc/dml/dcn35/dcn35_fpu.c  | 24 +--
 .../drm/amd/display/dc/dml/dcn35/dcn35_fpu.h  |  2 --
 3 files changed, 3 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c
index 24b455f3ac3c..d3cc8f4a82d1 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c
@@ -698,7 +698,7 @@ static const struct dc_debug_options debug_defaults_drv = {
.underflow_assert_delay_us = 0x,
.dwb_fi_phase = -1, // -1 = disable,
.dmub_command_table = true,
-   .pstate_enabled = false,
+   .pstate_enabled = true,
.use_max_lb = true,
.enable_mem_low_power = {
.bits = {
@@ -1841,6 +1841,7 @@ static bool dcn35_resource_construct(
 
/* Use pipe context based otg sync logic */
dc->config.use_pipe_ctx_sync_logic = true;
+   dc->config.use_default_clock_table = false;
/* read VBIOS LTTPR caps */
{
if (ctx->dc_bios->funcs->get_lttpr_caps) {
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c
index 4d5ee2aad9e4..be345f470b25 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c
@@ -205,29 +205,7 @@ void dcn35_build_wm_range_table_fpu(struct clk_mgr 
*clk_mgr)
//TODO
 }
 
-void dcn35_patch_dpm_table(struct clk_bw_params *bw_params)
-{
-   int i;
-   unsigned int max_dcfclk_mhz = 0, max_dispclk_mhz = 0, max_dppclk_mhz = 
0,
-   max_phyclk_mhz = 0, max_dtbclk_mhz = 0, max_fclk_mhz = 
0, max_uclk_mhz = 0;
-
-   for (i = 0; i < MAX_NUM_DPM_LVL; i++) {
-   if (bw_params->clk_table.entries[i].dcfclk_mhz > max_dcfclk_mhz)
-   max_dcfclk_mhz = 
bw_params->clk_table.entries[i].dcfclk_mhz;
-   if (bw_params->clk_table.entries[i].fclk_mhz > max_fclk_mhz)
-   max_fclk_mhz = bw_params->clk_table.entries[i].fclk_mhz;
-   if (bw_params->clk_table.entries[i].memclk_mhz > max_uclk_mhz)
-   max_uclk_mhz = 
bw_params->clk_table.entries[i].memclk_mhz;
-   if (bw_params->clk_table.entries[i].dispclk_mhz > 
max_dispclk_mhz)
-   max_dispclk_mhz = 
bw_params->clk_table.entries[i].dispclk_mhz;
-   if (bw_params->clk_table.entries[i].dppclk_mhz > max_dppclk_mhz)
-   max_dppclk_mhz = 
bw_params->clk_table.entries[i].dppclk_mhz;
-   if (bw_params->clk_table.entries[i].phyclk_mhz > max_phyclk_mhz)
-   max_phyclk_mhz = 
bw_params->clk_table.entries[i].phyclk_mhz;
-   if (bw_params->clk_table.entries[i].dtbclk_mhz > max_dtbclk_mhz)
-   max_dtbclk_mhz = 
bw_params->clk_table.entries[i].dtbclk_mhz;
-   }
-}
+
 /*
  * dcn35_update_bw_bounding_box
  *
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.h 
b/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.h
index b122ffdcc30a..e8d5a170893e 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.h
@@ -34,8 +34,6 @@ void dcn35_build_wm_range_table_fpu(struct clk_mgr *clk_mgr);
 void dcn35_update_bw_bounding_box_fpu(struct dc *dc,
  struct clk_bw_params *bw_params);
 
-void dcn35_patch_dpm_table(struct clk_bw_params *bw_params);
-
 int dcn35_populate_dml_pipes_from_context_fpu(struct dc *dc,
  struct dc_state *context,
  display_e2e_pipe_params_st *pipes,
-- 
2.25.1

[PATCH 06/16] drm/amd/display: Update stream mask

2023-10-04 Thread Tom Chung

From: Duncan Ma 

[Why]
Whenever stream changes because of new
pipe arrangements such as ODM. The new
stream mask is not reflected in DMCUB.

The mismatch in stream mask is blocking ips
entry in some scenarios.

[How]
Whenever stream arrangement changes,
update stream mask and notify DMCUB.

Reviewed-by: Charlene Liu 
Acked-by: Tom Chung 
Signed-off-by: Duncan Ma 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 17a36953d3a9..825f275ea9eb 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -3561,6 +3561,7 @@ static void commit_planes_for_stream(struct dc *dc,
bool should_lock_all_pipes = (update_type != UPDATE_TYPE_FAST);
bool subvp_prev_use = false;
bool subvp_curr_use = false;
+   uint8_t current_stream_mask = 0;
 
// Once we apply the new subvp context to hardware it won't be in the
// dc->current_state anymore, so we have to cache it before we apply
@@ -3910,6 +3911,12 @@ static void commit_planes_for_stream(struct dc *dc,
if (pipe_ctx->stream_res.tg->funcs->program_manual_trigger)

pipe_ctx->stream_res.tg->funcs->program_manual_trigger(pipe_ctx->stream_res.tg);
}
+
+   current_stream_mask = get_stream_mask(dc, context);
+   if (current_stream_mask != context->stream_mask) {
+   context->stream_mask = current_stream_mask;
+   dc_dmub_srv_notify_stream_mask(dc->ctx->dmub_srv, 
current_stream_mask);
+   }
 }
 
 /**
-- 
2.25.1

[PATCH 05/16] drm/amd/display: Revert "drm/amd/display: remove duplicated edp relink to fastboot"

2023-10-04 Thread Tom Chung

From: Aric Cyr 

Revert commit a0b8a2c85d1b ("drm/amd/display: remove duplicated edp relink to 
fastboot")

Because it cause 4k EDP not light up on boot

Reviewed-by: Tom Chung 
Cc: Mario Limonciello 
Cc: Alex Deucher 
Cc: sta...@vger.kernel.org
Acked-by: Tom Chung 
Signed-off-by: Aric Cyr 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 59 
 1 file changed, 59 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 63e97fb0a478..17a36953d3a9 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1213,6 +1213,64 @@ static void disable_dangling_plane(struct dc *dc, struct 
dc_state *context)
dc_release_state(current_ctx);
 }
 
+static void disable_vbios_mode_if_required(
+   struct dc *dc,
+   struct dc_state *context)
+{
+   unsigned int i, j;
+
+   /* check if timing_changed, disable stream*/
+   for (i = 0; i < dc->res_pool->pipe_count; i++) {
+   struct dc_stream_state *stream = NULL;
+   struct dc_link *link = NULL;
+   struct pipe_ctx *pipe = NULL;
+
+   pipe = &context->res_ctx.pipe_ctx[i];
+   stream = pipe->stream;
+   if (stream == NULL)
+   continue;
+
+   // only looking for first odm pipe
+   if (pipe->prev_odm_pipe)
+   continue;
+
+   if (stream->link->local_sink &&
+   stream->link->local_sink->sink_signal == 
SIGNAL_TYPE_EDP) {
+   link = stream->link;
+   }
+
+   if (link != NULL && 
link->link_enc->funcs->is_dig_enabled(link->link_enc)) {
+   unsigned int enc_inst, tg_inst = 0;
+   unsigned int pix_clk_100hz;
+
+   enc_inst = 
link->link_enc->funcs->get_dig_frontend(link->link_enc);
+   if (enc_inst != ENGINE_ID_UNKNOWN) {
+   for (j = 0; j < dc->res_pool->stream_enc_count; 
j++) {
+   if (dc->res_pool->stream_enc[j]->id == 
enc_inst) {
+   tg_inst = 
dc->res_pool->stream_enc[j]->funcs->dig_source_otg(
+   
dc->res_pool->stream_enc[j]);
+   break;
+   }
+   }
+
+   
dc->res_pool->dp_clock_source->funcs->get_pixel_clk_frequency_100hz(
+   dc->res_pool->dp_clock_source,
+   tg_inst, &pix_clk_100hz);
+
+   if (link->link_status.link_active) {
+   uint32_t requested_pix_clk_100hz =
+   
pipe->stream_res.pix_clk_params.requested_pix_clk_100hz;
+
+   if (pix_clk_100hz != 
requested_pix_clk_100hz) {
+   
dc->link_srv->set_dpms_off(pipe);
+   pipe->stream->dpms_off = false;
+   }
+   }
+   }
+   }
+   }
+}
+
 static void wait_for_no_pipes_pending(struct dc *dc, struct dc_state *context)
 {
int i;
@@ -1782,6 +1840,7 @@ static enum dc_status dc_commit_state_no_check(struct dc 
*dc, struct dc_state *c
dc_streams[i] =  context->streams[i];
 
if (!dcb->funcs->is_accelerated_mode(dcb)) {
+   disable_vbios_mode_if_required(dc, context);
dc->hwss.enable_accelerated_mode(dc, context);
}
 
-- 
2.25.1

[PATCH 04/16] drm/amd/display: Modify Vmin default value

2023-10-04 Thread Tom Chung

From: Max Tseng 

Fine tune the Vmin clock value

Reviewed-by: Robin Chen 
Acked-by: Tom Chung 
Signed-off-by: Max Tseng 
---
 drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c | 2 +-
 drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c   | 8 +++-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
index d8fa229d78ce..64a2692fd4f6 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
@@ -1914,7 +1914,7 @@ static bool dcn314_resource_construct(
dc->caps.color.mpc.ogam_rom_caps.hlg = 0;
dc->caps.color.mpc.ocsc = 1;
 
-   dc->caps.max_disp_clock_khz_at_vmin = 694000;
+   dc->caps.max_disp_clock_khz_at_vmin = 65;
 
/* Use pipe context based otg sync logic */
dc->config.use_pipe_ctx_sync_logic = true;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c
index 693c7ba4b34d..24b455f3ac3c 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_resource.c
@@ -1831,7 +1831,13 @@ static bool dcn35_resource_construct(
dc->caps.color.mpc.ogam_rom_caps.hlg = 0;
dc->caps.color.mpc.ocsc = 1;
 
-   dc->caps.max_disp_clock_khz_at_vmin = 669154;
+   /* max_disp_clock_khz_at_vmin is slightly lower than the STA value in 
order
+* to provide some margin.
+* It's expected for furture ASIC to have equal or higher value, in 
order to
+* have determinstic power improvement from generate to genration.
+* (i.e., we should not expect new ASIC generation with lower vmin rate)
+*/
+   dc->caps.max_disp_clock_khz_at_vmin = 65;
 
/* Use pipe context based otg sync logic */
dc->config.use_pipe_ctx_sync_logic = true;
-- 
2.25.1

[PATCH 03/16] drm/amd/display: Update pmfw_driver_if new structure

2023-10-04 Thread Tom Chung

From: Charlene Liu 

[why]
pmfw header file updated, need align with data structure.

[How]
Update the data structure.

Reviewed-by: Sung joon Kim 
Acked-by: Tom Chung 
Signed-off-by: Charlene Liu 
---
 .../display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c  | 214 --
 .../amd/display/dc/clk_mgr/dcn35/dcn35_smu.h  |  36 ++-
 2 files changed, 174 insertions(+), 76 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c
index 21dfe3faf08c..f80917f6153b 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c
@@ -507,7 +507,7 @@ static struct wm_table lpddr5_wm_table = {
}
 };
 
-static DpmClocks_t dummy_clocks;
+static DpmClocks_t_dcn35 dummy_clocks;
 
 static struct dcn35_watermarks dummy_wms = { 0 };
 
@@ -597,7 +597,7 @@ static void dcn35_notify_wm_ranges(struct clk_mgr 
*clk_mgr_base)
 static void dcn35_get_dpm_table_from_smu(struct clk_mgr_internal *clk_mgr,
struct dcn35_smu_dpm_clks *smu_dpm_clks)
 {
-   DpmClocks_t *table = smu_dpm_clks->dpm_clks;
+   DpmClocks_t_dcn35 *table = smu_dpm_clks->dpm_clks;
 
if (!clk_mgr->smu_ver)
return;
@@ -627,88 +627,158 @@ static uint32_t find_max_clk_value(const uint32_t 
clocks[], uint32_t num_clocks)
return max;
 }
 
-static unsigned int find_clk_for_voltage(
-   const DpmClocks_t *clock_table,
-   const uint32_t clocks[],
-   unsigned int voltage)
+static inline bool is_valid_clock_value(uint32_t clock_value)
 {
-   int i;
-   int max_voltage = 0;
-   int clock = 0;
-
-   for (i = 0; i < NUM_SOC_VOLTAGE_LEVELS; i++) {
-   if (clock_table->SocVoltage[i] == voltage) {
-   return clocks[i];
-   } else if (clock_table->SocVoltage[i] >= max_voltage &&
-   clock_table->SocVoltage[i] < voltage) {
-   max_voltage = clock_table->SocVoltage[i];
-   clock = clocks[i];
-   }
+   return clock_value > 1 && clock_value < 10;
+}
+
+static unsigned int convert_wck_ratio(uint8_t wck_ratio)
+{
+   switch (wck_ratio) {
+   case WCK_RATIO_1_2:
+   return 2;
+
+   case WCK_RATIO_1_4:
+   return 4;
+   /* Find lowest DPM, FCLK is filled in reverse order*/
+
+   default:
+   break;
}
 
-   ASSERT(clock);
-   return clock;
+   return 1;
 }
 
 static void dcn35_clk_mgr_helper_populate_bw_params(struct clk_mgr_internal 
*clk_mgr,
struct integrated_info 
*bios_info,
-   const DpmClocks_t 
*clock_table)
+   DpmClocks_t_dcn35 
*clock_table)
 {
-   int i, j;
struct clk_bw_params *bw_params = clk_mgr->base.bw_params;
-   uint32_t max_dispclk = 0, max_dppclk = 0;
-
-   j = -1;
-
-   ASSERT(NUM_DF_PSTATE_LEVELS <= MAX_NUM_DPM_LVL);
-
-   /* Find lowest DPM, FCLK is filled in reverse order*/
+   struct clk_limit_table_entry def_max = 
bw_params->clk_table.entries[bw_params->clk_table.num_entries - 1];
+   uint32_t max_pstate = 0,  max_uclk = 0, max_fclk = 0;
+   uint32_t min_pstate = 0, max_dispclk = 0, max_dppclk = 0;
+   int i;
 
-   for (i = NUM_DF_PSTATE_LEVELS - 1; i >= 0; i--) {
-   if (clock_table->DfPstateTable[i].FClk != 0) {
-   j = i;
-   break;
+   for (i = 0; i < clock_table->NumMemPstatesEnabled; i++) {
+   if (is_valid_clock_value(clock_table->MemPstateTable[i].UClk) &&
+   clock_table->MemPstateTable[i].UClk > max_uclk) {
+   max_uclk = clock_table->MemPstateTable[i].UClk;
+   max_pstate = i;
}
}
 
-   if (j == -1) {
-   /* clock table is all 0s, just use our own hardcode */
-   ASSERT(0);
-   return;
-   }
+   /* We expect the table to contain at least one valid Uclk entry. */
+   ASSERT(is_valid_clock_value(max_uclk));
 
-   bw_params->clk_table.num_entries = j + 1;
 
/* dispclk and dppclk can be max at any voltage, same number of levels 
for both */
if (clock_table->NumDispClkLevelsEnabled <= NUM_DISPCLK_DPM_LEVELS &&
clock_table->NumDispClkLevelsEnabled <= NUM_DPPCLK_DPM_LEVELS) {
-   max_dispclk = find_max_clk_value(clock_table->DispClocks, 
clock_table->NumDispClkLevelsEnabled);
-   max_dppclk = find_max_clk_value(clock_table->DppClocks, 
clock_table->NumDispClkLevelsEnabled);
+   max_dispclk = find_max_clk_value(clock_table->DispClocks,
+   clock_table->NumDispClkLevelsEnabled);
+

[PATCH 02/16] drm/amd/display: VSIF v3 set Max Refresh Rate

2023-10-04 Thread Tom Chung

From: Muhammad Ansari 

[WHY]
FreeSync spec requires PB8 and PB12 to be set to nominal
refresh rate regardless of fixed rate or variable

[HOW]
Removed the condition that checks and overwrites max refresh rate
and set PB8/PB12 to be set to max refresh rate always

Reviewed-by: Anthony Koo 
Acked-by: Tom Chung 
Signed-off-by: Muhammad Ansari 
---
 drivers/gpu/drm/amd/display/modules/freesync/freesync.c | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c 
b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
index ef3a67409021..ccecddafeb05 100644
--- a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
+++ b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
@@ -626,7 +626,6 @@ static void build_vrr_infopacket_data_v3(const struct 
mod_vrr_params *vrr,
unsigned int max_refresh;
unsigned int fixed_refresh;
unsigned int min_programmed;
-   unsigned int max_programmed;
 
/* PB1 = 0x1A (24bit AMD IEEE OUI (0x1A) - Byte 0) */
infopacket->sb[1] = 0x1A;
@@ -672,21 +671,17 @@ static void build_vrr_infopacket_data_v3(const struct 
mod_vrr_params *vrr,
(vrr->state == VRR_STATE_INACTIVE) ? min_refresh :
max_refresh; // Non-fs case, program nominal range
 
-   max_programmed = (vrr->state == VRR_STATE_ACTIVE_FIXED) ? fixed_refresh 
:
-   (vrr->state == VRR_STATE_ACTIVE_VARIABLE) ? max_refresh 
:
-   max_refresh;// Non-fs case, program nominal range
-
/* PB7 = FreeSync Minimum refresh rate (Hz) */
infopacket->sb[7] = min_programmed & 0xFF;
 
/* PB8 = FreeSync Maximum refresh rate (Hz) */
-   infopacket->sb[8] = max_programmed & 0xFF;
+   infopacket->sb[8] = max_refresh & 0xFF;
 
/* PB11 : MSB FreeSync Minimum refresh rate [Hz] - bits 9:8 */
infopacket->sb[11] = (min_programmed >> 8) & 0x03;
 
/* PB12 : MSB FreeSync Maximum refresh rate [Hz] - bits 9:8 */
-   infopacket->sb[12] = (max_programmed >> 8) & 0x03;
+   infopacket->sb[12] = (max_refresh >> 8) & 0x03;
 
/* PB16 : Reserved bits 7:1, FixedRate bit 0 */
infopacket->sb[16] = (vrr->state == VRR_STATE_ACTIVE_FIXED) ? 1 : 0;
-- 
2.25.1

[PATCH 01/16] drm/amd/display: Modify SMU message logs

2023-10-04 Thread Tom Chung

From: Sung Joon Kim 

[why]
It's important to make sure SMU messages
are logged by default to improve debugging for
power optimization use cases.

[how]
Change logs to warnings when SMU message
returns non-success id.

Reviewed-by: Charlene Liu 
Acked-by: Tom Chung 
Signed-off-by: Sung Joon Kim 
---
 .../drm/amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c |  1 +
 .../gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_smu.c | 12 ++--
 drivers/gpu/drm/amd/display/dc/dc.h  |  1 +
 drivers/gpu/drm/amd/display/dc/dcn35/dcn35_pg_cntl.c |  1 +
 4 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c
index b5acd7b01e40..21dfe3faf08c 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c
@@ -1046,6 +1046,7 @@ void dcn35_clk_mgr_construct(
ctx->dc->debug.disable_dpp_power_gate = false;
ctx->dc->debug.disable_hubp_power_gate = false;
ctx->dc->debug.disable_dsc_power_gate = false;
+   ctx->dc->debug.disable_hpo_power_gate = false;
} else {
/*let's reset the config control flag*/
ctx->dc->config.disable_ips = 1; /*pmfw not support it, 
disable it all*/
diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_smu.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_smu.c
index cf74e69cb2a1..b6b8c3ca1572 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_smu.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_smu.c
@@ -130,11 +130,11 @@ static int dcn35_smu_send_msg_with_param(struct 
clk_mgr_internal *clk_mgr,
result = dcn35_smu_wait_for_response(clk_mgr, 10, 200);
ASSERT(result == VBIOSSMC_Result_OK);
 
+   if (result != VBIOSSMC_Result_OK) {
+   DC_LOG_WARNING("SMU response after wait: %d, msg id = %d\n", 
result, msg_id);
 
-
-   if (result == VBIOSSMC_Status_BUSY) {
-   smu_print("SMU response after wait: %d\n", result);
-   return -1;
+   if (result == VBIOSSMC_Status_BUSY)
+   return -1;
}
 
/* First clear response register */
@@ -155,7 +155,7 @@ static int dcn35_smu_send_msg_with_param(struct 
clk_mgr_internal *clk_mgr,
else
ASSERT(0);
REG_WRITE(MP1_SMN_C2PMSG_91, VBIOSSMC_Result_OK);
-   smu_print("SMU response after wait: %d\n", result);
+   DC_LOG_WARNING("SMU response after wait: %d, msg id = %d\n", 
result, msg_id);
return -1;
}
 
@@ -163,7 +163,7 @@ static int dcn35_smu_send_msg_with_param(struct 
clk_mgr_internal *clk_mgr,
ASSERT(0);
result = dcn35_smu_wait_for_response(clk_mgr, 10, 200);
//dm_helpers_smu_timeout(CTX, msg_id, param, 10 * 20);
-   smu_print("SMU response after wait: %d\n", result);
+   DC_LOG_WARNING("SMU response after wait: %d, msg id = %d\n", 
result, msg_id);
}
 
return REG_READ(MP1_SMN_C2PMSG_83);
diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 7d1ce58d493b..6c51ebf5bbad 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -830,6 +830,7 @@ struct dc_debug_options {
bool disable_hubp_power_gate;
bool disable_dsc_power_gate;
bool disable_optc_power_gate;
+   bool disable_hpo_power_gate;
int dsc_min_slice_height_override;
int dsc_bpp_increment_div;
bool disable_pplib_wm_range;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_pg_cntl.c 
b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_pg_cntl.c
index ccfd3102e5a0..e62a192c595e 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_pg_cntl.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_pg_cntl.c
@@ -262,6 +262,7 @@ void pg_cntl35_hpo_pg_control(struct pg_cntl *pg_cntl, bool 
power_on)
bool block_enabled;
 
if (pg_cntl->ctx->dc->debug.ignore_pg ||
+   pg_cntl->ctx->dc->debug.disable_hpo_power_gate ||
pg_cntl->ctx->dc->idle_optimizations_allowed)
return;
 
-- 
2.25.1

[PATCH 00/16] DC Patches Oct 06 2023

2023-10-04 Thread Tom Chung

This DC patchset brings improvements in multiple areas. In summary, we have:

- Refactor DPG test pattern logic for ODM cases
- Refactor HWSS into component folder
- Revert "drm/amd/display: Add a check for idle power optimization"
- Revert "drm/amd/display: remove duplicated edp relink to fastboot
- Update cursor limits based on SW cursor fallback limits
- Update stream mask
- Update pmfw_driver_if new structure
- Modify SMU message logs
- Don't set dpms_off for seamless boot

Cc: Daniel Wheeler 

Alvin Lee (2):
  drm/amd/display: Update cursor limits based on SW cursor fallback
limits
  drm/amd/display: Make DCN3x use older FPO sequence

Aric Cyr (2):
  drm/amd/display: Revert "drm/amd/display: remove duplicated edp relink
to fastboot"
  drm/amd/display: 3.2.255

Charlene Liu (1):
  drm/amd/display: Update pmfw_driver_if new structure

Daniel Miess (1):
  drm/amd/display: Don't set dpms_off for seamless boot

Duncan Ma (1):
  drm/amd/display: Update stream mask

George Shen (2):
  drm/amd/display: Refactor DPG test pattern logic for ODM cases
  drm/amd/display: Disable SubVP if test pattern is enabled

Max Tseng (1):
  drm/amd/display: Modify Vmin default value

Mounika Adhuri (1):
  drm/amd/display: Refactor HWSS into component folder

Muhammad Ahmed (1):
  drm/amd/display: Update dml ssb from pmfw clock table

Muhammad Ansari (1):
  drm/amd/display: VSIF v3 set Max Refresh Rate

Nicholas Kazlauskas (1):
  drm/amd/display: Don't use fsleep for PSR exit waits

Sung Joon Kim (2):
  drm/amd/display: Modify SMU message logs
  drm/amd/display: Revert "drm/amd/display: Add a check for idle power
optimization"

 drivers/gpu/drm/amd/display/Makefile  |   1 +
 drivers/gpu/drm/amd/display/dc/Makefile   |   2 +-
 .../dc/clk_mgr/dce120/dce120_clk_mgr.c|   2 +-
 .../display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c  | 215 --
 .../amd/display/dc/clk_mgr/dcn35/dcn35_smu.c  |  12 +-
 .../amd/display/dc/clk_mgr/dcn35/dcn35_smu.h  |  36 ++-
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 105 +++--
 .../gpu/drm/amd/display/dc/core/dc_resource.c | 143 
 .../gpu/drm/amd/display/dc/core/dc_stream.c   |  34 +--
 drivers/gpu/drm/amd/display/dc/dc.h   |   6 +-
 drivers/gpu/drm/amd/display/dc/dce/Makefile   |   2 +-
 drivers/gpu/drm/amd/display/dc/dce/dce_dmcu.c |   3 +-
 drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c |   3 +-
 .../gpu/drm/amd/display/dc/dce100/Makefile|   2 +-
 .../amd/display/dc/dce100/dce100_resource.c   |   4 +-
 .../gpu/drm/amd/display/dc/dce110/Makefile|   2 +-
 .../amd/display/dc/dce110/dce110_resource.c   |   2 +-
 .../gpu/drm/amd/display/dc/dce112/Makefile|   2 +-
 .../amd/display/dc/dce112/dce112_resource.c   |   2 +-
 .../gpu/drm/amd/display/dc/dce120/Makefile|   1 -
 .../amd/display/dc/dce120/dce120_resource.c   |   6 +-
 .../amd/display/dc/dce60/dce60_hw_sequencer.c |   4 +-
 drivers/gpu/drm/amd/display/dc/dce80/Makefile |   2 +-
 .../drm/amd/display/dc/dce80/dce80_resource.c |   2 +-
 drivers/gpu/drm/amd/display/dc/dcn10/Makefile |   2 +-
 .../dc/dcn10/dcn10_hw_sequencer_debug.c   |   4 +-
 .../gpu/drm/amd/display/dc/dcn10/dcn10_init.c |   4 +-
 .../drm/amd/display/dc/dcn10/dcn10_resource.c |   4 +-
 drivers/gpu/drm/amd/display/dc/dcn20/Makefile |   2 +-
 .../gpu/drm/amd/display/dc/dcn20/dcn20_init.c |   6 +-
 .../drm/amd/display/dc/dcn20/dcn20_resource.c |   4 +-
 .../gpu/drm/amd/display/dc/dcn201/Makefile|   2 +-
 .../drm/amd/display/dc/dcn201/dcn201_init.c   |   6 +-
 .../amd/display/dc/dcn201/dcn201_resource.c   |   4 +-
 drivers/gpu/drm/amd/display/dc/dcn21/Makefile |   2 +-
 .../gpu/drm/amd/display/dc/dcn21/dcn21_init.c |   6 +-
 .../drm/amd/display/dc/dcn21/dcn21_resource.c |   2 +-
 drivers/gpu/drm/amd/display/dc/dcn30/Makefile |   1 -
 .../gpu/drm/amd/display/dc/dcn30/dcn30_init.c |   6 +-
 .../drm/amd/display/dc/dcn30/dcn30_resource.c |   2 +-
 .../gpu/drm/amd/display/dc/dcn301/Makefile|   2 +-
 .../drm/amd/display/dc/dcn301/dcn301_init.c   |   6 +-
 .../amd/display/dc/dcn301/dcn301_resource.c   |   2 +-
 .../gpu/drm/amd/display/dc/dcn302/Makefile|   2 +-
 .../drm/amd/display/dc/dcn302/dcn302_init.c   |   2 +-
 .../gpu/drm/amd/display/dc/dcn303/Makefile|   2 +-
 .../drm/amd/display/dc/dcn303/dcn303_init.c   |   2 +-
 drivers/gpu/drm/amd/display/dc/dcn31/Makefile |   2 +-
 .../gpu/drm/amd/display/dc/dcn31/dcn31_init.c |   4 +-
 .../drm/amd/display/dc/dcn31/dcn31_resource.c |   2 +-
 .../gpu/drm/amd/display/dc/dcn314/Makefile|   2 +-
 .../drm/amd/display/dc/dcn314/dcn314_init.c   |   4 +-
 .../amd/display/dc/dcn314/dcn314_resource.c   |   4 +-
 .../amd/display/dc/dcn315/dcn315_resource.c   |   2 +-
 .../amd/display/dc/dcn316/dcn316_resource.c   |   2 +-
 drivers/gpu/drm/amd/display/dc/dcn32/Makefile |   2 +-
 .../gpu/drm/amd/display/dc/dcn32/dcn32_init.c |   8 +-
 .../drm/amd/display/dc/dcn32/dcn32_resource.c |   2 +-
 .../amd/display/dc/dcn321/dcn321_resource.c   |   2 +-
 drivers/gpu/drm

Re: [Patch v2 1/2] drm/amdgpu: Rework KFD memory max limits

2023-10-04 Thread Christian König


Am 02.10.23 um 22:21 schrieb Rajneesh Bhardwaj:

To allow bigger allocations specially on systems such as GFXIP 9.4.3
that use GTT memory for VRAM allocations, relax the limits to
maximize ROCm allocations.

Reviewed-by: Felix Kuehling 
Signed-off-by: Rajneesh Bhardwaj 


Acked-by: Christian König  for the series.


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 10 --
  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index b5b940485059..c55907ff7dcf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -42,6 +42,7 @@
   * changes to accumulate
   */
  #define AMDGPU_USERPTR_RESTORE_DELAY_MS 1
+#define AMDGPU_RESERVE_MEM_LIMIT   (3UL << 29)
  
  /*

   * Align VRAM availability to 2MB to avoid fragmentation caused by 4K 
allocations in the tail 2MB
@@ -115,11 +116,16 @@ void amdgpu_amdkfd_gpuvm_init_mem_limits(void)
return;
  
  	si_meminfo(&si);

-   mem = si.freeram - si.freehigh;
+   mem = si.totalram - si.totalhigh;
mem *= si.mem_unit;
  
  	spin_lock_init(&kfd_mem_limit.mem_limit_lock);

-   kfd_mem_limit.max_system_mem_limit = mem - (mem >> 4);
+   kfd_mem_limit.max_system_mem_limit = mem - (mem >> 6);
+   if (kfd_mem_limit.max_system_mem_limit < 2 * AMDGPU_RESERVE_MEM_LIMIT)
+   kfd_mem_limit.max_system_mem_limit >>= 1;
+   else
+   kfd_mem_limit.max_system_mem_limit -= AMDGPU_RESERVE_MEM_LIMIT;
+
kfd_mem_limit.max_ttm_mem_limit = ttm_tt_pages_limit() << PAGE_SHIFT;
pr_debug("Kernel memory limit %lluM, TTM limit %lluM\n",
(kfd_mem_limit.max_system_mem_limit >> 20),

[PATCH v2 5/5] Documentation/amdgpu: Add board info details

2023-10-04 Thread Lijo Lazar

Add documentation for board info sysfs attribute.

Signed-off-by: Lijo Lazar 
Reviewed-by: Alex Deucher 
---
 Documentation/gpu/amdgpu/driver-misc.rst   |  6 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 16 
 2 files changed, 22 insertions(+)

diff --git a/Documentation/gpu/amdgpu/driver-misc.rst 
b/Documentation/gpu/amdgpu/driver-misc.rst
index 4321c38fef21..82b47f1818ac 100644
--- a/Documentation/gpu/amdgpu/driver-misc.rst
+++ b/Documentation/gpu/amdgpu/driver-misc.rst
@@ -32,6 +32,12 @@ unique_id
 .. kernel-doc:: drivers/gpu/drm/amd/pm/amdgpu_pm.c
:doc: unique_id
 
+board_info
+--
+
+.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+   :doc: board_info
+
 Accelerated Processing Units (APU) Info
 ---
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 10f1641aede9..27c95bb02411 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -162,6 +162,22 @@ static ssize_t amdgpu_device_get_pcie_replay_count(struct 
device *dev,
 static DEVICE_ATTR(pcie_replay_count, 0444,
amdgpu_device_get_pcie_replay_count, NULL);
 
+/**
+ * DOC: board_info
+ *
+ * The amdgpu driver provides a sysfs API for giving board related information.
+ * It provides the form factor information in the format
+ *
+ *   type : form factor
+ *
+ * Possible form factor values
+ *
+ * - "cem" - PCIE CEM card
+ * - "oam" - Open Compute Accelerator Module
+ * - "unknown" - Not known
+ *
+ */
+
 static ssize_t amdgpu_device_get_board_info(struct device *dev,
struct device_attribute *attr,
char *buf)
-- 
2.25.1

[PATCH v2 4/5] drm/amdgpu: Add sysfs attribute to get board info

2023-10-04 Thread Lijo Lazar

Add a sysfs attribute which shows the board form factor like OAM or
CEM.

Signed-off-by: Lijo Lazar 
Reviewed-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 ++
 1 file changed, 57 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index bad2b5577e96..10f1641aede9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -162,6 +162,58 @@ static ssize_t amdgpu_device_get_pcie_replay_count(struct 
device *dev,
 static DEVICE_ATTR(pcie_replay_count, 0444,
amdgpu_device_get_pcie_replay_count, NULL);
 
+static ssize_t amdgpu_device_get_board_info(struct device *dev,
+   struct device_attribute *attr,
+   char *buf)
+{
+   struct drm_device *ddev = dev_get_drvdata(dev);
+   struct amdgpu_device *adev = drm_to_adev(ddev);
+   enum amdgpu_pkg_type pkg_type = AMDGPU_PKG_TYPE_CEM;
+   const char *pkg;
+
+   if (adev->smuio.funcs && adev->smuio.funcs->get_pkg_type)
+   pkg_type = adev->smuio.funcs->get_pkg_type(adev);
+
+   switch (pkg_type) {
+   case AMDGPU_PKG_TYPE_CEM:
+   pkg = "cem";
+   break;
+   case AMDGPU_PKG_TYPE_OAM:
+   pkg = "oam";
+   break;
+   default:
+   pkg = "unknown";
+   break;
+   }
+
+   return sysfs_emit(buf, "%s : %s\n", "type", pkg);
+}
+
+static DEVICE_ATTR(board_info, 0444, amdgpu_device_get_board_info, NULL);
+
+static struct attribute *amdgpu_board_attrs[] = {
+   &dev_attr_board_info.attr,
+   NULL,
+};
+
+static umode_t amdgpu_board_attrs_is_visible(struct kobject *kobj,
+struct attribute *attr, int n)
+{
+   struct device *dev = kobj_to_dev(kobj);
+   struct drm_device *ddev = dev_get_drvdata(dev);
+   struct amdgpu_device *adev = drm_to_adev(ddev);
+
+   if (adev->flags & AMD_IS_APU)
+   return 0;
+
+   return attr->mode;
+}
+
+static const struct attribute_group amdgpu_board_attrs_group = {
+   .attrs = amdgpu_board_attrs,
+   .is_visible = amdgpu_board_attrs_is_visible
+};
+
 static void amdgpu_device_get_pcie_info(struct amdgpu_device *adev);
 
 
@@ -4038,6 +4090,11 @@ int amdgpu_device_init(struct amdgpu_device *adev,
if (r)
dev_err(adev->dev, "Could not create amdgpu device attr\n");
 
+   r = devm_device_add_group(adev->dev, &amdgpu_board_attrs_group);
+   if (r)
+   dev_err(adev->dev,
+   "Could not create amdgpu board attributes\n");
+
amdgpu_fru_sysfs_init(adev);
 
if (IS_ENABLED(CONFIG_PERF_EVENTS))
-- 
2.25.1

[PATCH v2 3/5] drm/amdgpu: Get package types for smuio v13.0

2023-10-04 Thread Lijo Lazar

Add support to query package types supported in smuio v13.0 ASICs.

Signed-off-by: Lijo Lazar 
Reviewed-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/smuio_v13_0.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/smuio_v13_0.c 
b/drivers/gpu/drm/amd/amdgpu/smuio_v13_0.c
index 13e905c22592..bf8b8e5ddf5d 100644
--- a/drivers/gpu/drm/amd/amdgpu/smuio_v13_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/smuio_v13_0.c
@@ -128,6 +128,27 @@ static bool smuio_v13_0_is_host_gpu_xgmi_supported(struct 
amdgpu_device *adev)
return data ? true : false;
 }
 
+static enum amdgpu_pkg_type smuio_v13_0_get_pkg_type(struct amdgpu_device 
*adev)
+{
+   enum amdgpu_pkg_type pkg_type;
+   u32 data;
+
+   data = RREG32_SOC15(SMUIO, 0, regSMUIO_MCM_CONFIG);
+   data = REG_GET_FIELD(data, SMUIO_MCM_CONFIG, TOPOLOGY_ID);
+
+   switch (data) {
+   case 0x4:
+   case 0xC:
+   pkg_type = AMDGPU_PKG_TYPE_CEM;
+   break;
+   default:
+   pkg_type = AMDGPU_PKG_TYPE_OAM;
+   break;
+   }
+
+   return pkg_type;
+}
+
 const struct amdgpu_smuio_funcs smuio_v13_0_funcs = {
.get_rom_index_offset = smuio_v13_0_get_rom_index_offset,
.get_rom_data_offset = smuio_v13_0_get_rom_data_offset,
@@ -136,4 +157,5 @@ const struct amdgpu_smuio_funcs smuio_v13_0_funcs = {
.is_host_gpu_xgmi_supported = smuio_v13_0_is_host_gpu_xgmi_supported,
.update_rom_clock_gating = smuio_v13_0_update_rom_clock_gating,
.get_clock_gating_state = smuio_v13_0_get_clock_gating_state,
+   .get_pkg_type = smuio_v13_0_get_pkg_type,
 };
-- 
2.25.1

[PATCH v2 2/5] drm/amdgpu: Add more smuio v13.0.3 package types

2023-10-04 Thread Lijo Lazar

Expand support to get other board types like OAM or CEM.

Signed-off-by: Lijo Lazar 
Reviewed-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/smuio_v13_0_3.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/smuio_v13_0_3.c 
b/drivers/gpu/drm/amd/amdgpu/smuio_v13_0_3.c
index 4368a5891eeb..5461b5289793 100644
--- a/drivers/gpu/drm/amd/amdgpu/smuio_v13_0_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/smuio_v13_0_3.c
@@ -84,6 +84,12 @@ static enum amdgpu_pkg_type 
smuio_v13_0_3_get_pkg_type(struct amdgpu_device *ade
 * b0100 - b - Reserved
 */
switch (data & PKG_TYPE_MASK) {
+   case 0x0:
+   pkg_type = AMDGPU_PKG_TYPE_CEM;
+   break;
+   case 0x1:
+   pkg_type = AMDGPU_PKG_TYPE_OAM;
+   break;
case 0x2:
pkg_type = AMDGPU_PKG_TYPE_APU;
break;
-- 
2.25.1

[PATCH v2 1/5] drm/amdgpu: Move package type enum to amdgpu_smuio

2023-10-04 Thread Lijo Lazar

Move definition of package type to amdgpu_smuio header and add new
package types for CEM and OAM.

Signed-off-by: Lijo Lazar 
---

v2: Move definition to amdgpu_smuio.h instead of amdgpu.h (Christian/Hawking) 

 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h   | 5 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_smuio.h | 7 +++
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index 42ac6d1bf9ca..7088c5015675 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -69,11 +69,6 @@ enum amdgpu_gfx_partition {
 
 #define NUM_XCC(x) hweight16(x)
 
-enum amdgpu_pkg_type {
-   AMDGPU_PKG_TYPE_APU = 2,
-   AMDGPU_PKG_TYPE_UNKNOWN,
-};
-
 enum amdgpu_gfx_ras_mem_id_type {
AMDGPU_GFX_CP_MEM = 0,
AMDGPU_GFX_GCEA_MEM,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_smuio.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_smuio.h
index 89c38d864471..5910d50ac74d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_smuio.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_smuio.h
@@ -23,6 +23,13 @@
 #ifndef __AMDGPU_SMUIO_H__
 #define __AMDGPU_SMUIO_H__
 
+enum amdgpu_pkg_type {
+   AMDGPU_PKG_TYPE_APU = 2,
+   AMDGPU_PKG_TYPE_CEM = 3,
+   AMDGPU_PKG_TYPE_OAM = 4,
+   AMDGPU_PKG_TYPE_UNKNOWN,
+};
+
 struct amdgpu_smuio_funcs {
u32 (*get_rom_index_offset)(struct amdgpu_device *adev);
u32 (*get_rom_data_offset)(struct amdgpu_device *adev);
-- 
2.25.1

1 2 >

1 - 100 of 101 matches

Mail list logo