from:"Quan, Evan"

RE: [V10 1/8] ACPI: Add support for AMD ACPI based Wifi band RFI mitigation feature

2023-08-27 Thread Quan, Evan

[AMD Official Use Only - General]

> -Original Message-
> From: Simon Horman 
> Sent: Sunday, August 27, 2023 11:43 PM
> To: Quan, Evan 
> Cc: l...@kernel.org; johan...@sipsolutions.net; da...@davemloft.net;
> eduma...@google.com; k...@kernel.org; pab...@redhat.com; Deucher,
> Alexander ; raf...@kernel.org; Lazar, Lijo
> ; Limonciello, Mario ;
> linux-ker...@vger.kernel.org; linux-a...@vger.kernel.org; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> wirel...@vger.kernel.org; net...@vger.kernel.org
> Subject: Re: [V10 1/8] ACPI: Add support for AMD ACPI based Wifi band RFI
> mitigation feature
>
> On Fri, Aug 25, 2023 at 04:38:39PM +0800, Evan Quan wrote:
> > Due to electrical and mechanical constraints in certain platform
> > designs there may be likely interference of relatively high-powered
> > harmonics of the (G-)DDR memory clocks with local radio module
> > frequency bands used by Wifi 6/6e/7.
> >
> > To mitigate this, AMD has introduced a mechanism that devices can use
> > to notify active use of particular frequencies so that other devices
> > can make relative internal adjustments as necessary to avoid this resonance.
> >
> > Signed-off-by: Evan Quan 
>
> ...
>
> > diff --git a/drivers/acpi/amd_wbrf.c b/drivers/acpi/amd_wbrf.c
>
> ...
>
> > +/**
> > + * acpi_amd_wbrf_add_exclusion - broadcast the frequency band the
> device
> > + *   is using
> > + *
> > + * @dev: device pointer
> > + * @in: input structure containing the frequency band the device is
> > +using
> > + *
> > + * Broadcast to other consumers the frequency band the device starts
> > + * to use. Underneath the surface the information is cached into an
> > + * internal buffer first. Then a notification is sent to all those
> > + * registered consumers. So then they can retrieve that buffer to
> > + * know the latest active frequency bands. The benifit with such
> > +design
>
> nit: ./checkpatch.pl --codespell suggests benifit -> benefit.
Thanks, will fix that.

Evan
>
> > + * is for those consumers which have not been registered yet, they
> > +can
> > + * still have a chance to retrieve such information later.
> > + */
> > +int acpi_amd_wbrf_add_exclusion(struct device *dev,
> > +   struct wbrf_ranges_in_out *in)
> > +{
> > +   struct acpi_device *adev = ACPI_COMPANION(dev);
> > +   int ret;
> > +
> > +   if (!adev)
> > +   return -ENODEV;
> > +
> > +   ret = wbrf_record(adev, WBRF_RECORD_ADD, in);
> > +   if (ret)
> > +   return ret;
> > +
> > +   blocking_notifier_call_chain(_chain_head,
> > +WBRF_CHANGED,
> > +NULL);
> > +
> > +   return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(acpi_amd_wbrf_add_exclusion);
>
> ...

RE: [V10 7/8] drm/amd/pm: enable Wifi RFI mitigation feature support for SMU13.0.0

2023-08-27 Thread Quan, Evan

[AMD Official Use Only - General]

> -Original Message-
> From: Lazar, Lijo 
> Sent: Friday, August 25, 2023 10:13 PM
> To: Quan, Evan ; l...@kernel.org;
> johan...@sipsolutions.net; da...@davemloft.net; eduma...@google.com;
> k...@kernel.org; pab...@redhat.com; Deucher, Alexander
> ; raf...@kernel.org; Limonciello, Mario
> 
> Cc: linux-ker...@vger.kernel.org; linux-a...@vger.kernel.org; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> wirel...@vger.kernel.org; net...@vger.kernel.org
> Subject: Re: [V10 7/8] drm/amd/pm: enable Wifi RFI mitigation feature
> support for SMU13.0.0
>
>
>
> On 8/25/2023 2:08 PM, Evan Quan wrote:
> > Fulfill the SMU13.0.0 support for Wifi RFI mitigation feature.
> >
> > Signed-off-by: Evan Quan 
> > Reviewed-by: Mario Limonciello 
> > ---
> >   drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |  3 +
> >   drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h  |  3 +-
> >   drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h  |  3 +
> >   .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c|  9 +++
> >   .../drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c  | 60
> +++
> >   5 files changed, 77 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> > b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> > index 60d595344c45..a081e6bb27c4 100644
> > --- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> > +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> > @@ -325,6 +325,7 @@ enum smu_table_id
> > SMU_TABLE_PACE,
> > SMU_TABLE_ECCINFO,
> > SMU_TABLE_COMBO_PPTABLE,
> > +   SMU_TABLE_WIFIBAND,
> > SMU_TABLE_COUNT,
> >   };
> >
> > @@ -1501,6 +1502,8 @@ enum smu_baco_seq {
> >  __dst_size);  \
> >   })
> >
> > +#define HZ_IN_MHZ  100U
> > +
> >   #if !defined(SWSMU_CODE_LAYER_L2)
> && !defined(SWSMU_CODE_LAYER_L3)
> && !defined(SWSMU_CODE_LAYER_L4)
> >   int smu_get_power_limit(void *handle,
> > uint32_t *limit,
> > diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h
> > b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h
> > index 297b70b9388f..5bbb60289a79 100644
> > --- a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h
> > +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h
> > @@ -245,7 +245,8 @@
> > __SMU_DUMMY_MAP(AllowGpo),  \
> > __SMU_DUMMY_MAP(Mode2Reset),\
> > __SMU_DUMMY_MAP(RequestI2cTransaction), \
> > -   __SMU_DUMMY_MAP(GetMetricsTable),
> > +   __SMU_DUMMY_MAP(GetMetricsTable), \
> > +   __SMU_DUMMY_MAP(EnableUCLKShadow),
> >
> >   #undef __SMU_DUMMY_MAP
> >   #define __SMU_DUMMY_MAP(type) SMU_MSG_##type
> > diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h
> > b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h
> > index 355c156d871a..dd70b56aa71e 100644
> > --- a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h
> > +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h
> > @@ -299,5 +299,8 @@ int smu_v13_0_update_pcie_parameters(struct
> smu_context *smu,
> >  uint32_t pcie_gen_cap,
> >  uint32_t pcie_width_cap);
> >
> > +int smu_v13_0_enable_uclk_shadow(struct smu_context *smu,
> > +bool enablement);
> > +
> >   #endif
> >   #endif
> > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
> > b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
> > index 9b62b45ebb7f..6a5cb582aa92 100644
> > --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
> > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
> > @@ -2472,3 +2472,12 @@ int smu_v13_0_update_pcie_parameters(struct
> > smu_context *smu,
> >
> > return 0;
> >   }
> > +
> > +int smu_v13_0_enable_uclk_shadow(struct smu_context *smu,
> > +bool enablement)
> > +{
> > +   return smu_cmn_send_smc_msg_with_param(smu,
> > +  SMU_MSG_EnableUCLKShadow,
> > +  enablement,
> > +  NULL);
> > +}
> > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
> > b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
> > index 3d188616ba24..fd3ac18653ed 100644
> > --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
> > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
> > @@ -

RE: [V10 5/8] drm/amd/pm: setup the framework to support Wifi RFI mitigation feature

2023-08-27 Thread Quan, Evan

[AMD Official Use Only - General]

> -Original Message-
> From: Lazar, Lijo 
> Sent: Friday, August 25, 2023 10:09 PM
> To: Quan, Evan ; l...@kernel.org;
> johan...@sipsolutions.net; da...@davemloft.net; eduma...@google.com;
> k...@kernel.org; pab...@redhat.com; Deucher, Alexander
> ; raf...@kernel.org; Limonciello, Mario
> 
> Cc: linux-ker...@vger.kernel.org; linux-a...@vger.kernel.org; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> wirel...@vger.kernel.org; net...@vger.kernel.org
> Subject: Re: [V10 5/8] drm/amd/pm: setup the framework to support Wifi
> RFI mitigation feature
>
>
>
> On 8/25/2023 2:08 PM, Evan Quan wrote:
> > With WBRF feature supported, as a driver responding to the
> > frequencies, amdgpu driver is able to do shadow pstate switching to
> > mitigate possible interference(between its (G-)DDR memory clocks and
> > local radio module frequency bands used by Wifi 6/6e/7).
> >
> > Signed-off-by: Evan Quan 
> > Reviewed-by: Mario Limonciello 
> > --
> > v1->v2:
> >- update the prompt for feature support(Lijo)
> > v8->v9:
> >- update parameter document for smu_wbrf_event_handler(Simon)
> > v9->v10:
> >   - correct the logics for wbrf range sorting(Lijo)
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   2 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  17 ++
> >   drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 195
> ++
> >   drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |  23 +++
> >   drivers/gpu/drm/amd/pm/swsmu/smu_internal.h   |   3 +
> >   5 files changed, 240 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > index a3b86b86dc47..2bfc9111ab00 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > @@ -247,6 +247,8 @@ extern int amdgpu_sg_display;
> >
> >   extern int amdgpu_user_partt_mode;
> >
> > +extern int amdgpu_wbrf;
> > +
> >   #define AMDGPU_VM_MAX_NUM_CTX 4096
> >   #define AMDGPU_SG_THRESHOLD   (256*1024*1024)
> >   #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS3000
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > index 0593ef8fe0a6..1c574bd3b60d 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > @@ -195,6 +195,7 @@ int amdgpu_use_xgmi_p2p = 1;
> >   int amdgpu_vcnfw_log;
> >   int amdgpu_sg_display = -1; /* auto */
> >   int amdgpu_user_partt_mode =
> AMDGPU_AUTO_COMPUTE_PARTITION_MODE;
> > +int amdgpu_wbrf = -1;
> >
> >   static void amdgpu_drv_delayed_reset_work_handler(struct work_struct
> > *work);
> >
> > @@ -981,6 +982,22 @@ module_param_named(user_partt_mode,
> amdgpu_user_partt_mode, uint, 0444);
> >   module_param(enforce_isolation, bool, 0444);
> >   MODULE_PARM_DESC(enforce_isolation, "enforce process isolation
> > between graphics and compute . enforce_isolation = on");
> >
> > +/**
> > + * DOC: wbrf (int)
> > + * Enable Wifi RFI interference mitigation feature.
> > + * Due to electrical and mechanical constraints there may be likely
> > +interference of
> > + * relatively high-powered harmonics of the (G-)DDR memory clocks
> > +with local radio
> > + * module frequency bands used by Wifi 6/6e/7. To mitigate the
> > +possible RFI interference,
> > + * with this feature enabled, PMFW will use either “shadowed P-State”
> > +or “P-State” based
> > + * on active list of frequencies in-use (to be avoided) as part of
> > +initial setting or
> > + * P-state transition. However, there may be potential performance
> > +impact with this
> > + * feature enabled.
> > + * (0 = disabled, 1 = enabled, -1 = auto (default setting, will be
> > +enabled if supported))  */ MODULE_PARM_DESC(wbrf,
> > +   "Enable Wifi RFI interference mitigation (0 = disabled, 1 = enabled,
> > +-1 = auto(default)"); module_param_named(wbrf, amdgpu_wbrf, int,
> > +0444);
> > +
> >   /* These devices are not supported by amdgpu.
> >* They are supported by the mach64, r128, radeon drivers
> >*/
> > diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> > b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> > index ce41a8309582..bdfd234d1558 100644
> > --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> > +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.

RE: [PATCH V7 4/9] wifi: mac80211: Add support for ACPI WBRF

2023-08-14 Thread Quan, Evan

[AMD Official Use Only - General]

Hi Andrew,

I sent out a new V8 series last week.
A kernel parameter `wbrf` was introduced there to decide the policy.
Please help to check whether that makes sense to you.
Please share your insights there.

BR,
Evan
> -Original Message-
> From: Andrew Lunn 
> Sent: Wednesday, July 26, 2023 4:10 AM
> To: Limonciello, Mario 
> Cc: Quan, Evan ; raf...@kernel.org; l...@kernel.org;
> Deucher, Alexander ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; johan...@sipsolutions.net;
> da...@davemloft.net; eduma...@google.com; k...@kernel.org;
> pab...@redhat.com; mdaen...@redhat.com;
> maarten.lankho...@linux.intel.com; tzimmerm...@suse.de;
> hdego...@redhat.com; jingyuwang_...@163.com; Lazar, Lijo
> ; jim.cro...@gmail.com; bellosili...@gmail.com;
> andrealm...@igalia.com; t...@redhat.com; j...@jsg.id.au; a...@arndb.de;
> linux-ker...@vger.kernel.org; linux-a...@vger.kernel.org; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> wirel...@vger.kernel.org; net...@vger.kernel.org
> Subject: Re: [PATCH V7 4/9] wifi: mac80211: Add support for ACPI WBRF
> 
> > This comes back to the point that was mentioned by Johannes - you need
> > to have deep design understanding of the hardware to know whether or
> > not you will have producers that a consumer need to react to.
> 
> Yes, this is the policy is keep referring to. I would expect that there is 
> something
> somewhere in ACPI which says for this machine, the policy is Yes/No.
> 
> It could well be that AMD based machine has a different ACPI extension to
> indicate this policy to what Intel machine has. As far as i understand it, you
> have not submitted this yet for formal approval, this is all vendor specific, 
> so
> Intel could do it completely differently. Hence i would expect a generic API 
> to
> tell the core what the policy is, and your glue code can call into ACPI to 
> find out
> that information, and then tell the core.
> 
> > If all producers indicate their frequency and all consumers react to
> > it you may have activated mitigations that are unnecessary. The
> > hardware designer may have added extra shielding or done the layout
> > such that they're not needed.
> 
> And the policy will indicate No, nothing needs to be done. The core can then
> tell produces and consumes not to bother telling the core anything.
> 
> > So I don't think we're ever going to be in a situation that the
> > generic implementation should be turned on by default.  It's a "developer
> knob".
> 
> Wrong. You should have a generic core, which your AMD CPU DDR device
> plugs into. The Intel CPU DDR device can plug into, the nvidea GPU can plug
> into, your Radeon GPU can plug into, the intel ARC can plug into, the generic
> WiFi core plugs into, etc.
> 
> > If needed these can then be enabled using the AMD ACPI interface, a DT
> > one if one is developed or maybe even an allow-list of SMBIOS strings.
> 
> Notice i've not mentioned DT for a while. I just want a generic core, which
> AMD, Intel, nvidea, Ampare, Graviton, Qualcomm, Marvell, ..., etc can use. We
> should be solving this problem once, for everybody, not adding a solution for
> just one vendor.
> 
>   Andrew

RE: [PATCH V8 3/9] cfg80211: expose nl80211_chan_width_to_mhz for wide sharing

2023-08-14 Thread Quan, Evan

[AMD Official Use Only - General]



> -Original Message-
> From: Jeff Johnson 
> Sent: Thursday, August 10, 2023 10:06 PM
> To: Quan, Evan ; raf...@kernel.org; l...@kernel.org;
> Deucher, Alexander ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; johan...@sipsolutions.net;
> da...@davemloft.net; eduma...@google.com; k...@kernel.org;
> pab...@redhat.com; Limonciello, Mario ;
> mdaen...@redhat.com; maarten.lankho...@linux.intel.com;
> tzimmerm...@suse.de; hdego...@redhat.com; jingyuwang_...@163.com;
> Lazar, Lijo ; jim.cro...@gmail.com;
> bellosili...@gmail.com; andrealm...@igalia.com; t...@redhat.com;
> j...@jsg.id.au; a...@arndb.de; and...@lunn.ch
> Cc: linux-ker...@vger.kernel.org; linux-a...@vger.kernel.org; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> wirel...@vger.kernel.org; net...@vger.kernel.org
> Subject: Re: [PATCH V8 3/9] cfg80211: expose nl80211_chan_width_to_mhz
> for wide sharing
> 
> On 8/10/2023 12:37 AM, Evan Quan wrote:
> > The newly added WBRF feature needs this interface for channel width
> > calculation.
> >
> > Signed-off-by: Evan Quan 
> > ---
> >   include/net/cfg80211.h | 8 
> >   net/wireless/chan.c| 3 ++-
> >   2 files changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h index
> > 7c7d03aa9d06..f50508e295db 100644
> > --- a/include/net/cfg80211.h
> > +++ b/include/net/cfg80211.h
> > @@ -920,6 +920,14 @@ const struct cfg80211_chan_def *
> >   cfg80211_chandef_compatible(const struct cfg80211_chan_def
> *chandef1,
> > const struct cfg80211_chan_def *chandef2);
> >
> > +/**
> > + * nl80211_chan_width_to_mhz - get the channel width in Mhz
> > + * @chan_width: the channel width from  nl80211_chan_width
> > + * Return: channel width in Mhz if the chan_width from 
> > +nl80211_chan_width
> > + * is valid. -1 otherwise.
> 
> SI nit: s/Mhz/MHz/ in both places
Thanks, will update them accordingly.

Evan
> 
> > + */
> > +int nl80211_chan_width_to_mhz(enum nl80211_chan_width
> chan_width);
> > +
> >   /**
> >* cfg80211_chandef_valid - check if a channel definition is valid
> >* @chandef: the channel definition to check diff --git
> > a/net/wireless/chan.c b/net/wireless/chan.c index
> > 0b7e81db383d..227db04eac42 100644
> > --- a/net/wireless/chan.c
> > +++ b/net/wireless/chan.c
> > @@ -141,7 +141,7 @@ static bool cfg80211_edmg_chandef_valid(const
> struct cfg80211_chan_def *chandef)
> > return true;
> >   }
> >
> > -static int nl80211_chan_width_to_mhz(enum nl80211_chan_width
> > chan_width)
> > +int nl80211_chan_width_to_mhz(enum nl80211_chan_width chan_width)
> >   {
> > int mhz;
> >
> > @@ -190,6 +190,7 @@ static int nl80211_chan_width_to_mhz(enum
> nl80211_chan_width chan_width)
> > }
> > return mhz;
> >   }
> > +EXPORT_SYMBOL(nl80211_chan_width_to_mhz);
> >
> >   static int cfg80211_chandef_get_width(const struct cfg80211_chan_def
> *c)
> >   {

RE: [PATCH V8 1/9] drivers core: Add support for Wifi band RF mitigations

2023-08-14 Thread Quan, Evan

[AMD Official Use Only - General]



> -Original Message-
> From: Randy Dunlap 
> Sent: Thursday, August 10, 2023 11:41 PM
> To: Quan, Evan ; raf...@kernel.org; l...@kernel.org;
> Deucher, Alexander ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; johan...@sipsolutions.net;
> da...@davemloft.net; eduma...@google.com; k...@kernel.org;
> pab...@redhat.com; Limonciello, Mario ;
> mdaen...@redhat.com; maarten.lankho...@linux.intel.com;
> tzimmerm...@suse.de; hdego...@redhat.com; jingyuwang_...@163.com;
> Lazar, Lijo ; jim.cro...@gmail.com;
> bellosili...@gmail.com; andrealm...@igalia.com; t...@redhat.com;
> j...@jsg.id.au; a...@arndb.de; and...@lunn.ch
> Cc: linux-ker...@vger.kernel.org; linux-a...@vger.kernel.org; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> wirel...@vger.kernel.org; net...@vger.kernel.org
> Subject: Re: [PATCH V8 1/9] drivers core: Add support for Wifi band RF
> mitigations
> 
> 
> 
> On 8/10/23 00:37, Evan Quan wrote:
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt
> > b/Documentation/admin-guide/kernel-parameters.txt
> > index a1457995fd41..21f73a0bbd0b 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -7152,3 +7152,12 @@
> > xmon commands.
> > off xmon is disabled.
> >
> > +   wbrf=   [KNL]
> > +   Format: { on | auto | off }
> > +   Controls if WBRF features should be enabled or
> disabled
> > +   forcely. Default is auto.
> 
> "forcely" is not a word. "forcedly" is a word, but it's not used very much 
> AFAIK.
> I would probably write "Controls if WBRF features should be forced on or off."
Yeah, that sounds better. Will update this as suggested.

Evan
> 
> > +   on  Force enable the WBRF features.
> > +   autoUp to the system to do proper checks to
> > +   determine the WBRF features should be
> enabled
> > +   or not.
> > +   off Force disable the WBRF features.
> 
> --
> ~Randy

RE: [PATCH V8 6/9] drm/amd/pm: setup the framework to support Wifi RFI mitigation feature

2023-08-14 Thread Quan, Evan

[AMD Official Use Only - General]



> -Original Message-
> From: Simon Horman 
> Sent: Friday, August 11, 2023 5:35 PM
> To: Quan, Evan 
> Cc: raf...@kernel.org; l...@kernel.org; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; johan...@sipsolutions.net;
> da...@davemloft.net; eduma...@google.com; k...@kernel.org;
> pab...@redhat.com; Limonciello, Mario ;
> mdaen...@redhat.com; maarten.lankho...@linux.intel.com;
> tzimmerm...@suse.de; hdego...@redhat.com; jingyuwang_...@163.com;
> Lazar, Lijo ; jim.cro...@gmail.com;
> bellosili...@gmail.com; andrealm...@igalia.com; t...@redhat.com;
> j...@jsg.id.au; a...@arndb.de; and...@lunn.ch; linux-
> ker...@vger.kernel.org; linux-a...@vger.kernel.org; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> wirel...@vger.kernel.org; net...@vger.kernel.org
> Subject: Re: [PATCH V8 6/9] drm/amd/pm: setup the framework to support
> Wifi RFI mitigation feature
> 
> On Thu, Aug 10, 2023 at 03:38:00PM +0800, Evan Quan wrote:
> > With WBRF feature supported, as a driver responding to the frequencies,
> > amdgpu driver is able to do shadow pstate switching to mitigate possible
> > interference(between its (G-)DDR memory clocks and local radio module
> > frequency bands used by Wifi 6/6e/7).
> >
> > Signed-off-by: Evan Quan 
> > Reviewed-by: Mario Limonciello 
> 
> ...
> 
> > +/**
> > + * smu_wbrf_event_handler - handle notify events
> > + *
> > + * @nb: notifier block
> > + * @action: event type
> > + * @data: event data
> 
> Hi Evan,
> 
> a minor nit from my side: although it is documented here,
> smu_wbrf_event_handler has no @data parameter, while
> it does have an undocumented _arg parameter.
Thanks for pointing this out. I will fix this.

Evan
> 
> > + *
> > + * Calls relevant amdgpu function in response to wbrf event
> > + * notification from kernel.
> > + */
> > +static int smu_wbrf_event_handler(struct notifier_block *nb,
> > + unsigned long action, void *_arg)
> > +{
> > +   struct smu_context *smu = container_of(nb, struct smu_context,
> > +  wbrf_notifier);
> > +
> > +   switch (action) {
> > +   case WBRF_CHANGED:
> > +   smu_wbrf_handle_exclusion_ranges(smu);
> > +   break;
> > +   default:
> > +   return NOTIFY_DONE;
> > +   };
> > +
> > +   return NOTIFY_OK;
> > +}
> 
> ...

RE: [PATCH V8 2/9] drivers core: add ACPI based WBRF mechanism introduced by AMD

2023-08-14 Thread Quan, Evan

[AMD Official Use Only - General]



> -Original Message-
> From: Simon Horman 
> Sent: Friday, August 11, 2023 5:38 PM
> To: Quan, Evan 
> Cc: raf...@kernel.org; l...@kernel.org; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; johan...@sipsolutions.net;
> da...@davemloft.net; eduma...@google.com; k...@kernel.org;
> pab...@redhat.com; Limonciello, Mario ;
> mdaen...@redhat.com; maarten.lankho...@linux.intel.com;
> tzimmerm...@suse.de; hdego...@redhat.com; jingyuwang_...@163.com;
> Lazar, Lijo ; jim.cro...@gmail.com;
> bellosili...@gmail.com; andrealm...@igalia.com; t...@redhat.com;
> j...@jsg.id.au; a...@arndb.de; and...@lunn.ch; linux-
> ker...@vger.kernel.org; linux-a...@vger.kernel.org; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> wirel...@vger.kernel.org; net...@vger.kernel.org
> Subject: Re: [PATCH V8 2/9] drivers core: add ACPI based WBRF mechanism
> introduced by AMD
> 
> On Thu, Aug 10, 2023 at 03:37:56PM +0800, Evan Quan wrote:
> > AMD has introduced an ACPI based mechanism to support WBRF for some
> > platforms with AMD dGPU + WLAN. This needs support from BIOS equipped
> > with necessary AML implementations and dGPU firmwares.
> >
> > For those systems without the ACPI mechanism and developing solutions,
> > user can use/fall-back the generic WBRF solution for diagnosing potential
> > interference issues.
> >
> > And for the platform which does not equip with the necessary AMD ACPI
> > implementations but with CONFIG_WBRF_AMD_ACPI built as 'y', it will
> > fall back to generic WBRF solution if the `wbrf` is set as "on".
> >
> > Co-developed-by: Mario Limonciello 
> > Signed-off-by: Mario Limonciello 
> > Co-developed-by: Evan Quan 
> > Signed-off-by: Evan Quan 
> 
> ...
> 
> > diff --git a/drivers/acpi/amd_wbrf.c b/drivers/acpi/amd_wbrf.c
> 
> ...
> 
> > +static bool check_acpi_wbrf(acpi_handle handle, u64 rev, u64 funcs)
> > +{
> > +   int i;
> > +   u64 mask = 0;
> > +   union acpi_object *obj;
> > +
> > +   if (funcs == 0)
> > +   return false;
> > +
> > +   obj = acpi_evaluate_wbrf(handle, rev, 0);
> > +   if (!obj)
> > +   return false;
> > +
> > +   if (obj->type != ACPI_TYPE_BUFFER)
> > +   return false;
> > +
> > +   /*
> > +* Bit vector providing supported functions information.
> > +* Each bit marks support for one specific function of the WBRF
> method.
> > +*/
> > +   for (i = 0; i < obj->buffer.length && i < 8; i++)
> > +   mask |= (((u64)obj->buffer.pointer[i]) << (i * 8));
> > +
> > +   ACPI_FREE(obj);
> > +
> > +   if ((mask & BIT(WBRF_ENABLED)) &&
> > +(mask & funcs) == funcs)
> 
> Hi Evan,
> 
> a minor nit from my side: the indentation of the line above seems odd.
Thanks. Will update this.

Evan
> 
>   if ((mask & BIT(WBRF_ENABLED)) &&
>   (mask & funcs) == funcs)
> 
> > +   return true;
> > +
> > +   return false;
> > +}
> 
> ...

RE: [PATCH -next] drm/amd/pm: Remove many unnecessary NULL values

2023-08-01 Thread Quan, Evan

[AMD Official Use Only - General]

Reviewed-by: Evan Quan 

> -Original Message-
> From: Ruan Jinjie 
> Sent: Tuesday, August 1, 2023 8:55 PM
> To: Quan, Evan ; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; mrip...@kernel.org;
> tzimmerm...@suse.de; d...@mailo.com; amd-...@lists.freedesktop.org; dri-
> de...@lists.freedesktop.org
> Cc: ruanjin...@huawei.com
> Subject: [PATCH -next] drm/amd/pm: Remove many unnecessary NULL values
>
> Ther are many pointers assigned first, which need not to be initialized, so
> remove the NULL assignment.
>
> Signed-off-by: Ruan Jinjie 
> ---
>  drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c | 2 +-
>  drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c  | 2 +-
>  drivers/gpu/drm/amd/pm/powerplay/smumgr/fiji_smumgr.c| 2 +-
>  drivers/gpu/drm/amd/pm/powerplay/smumgr/iceland_smumgr.c | 2 +-
>  drivers/gpu/drm/amd/pm/powerplay/smumgr/tonga_smumgr.c   | 2 +-
>  5 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
> b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
> index 182118e3fd5f..5794b64507bf 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
> @@ -1237,7 +1237,7 @@ static int get_vce_clock_voltage_limit_table(struct
> pp_hwmgr *hwmgr,
>   const VCEClockInfoArray*array)
>  {
>   unsigned long i;
> - struct phm_vce_clock_voltage_dependency_table *vce_table = NULL;
> + struct phm_vce_clock_voltage_dependency_table *vce_table;
>
>   vce_table = kzalloc(struct_size(vce_table, entries, table->numEntries),
>   GFP_KERNEL);
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
> index 4bc8db1be738..9e4228232f02 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
> @@ -2732,7 +2732,7 @@ static bool ci_is_dpm_running(struct pp_hwmgr
> *hwmgr)
>
>  static int ci_smu_init(struct pp_hwmgr *hwmgr)
>  {
> - struct ci_smumgr *ci_priv = NULL;
> + struct ci_smumgr *ci_priv;
>
>   ci_priv = kzalloc(sizeof(struct ci_smumgr), GFP_KERNEL);
>
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/fiji_smumgr.c
> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/fiji_smumgr.c
> index 02c094a06605..5e43ad2b2956 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/fiji_smumgr.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/fiji_smumgr.c
> @@ -332,7 +332,7 @@ static bool fiji_is_hw_avfs_present(struct pp_hwmgr
> *hwmgr)
>
>  static int fiji_smu_init(struct pp_hwmgr *hwmgr)
>  {
> - struct fiji_smumgr *fiji_priv = NULL;
> + struct fiji_smumgr *fiji_priv;
>
>   fiji_priv = kzalloc(sizeof(struct fiji_smumgr), GFP_KERNEL);
>
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/iceland_smumgr.c
> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/iceland_smumgr.c
> index 060fc140c574..97d9802fe673 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/iceland_smumgr.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/iceland_smumgr.c
> @@ -259,7 +259,7 @@ static int iceland_start_smu(struct pp_hwmgr
> *hwmgr)
>
>  static int iceland_smu_init(struct pp_hwmgr *hwmgr)
>  {
> - struct iceland_smumgr *iceland_priv = NULL;
> + struct iceland_smumgr *iceland_priv;
>
>   iceland_priv = kzalloc(sizeof(struct iceland_smumgr), GFP_KERNEL);
>
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/tonga_smumgr.c
> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/tonga_smumgr.c
> index acbe41174d7e..6fe6e6abb5d8 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/tonga_smumgr.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/tonga_smumgr.c
> @@ -226,7 +226,7 @@ static int tonga_start_smu(struct pp_hwmgr
> *hwmgr)
>
>  static int tonga_smu_init(struct pp_hwmgr *hwmgr)
>  {
> - struct tonga_smumgr *tonga_priv = NULL;
> + struct tonga_smumgr *tonga_priv;
>
>   tonga_priv = kzalloc(sizeof(struct tonga_smumgr), GFP_KERNEL);
>   if (tonga_priv == NULL)
> --
> 2.34.1

RE: [PATCH] drm/amdgpu: Clean up errors in vega20_baco.c

2023-08-01 Thread Quan, Evan

[AMD Official Use Only - General]

Reviewed-by: Evan Quan 

> -Original Message-
> From: amd-gfx  On Behalf Of Ran
> Sun
> Sent: Tuesday, August 1, 2023 4:03 PM
> To: Deucher, Alexander ; airl...@gmail.com;
> dan...@ffwll.ch
> Cc: Ran Sun ; dri-devel@lists.freedesktop.org;
> amd-...@lists.freedesktop.org; linux-ker...@vger.kernel.org
> Subject: [PATCH] drm/amdgpu: Clean up errors in vega20_baco.c
>
> Fix the following errors reported by checkpatch:
>
> ERROR: that open brace { should be on the previous line
> ERROR: space required before the open parenthesis '('
>
> Signed-off-by: Ran Sun 
> ---
>  drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_baco.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_baco.c
> b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_baco.c
> index 8d99c7a5abf8..994c0d374bfa 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_baco.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_baco.c
> @@ -31,8 +31,7 @@
>
>  #include "amdgpu_ras.h"
>
> -static const struct soc15_baco_cmd_entry clean_baco_tbl[] = -{
> +static const struct soc15_baco_cmd_entry clean_baco_tbl[] = {
>   {CMD_WRITE, SOC15_REG_ENTRY(NBIF, 0, mmBIOS_SCRATCH_6), 0,
> 0, 0, 0},
>   {CMD_WRITE, SOC15_REG_ENTRY(NBIF, 0, mmBIOS_SCRATCH_7), 0,
> 0, 0, 0},  }; @@ -90,11 +89,11 @@ int vega20_baco_set_state(struct
> pp_hwmgr *hwmgr, enum BACO_STATE state)
>   data |= 0x8000;
>   WREG32_SOC15(THM, 0, mmTHM_BACO_CNTL,
> data);
>
> - if(smum_send_msg_to_smc_with_parameter(hwmgr,
> + if
> (smum_send_msg_to_smc_with_parameter(hwmgr,
>   PPSMC_MSG_EnterBaco, 0, NULL))
>   return -EINVAL;
>   } else {
> - if(smum_send_msg_to_smc_with_parameter(hwmgr,
> + if
> (smum_send_msg_to_smc_with_parameter(hwmgr,
>   PPSMC_MSG_EnterBaco, 1, NULL))
>   return -EINVAL;
>   }
> --
> 2.17.1

RE: [PATCH] drm/amd/pm: Clean up errors in vega20_hwmgr.h

2023-08-01 Thread Quan, Evan

[AMD Official Use Only - General]

Reviewed-by: Evan Quan 

> -Original Message-
> From: amd-gfx  On Behalf Of Ran
> Sun
> Sent: Tuesday, August 1, 2023 10:39 AM
> To: Deucher, Alexander ; airl...@gmail.com;
> dan...@ffwll.ch
> Cc: Ran Sun ; dri-devel@lists.freedesktop.org;
> amd-...@lists.freedesktop.org; linux-ker...@vger.kernel.org
> Subject: [PATCH] drm/amd/pm: Clean up errors in vega20_hwmgr.h
>
> Fix the following errors reported by checkpatch:
>
> ERROR: open brace '{' following enum go on the same line
>
> Signed-off-by: Ran Sun 
> ---
>  drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.h | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.h
> b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.h
> index 075c0094da9c..1ba9b5fe2a5d 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.h
> +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.h
> @@ -385,8 +385,7 @@ struct vega20_odn_data {
>   struct vega20_odn_temp_tableodn_temp_table;
>  };
>
> -enum OD8_FEATURE_ID
> -{
> +enum OD8_FEATURE_ID {
>   OD8_GFXCLK_LIMITS   = 1 << 0,
>   OD8_GFXCLK_CURVE= 1 << 1,
>   OD8_UCLK_MAX= 1 << 2,
> @@ -399,8 +398,7 @@ enum OD8_FEATURE_ID
>   OD8_FAN_ZERO_RPM_CONTROL= 1 << 9
>  };
>
> -enum OD8_SETTING_ID
> -{
> +enum OD8_SETTING_ID {
>   OD8_SETTING_GFXCLK_FMIN = 0,
>   OD8_SETTING_GFXCLK_FMAX,
>   OD8_SETTING_GFXCLK_FREQ1,
> --
> 2.17.1

RE: [PATCH V7 4/9] wifi: mac80211: Add support for ACPI WBRF

2023-07-25 Thread Quan, Evan

[AMD Official Use Only - General]

> -Original Message-
> From: Limonciello, Mario 
> Sent: Monday, July 24, 2023 9:41 PM
> To: Andrew Lunn ; Quan, Evan 
> Cc: raf...@kernel.org; l...@kernel.org; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; johan...@sipsolutions.net;
> da...@davemloft.net; eduma...@google.com; k...@kernel.org;
> pab...@redhat.com; mdaen...@redhat.com;
> maarten.lankho...@linux.intel.com; tzimmerm...@suse.de;
> hdego...@redhat.com; jingyuwang_...@163.com; Lazar, Lijo
> ; jim.cro...@gmail.com; bellosili...@gmail.com;
> andrealm...@igalia.com; t...@redhat.com; j...@jsg.id.au; a...@arndb.de;
> linux-ker...@vger.kernel.org; linux-a...@vger.kernel.org; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> wirel...@vger.kernel.org; net...@vger.kernel.org
> Subject: Re: [PATCH V7 4/9] wifi: mac80211: Add support for ACPI WBRF
>
> On 7/24/2023 04:22, Andrew Lunn wrote:
> >> @@ -1395,6 +1395,8 @@ int ieee80211_register_hw(struct
> ieee80211_hw *hw)
> >>debugfs_hw_add(local);
> >>rate_control_add_debugfs(local);
> >>
> >> +  ieee80211_check_wbrf_support(local);
> >> +
> >>rtnl_lock();
> >>wiphy_lock(hw->wiphy);
> >>
> >
> >> +void ieee80211_check_wbrf_support(struct ieee80211_local *local) {
> >> +  struct wiphy *wiphy = local->hw.wiphy;
> >> +  struct device *dev;
> >> +
> >> +  if (!wiphy)
> >> +  return;
> >> +
> >> +  dev = wiphy->dev.parent;
> >> +  if (!dev)
> >> +  return;
> >> +
> >> +  local->wbrf_supported = wbrf_supported_producer(dev);
> >> +  dev_dbg(dev, "WBRF is %s supported\n",
> >> +  local->wbrf_supported ? "" : "not"); }
> >
> > This seems wrong. wbrf_supported_producer() is about "Should this
> > device report the frequencies it is using?" The answer to that depends
> > on a combination of: Are there consumers registered with the core, and
> > is the policy set so WBRF should take actions. > The problem here is,
> > you have no idea of the probe order. It could be this device probes
> > before others, so wbrf_supported_producer() reports false, but a few
> > second later would report true, once other devices have probed.
> >
> > It should be an inexpensive call into the core, so can be made every
> > time the channel changes. All the core needs to do is check if the
> > list of consumers is empty, and if not, check a Boolean policy value.
> >
> >   Andrew
>
> No, it's not a combination of whether consumers are registered with the core.
> If a consumer probes later it needs to know the current in use frequencies 
> too.
>
> The reason is because of this sequence of events:
> 1) Producer probes.
> 2) Producer selects a frequency.
> 3) Consumer probes.
> 4) Producer stays at same frequency.
>
> If the producer doesn't notify the frequency because a consumer isn't yet
> loaded then the consumer won't be able to get the current frequency.
Yes, exactly.

RE: [PATCH V7 0/9] Enable Wifi RFI interference mitigation feature support

2023-07-23 Thread Quan, Evan

[AMD Official Use Only - General]

Gentle ping on this series.

Hi Rafael and Andrew,

Can you help to check this latest series and share your thoughts if any?

BR,
Evan
> -Original Message-
> From: Quan, Evan 
> Sent: Wednesday, July 19, 2023 5:00 PM
> To: raf...@kernel.org; l...@kernel.org; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; johan...@sipsolutions.net;
> da...@davemloft.net; eduma...@google.com; k...@kernel.org;
> pab...@redhat.com; Limonciello, Mario ;
> mdaen...@redhat.com; maarten.lankho...@linux.intel.com;
> tzimmerm...@suse.de; hdego...@redhat.com; jingyuwang_...@163.com;
> Lazar, Lijo ; jim.cro...@gmail.com;
> bellosili...@gmail.com; andrealm...@igalia.com; t...@redhat.com;
> j...@jsg.id.au; a...@arndb.de; and...@lunn.ch
> Cc: linux-ker...@vger.kernel.org; linux-a...@vger.kernel.org; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> wirel...@vger.kernel.org; net...@vger.kernel.org; Quan, Evan
> 
> Subject: [PATCH V7 0/9] Enable Wifi RFI interference mitigation feature
> support
>
> Due to electrical and mechanical constraints in certain platform designs there
> may
> be likely interference of relatively high-powered harmonics of the (G-)DDR
> memory
> clocks with local radio module frequency bands used by Wifi 6/6e/7. To
> mitigate
> possible RFI interference producers can advertise the frequencies in use and
> consumers can use this information to avoid using these frequencies for
> sensitive features.
>
> The whole patch set is based on Linux 6.4. With some brief introductions as
> below:
> Patch1 - 2:  Core functionality setup for WBRF feature support
> Patch3 - 4:  Bring WBRF support to wifi subsystem.
> Patch5 - 9:  Bring WBRF support to AMD graphics driver.
>
> Evan Quan (9):
>   drivers core: Add support for Wifi band RF mitigations
>   driver core: add ACPI based WBRF mechanism introduced by AMD
>   cfg80211: expose nl80211_chan_width_to_mhz for wide sharing
>   wifi: mac80211: Add support for ACPI WBRF
>   drm/amd/pm: update driver_if and ppsmc headers for coming wbrf feature
>   drm/amd/pm: setup the framework to support Wifi RFI mitigation feature
>   drm/amd/pm: add flood detection for wbrf events
>   drm/amd/pm: enable Wifi RFI mitigation feature support for SMU13.0.0
>   drm/amd/pm: enable Wifi RFI mitigation feature support for SMU13.0.7
>
>  drivers/acpi/Makefile |   2 +
>  drivers/acpi/amd_wbrf.c   | 282 ++
>  drivers/base/Kconfig  |  37 +++
>  drivers/base/Makefile |   1 +
>  drivers/base/wbrf.c   | 256 
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  19 ++
>  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 213 +
>  drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |  33 ++
>  .../inc/pmfw_if/smu13_driver_if_v13_0_0.h |  14 +-
>  .../inc/pmfw_if/smu13_driver_if_v13_0_7.h |  14 +-
>  .../pm/swsmu/inc/pmfw_if/smu_v13_0_0_ppsmc.h  |   3 +-
>  .../pm/swsmu/inc/pmfw_if/smu_v13_0_7_ppsmc.h  |   3 +-
>  drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h  |   3 +-
>  drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h  |   3 +
>  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c|   9 +
>  .../drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c  |  60 
>  .../drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c  |  59 
>  drivers/gpu/drm/amd/pm/swsmu/smu_internal.h   |   3 +
>  include/linux/acpi_amd_wbrf.h |  24 ++
>  include/linux/ieee80211.h |   1 +
>  include/linux/wbrf.h  |  72 +
>  include/net/cfg80211.h|   8 +
>  net/mac80211/Makefile |   2 +
>  net/mac80211/chan.c   |   9 +
>  net/mac80211/ieee80211_i.h|  19 ++
>  net/mac80211/main.c   |   2 +
>  net/mac80211/wbrf.c   | 103 +++
>  net/wireless/chan.c   |   3 +-
>  29 files changed, 1252 insertions(+), 6 deletions(-)
>  create mode 100644 drivers/acpi/amd_wbrf.c
>  create mode 100644 drivers/base/wbrf.c
>  create mode 100644 include/linux/acpi_amd_wbrf.h
>  create mode 100644 include/linux/wbrf.h
>  create mode 100644 net/mac80211/wbrf.c
>
> --
> 2.34.1

RE: [PATCH V6 1/9] drivers core: Add support for Wifi band RF mitigations

2023-07-18 Thread Quan, Evan

[AMD Official Use Only - General]

> -Original Message-
> From: Andrew Lunn 
> Sent: Tuesday, July 18, 2023 10:16 PM
> To: Quan, Evan 
> Cc: raf...@kernel.org; l...@kernel.org; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; johan...@sipsolutions.net;
> da...@davemloft.net; eduma...@google.com; k...@kernel.org;
> pab...@redhat.com; Limonciello, Mario ;
> mdaen...@redhat.com; maarten.lankho...@linux.intel.com;
> tzimmerm...@suse.de; hdego...@redhat.com; jingyuwang_...@163.com;
> Lazar, Lijo ; jim.cro...@gmail.com;
> bellosili...@gmail.com; andrealm...@igalia.com; t...@redhat.com;
> j...@jsg.id.au; a...@arndb.de; linux-ker...@vger.kernel.org; linux-
> a...@vger.kernel.org; amd-...@lists.freedesktop.org; dri-
> de...@lists.freedesktop.org; linux-wirel...@vger.kernel.org;
> net...@vger.kernel.org
> Subject: Re: [PATCH V6 1/9] drivers core: Add support for Wifi band RF
> mitigations
>
> > The wbrf_supported_producer and wbrf_supported_consumer APIs seem
> > unnecessary for the generic implementation.
>
> I'm happy with these, once the description is corrected. As i said in another
> comment, 'can' should be replaced with 'should'. The device itself knows if it
> can, only the core knows if it should, based on the policy of if actions need 
> to
> be taken, and there are both providers and consumers registered with the
> core.
Sure, will update that in V7.

Evan
>
>Andrew

RE: [PATCH V6 1/9] drivers core: Add support for Wifi band RF mitigations

2023-07-18 Thread Quan, Evan

[AMD Official Use Only - General]

Personally I would like to treat the wbrf core as a water pool. Any stream can 
flow in. Also any needed can drain water from it at any time.
The way to allow producers to report only when there is consumer existing does 
not work. Since the consumer might come after the producer.
Just considering the scenario below:
Wifi core/driver started --> wifi driver reports its frequency in-use  --> 
proper action taken by wbrf core --> amdgpu driver(consumer) started
What should be the proper action taken by wbrf core then? Stop the producer to 
report its frequency in-use? That might lead consumer to never have a chance to 
know that.

The wbrf_supported_producer and wbrf_supported_consumer APIs seem unnecessary 
for the generic implementation.
But to support AMD ACPI implementation(or future device tree implementation), 
they are needed. The wbrf core needs to check whether the necessary AML codes 
are there.
It needs those information to judge whether a producer can report(will be 
accepted) or a consumer can retrieve needed information.

> > +struct wbrf_ranges_out {
> > +   u32 num_of_ranges;
> > +   struct exclusion_range  band_list[MAX_NUM_OF_WBRF_RANGES];
> > +} __packed;
>
> Seems odd using packed here. It is the only structure which is
> packed. I would also move the u32 after the struct so it is naturally
> aligned on 64 bit systems.
This is to align with the AMD ACPI implementation.
But I can make this AMD ACPI specific and bring a more generic version here.

Evan
> -Original Message-
> From: Andrew Lunn 
> Sent: Thursday, July 13, 2023 7:12 AM
> To: Quan, Evan 
> Cc: raf...@kernel.org; l...@kernel.org; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; johan...@sipsolutions.net;
> da...@davemloft.net; eduma...@google.com; k...@kernel.org;
> pab...@redhat.com; Limonciello, Mario ;
> mdaen...@redhat.com; maarten.lankho...@linux.intel.com;
> tzimmerm...@suse.de; hdego...@redhat.com; jingyuwang_...@163.com;
> Lazar, Lijo ; jim.cro...@gmail.com;
> bellosili...@gmail.com; andrealm...@igalia.com; t...@redhat.com;
> j...@jsg.id.au; a...@arndb.de; linux-ker...@vger.kernel.org; linux-
> a...@vger.kernel.org; amd-...@lists.freedesktop.org; dri-
> de...@lists.freedesktop.org; linux-wirel...@vger.kernel.org;
> net...@vger.kernel.org
> Subject: Re: [PATCH V6 1/9] drivers core: Add support for Wifi band RF
> mitigations
>
> > +/**
> > + * wbrf_supported_producer - Determine if the device can report
> frequencies
> > + *
> > + * @dev: device pointer
> > + *
> > + * WBRF is used to mitigate devices that cause harmonic interference.
> > + * This function will determine if this device needs to report such
> frequencies.
>
> How is the WBRF core supposed to answer this question? That it knows
> there is at least one device which has registered with WBRF saying it
> can change its behaviour to avoid causing interference?
>
> Rather than "Determine if the device can report frequencies" should it be
> "Determine if the device should report frequencies"
>
> A WiFi device can always report frequencies, since it knows what
> frequency is it currently using. However, it is pointless making such
> reports if there is no device which can actually make use of the
> information.
>
> > +bool wbrf_supported_producer(struct device *dev)
> > +{
> > +   return true;
> > +}
>
> I found the default implementation of true being odd. It makes me
> wounder, what is the point of this call. I would expect this to see if
> a linked list is empty or not.
>
> > +/**
> > + * wbrf_supported_consumer - Determine if the device can react to
> frequencies
>
> This again seems odd. A device should know if it can react to
> frequencies or not. WBRF core should not need to tell it. What makes
> more sense to me is that this call is about a device telling the WBRF
> core it is able to react to frequencies. The WBRF core then can give a
> good answer to wbrf_supported_producer(), yes, i know of some other
> device who might be able to do something to avoid causing interference
> to you, so please do tell me about frequencies you want to use.
>
> What is missing here in this API is policy information. The WBRF core
> knows it has zero or more devices which can report what frequencies
> they are using, and it has zero or more devices which maybe can do
> something. But then you need policy to say this particular board needs
> any registered devices to actually do something because of poor
> shielding. Should this policy be as simple as a bool, or should it
> actually say the board has shielding issues for a list of frequencies?

RE: [PATCH v2] drm/amd/pm: Vangogh: Add new gpu_metrics_v2_4 to acquire gpu_metrics

2023-07-17 Thread Quan, Evan

[AMD Official Use Only - General]

Hi Wenyou,

I think you already got the greenlight(RB from Mario and ACK from me) to land 
the change.
Go ahead please.

Evan
> -Original Message-
> From: Yang, WenYou 
> Sent: Thursday, July 13, 2023 8:56 AM
> To: Yang, WenYou ; Deucher, Alexander
> ; Limonciello, Mario
> ; Koenig, Christian
> ; Pan, Xinhui ; Quan,
> Evan 
> Cc: Yuan, Perry ; Liang, Richard qi
> ; amd-...@lists.freedesktop.org; dri-
> de...@lists.freedesktop.org; linux-ker...@vger.kernel.org
> Subject: RE: [PATCH v2] drm/amd/pm: Vangogh: Add new gpu_metrics_v2_4
> to acquire gpu_metrics
>
> [AMD Official Use Only - General]
>
> Any comments? Any advice?
>
> Best Regards,
> Wenyou
>
> > -Original Message-
> > From: Wenyou Yang 
> > Sent: Wednesday, June 21, 2023 2:32 PM
> > To: Deucher, Alexander ; Limonciello, Mario
> > ; Koenig, Christian
> > ; Pan, Xinhui ; Quan,
> > Evan 
> > Cc: Yuan, Perry ; Liang, Richard qi
> > ; amd-...@lists.freedesktop.org; dri-
> > de...@lists.freedesktop.org; linux-ker...@vger.kernel.org; Yang,
> > WenYou 
> > Subject: [PATCH v2] drm/amd/pm: Vangogh: Add new gpu_metrics_v2_4 to
> > acquire gpu_metrics
> >
> > To acquire the voltage and current info from gpu_metrics interface,
> > but
> > gpu_metrics_v2_3 doesn't contain them, and to be backward compatible,
> > add new gpu_metrics_v2_4 structure.
> >
> > Reviewed-by: Mario Limonciello 
> > Acked-by: Evan Quan 
> > Signed-off-by: Wenyou Yang 
> > ---
> >  .../gpu/drm/amd/include/kgd_pp_interface.h|  69 +++
> >  .../gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c  | 109
> -
> > -
> >  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c|   3 +
> >  3 files changed, 172 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/include/kgd_pp_interface.h
> > b/drivers/gpu/drm/amd/include/kgd_pp_interface.h
> > index 9f542f6e19ed..90989405eddc 100644
> > --- a/drivers/gpu/drm/amd/include/kgd_pp_interface.h
> > +++ b/drivers/gpu/drm/amd/include/kgd_pp_interface.h
> > @@ -892,4 +892,73 @@ struct gpu_metrics_v2_3 {
> >   uint16_taverage_temperature_core[8]; //
> > average CPU core temperature on APUs
> >   uint16_taverage_temperature_l3[2];
> >  };
> > +
> > +struct gpu_metrics_v2_4 {
> > + struct metrics_table_header common_header;
> > +
> > + /* Temperature (unit: centi-Celsius) */
> > + uint16_ttemperature_gfx;
> > + uint16_ttemperature_soc;
> > + uint16_ttemperature_core[8];
> > + uint16_ttemperature_l3[2];
> > +
> > + /* Utilization (unit: centi) */
> > + uint16_taverage_gfx_activity;
> > + uint16_taverage_mm_activity;
> > +
> > + /* Driver attached timestamp (in ns) */
> > + uint64_tsystem_clock_counter;
> > +
> > + /* Power/Energy (unit: mW) */
> > + uint16_taverage_socket_power;
> > + uint16_taverage_cpu_power;
> > + uint16_taverage_soc_power;
> > + uint16_taverage_gfx_power;
> > + uint16_taverage_core_power[8];
> > +
> > + /* Average clocks (unit: MHz) */
> > + uint16_taverage_gfxclk_frequency;
> > + uint16_taverage_socclk_frequency;
> > + uint16_taverage_uclk_frequency;
> > + uint16_taverage_fclk_frequency;
> > + uint16_taverage_vclk_frequency;
> > + uint16_taverage_dclk_frequency;
> > +
> > + /* Current clocks (unit: MHz) */
> > + uint16_tcurrent_gfxclk;
> > + uint16_tcurrent_socclk;
> > + uint16_tcurrent_uclk;
> > + uint16_tcurrent_fclk;
> > + uint16_tcurrent_vclk;
> > + uint16_tcurrent_dclk;
> > + uint16_tcurrent_coreclk[8];
> > + uint16_tcurrent_l3clk[2];
> > +
> > + /* Throttle status (ASIC dependent) */
> > + uint32_tthrottle_status;
> > +
> > + /*

RE: [PATCH V5 1/9] drivers core: Add support for Wifi band RF mitigations

2023-07-05 Thread Quan, Evan

[AMD Official Use Only - General]

Hi Andrew,

I discussed with Mario about your proposal/concerns here.
We believe some changes below might address your concerns.
- place/move the wbrf_supported_producer check inside 
acpi_amd_wbrf_add_exclusion and acpi_amd_wbrf_add_exclusion
- place the wbrf_supported_consumer check inside 
acpi_amd_wbrf_retrieve_exclusions
So that the wbrf_supported_producer and wbrf_supported_consumer can be dropped.
We made some prototypes and even performed some tests which showed technically 
it is absolutely practicable.

However, we found several issues with that.
- The overhead caused by the extra _producer/_consumer check on every calling 
of wbrf_add/remove/retrieve_ecxclusion.
  Especially when you consider there might be multiple producers and consumers 
in the system at the same time. And some of
  them might do in-use band/frequency switching frequently.
- Some extra costs caused by the "know it only at the last minute". For 
example, to support WBRF, amdgpu driver needs some preparations: install the 
notification hander,
  setup the delay workqueue(to handle possible events flooding) and even notify 
firmware engine to be ready. However, only on the 1st notification receiving,
  it is realized(reported by wbrf_supported_consumer check) the WBRF feature is 
actually not supported. All those extra costs can be actually avoided if we can 
know the WBRF is not supported at first.
  This could happen to other consumers and producers too.

After a careful consideration, we think the changes do not benefit us much. It 
does not deserve us to spend extra efforts.
Thus we would like to stick with original implementations. That is to have 
wbrf_supported_producer and wbrf_supported_consumer interfaces exposed.
Then other drivers/subsystems can do necessary wbrf support check in advance 
and coordinate their actions accordingly.
Please let us know your thoughts.

BR,
Evan
> -Original Message-
> From: Andrew Lunn 
> Sent: Tuesday, July 4, 2023 9:07 PM
> To: Quan, Evan 
> Cc: raf...@kernel.org; l...@kernel.org; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; johan...@sipsolutions.net;
> da...@davemloft.net; eduma...@google.com; k...@kernel.org;
> pab...@redhat.com; Limonciello, Mario ;
> mdaen...@redhat.com; maarten.lankho...@linux.intel.com;
> tzimmerm...@suse.de; hdego...@redhat.com; jingyuwang_...@163.com;
> Lazar, Lijo ; jim.cro...@gmail.com;
> bellosili...@gmail.com; andrealm...@igalia.com; t...@redhat.com;
> j...@jsg.id.au; a...@arndb.de; linux-ker...@vger.kernel.org; linux-
> a...@vger.kernel.org; amd-...@lists.freedesktop.org; dri-
> de...@lists.freedesktop.org; linux-wirel...@vger.kernel.org;
> net...@vger.kernel.org
> Subject: Re: [PATCH V5 1/9] drivers core: Add support for Wifi band RF
> mitigations
>
> > > What is the purpose of this stage? Why would it not be supported for
> > > this device?
> > This is needed for wbrf support via ACPI mechanism. If BIOS(AML code)
> > does not support the wbrf adding/removing for some device, it should
> speak that out so that the device can be aware of that.
>
> How much overhead is this adding? How deep do you need to go to find the
> BIOS does not support it? And how often is this called?
>
> Where do we want to add complexity? In the generic API? Or maybe a little
> deeper in the ACPI specific code?
>
>Andrew

RE: [PATCH V5 1/9] drivers core: Add support for Wifi band RF mitigations

2023-07-03 Thread Quan, Evan

[AMD Official Use Only - General]

> -Original Message-
> From: Simon Horman 
> Sent: Friday, June 30, 2023 9:39 PM
> To: Quan, Evan 
> Cc: raf...@kernel.org; l...@kernel.org; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; johan...@sipsolutions.net;
> da...@davemloft.net; eduma...@google.com; k...@kernel.org;
> pab...@redhat.com; Limonciello, Mario ;
> mdaen...@redhat.com; maarten.lankho...@linux.intel.com;
> tzimmerm...@suse.de; hdego...@redhat.com; jingyuwang_...@163.com;
> Lazar, Lijo ; jim.cro...@gmail.com;
> bellosili...@gmail.com; andrealm...@igalia.com; t...@redhat.com;
> j...@jsg.id.au; a...@arndb.de; linux-ker...@vger.kernel.org; linux-
> a...@vger.kernel.org; amd-...@lists.freedesktop.org; dri-
> de...@lists.freedesktop.org; linux-wirel...@vger.kernel.org;
> net...@vger.kernel.org
> Subject: Re: [PATCH V5 1/9] drivers core: Add support for Wifi band RF
> mitigations
>
> On Fri, Jun 30, 2023 at 06:32:32PM +0800, Evan Quan wrote:
>
> ...
>
> > diff --git a/include/linux/wbrf.h b/include/linux/wbrf.h
> > new file mode 100644
> > index ..3ca95786cef5
> > --- /dev/null
> > +++ b/include/linux/wbrf.h
> > @@ -0,0 +1,65 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Wifi Band Exclusion Interface
> > + * Copyright (C) 2023 Advanced Micro Devices
> > + */
> > +
> > +#ifndef _LINUX_WBRF_H
> > +#define _LINUX_WBRF_H
> > +
> > +#include 
> > +
> > +/* Maximum number of wbrf ranges */
> > +#define MAX_NUM_OF_WBRF_RANGES 11
> > +
> > +struct exclusion_range {
> > +   /* start and end point of the frequency range in Hz */
> > +   uint64_tstart;
> > +   uint64_tend;
> > +};
> > +
> > +struct exclusion_range_pool {
> > +   struct exclusion_range  band_list[MAX_NUM_OF_WBRF_RANGES];
> > +   uint64_t
>   ref_counter[MAX_NUM_OF_WBRF_RANGES];
> > +};
> > +
> > +struct wbrf_ranges_in {
> > +   /* valid entry: `start` and `end` filled with non-zero values */
> > +   struct exclusion_range  band_list[MAX_NUM_OF_WBRF_RANGES];
> > +};
> > +
> > +struct wbrf_ranges_out {
> > +   uint32_tnum_of_ranges;
> > +   struct exclusion_range  band_list[MAX_NUM_OF_WBRF_RANGES];
> > +} __packed;
> > +
> > +enum wbrf_notifier_actions {
> > +   WBRF_CHANGED,
> > +};
>
> Hi Evan,
>
> checkpatch suggests that u64 and u32 might be more appropriate types here,
> as they are Kernel types, whereas the ones use are user-space types.
Thanks for pointing this out. Will update them accordingly.

Evan
>
> ...

RE: [PATCH V5 1/9] drivers core: Add support for Wifi band RF mitigations

2023-07-03 Thread Quan, Evan

[AMD Official Use Only - General]

> -Original Message-
> From: Limonciello, Mario 
> Sent: Saturday, July 1, 2023 12:41 AM
> To: Quan, Evan ; raf...@kernel.org; l...@kernel.org;
> Deucher, Alexander ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; johan...@sipsolutions.net;
> da...@davemloft.net; eduma...@google.com; k...@kernel.org;
> pab...@redhat.com; mdaen...@redhat.com;
> maarten.lankho...@linux.intel.com; tzimmerm...@suse.de;
> hdego...@redhat.com; jingyuwang_...@163.com; Lazar, Lijo
> ; jim.cro...@gmail.com; bellosili...@gmail.com;
> andrealm...@igalia.com; t...@redhat.com; j...@jsg.id.au; a...@arndb.de
> Cc: linux-ker...@vger.kernel.org; linux-a...@vger.kernel.org; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> wirel...@vger.kernel.org; net...@vger.kernel.org
> Subject: Re: [PATCH V5 1/9] drivers core: Add support for Wifi band RF
> mitigations
>
> On 6/30/2023 05:32, Evan Quan wrote:
> > Due to electrical and mechanical constraints in certain platform
> > designs there may be likely interference of relatively high-powered
> > harmonics of the (G-)DDR memory clocks with local radio module
> > frequency bands used by Wifi 6/6e/7.
> >
> > To mitigate this, AMD has introduced a mechanism that devices can use
> > to notify active use of particular frequencies so that other devices
> > can make relative internal adjustments as necessary to avoid this resonance.
> >
> > In order for a device to support this, the expected flow for device
> > driver or subsystems:
> >
> > Drivers/subsystems contributing frequencies:
> >
> > 1) During probe, check `wbrf_supported_producer` to see if WBRF
> supported
> > for the device.
> > 2) If adding frequencies, then call `wbrf_add_exclusion` with the
> > start and end ranges of the frequencies.
> > 3) If removing frequencies, then call `wbrf_remove_exclusion` with
> > start and end ranges of the frequencies.
> >
> > Drivers/subsystems responding to frequencies:
> >
> > 1) During probe, check `wbrf_supported_consumer` to see if WBRF is
> supported
> > for the device.
> > 2) Call the `wbrf_retrieve_exclusions` to retrieve the current
> > exclusions on receiving an ACPI notification for a new frequency
> > change.
> >
> > Co-developed-by: Mario Limonciello 
> > Signed-off-by: Mario Limonciello 
> > Co-developed-by: Evan Quan 
> > Signed-off-by: Evan Quan 
> > --
> > v4->v5:
> >- promote this to be a more generic solution with input argument taking
> >  `struct device` and provide better scalability to support non-ACPI
> >  scenarios(Andrew)
> >- update the APIs naming and some other minor fixes(Rafael)
> > ---
> >   drivers/base/Kconfig  |   8 ++
> >   drivers/base/Makefile |   1 +
> >   drivers/base/wbrf.c   | 227
> ++
> >   include/linux/wbrf.h  |  65 
> >   4 files changed, 301 insertions(+)
> >   create mode 100644 drivers/base/wbrf.c
> >   create mode 100644 include/linux/wbrf.h
> >
> > diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index
> > 2b8fd6bb7da0..5b441017b225 100644
> > --- a/drivers/base/Kconfig
> > +++ b/drivers/base/Kconfig
> > @@ -242,4 +242,12 @@ config FW_DEVLINK_SYNC_STATE_TIMEOUT
> >   command line option on every system/board your kernel is expected
> to
> >   work on.
> >
> > +config WBRF
> > +   bool "Wifi band RF mitigation mechanism"
> > +   default n
> > +   help
> > + Wifi band RF mitigation mechanism allows multiple drivers from
> > + different domains to notify the frequencies in use so that hardware
> > + can be reconfigured to avoid harmonic conflicts.
> > +
> >   endmenu
> > diff --git a/drivers/base/Makefile b/drivers/base/Makefile index
> > 3079bfe53d04..c844f68a6830 100644
> > --- a/drivers/base/Makefile
> > +++ b/drivers/base/Makefile
> > @@ -26,6 +26,7 @@ obj-$(CONFIG_GENERIC_MSI_IRQ) += platform-msi.o
> >   obj-$(CONFIG_GENERIC_ARCH_TOPOLOGY) += arch_topology.o
> >   obj-$(CONFIG_GENERIC_ARCH_NUMA) += arch_numa.o
> >   obj-$(CONFIG_ACPI) += physical_location.o
> > +obj-$(CONFIG_WBRF) += wbrf.o
> >
> >   obj-y += test/
> >
> > diff --git a/drivers/base/wbrf.c b/drivers/base/wbrf.c new file mode
> > 100644 index ..2163a8ec8a9a
> > --- /dev/null
> > +++ b/drivers/base/wbrf.c
> > @@ -0,0 +1,227 @@
> > +// SPDX-License-I

RE: [PATCH V5 1/9] drivers core: Add support for Wifi band RF mitigations

2023-07-03 Thread Quan, Evan

[AMD Official Use Only - General]

> -Original Message-
> From: Andrew Lunn 
> Sent: Saturday, July 1, 2023 8:20 AM
> To: Quan, Evan 
> Cc: raf...@kernel.org; l...@kernel.org; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; johan...@sipsolutions.net;
> da...@davemloft.net; eduma...@google.com; k...@kernel.org;
> pab...@redhat.com; Limonciello, Mario ;
> mdaen...@redhat.com; maarten.lankho...@linux.intel.com;
> tzimmerm...@suse.de; hdego...@redhat.com; jingyuwang_...@163.com;
> Lazar, Lijo ; jim.cro...@gmail.com;
> bellosili...@gmail.com; andrealm...@igalia.com; t...@redhat.com;
> j...@jsg.id.au; a...@arndb.de; linux-ker...@vger.kernel.org; linux-
> a...@vger.kernel.org; amd-...@lists.freedesktop.org; dri-
> de...@lists.freedesktop.org; linux-wirel...@vger.kernel.org;
> net...@vger.kernel.org
> Subject: Re: [PATCH V5 1/9] drivers core: Add support for Wifi band RF
> mitigations
>
> > Drivers/subsystems contributing frequencies:
> >
> > 1) During probe, check `wbrf_supported_producer` to see if WBRF
> supported
> >for the device.
>
> What is the purpose of this stage? Why would it not be supported for this
> device?
This is needed for wbrf support via ACPI mechanism. If BIOS(AML code) does not 
support the wbrf adding/removing for some device,
it should speak that out so that the device can be aware of that.
>
> > +#ifdef CONFIG_WBRF
> > +bool wbrf_supported_producer(struct device *dev); int
> > +wbrf_add_exclusion(struct device *adev,
> > +  struct wbrf_ranges_in *in);
> > +int wbrf_remove_exclusion(struct device *dev,
> > + struct wbrf_ranges_in *in);
> > +int wbrf_retrieve_exclusions(struct device *dev,
> > +struct wbrf_ranges_out *out); bool
> > +wbrf_supported_consumer(struct device *dev);
> > +
> > +int wbrf_register_notifier(struct notifier_block *nb); int
> > +wbrf_unregister_notifier(struct notifier_block *nb); #else static
> > +inline bool wbrf_supported_producer(struct device *dev) { return
> > +false; } static inline int wbrf_add_exclusion(struct device *adev,
> > +struct wbrf_ranges_in *in) { return -
> ENODEV; } static inline
> > +int wbrf_remove_exclusion(struct device *dev,
> > +   struct wbrf_ranges_in *in) { return -
> ENODEV; }
>
> The normal aim of stubs is that so long as it is not expected to be fatal if 
> the
> functionality is missing, the caller should not care if it is missing. So i 
> would
> expect these to return 0, indicating everything worked as expected.
Sure, that makes sense.
>
> > +static inline int wbrf_retrieve_exclusions(struct device *dev,
> > +  struct wbrf_ranges_out *out)
> { return -ENODEV; }
>
> This is more complex. Ideally you want to return an empty set, so there is
> nothing to do. So i think the stub probably wants to do a memset and then
> return 0.
Right, will update it accordingly.
>
> > +static inline bool wbrf_supported_consumer(struct device *dev) {
> > +return false; } static inline int wbrf_register_notifier(struct
> > +notifier_block *nb) { return -ENODEV; } static inline int
> > +wbrf_unregister_notifier(struct notifier_block *nb) { return -ENODEV;
> > +}
>
> And these can just return 0.
Will update it.

Evan
>
> Andrew

RE: [PATCH V5 1/9] drivers core: Add support for Wifi band RF mitigations

2023-07-03 Thread Quan, Evan

[AMD Official Use Only - General]

> -Original Message-
> From: Andrew Lunn 
> Sent: Saturday, July 1, 2023 8:25 AM
> To: Limonciello, Mario 
> Cc: Quan, Evan ; raf...@kernel.org; l...@kernel.org;
> Deucher, Alexander ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; johan...@sipsolutions.net;
> da...@davemloft.net; eduma...@google.com; k...@kernel.org;
> pab...@redhat.com; mdaen...@redhat.com;
> maarten.lankho...@linux.intel.com; tzimmerm...@suse.de;
> hdego...@redhat.com; jingyuwang_...@163.com; Lazar, Lijo
> ; jim.cro...@gmail.com; bellosili...@gmail.com;
> andrealm...@igalia.com; t...@redhat.com; j...@jsg.id.au; a...@arndb.de;
> linux-ker...@vger.kernel.org; linux-a...@vger.kernel.org; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> wirel...@vger.kernel.org; net...@vger.kernel.org
> Subject: Re: [PATCH V5 1/9] drivers core: Add support for Wifi band RF
> mitigations
>
> > Right now there are stubs for non CONFIG_WBRF as well as other patches
> > are using #ifdef CONFIG_WBRF or having their own stubs.  Like mac80211
> > patch looks for #ifdef CONFIG_WBRF.
> >
> > I think we should pick one or the other.
> >
> > Having other subsystems #ifdef CONFIG_WBRF will make the series easier
> > to land through multiple trees; so I have a slight leaning in that 
> > direction.
>
> #ifdef in C files is generally not liked because it makes build testing 
> harder.
> There are more permutations to build. It is better to use
>
> if (IS_ENABLED(CONFIG_WBTR)) {
> }
>
> so that the code is compiled, and them throw away because
> IS_ENABLED(CONFIG_WBTR) evaluates to false.
>
> However, if the stubs are done correctly, the driver should not care. I doubt
> this is used in any sort of hot path where every instruction counts.
OK, will update as suggested.

Evan
>
>   Andrew

RE: [PATCH V5 2/9] driver core: add ACPI based WBRF mechanism introduced by AMD

2023-07-03 Thread Quan, Evan

[AMD Official Use Only - General]

> -Original Message-
> From: Andrew Lunn 
> Sent: Saturday, July 1, 2023 8:51 AM
> To: Quan, Evan 
> Cc: raf...@kernel.org; l...@kernel.org; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; johan...@sipsolutions.net;
> da...@davemloft.net; eduma...@google.com; k...@kernel.org;
> pab...@redhat.com; Limonciello, Mario ;
> mdaen...@redhat.com; maarten.lankho...@linux.intel.com;
> tzimmerm...@suse.de; hdego...@redhat.com; jingyuwang_...@163.com;
> Lazar, Lijo ; jim.cro...@gmail.com;
> bellosili...@gmail.com; andrealm...@igalia.com; t...@redhat.com;
> j...@jsg.id.au; a...@arndb.de; linux-ker...@vger.kernel.org; linux-
> a...@vger.kernel.org; amd-...@lists.freedesktop.org; dri-
> de...@lists.freedesktop.org; linux-wirel...@vger.kernel.org;
> net...@vger.kernel.org
> Subject: Re: [PATCH V5 2/9] driver core: add ACPI based WBRF mechanism
> introduced by AMD
>
> > +   argv4 = kzalloc(sizeof(*argv4) * (2 * num_of_ranges + 2 + 1),
> GFP_KERNEL);
> > +   if (!argv4)
> > +   return -ENOMEM;
> > +
> > +   argv4[arg_idx].package.type = ACPI_TYPE_PACKAGE;
> > +   argv4[arg_idx].package.count = 2 + 2 * num_of_ranges;
> > +   argv4[arg_idx++].package.elements = [1];
> > +   argv4[arg_idx].integer.type = ACPI_TYPE_INTEGER;
> > +   argv4[arg_idx++].integer.value = num_of_ranges;
> > +   argv4[arg_idx].integer.type = ACPI_TYPE_INTEGER;
> > +   argv4[arg_idx++].integer.value = action;
>
> There is a lot of magic numbers in that kzalloc. It is being used as an array,
> kcalloc() would be a good start to make it more readable.
> Can some #define's be used to explain what the other numbers mean?
Sure, will update accordingly.
>
> > +   /*
> > +* Bit 0 indicates whether there's support for any functions other than
> > +* function 0.
> > +*/
>
> Please make use of the BIT macro to give the different bits informative names.
Sure.
>
> > +   if ((mask & 0x1) && (mask & funcs) == funcs)
> > +   return true;
> > +
> > +   return false;
> > +}
> > +
>
> > +int acpi_amd_wbrf_retrieve_exclusions(struct device *dev,
> > + struct wbrf_ranges_out *out) {
> > +   struct acpi_device *adev = ACPI_COMPANION(dev);
> > +   union acpi_object *obj;
> > +
> > +   if (!adev)
> > +   return -ENODEV;
> > +
> > +   obj = acpi_evaluate_wbrf(adev->handle,
> > +WBRF_REVISION,
> > +WBRF_RETRIEVE);
> > +   if (!obj)
> > +   return -EINVAL;
> > +
> > +   WARN(obj->buffer.length != sizeof(*out),
> > +   "Unexpected buffer length");
> > +   memcpy(out, obj->buffer.pointer, obj->buffer.length);
>
> You WARN, and then overwrite whatever i passed the end of out?  Please at
> least use min(obj->buffer.length, sizeof(*out)), but better still:
>
>if (obj->buffer.length != sizeof(*out)) {
>  dev_err(dev, "BIOS FUBAR, ignoring wrong sized WBRT information");
>return -EINVAL;
>}
OK. Sounds reasonable. Will update as suggested.
>
> > +#if defined(CONFIG_WBRF_GENERIC)
> >  static struct exclusion_range_pool wbrf_pool;
> >
> >  static int _wbrf_add_exclusion_ranges(struct wbrf_ranges_in *in) @@
> > -89,6 +92,7 @@ static int _wbrf_retrieve_exclusion_ranges(struct
> > wbrf_ranges_out *out)
> >
> > return 0;
> >  }
> > +#endif
>
> I was expecting you would keep these tables, and then call into the BIOS as
> well. Having this table in debugfs seems like a useful thing to have for
> debugging the BIOS.
I'm not sure. Since these interfaces what we designed now kind of serve as a 
library.
When and where the debugfs should be created will be quite tricky.
>
> > +#ifdef CONFIG_WBRF_AMD_ACPI
> > +#else
> > +static inline bool
> > +acpi_amd_wbrf_supported_consumer(struct device *dev) { return false;
> > +} static inline bool acpi_amd_wbrf_supported_producer(struct device
> > +*dev) {return false; } static inline int
> > +acpi_amd_wbrf_remove_exclusion(struct device *dev,
> > +  struct wbrf_ranges_in *in) { return -ENODEV; }
> static
> > +inline int acpi_amd_wbrf_add_exclusion(struct device *dev,
> > +   struct wbrf_ranges_in *in) { return -ENODEV; } 
> > static
> inline
> > +int acpi_amd_wbrf_retrieve_exclusions(struct device *dev,
> > + struct wbrf_ranges_out *out) { return -
> ENODEV; }
>
> Do you actually need these stub versions?
Yes, these can be dropped. Let me update accordingly.

Evan
>
>   Andrew

RE: [PATCH V5 4/9] wifi: mac80211: Add support for ACPI WBRF

2023-07-03 Thread Quan, Evan

[AMD Official Use Only - General]

Hi Andrew,

> -Original Message-
> From: Andrew Lunn 
> Sent: Saturday, July 1, 2023 9:02 AM
> To: Quan, Evan 
> Cc: raf...@kernel.org; l...@kernel.org; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; johan...@sipsolutions.net;
> da...@davemloft.net; eduma...@google.com; k...@kernel.org;
> pab...@redhat.com; Limonciello, Mario ;
> mdaen...@redhat.com; maarten.lankho...@linux.intel.com;
> tzimmerm...@suse.de; hdego...@redhat.com; jingyuwang_...@163.com;
> Lazar, Lijo ; jim.cro...@gmail.com;
> bellosili...@gmail.com; andrealm...@igalia.com; t...@redhat.com;
> j...@jsg.id.au; a...@arndb.de; linux-ker...@vger.kernel.org; linux-
> a...@vger.kernel.org; amd-...@lists.freedesktop.org; dri-
> de...@lists.freedesktop.org; linux-wirel...@vger.kernel.org;
> net...@vger.kernel.org
> Subject: Re: [PATCH V5 4/9] wifi: mac80211: Add support for ACPI WBRF
>
> > +static void get_chan_freq_boundary(u32 center_freq,
> > +  u32 bandwidth,
> > +  u64 *start,
> > +  u64 *end)
> > +{
> > +   bandwidth = MHZ_TO_KHZ(bandwidth);
> > +   center_freq = MHZ_TO_KHZ(center_freq);
> > +
> > +   *start = center_freq - bandwidth / 2;
> > +   *end = center_freq + bandwidth / 2;
> > +
> > +   /* Frequency in HZ is expected */
> > +   *start = KHZ_TO_HZ(*start);
> > +   *end = KHZ_TO_HZ(*end);
> > +}
>
> This seems pretty generic, so maybe it should be moved into the shared code?
> It can then become a NOP when the functionality if disabled.
The shared code you mean is some place around mac80211?
Actually, there are two similar variants existed already: 
cfg80211_get_start_freq and cfg80211_get_end_freq.
The outputs of them are really what most mac80211 logics care.
The new API here is unlikely to be shared by other mac80211 part.
So, I suppose placing it here(only in wbrf.c) seems proper.
How do you think?

Evan
>
>   Andrew

RE: [PATCH V4 1/8] drivers/acpi: Add support for Wifi band RF mitigations

2023-06-30 Thread Quan, Evan

[AMD Official Use Only - General]

Hi Rafael & Andrew,

I just posted a new V5 series based on the discussions here and offline 
discussions with Mario.
Please share your comments/insights there.

Thanks,
Evan
> -Original Message-
> From: Rafael J. Wysocki 
> Sent: Saturday, June 24, 2023 1:16 AM
> To: Limonciello, Mario 
> Cc: Rafael J. Wysocki ; Quan, Evan
> ; l...@kernel.org; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; johan...@sipsolutions.net;
> da...@davemloft.net; eduma...@google.com; k...@kernel.org;
> pab...@redhat.com; mdaen...@redhat.com;
> maarten.lankho...@linux.intel.com; tzimmerm...@suse.de;
> hdego...@redhat.com; jingyuwang_...@163.com; Lazar, Lijo
> ; jim.cro...@gmail.com; bellosili...@gmail.com;
> andrealm...@igalia.com; t...@redhat.com; j...@jsg.id.au; a...@arndb.de;
> linux-ker...@vger.kernel.org; linux-a...@vger.kernel.org; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> wirel...@vger.kernel.org; net...@vger.kernel.org
> Subject: Re: [PATCH V4 1/8] drivers/acpi: Add support for Wifi band RF
> mitigations
>
> On Fri, Jun 23, 2023 at 6:48 PM Limonciello, Mario
>  wrote:
> >
> >
> > On 6/23/2023 11:28 AM, Rafael J. Wysocki wrote:
> > > On Fri, Jun 23, 2023 at 5:57 PM Limonciello, Mario
> > >  wrote:
> > >>
> > >> On 6/23/2023 9:52 AM, Rafael J. Wysocki wrote:
> > >>> On Wed, Jun 21, 2023 at 7:47 AM Evan Quan 
> wrote:
> > >>>> From: Mario Limonciello 
> > >>>>
> > >>>> Due to electrical and mechanical constraints in certain platform
> > >>>> designs there may be likely interference of relatively
> > >>>> high-powered harmonics of the (G-)DDR memory clocks with local
> > >>>> radio module frequency bands used by Wifi 6/6e/7.
> > >>>>
> > >>>> To mitigate this, AMD has introduced an ACPI based mechanism that
> > >>>> devices can use to notify active use of particular frequencies so
> > >>>> that devices can make relative internal adjustments as necessary
> > >>>> to avoid this resonance.
> > >>>>
> > >>>> In order for a device to support this, the expected flow for
> > >>>> device driver or subsystems:
> > >>>>
> > >>>> Drivers/subsystems contributing frequencies:
> > >>>>
> > >>>> 1) During probe, check `wbrf_supported_producer` to see if WBRF
> > >>>> supported
> > >>> The prefix should be acpi_wbrf_ or acpi_amd_wbrf_ even, so it is
> > >>> clear that this uses ACPI and is AMD-specific.
> > >> I guess if we end up with an intermediary library approach
> > >> wbrf_supported_producer makes sense and that could call acpi_wbrf_*.
> > >>
> > >> But with no intermediate library your suggestion makes sense.
> > >>
> > >> I would prefer not to make it acpi_amd as there is no reason that
> > >> this exact same problem couldn't happen on an Wifi 6e + Intel SOC +
> > >> AMD dGPU design too and OEMs could use the same mitigation
> > >> mechanism as Wifi6e + AMD SOC + AMD dGPU too.
> > > The mitigation mechanism might be the same, but the AML interface
> > > very well may be different.
> >
> >
> > Right.  I suppose right now we should keep it prefixed as "amd", and
> > if it later is promoted as a standard it can be renamed.
> >
> >
> > >
> > > My point is that this particular interface is AMD-specific ATM and
> > > I'm not aware of any plans to make it "standard" in some way.
> >
> >
> > Yeah; this implementation is currently AMD specific AML, but I expect
> > the exact same AML would be delivered to OEMs using the dGPUs.
> >
> >
> > >
> > > Also if the given interface is specified somewhere, it would be good
> > > to have a pointer to that place.
> >
> >
> > It's a code first implementation.  I'm discussing with the owners when
> > they will release it.
> >
> >
> > >
> > >>> Whether or not there needs to be an intermediate library wrapped
> > >>> around this is a different matter.
> > > IMO individual drivers should not be expected to use this interface
> > > directly, as that would add to boilerplate code and overall bloat.
> >
> > The thing is the ACPI method is not a platform method.  It's a
> > function of the

RE: [PATCH V3 1/7] drivers/acpi: Add support for Wifi band RF mitigations

2023-06-20 Thread Quan, Evan

[AMD Official Use Only - General]

> -Original Message-
> From: Limonciello, Mario 
> Sent: Monday, June 19, 2023 10:04 AM
> To: Quan, Evan 
> Cc: linux-ker...@vger.kernel.org; linux-a...@vger.kernel.org; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> wirel...@vger.kernel.org; raf...@kernel.org; l...@kernel.org; Deucher,
> Alexander ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; kv...@kernel.org; n...@nbd.name;
> lore...@kernel.org; ryder@mediatek.com; shayne.c...@mediatek.com;
> sean.w...@mediatek.com; matthias@gmail.com;
> angelogioacchino.delre...@collabora.com; Lazar, Lijo 
> Subject: Re: [PATCH V3 1/7] drivers/acpi: Add support for Wifi band RF
> mitigations
>
> On 6/16/23 01:57, Evan Quan wrote:
> > From: Mario Limonciello 
> >
> > Due to electrical and mechanical constraints in certain platform
> > designs there may be likely interference of relatively high-powered
> > harmonics of the (G-)DDR memory clocks with local radio module
> > frequency bands used by Wifi 6/6e/7.
> >
> > To mitigate this, AMD has introduced an ACPI based mechanism that
> > devices can use to notify active use of particular frequencies so that
> > devices can make relative internal adjustments as necessary to avoid
> > this resonance.
> >
> > In order for a device to support this, the expected flow for device
> > driver or subsystems:
> >
> > Drivers/subsystems contributing frequencies:
> >
> > 1) During probe, check `wbrf_supported_producer` to see if WBRF
> supported
> > for the device.
> > 2) If adding frequencies, then call `wbrf_add_exclusion` with the
> > start and end ranges of the frequencies.
> > 3) If removing frequencies, then call `wbrf_remove_exclusion` with
> > start and end ranges of the frequencies.
> >
> > Drivers/subsystems responding to frequencies:
> >
> > 1) During probe, check `wbrf_supported_consumer` to see if WBRF is
> supported
> > for the device.
> > 2) Call the `wbrf_retrieve_exclusions` to retrieve the current
> > exclusions on receiving an ACPI notification for a new frequency
> > change.
> >
> > Signed-off-by: Mario Limonciello 
> > Co-developed-by: Evan Quan 
> > Signed-off-by: Evan Quan 
> > --
> > v1->v2:
> >- move those wlan specific implementations to net/mac80211(Mario)
> > ---
> >   drivers/acpi/Kconfig |   7 ++
> >   drivers/acpi/Makefile|   2 +
> >   drivers/acpi/acpi_wbrf.c | 215
> +++
> >   include/linux/wbrf.h |  55 ++
> >   4 files changed, 279 insertions(+)
> >   create mode 100644 drivers/acpi/acpi_wbrf.c
> >   create mode 100644 include/linux/wbrf.h
> >
> > diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index
> > ccbeab9500ec..9ee7c7dcc3e6 100644
> > --- a/drivers/acpi/Kconfig
> > +++ b/drivers/acpi/Kconfig
> > @@ -611,3 +611,10 @@ config X86_PM_TIMER
> >
> >   You should nearly always say Y here because many modern
> >   systems require this timer.
> > +
> > +config ACPI_WBRF
> > +   bool "ACPI Wifi band RF mitigation mechanism"
> > +   help
> > + Wifi band RF mitigation mechanism allows multiple drivers from
> > + different domains to notify the frequencies in use so that hardware
> > + can be reconfigured to avoid harmonic conflicts.
> > \ No newline at end of file
>
> There should be a newline at the end of the Kconfig file.
OK, will add that.

Evan
>
> > diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile index
> > feb36c0b9446..be173e76aa62 100644
> > --- a/drivers/acpi/Makefile
> > +++ b/drivers/acpi/Makefile
> > @@ -131,3 +131,5 @@ obj-y   += dptf/
> >   obj-$(CONFIG_ARM64)   += arm64/
> >
> >   obj-$(CONFIG_ACPI_VIOT)   += viot.o
> > +
> > +obj-$(CONFIG_ACPI_WBRF)+= acpi_wbrf.o
> > \ No newline at end of file
> > diff --git a/drivers/acpi/acpi_wbrf.c b/drivers/acpi/acpi_wbrf.c new
> > file mode 100644 index ..8c275998ac29
> > --- /dev/null
> > +++ b/drivers/acpi/acpi_wbrf.c
> > @@ -0,0 +1,215 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * AMD Wifi Band Exclusion Interface
> > + * Copyright (C) 2023 Advanced Micro Devices
> > + *
> > + */
> > +
> > +#include 
> > +
> > +/* functions */
> > +#define WBRF_RECORD0x1
> > +#define WBRF_R

RE: [PATCH V3 2/7] wifi: mac80211: Add support for ACPI WBRF

2023-06-20 Thread Quan, Evan

[AMD Official Use Only - General]

> -Original Message-
> From: Limonciello, Mario 
> Sent: Monday, June 19, 2023 10:17 AM
> To: Quan, Evan 
> Cc: linux-ker...@vger.kernel.org; linux-a...@vger.kernel.org; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> wirel...@vger.kernel.org; raf...@kernel.org; l...@kernel.org; Deucher,
> Alexander ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; kv...@kernel.org; n...@nbd.name;
> lore...@kernel.org; ryder@mediatek.com; shayne.c...@mediatek.com;
> sean.w...@mediatek.com; matthias@gmail.com;
> angelogioacchino.delre...@collabora.com; Lazar, Lijo 
> Subject: Re: [PATCH V3 2/7] wifi: mac80211: Add support for ACPI WBRF
>
> On 6/16/23 01:57, Evan Quan wrote:
> > From: Mario Limonciello 
> >
> > To support AMD's WBRF interference mitigation mechanism, Wifi adapters
> > utilized in the system must register the frequencies in use(or
> > unregister those frequencies no longer used) via the dedicated APCI
> > calls. So that, other drivers responding to the frequencies can take
> > proper actions to mitigate possible interference.
> >
> > To make WBRF feature functional, the kernel needs to be configured
> > with CONFIG_ACPI_WBRF and the platform is equipped with WBRF
> > support(from BIOS and drivers).
> >
> > Signed-off-by: Mario Limonciello 
> > Co-developed-by: Evan Quan 
> > Signed-off-by: Evan Quan 
> > --
> > v1->v2:
> >- place the new added member(`wbrf_supported`) in
> >  ieee80211_local(Johannes)
> >- handle chandefs change scenario properly(Johannes)
> >- some minor fixes around code sharing and possible invalid input
> >  checks(Johannes)
> > ---
> >   include/net/cfg80211.h |   8 +++
> >   net/mac80211/Makefile  |   2 +
> >   net/mac80211/chan.c|  11 +++
> >   net/mac80211/ieee80211_i.h |  19 +
> >   net/mac80211/main.c|   2 +
> >   net/mac80211/wbrf.c| 137
> +
> >   net/wireless/chan.c|   3 +-
> >   7 files changed, 181 insertions(+), 1 deletion(-)
> >   create mode 100644 net/mac80211/wbrf.c
> >
> > diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h index
> > 9e04f69712b1..c6dc337eafce 100644
> > --- a/include/net/cfg80211.h
> > +++ b/include/net/cfg80211.h
> > @@ -920,6 +920,14 @@ const struct cfg80211_chan_def *
> >   cfg80211_chandef_compatible(const struct cfg80211_chan_def
> *chandef1,
> > const struct cfg80211_chan_def *chandef2);
> >
> > +/**
> > + * nl80211_chan_width_to_mhz - get the channel width in Mhz
> > + * @chan_width: the channel width from  nl80211_chan_width
> > + * Return: channel width in Mhz if the chan_width from 
> > +nl80211_chan_width
> > + * is valid. -1 otherwise.
> > + */
> > +int nl80211_chan_width_to_mhz(enum nl80211_chan_width
> chan_width);
> > +
>
> It's up to mac80211 maintainers, but I would think that the changes to change
> nl80211_chan_width_to_mhz from static to exported should be separate
> from the patch to introduced WBRF support in the series.
Will do that.
>
> >   /**
> >* cfg80211_chandef_valid - check if a channel definition is valid
> >* @chandef: the channel definition to check diff --git
> > a/net/mac80211/Makefile b/net/mac80211/Makefile index
> > b8de44da1fb8..709eb678f42a 100644
> > --- a/net/mac80211/Makefile
> > +++ b/net/mac80211/Makefile
> > @@ -65,4 +65,6 @@ rc80211_minstrel-$(CONFIG_MAC80211_DEBUGFS)
> += \
> >
> >   mac80211-$(CONFIG_MAC80211_RC_MINSTREL) += $(rc80211_minstrel-
> y)
> >
> > +mac80211-$(CONFIG_ACPI_WBRF) += wbrf.o
> > +
> >   ccflags-y += -DDEBUG
> > diff --git a/net/mac80211/chan.c b/net/mac80211/chan.c index
> > 77c90ed8f5d7..0c5289a9aa6c 100644
> > --- a/net/mac80211/chan.c
> > +++ b/net/mac80211/chan.c
> > @@ -506,11 +506,16 @@ static void _ieee80211_change_chanctx(struct
> > ieee80211_local *local,
> >
> > WARN_ON(!cfg80211_chandef_compatible(>conf.def,
> chandef));
> >
> > +   ieee80211_remove_wbrf(local, >conf.def);
> > +
> > ctx->conf.def = *chandef;
> >
> > /* check if min chanctx also changed */
> > changed = IEEE80211_CHANCTX_CHANGE_WIDTH |
> >   _ieee80211_recalc_chanctx_min_def(local, ctx, rsvd_for);
> > +
> > +   ieee80211_add_wbrf(local, >conf.def);
> > +
> > drv_change_chanctx(local, ctx, changed);
> >
> > if (!local->use_cha

RE: [PATCH V3 2/7] wifi: mac80211: Add support for ACPI WBRF

2023-06-20 Thread Quan, Evan

[AMD Official Use Only - General]

> -Original Message-
> From: Johannes Berg 
> Sent: Monday, June 19, 2023 4:24 PM
> To: Limonciello, Mario ; Quan, Evan
> 
> Cc: linux-ker...@vger.kernel.org; linux-a...@vger.kernel.org; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> wirel...@vger.kernel.org; raf...@kernel.org; l...@kernel.org; Deucher,
> Alexander ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; kv...@kernel.org; n...@nbd.name;
> lore...@kernel.org; ryder@mediatek.com; shayne.c...@mediatek.com;
> sean.w...@mediatek.com; matthias@gmail.com;
> angelogioacchino.delre...@collabora.com; Lazar, Lijo 
> Subject: Re: [PATCH V3 2/7] wifi: mac80211: Add support for ACPI WBRF
>
> On Sun, 2023-06-18 at 21:17 -0500, Mario Limonciello wrote:
> >
> > > --- a/include/net/cfg80211.h
> > > +++ b/include/net/cfg80211.h
> > > @@ -920,6 +920,14 @@ const struct cfg80211_chan_def *
> > >   cfg80211_chandef_compatible(const struct cfg80211_chan_def
> *chandef1,
> > >   const struct cfg80211_chan_def *chandef2);
> > >
> > > +/**
> > > + * nl80211_chan_width_to_mhz - get the channel width in Mhz
> > > + * @chan_width: the channel width from  nl80211_chan_width
> > > + * Return: channel width in Mhz if the chan_width from 
> > > +nl80211_chan_width
> > > + * is valid. -1 otherwise.
> > > + */
> > > +int nl80211_chan_width_to_mhz(enum nl80211_chan_width
> chan_width);
> > > +
> >
> > It's up to mac80211 maintainers, but I would think that the changes to
> > change nl80211_chan_width_to_mhz from static to exported should be
> > separate from the patch to introduced WBRF support in the series.
>
> Yeah, that'd be nice :)
OK, I will move that into a new patch.
>
>
> > > +#define KHZ_TO_HZ(freq)  ((freq) * 1000ULL)
>
> Together with MHZ_TO_KHZ() for example :)
Sure.

Evan
>
> johannes

RE: [PATCH V3 4/7] drm/amd/pm: setup the framework to support Wifi RFI mitigation feature

2023-06-20 Thread Quan, Evan

[AMD Official Use Only - General]

> -Original Message-
> From: Lazar, Lijo 
> Sent: Monday, June 19, 2023 10:55 PM
> To: Quan, Evan ; raf...@kernel.org; l...@kernel.org;
> Deucher, Alexander ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; kv...@kernel.org; n...@nbd.name;
> lore...@kernel.org; ryder@mediatek.com; shayne.c...@mediatek.com;
> sean.w...@mediatek.com; matthias@gmail.com;
> angelogioacchino.delre...@collabora.com; Limonciello, Mario
> 
> Cc: linux-ker...@vger.kernel.org; linux-a...@vger.kernel.org; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> wirel...@vger.kernel.org
> Subject: Re: [PATCH V3 4/7] drm/amd/pm: setup the framework to support
> Wifi RFI mitigation feature
>
>
>
> On 6/16/2023 12:27 PM, Evan Quan wrote:
> > With WBRF feature supported, as a driver responding to the
> > frequencies, amdgpu driver is able to do shadow pstate switching to
> > mitigate possible interference(between its (G-)DDR memory clocks and
> > local radio module frequency bands used by Wifi 6/6e/7).
> >
> > To make WBRF feature functional, the kernel needs to be configured
> > with CONFIG_ACPI_WBRF and the platform is equipped with necessary ACPI
> > based mechanism to get amdgpu driver notified.
> >
> > Signed-off-by: Evan Quan 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  26 +++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c  |  63 ++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  19 ++
> >   drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 184
> ++
> >   drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |  20 ++
> >   drivers/gpu/drm/amd/pm/swsmu/smu_internal.h   |   3 +
> >   6 files changed, 315 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > index 02b827785e39..2f2ec64ed1b2 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > @@ -50,6 +50,7 @@
> >   #include 
> >   #include 
> >   #include 
> > +#include 
> >
> >   #include 
> >   #include 
> > @@ -241,6 +242,7 @@ extern int amdgpu_num_kcq;
> >   #define AMDGPU_VCNFW_LOG_SIZE (32 * 1024)
> >   extern int amdgpu_vcnfw_log;
> >   extern int amdgpu_sg_display;
> > +extern int amdgpu_wbrf;
> >
> >   #define AMDGPU_VM_MAX_NUM_CTX 4096
> >   #define AMDGPU_SG_THRESHOLD   (256*1024*1024)
> > @@ -741,6 +743,9 @@ struct amdgpu_reset_domain;
> >*/
> >   #define AMDGPU_HAS_VRAM(_adev) ((_adev)->gmc.real_vram_size)
> >
> > +typedef
> > +void (*wbrf_notify_handler) (struct amdgpu_device *adev);
> > +
> >   struct amdgpu_device {
> > struct device   *dev;
> > struct pci_dev  *pdev;
> > @@ -1050,6 +1055,8 @@ struct amdgpu_device {
> >
> > booljob_hang;
> > booldc_enabled;
> > +
> > +   wbrf_notify_handler wbrf_event_handler;
> >   };
> >
> >   static inline struct amdgpu_device *drm_to_adev(struct drm_device
> > *ddev) @@ -1381,6 +1388,25 @@ static inline int
> amdgpu_acpi_smart_shift_update(struct drm_device *dev,
> >  enum amdgpu_ss ss_state)
> { return 0; }
> >   #endif
> >
> > +#if defined(CONFIG_ACPI_WBRF)
> > +bool amdgpu_acpi_is_wbrf_supported(struct amdgpu_device *adev); int
> > +amdgpu_acpi_wbrf_retrieve_exclusions(struct amdgpu_device *adev,
> > +struct wbrf_ranges_out
> *exclusions_out); int
> > +amdgpu_acpi_register_wbrf_notify_handler(struct amdgpu_device *adev,
> > +wbrf_notify_handler handler); int
> > +amdgpu_acpi_unregister_wbrf_notify_handler(struct amdgpu_device
> > +*adev); #else static inline bool amdgpu_acpi_is_wbrf_supported(struct
> > +amdgpu_device *adev) { return false; } static inline int
> > +amdgpu_acpi_wbrf_retrieve_exclusions(struct amdgpu_device *adev,
> > +struct wbrf_ranges_out
> *exclusions_out) { return 0; } static
> > +inline int amdgpu_acpi_register_wbrf_notify_handler(struct
> > +amdgpu_device *adev,
> > +wbrf_notify_handler handler)
> { return 0; } static inline
> > +int amdgpu_acpi_unregister_wbrf_notify_handler(struct amdgpu_device
> > +*adev) { return 0; } #endif
> > +
> >

RE: [PATCH V2 2/7] wifi: mac80211: Add support for ACPI WBRF

2023-06-12 Thread Quan, Evan

[AMD Official Use Only - General]

Thanks Johannes. Comment in-line

> -Original Message-
> From: Johannes Berg 
> Sent: Friday, June 9, 2023 4:21 PM
> To: Quan, Evan ; raf...@kernel.org; l...@kernel.org;
> Deucher, Alexander ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; kv...@kernel.org; n...@nbd.name;
> lore...@kernel.org; ryder@mediatek.com; shayne.c...@mediatek.com;
> sean.w...@mediatek.com; matthias@gmail.com;
> angelogioacchino.delre...@collabora.com; Limonciello, Mario
> ; Lazar, Lijo 
> Cc: linux-ker...@vger.kernel.org; linux-a...@vger.kernel.org; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> wirel...@vger.kernel.org
> Subject: Re: [PATCH V2 2/7] wifi: mac80211: Add support for ACPI WBRF
>
> On Fri, 2023-06-09 at 15:28 +0800, Evan Quan wrote:
>
> > --- a/include/net/cfg80211.h
> > +++ b/include/net/cfg80211.h
> > @@ -5551,6 +5551,10 @@ struct wiphy {
> >
> > u16 hw_timestamp_max_peers;
> >
> > +#ifdef CONFIG_ACPI_WBRF
> > +   bool wbrf_supported;
> > +#endif
>
> This should be in some private struct in mac80211, ieee80211_local I think.
There was indeed a proposal from Mario to put this in ieee80211_local.
But I thought "wbrf_supported" stands for a device specific feature and "struct 
wiphy" seemed the right place.
Anyway I can update this as suggested.
>
> > char priv[] __aligned(NETDEV_ALIGN);  };
> >
> > @@ -9067,4 +9071,18 @@ static inline int
> > cfg80211_color_change_notify(struct net_device *dev)  bool
> cfg80211_valid_disable_subchannel_bitmap(u16 *bitmap,
> >   const struct cfg80211_chan_def
> *chandef);
> >
> > +#ifdef CONFIG_ACPI_WBRF
> > +void ieee80211_check_wbrf_support(struct wiphy *wiphy); int
> > +ieee80211_add_wbrf(struct wiphy *wiphy,
> > +  struct cfg80211_chan_def *chandef); void
> > +ieee80211_remove_wbrf(struct wiphy *wiphy,
> > +  struct cfg80211_chan_def *chandef); #else static
> inline void
> > +ieee80211_check_wbrf_support(struct wiphy *wiphy) { } static inline
> > +int ieee80211_add_wbrf(struct wiphy *wiphy,
> > +struct cfg80211_chan_def *chandef)
> { return 0; } static
> > +inline void ieee80211_remove_wbrf(struct wiphy *wiphy,
> > +struct cfg80211_chan_def *chandef)
> { } #endif /*
> > +CONFIG_ACPI_WBRF */
>
> Same here, not the right place. This should even be in an internal
> mac80211 header (such as net/mac80211/ieee80211_i.h or create a new
> net/mac80211/wrbf.h or so if you prefer.)
Will update this altogether.
>
>
> > --- a/net/mac80211/chan.c
> > +++ b/net/mac80211/chan.c
> > @@ -668,6 +668,10 @@ static int ieee80211_add_chanctx(struct
> ieee80211_local *local,
> > lockdep_assert_held(>mtx);
> > lockdep_assert_held(>chanctx_mtx);
> >
> > +   err = ieee80211_add_wbrf(local->hw.wiphy, >conf.def);
> > +   if (err)
> > +   return err;
> > +
> > if (!local->use_chanctx)
> > local->hw.conf.radar_enabled = ctx->conf.radar_enabled;
> >
> > @@ -748,6 +752,8 @@ static void ieee80211_del_chanctx(struct
> ieee80211_local *local,
> > }
> >
> > ieee80211_recalc_idle(local);
> > +
> > +   ieee80211_remove_wbrf(local->hw.wiphy, >conf.def);
> >  }
> >
> >  static void ieee80211_free_chanctx(struct ieee80211_local *local,
> >
>
> This is tricky, and quite likely incorrect.
>
> First of all, chandefs can actually _change_, see _ieee80211_change_chanctx().
> You'd probably have to call this add/remove (or have modify) whenever we
> call drv_change_chanctx() to change the width (not if radar/rx chains change).
Thanks for pointing this out. Unfortunately I have limit knowledge about 
network related.
Can you help me to list all those APIs? Any others except the four below?
_ieee80211_change_chanctx
ieee80211_recalc_chanctx_min_def  (Just curious why "!local->use_chanctx" is 
not cover in this API like others?)
ieee80211_recalc_radar_chanctx  (do you mean this one can be ignored ?)
ieee80211_recalc_smps_chanctx
>
> Secondly, you don't know if the driver will actually use ctx->conf.def, or 
> ctx-
> >conf.mindef. For client mode that doesn't matter, but for AP mode if the AP 
> >is
> configured to say 160 MHz, it might actually configure down to 20 MHz when
> no stations are connected (or only 20 MHz stations are). I don't know if you
> really care about taking that into account, I also don't know how dynamic

RE: [PATCH] drm/amd/pm: remove unused num_of_active_display variable

2023-04-09 Thread Quan, Evan

[AMD Official Use Only - General]

Reviewed-by: Evan Quan 

> -Original Message-
> From: Tom Rix 
> Sent: Saturday, April 1, 2023 12:41 AM
> To: Quan, Evan ; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; nat...@kernel.org;
> ndesaulni...@google.com; Zhang, Hawking ;
> Feng, Kenneth ; Lazar, Lijo
> ; Wang, Yang(Kevin) ;
> Huang, Tim ; andrealm...@igalia.com; Liu, Kun
> ; Limonciello, Mario 
> Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> ker...@vger.kernel.org; l...@lists.linux.dev; Tom Rix 
> Subject: [PATCH] drm/amd/pm: remove unused num_of_active_display
> variable
> 
> clang with W=1 reports
> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/amdgpu_smu.c:1700:6: error:
> variable
>   'num_of_active_display' set but not used [-Werror,-Wunused-but-set-
> variable]
> int num_of_active_display = 0;
> ^
> This variable is not used so remove it.
> 
> Signed-off-by: Tom Rix 
> ---
>  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 7 ---
>  1 file changed, 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> index b5d64749990e..f93f7a9ed631 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> @@ -1696,8 +1696,6 @@ static int smu_display_configuration_change(void
> *handle,
>   const struct
> amd_pp_display_configuration *display_config)  {
>   struct smu_context *smu = handle;
> - int index = 0;
> - int num_of_active_display = 0;
> 
>   if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)
>   return -EOPNOTSUPP;
> @@ -1708,11 +1706,6 @@ static int smu_display_configuration_change(void
> *handle,
>   smu_set_min_dcef_deep_sleep(smu,
>   display_config-
> >min_dcef_deep_sleep_set_clk / 100);
> 
> - for (index = 0; index < display_config-
> >num_path_including_non_display; index++) {
> - if (display_config->displays[index].controller_id != 0)
> - num_of_active_display++;
> - }
> -
>   return 0;
>  }
> 
> --
> 2.27.0

RE: [PATCH v3 2/2] drm/probe_helper: warning on poll_enabled for issue catching

2023-03-09 Thread Quan, Evan

[AMD Official Use Only - General]

Series is acked-by: Evan Quan 

> -Original Message-
> From: amd-gfx  On Behalf Of
> Guchun Chen
> Sent: Friday, March 10, 2023 9:02 AM
> To: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org;
> Deucher, Alexander ; Zhang, Hawking
> ; dmitry.barysh...@linaro.org;
> spassw...@web.de; m...@fireburn.co.uk
> Cc: Chen, Guchun 
> Subject: [PATCH v3 2/2] drm/probe_helper: warning on poll_enabled for
> issue catching
> 
> In order to catch issues in other drivers to ensure proper call sequence of
> polling function.
> 
> v2: drop Fixes tag in commit message
> 
> Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2411
> Reported-by: Bert Karwatzki 
> Suggested-by: Dmitry Baryshkov 
> Signed-off-by: Guchun Chen 
> ---
>  drivers/gpu/drm/drm_probe_helper.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_probe_helper.c
> b/drivers/gpu/drm/drm_probe_helper.c
> index 8127be134c39..85e0e80d4a52 100644
> --- a/drivers/gpu/drm/drm_probe_helper.c
> +++ b/drivers/gpu/drm/drm_probe_helper.c
> @@ -852,6 +852,8 @@ EXPORT_SYMBOL(drm_kms_helper_is_poll_worker);
>   */
>  void drm_kms_helper_poll_disable(struct drm_device *dev)  {
> + WARN_ON(!dev->mode_config.poll_enabled);
> +
>   if (dev->mode_config.poll_running)
>   drm_kms_helper_disable_hpd(dev);
> 
> --
> 2.25.1

RE: [PATCH] gpu: amd/pm: mark symbols static where possible for smu11

2023-03-02 Thread Quan, Evan

[AMD Official Use Only - General]

Thanks. But I think there was already a patch from Kun Liu to address this 
issue.
https://lists.freedesktop.org/archives/amd-gfx/2023-March/090029.html

BR
Evan
> -Original Message-
> From: Jeff Pang 
> Sent: Thursday, March 2, 2023 5:16 PM
> To: Quan, Evan 
> Cc: amd-...@lists.freedesktop.org; linux-ker...@vger.kernel.org; dri-
> de...@lists.freedesktop.org; Jeff Pang 
> Subject: [PATCH] gpu: amd/pm: mark symbols static where possible for
> smu11
> 
> I get one warning when building kernel with -Werror=missing-prototypes :
> 
> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:1600:5:
> error: no previous prototype for ‘vangogh_set_apu_thermal_limit’
> [-Werror=missing-prototypes]
> int vangogh_set_apu_thermal_limit(struct smu_context *smu, uint32_t limit)
> 
> In fact, this function don't need a declaration due to it's only used in the 
> file
> which they are.
> So this patch marks the function with 'static'.
> 
> Signed-off-by: Jeff Pang 
> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> index 016d5621e0b3..24046af60933 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> @@ -1597,7 +1597,7 @@ static int vangogh_get_apu_thermal_limit(struct
> smu_context *smu, uint32_t *limi
> 0, limit);
>  }
> 
> -int vangogh_set_apu_thermal_limit(struct smu_context *smu, uint32_t
> limit)
> +static int vangogh_set_apu_thermal_limit(struct smu_context *smu,
> +uint32_t limit)
>  {
>   return smu_cmn_send_smc_msg_with_param(smu,
> 
> SMU_MSG_SetReducedThermalLimit,
> --
> 2.34.1

RE: [PATCH 0/3] drm/amd/pm/powerplay: use bitwise or for bitmasks addition

2023-01-15 Thread Quan, Evan

[AMD Official Use Only - General]

Series is reviewed-by: Evan Quan 

> -Original Message-
> From: Deepak R Varma 
> Sent: Sunday, January 15, 2023 3:16 PM
> To: Quan, Evan ; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ; David
> Airlie ; Daniel Vetter ; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> ker...@vger.kernel.org
> Cc: Saurabh Singh Sengar ; Praveen Kumar
> 
> Subject: [PATCH 0/3] drm/amd/pm/powerplay: use bitwise or for bitmasks
> addition
> 
> The patch series proposes usage of bitwise or "|" operator for addition of
> bitmasks instead of using numerial additions. The former is quicker and
> cleaner.
> 
> The proposed change is compile tested.
> 
> Deepak R Varma (3):
>   drm/amd/pm/powerplay/smumgr: use bitwise or for addition
>   drm/amd/pm/powerplay/hwmgr: use bitwise or for bitmasks addition
>   drm/amd/pm/powerplay/smumgr/ci: use bitwise or for bitmasks addition
> 
>  drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c  | 8 ---
> -
>  drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c  | 2 +-
>  drivers/gpu/drm/amd/pm/powerplay/smumgr/iceland_smumgr.c | 2 +-
>  drivers/gpu/drm/amd/pm/powerplay/smumgr/tonga_smumgr.c   | 2 +-
>  4 files changed, 7 insertions(+), 7 deletions(-)
> 
> --
> 2.34.1
> 
>

RE: [PATCH] drm/amdgpu: add mb for si

2022-11-24 Thread Quan, Evan

[AMD Official Use Only - General]



> -Original Message-
> From: Lazar, Lijo 
> Sent: Thursday, November 24, 2022 6:49 PM
> To: Quan, Evan ; 李真能 ;
> Michel Dänzer ; Koenig, Christian
> ; Deucher, Alexander
> 
> Cc: amd-...@lists.freedesktop.org; Pan, Xinhui ;
> linux-ker...@vger.kernel.org; dri-devel@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: add mb for si
> 
> 
> 
> On 11/24/2022 4:11 PM, Lazar, Lijo wrote:
> >
> >
> > On 11/24/2022 3:34 PM, Quan, Evan wrote:
> >> [AMD Official Use Only - General]
> >>
> >> Could the attached patch help?
> >>
> >> Evan
> >>> -Original Message-
> >>> From: amd-gfx  On Behalf
> Of ???
> >>> Sent: Friday, November 18, 2022 5:25 PM
> >>> To: Michel Dänzer ; Koenig, Christian
> >>> ; Deucher, Alexander
> >>> 
> >>> Cc: amd-...@lists.freedesktop.org; Pan, Xinhui ;
> >>> linux-ker...@vger.kernel.org; dri-devel@lists.freedesktop.org
> >>> Subject: Re: [PATCH] drm/amdgpu: add mb for si
> >>>
> >>>
> >>> 在 2022/11/18 17:18, Michel Dänzer 写道:
> >>>> On 11/18/22 09:01, Christian König wrote:
> >>>>> Am 18.11.22 um 08:48 schrieb Zhenneng Li:
> >>>>>> During reboot test on arm64 platform, it may failure on boot, so
> >>>>>> add this mb in smc.
> >>>>>>
> >>>>>> The error message are as follows:
> >>>>>> [    6.996395][ 7] [  T295] [drm:amdgpu_device_ip_late_init
> >>>>>> [amdgpu]] *ERROR*
> >>>>>>   late_init of IP block  failed -22 [
> >>>>>> 7.006919][ 7] [  T295] amdgpu :04:00.0:
> >
> > The issue is happening in late_init() which eventually does
> >
> >  ret = si_thermal_enable_alert(adev, false);
> >
> > Just before this, si_thermal_start_thermal_controller is called in
> > hw_init and that enables thermal alert.
> >
> > Maybe the issue is with enable/disable of thermal alerts in quick
> > succession. Adding a delay inside si_thermal_start_thermal_controller
> > might help.
> >
> 
> On a second look, temperature range is already set as part of
> si_thermal_start_thermal_controller in hw_init
> https://elixir.bootlin.com/linux/v6.1-
> rc6/source/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c#L6780
> 
> There is no need to set it again here -
> 
> https://elixir.bootlin.com/linux/v6.1-
> rc6/source/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c#L7635
> 
> I think it is safe to remove the call from late_init altogether. Alex/Evan?
> 
[Quan, Evan] Yes, it makes sense to me. But I'm not sure whether that’s related 
with the issue here.
Since per my understandings, if the issue is caused by double calling of 
thermal_alert enablement, it will fail every time.
That cannot explain why adding some delays or a mb() calling can help.

BR
Evan
> Thanks,
> Lijo
> 
> > Thanks,
> > Lijo
> >
> >>>>>> amdgpu_device_ip_late_init failed [    7.014224][ 7] [  T295] amdgpu
> >>>>>> :04:00.0: Fatal error during GPU init
> >>>>> Memory barries are not supposed to be sprinkled around like this,
> you
> >>> need to give a detailed explanation why this is necessary.
> >>>>>
> >>>>> Regards,
> >>>>> Christian.
> >>>>>
> >>>>>> Signed-off-by: Zhenneng Li 
> >>>>>> ---
> >>>>>>     drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++
> >>>>>>     1 file changed, 2 insertions(+)
> >>>>>>
> >>>>>> diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>>>>> b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>>>>> index 8f994ffa9cd1..c7656f22278d 100644
> >>>>>> --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>>>>> +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>>>>> @@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct
> >>>>>> amdgpu_device *adev)
> >>>>>>     u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL);
> >>>>>>     u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0);
> >>>>>>     +    mb();
> >>>>>> +
> >>>>>>     if (!(rst & RST_REG) && !(clk & CK_DISABLE))
> >>>>>>     return true;
> >>>> In particular, it makes no sense in this specific place, since it
> >>>> cannot directly
> >>> affect the values of rst & clk.
> >>>
> >>> I thinks so too.
> >>>
> >>> But when I do reboot test using nine desktop machines,  there maybe
> >>> report
> >>> this error on one or two machines after Hundreds of times or
> >>> Thousands of
> >>> times reboot test, at the beginning, I use msleep() instead of mb(),
> >>> these
> >>> two methods are all works, but I don't know what is the root case.
> >>>
> >>> I use this method on other verdor's oland card, this error message are
> >>> reported again.
> >>>
> >>> What could be the root reason?
> >>>
> >>> test environmen:
> >>>
> >>> graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87
> >>>
> >>> driver: amdgpu
> >>>
> >>> os: ubuntu 2004
> >>>
> >>> platform: arm64
> >>>
> >>> kernel: 5.4.18
> >>>
> >>>>
<>

RE: [PATCH] drm/amdgpu: add mb for si

2022-11-24 Thread Quan, Evan

[AMD Official Use Only - General]

Did you see that? It's a patch which I created by git-format-patch.
Anyway I will paste the changes below. I was suspecting maybe we need some 
waits for smu running.

diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c 
b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
index 49c398ec0aaf..9f308a021b2d 100644
--- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
+++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
@@ -6814,6 +6814,7 @@ static int si_dpm_enable(struct amdgpu_device *adev)
struct si_power_info *si_pi = si_get_pi(adev);
struct amdgpu_ps *boot_ps = adev->pm.dpm.boot_ps;
int ret;
+   int i;

if (amdgpu_si_is_smc_running(adev))
return -EINVAL;
@@ -6909,6 +6910,17 @@ static int si_dpm_enable(struct amdgpu_device *adev)
si_program_response_times(adev);
si_program_ds_registers(adev);
si_dpm_start_smc(adev);
+   /* Waiting for smc alive */
+   for (i = 0; i < adev->usec_timeout; i++) {
+   if (amdgpu_si_is_smc_running(adev))
+   break;
+   udelay(1);
+   }
+   if (i >= adev->usec_timeout) {
+   DRM_ERROR("Timedout on waiting for smu running\n");
+   return -EINVAL;
+   }
+
ret = si_notify_smc_display_change(adev, false);
if (ret) {
DRM_ERROR("si_notify_smc_display_change failed\n");


BR
Evan
> -Original Message-
> From: Christian König 
> Sent: Thursday, November 24, 2022 6:06 PM
> To: Quan, Evan ; 李真能 ;
> Michel Dänzer ; Koenig, Christian
> ; Deucher, Alexander
> 
> Cc: dri-devel@lists.freedesktop.org; Pan, Xinhui ;
> linux-ker...@vger.kernel.org; amd-...@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: add mb for si
> 
> That's not a patch but some binary file?
> 
> Christian.
> 
> Am 24.11.22 um 11:04 schrieb Quan, Evan:
> > [AMD Official Use Only - General]
> >
> > Could the attached patch help?
> >
> > Evan
> >> -Original Message-
> >> From: amd-gfx  On Behalf
> Of ???
> >> Sent: Friday, November 18, 2022 5:25 PM
> >> To: Michel Dänzer ; Koenig, Christian
> >> ; Deucher, Alexander
> >> 
> >> Cc: amd-...@lists.freedesktop.org; Pan, Xinhui ;
> >> linux-ker...@vger.kernel.org; dri-devel@lists.freedesktop.org
> >> Subject: Re: [PATCH] drm/amdgpu: add mb for si
> >>
> >>
> >> 在 2022/11/18 17:18, Michel Dänzer 写道:
> >>> On 11/18/22 09:01, Christian König wrote:
> >>>> Am 18.11.22 um 08:48 schrieb Zhenneng Li:
> >>>>> During reboot test on arm64 platform, it may failure on boot, so
> >>>>> add this mb in smc.
> >>>>>
> >>>>> The error message are as follows:
> >>>>> [    6.996395][ 7] [  T295] [drm:amdgpu_device_ip_late_init
> >>>>> [amdgpu]] *ERROR*
> >>>>>       late_init of IP block  failed -22 [
> >>>>> 7.006919][ 7] [  T295] amdgpu :04:00.0:
> >>>>> amdgpu_device_ip_late_init failed [    7.014224][ 7] [  T295]
> >>>>> amdgpu
> >>>>> :04:00.0: Fatal error during GPU init
> >>>> Memory barries are not supposed to be sprinkled around like this,
> >>>> you
> >> need to give a detailed explanation why this is necessary.
> >>>> Regards,
> >>>> Christian.
> >>>>
> >>>>> Signed-off-by: Zhenneng Li 
> >>>>> ---
> >>>>>     drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++
> >>>>>     1 file changed, 2 insertions(+)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>>>> b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>>>> index 8f994ffa9cd1..c7656f22278d 100644
> >>>>> --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>>>> +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>>>> @@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct
> >>>>> amdgpu_device *adev)
> >>>>>     u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL);
> >>>>>     u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0);
> >>>>>     +    mb();
> >>>>> +
> >>>>>     if (!(rst & RST_REG) && !(clk & CK_DISABLE))
> >>>>>     return true;
> >>> In particular, it makes no sense in this specific place, since it
> >>> cannot directly
> >> affect the values of rst & clk.
> >>
> >> I thinks so too.
> >>
> >> But when I do reboot test using nine desktop machines,  there maybe
> >> report this error on one or two machines after Hundreds of times or
> >> Thousands of times reboot test, at the beginning, I use msleep()
> >> instead of mb(), these two methods are all works, but I don't know what
> is the root case.
> >>
> >> I use this method on other verdor's oland card, this error message
> >> are reported again.
> >>
> >> What could be the root reason?
> >>
> >> test environmen:
> >>
> >> graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87
> >>
> >> driver: amdgpu
> >>
> >> os: ubuntu 2004
> >>
> >> platform: arm64
> >>
> >> kernel: 5.4.18
> >>
<>

RE: [PATCH] swsmu/amdgpu_smu: Fix the wrong if-condition

2022-11-24 Thread Quan, Evan

[AMD Official Use Only - General]

Reviewed-by: Evan Quan 

> -Original Message-
> From: amd-gfx  On Behalf Of Yu
> Songping
> Sent: Thursday, November 24, 2022 9:53 AM
> To: airl...@gmail.com; dan...@ffwll.ch
> Cc: dri-devel@lists.freedesktop.org; amd-...@lists.freedesktop.org; linux-
> ker...@vger.kernel.org
> Subject: [PATCH] swsmu/amdgpu_smu: Fix the wrong if-condition
> 
> The logical operator '&&' will make
> smu->ppt_funcs->set_gfx_power_up_by_imu segment fault when
> ppt_funcs is
> smu->NULL.
> 
> Signed-off-by: Yu Songping 
> ---
>  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> index b880f4d7d67e..1cb728b0b7ee 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> @@ -161,7 +161,7 @@ int smu_get_dpm_freq_range(struct smu_context
> *smu,
> 
>  int smu_set_gfx_power_up_by_imu(struct smu_context *smu)  {
> - if (!smu->ppt_funcs && !smu->ppt_funcs-
> >set_gfx_power_up_by_imu)
> + if (!smu->ppt_funcs || !smu->ppt_funcs-
> >set_gfx_power_up_by_imu)
>   return -EOPNOTSUPP;
> 
>   return smu->ppt_funcs->set_gfx_power_up_by_imu(smu);
> --
> 2.17.1
<>

RE: [PATCH] drm/amdgpu: add mb for si

2022-11-24 Thread Quan, Evan

[AMD Official Use Only - General]

Could the attached patch help?

Evan
> -Original Message-
> From: amd-gfx  On Behalf Of ???
> Sent: Friday, November 18, 2022 5:25 PM
> To: Michel Dänzer ; Koenig, Christian
> ; Deucher, Alexander
> 
> Cc: amd-...@lists.freedesktop.org; Pan, Xinhui ;
> linux-ker...@vger.kernel.org; dri-devel@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: add mb for si
> 
> 
> 在 2022/11/18 17:18, Michel Dänzer 写道:
> > On 11/18/22 09:01, Christian König wrote:
> >> Am 18.11.22 um 08:48 schrieb Zhenneng Li:
> >>> During reboot test on arm64 platform, it may failure on boot, so add
> >>> this mb in smc.
> >>>
> >>> The error message are as follows:
> >>> [    6.996395][ 7] [  T295] [drm:amdgpu_device_ip_late_init
> >>> [amdgpu]] *ERROR*
> >>>      late_init of IP block  failed -22 [
> >>> 7.006919][ 7] [  T295] amdgpu :04:00.0:
> >>> amdgpu_device_ip_late_init failed [    7.014224][ 7] [  T295] amdgpu
> >>> :04:00.0: Fatal error during GPU init
> >> Memory barries are not supposed to be sprinkled around like this, you
> need to give a detailed explanation why this is necessary.
> >>
> >> Regards,
> >> Christian.
> >>
> >>> Signed-off-by: Zhenneng Li 
> >>> ---
> >>>    drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++
> >>>    1 file changed, 2 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>> b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>> index 8f994ffa9cd1..c7656f22278d 100644
> >>> --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>> +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>> @@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct
> >>> amdgpu_device *adev)
> >>>    u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL);
> >>>    u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0);
> >>>    +    mb();
> >>> +
> >>>    if (!(rst & RST_REG) && !(clk & CK_DISABLE))
> >>>    return true;
> > In particular, it makes no sense in this specific place, since it cannot 
> > directly
> affect the values of rst & clk.
> 
> I thinks so too.
> 
> But when I do reboot test using nine desktop machines,  there maybe report
> this error on one or two machines after Hundreds of times or Thousands of
> times reboot test, at the beginning, I use msleep() instead of mb(), these
> two methods are all works, but I don't know what is the root case.
> 
> I use this method on other verdor's oland card, this error message are
> reported again.
> 
> What could be the root reason?
> 
> test environmen:
> 
> graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87
> 
> driver: amdgpu
> 
> os: ubuntu 2004
> 
> platform: arm64
> 
> kernel: 5.4.18
> 
> >
<>

RE: [PATCH] drm/amdgpu: don't call drm_fb_helper_lastclose in lastclose()

2022-10-24 Thread Quan, Evan

[AMD Official Use Only - General]

Reviewed-by: Evan Quan 

> -Original Message-
> From: amd-gfx  On Behalf Of Alex
> Deucher
> Sent: Thursday, October 20, 2022 10:36 PM
> To: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Cc: Deucher, Alexander ; Thomas
> Zimmermann 
> Subject: [PATCH] drm/amdgpu: don't call drm_fb_helper_lastclose in
> lastclose()
> 
> It's used to restore the fbdev console, but as amdgpu uses
> generic fbdev emulation, the console is being restored by the
> DRM client helpers already. See the call to drm_client_dev_restore()
> in drm_lastclose().
> 
> Fixes: 087451f372bf76 ("drm/amdgpu: use generic fb helpers instead of
> setting up AMD own's.")
> Cc: Thomas Zimmermann 
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index fe23e09eec98..474b9f40f792 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -1106,7 +1106,6 @@ int amdgpu_info_ioctl(struct drm_device *dev,
> void *data, struct drm_file *filp)
>   */
>  void amdgpu_driver_lastclose_kms(struct drm_device *dev)
>  {
> - drm_fb_helper_lastclose(dev);
>   vga_switcheroo_process_delayed_switch();
>  }
> 
> --
> 2.37.3

RE: [PATCH] drm/amdgpu/powerplay/psm: Fix memory leak in power state init

2022-10-17 Thread Quan, Evan

[AMD Official Use Only - General]

Reviewed-by: Evan Quan 

> -Original Message-
> From: Rafael Mendonca 
> Sent: Tuesday, October 18, 2022 8:54 AM
> To: Quan, Evan ; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ; David
> Airlie ; Daniel Vetter 
> Cc: Rafael Mendonca ; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> ker...@vger.kernel.org
> Subject: [PATCH] drm/amdgpu/powerplay/psm: Fix memory leak in power
> state init
> 
> Commit 902bc65de0b3 ("drm/amdgpu/powerplay/psm: return an error in
> power state init") made the power state init function return early in case of
> failure to get an entry from the powerplay table, but it missed to clean up 
> the
> allocated memory for the current power state before returning.
> 
> Fixes: 902bc65de0b3 ("drm/amdgpu/powerplay/psm: return an error in
> power state init")
> Signed-off-by: Rafael Mendonca 
> ---
>  drivers/gpu/drm/amd/pm/powerplay/hwmgr/pp_psm.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pp_psm.c
> b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pp_psm.c
> index 67d7da0b6fed..1d829402cd2e 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pp_psm.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pp_psm.c
> @@ -75,8 +75,10 @@ int psm_init_power_state_table(struct pp_hwmgr
> *hwmgr)
>   for (i = 0; i < table_entries; i++) {
>   result = hwmgr->hwmgr_func->get_pp_table_entry(hwmgr,
> i, state);
>   if (result) {
> + kfree(hwmgr->current_ps);
>   kfree(hwmgr->request_ps);
>   kfree(hwmgr->ps);
> + hwmgr->current_ps = NULL;
>   hwmgr->request_ps = NULL;
>   hwmgr->ps = NULL;
>   return -EINVAL;
> --
> 2.34.1

RE: [PATCH v2] drivers/amd/pm: check the return value of amdgpu_bo_kmap

2022-09-21 Thread Quan, Evan

[AMD Official Use Only - General]

Reviewed-by: Evan Quan 

> -Original Message-
> From: Li Zhong 
> Sent: Thursday, September 22, 2022 12:18 PM
> To: dri-devel@lists.freedesktop.org; amd-...@lists.freedesktop.org
> Cc: jiapeng.ch...@linux.alibaba.com; Powell, Darren
> ; Chen, Guchun ;
> Limonciello, Mario ; Quan, Evan
> ; Lazar, Lijo ; dan...@ffwll.ch;
> airl...@linux.ie; Pan, Xinhui ; Koenig, Christian
> ; Deucher, Alexander
> ; Li Zhong 
> Subject: [PATCH v2] drivers/amd/pm: check the return value of
> amdgpu_bo_kmap
> 
> amdgpu_bo_kmap() returns error when fails to map buffer object. Add the
> error check and propagate the error.
> 
> Signed-off-by: Li Zhong 
> ---
>  drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
> b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
> index 1eb4e613b27a..ec055858eb95 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
> @@ -1485,6 +1485,7 @@ static int pp_get_prv_buffer_details(void *handle,
> void **addr, size_t *size)
>  {
>   struct pp_hwmgr *hwmgr = handle;
>   struct amdgpu_device *adev = hwmgr->adev;
> + int err;
> 
>   if (!addr || !size)
>   return -EINVAL;
> @@ -1492,7 +1493,9 @@ static int pp_get_prv_buffer_details(void *handle,
> void **addr, size_t *size)
>   *addr = NULL;
>   *size = 0;
>   if (adev->pm.smu_prv_buffer) {
> - amdgpu_bo_kmap(adev->pm.smu_prv_buffer, addr);
> + err = amdgpu_bo_kmap(adev->pm.smu_prv_buffer, addr);
> + if (err)
> + return err;
>   *size = adev->pm.smu_prv_buffer_size;
>   }
> 
> --
> 2.25.1

RE: [PATCH 2/2] drm/amd/pm: Fix a potential gpu_metrics_table memory leak

2022-08-03 Thread Quan, Evan

[AMD Official Use Only - General]

Thanks for the fixes! The series is reviewed-by: Evan Quan 

Evan
> -Original Message-
> From: Zhen Ni 
> Sent: Wednesday, August 3, 2022 5:20 PM
> To: airl...@linux.ie; dan...@ffwll.ch; Quan, Evan ;
> Deucher, Alexander ; Koenig, Christian
> ; Pan, Xinhui 
> Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> ker...@vger.kernel.org; Zhen Ni 
> Subject: [PATCH 2/2] drm/amd/pm: Fix a potential gpu_metrics_table
> memory leak
> 
> Memory is allocated for gpu_metrics_table in
> smu_v13_0_5_init_smc_tables(), but not freed in
> smu_v13_0_5_fini_smc_tables(). This may cause memory leaks, fix it.
> 
> Signed-off-by: Zhen Ni 
> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_5_ppt.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_5_ppt.c
> b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_5_ppt.c
> index b81711c4ff33..267c9c43a010 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_5_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_5_ppt.c
> @@ -167,6 +167,9 @@ static int smu_v13_0_5_fini_smc_tables(struct
> smu_context *smu)
>   kfree(smu_table->watermarks_table);
>   smu_table->watermarks_table = NULL;
> 
> + kfree(smu_table->gpu_metrics_table);
> + smu_table->gpu_metrics_table = NULL;
> +
>   return 0;
>  }
> 
> --
> 2.20.1
> 
>

RE: [PATCH 1/4] drm/amd: Add detailed GFXOFF stats to debugfs

2022-07-25 Thread Quan, Evan

[AMD Official Use Only - General]



> -Original Message-
> From: André Almeida 
> Sent: Tuesday, July 26, 2022 12:15 AM
> To: Quan, Evan ; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ; David
> Airlie ; Daniel Vetter ; Zhang, Hawking
> ; Zhou1, Tao ; Kuehling,
> Felix ; Xiao, Jack ; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> ker...@vger.kernel.org; StDenis, Tom ; Siqueira,
> Rodrigo 
> Cc: kernel-...@igalia.com
> Subject: Re: [PATCH 1/4] drm/amd: Add detailed GFXOFF stats to debugfs
> 
> Às 10:04 de 25/07/22, André Almeida escreveu:
> > Às 07:27 de 25/07/22, Quan, Evan escreveu:
> >> [AMD Official Use Only - General]
> >>
> >> Using "uint64_t" instead of "uint32_t" for entry counter may be better.
> >>
> >
> > Indeed, it's a good idea. I'll send a v2 with that change, thanks.
> >
> 
> However, SMU messaging reads a 32bit register to get the entrycount from
> the pwfw, so would keep with with the risk of overflow anyway right?
[Quan, Evan] Yes, that makes sense. Better to document that(the risk of 
overflow).
Anyway, the series seems fine to me.
Series is acked-by: Evan Quan 
> 
> >> BR
> >> Evan
> >>> -Original Message-
> >>> From: amd-gfx  On Behalf Of
> >>> André Almeida
> >>> Sent: Saturday, July 23, 2022 4:34 AM
> >>> To: Deucher, Alexander ; Koenig,
> >>> Christian ; Pan, Xinhui
> >>> ; David Airlie ; Daniel Vetter
> >>> ; Zhang, Hawking ; Zhou1,
> >>> Tao ; Kuehling, Felix
> ;
> >>> Xiao, Jack ; amd- g...@lists.freedesktop.org;
> >>> dri-devel@lists.freedesktop.org; linux- ker...@vger.kernel.org;
> >>> StDenis, Tom ; Siqueira, Rodrigo
> >>> 
> >>> Cc: André Almeida ; kernel-...@igalia.com
> >>> Subject: [PATCH 1/4] drm/amd: Add detailed GFXOFF stats to debugfs
> >>>
> >>> Add debugfs interface to log GFXOFF statistics:
> >>>
> >>> - Read amdgpu_gfxoff_count to get the total GFXOFF entry count at the
> >>>   time of query since system power-up
> >>>
> >>> - Write 1 to amdgpu_gfxoff_residency to start logging, and 0 to stop.
> >>>   Read it to get average GFXOFF residency % multiplied by 100
> >>>   during the last logging interval.
> >>>
> >>> Both features are designed to be keep the values persistent between
> >>> suspends.
> >>>
> >>> Signed-off-by: André Almeida 
> >>> ---
> >>>  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   | 168
> >>> ++
> >>>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c|   2 +
> >>>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c   |  39 
> >>>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h   |   6 +
> >>>  drivers/gpu/drm/amd/pm/amdgpu_dpm.c   |  45 +
> >>>  drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h   |   3 +
> >>>  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  34 +++-
> >>>  drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |  22 +++
> >>>  drivers/gpu/drm/amd/pm/swsmu/smu_internal.h   |   3 +
> >>>  9 files changed, 321 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> >>> index e2eec985adb3..edf90a9ba980 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> >>> @@ -1042,6 +1042,157 @@ static ssize_t
> >>> amdgpu_debugfs_gpr_read(struct file *f, char __user *buf,
> >>>   return r;
> >>>  }
> >>>
> >>> +/**
> >>> + * amdgpu_debugfs_gfxoff_residency_read - Read GFXOFF residency
> >>> + *
> >>> + * @f: open file handle
> >>> + * @buf: User buffer to store read data in
> >>> + * @size: Number of bytes to read
> >>> + * @pos:  Offset to seek to
> >>> + *
> >>> + * Read the last residency value logged. It doesn't auto update,
> >>> +one needs
> >>> to
> >>> + * stop logging before getting the current value.
> >>> + */
> >>> +static ssize_t amdgpu_debugfs_gfxoff_residency_read(struct file *f,
> >>> +char
> >>> __user *buf,
> >>> + size_t size, loff_t *pos) {

RE: [PATCH 1/4] drm/amd: Add detailed GFXOFF stats to debugfs

2022-07-25 Thread Quan, Evan

[AMD Official Use Only - General]

Using "uint64_t" instead of "uint32_t" for entry counter may be better.

BR
Evan
> -Original Message-
> From: amd-gfx  On Behalf Of
> André Almeida
> Sent: Saturday, July 23, 2022 4:34 AM
> To: Deucher, Alexander ; Koenig, Christian
> ; Pan, Xinhui ; David
> Airlie ; Daniel Vetter ; Zhang, Hawking
> ; Zhou1, Tao ; Kuehling,
> Felix ; Xiao, Jack ; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> ker...@vger.kernel.org; StDenis, Tom ; Siqueira,
> Rodrigo 
> Cc: André Almeida ; kernel-...@igalia.com
> Subject: [PATCH 1/4] drm/amd: Add detailed GFXOFF stats to debugfs
> 
> Add debugfs interface to log GFXOFF statistics:
> 
> - Read amdgpu_gfxoff_count to get the total GFXOFF entry count at the
>   time of query since system power-up
> 
> - Write 1 to amdgpu_gfxoff_residency to start logging, and 0 to stop.
>   Read it to get average GFXOFF residency % multiplied by 100
>   during the last logging interval.
> 
> Both features are designed to be keep the values persistent between
> suspends.
> 
> Signed-off-by: André Almeida 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   | 168
> ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c|   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c   |  39 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h   |   6 +
>  drivers/gpu/drm/amd/pm/amdgpu_dpm.c   |  45 +
>  drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h   |   3 +
>  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  34 +++-
>  drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |  22 +++
>  drivers/gpu/drm/amd/pm/swsmu/smu_internal.h   |   3 +
>  9 files changed, 321 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index e2eec985adb3..edf90a9ba980 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -1042,6 +1042,157 @@ static ssize_t amdgpu_debugfs_gpr_read(struct
> file *f, char __user *buf,
>   return r;
>  }
> 
> +/**
> + * amdgpu_debugfs_gfxoff_residency_read - Read GFXOFF residency
> + *
> + * @f: open file handle
> + * @buf: User buffer to store read data in
> + * @size: Number of bytes to read
> + * @pos:  Offset to seek to
> + *
> + * Read the last residency value logged. It doesn't auto update, one needs
> to
> + * stop logging before getting the current value.
> + */
> +static ssize_t amdgpu_debugfs_gfxoff_residency_read(struct file *f, char
> __user *buf,
> + size_t size, loff_t *pos)
> +{
> + struct amdgpu_device *adev = file_inode(f)->i_private;
> + ssize_t result = 0;
> + int r;
> +
> + if (size & 0x3 || *pos & 0x3)
> + return -EINVAL;
> +
> + r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
> + if (r < 0) {
> + pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
> + return r;
> + }
> +
> + while (size) {
> + uint32_t value;
> +
> + r = amdgpu_get_gfx_off_residency(adev, );
> + if (r)
> + goto out;
> +
> + r = put_user(value, (uint32_t *)buf);
> + if (r)
> + goto out;
> +
> + result += 4;
> + buf += 4;
> + *pos += 4;
> + size -= 4;
> + }
> +
> + r = result;
> +out:
> + pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);
> + pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
> +
> + return r;
> +}
> +
> +/**
> + * amdgpu_debugfs_gfxoff_residency_write - Log GFXOFF Residency
> + *
> + * @f: open file handle
> + * @buf: User buffer to write data from
> + * @size: Number of bytes to write
> + * @pos:  Offset to seek to
> + *
> + * Write a 32-bit non-zero to start logging; write a 32-bit zero to stop
> + */
> +static ssize_t amdgpu_debugfs_gfxoff_residency_write(struct file *f, const
> char __user *buf,
> +  size_t size, loff_t *pos)
> +{
> + struct amdgpu_device *adev = file_inode(f)->i_private;
> + ssize_t result = 0;
> + int r;
> +
> + if (size & 0x3 || *pos & 0x3)
> + return -EINVAL;
> +
> + r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
> + if (r < 0) {
> + pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
> + return r;
> + }
> +
> + while (size) {
> + u32 value;
> +
> + r = get_user(value, (uint32_t *)buf);
> + if (r)
> + goto out;
> +
> + amdgpu_set_gfx_off_residency(adev, value ? true : false);
> +
> + result += 4;
> + buf += 4;
> + *pos += 4;
> + size -= 4;
> + }
> +
> + r = result;
> +out:
> + pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);
> + pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
> +
> + return

RE: [PATCH v2 1/1] drm/amd/pm: Implement get GFXOFF status for vangogh

2022-07-11 Thread Quan, Evan

[AMD Official Use Only - General]

Acked-by: Evan Quan 

> -Original Message-
> From: amd-gfx  On Behalf Of
> André Almeida
> Sent: Tuesday, July 12, 2022 3:35 AM
> To: Deucher, Alexander ; Koenig, Christian
> ; Pan, Xinhui ; David
> Airlie ; Daniel Vetter ; Zhang, Hawking
> ; Zhou1, Tao ; Kuehling,
> Felix ; Xiao, Jack ; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> ker...@vger.kernel.org
> Cc: André Almeida ; kernel-...@igalia.com
> Subject: [PATCH v2 1/1] drm/amd/pm: Implement get GFXOFF status for
> vangogh
> 
> Implement function to get current GFXOFF status for vangogh.
> 
> Signed-off-by: André Almeida 
> ---
> Changes from v1:
> - Squash commits in a single one
> 
>  .../gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c  | 38
> +++
>  1 file changed, 38 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> index e2d8ac90cf36..89504ff8e9ed 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> @@ -46,6 +46,18 @@
>  #undef pr_info
>  #undef pr_debug
> 
> +// Registers related to GFXOFF
> +// addressBlock: smuio_smuio_SmuSmuioDec
> +// base address: 0x5a000
> +#define mmSMUIO_GFX_MISC_CNTL0x00c5
> +#define mmSMUIO_GFX_MISC_CNTL_BASE_IDX   0
> +
> +//SMUIO_GFX_MISC_CNTL
> +#define SMUIO_GFX_MISC_CNTL__SMU_GFX_cold_vs_gfxoff__SHIFT
>   0x0
> +#define SMUIO_GFX_MISC_CNTL__PWR_GFXOFF_STATUS__SHIFT
>   0x1
> +#define SMUIO_GFX_MISC_CNTL__SMU_GFX_cold_vs_gfxoff_MASK
>   0x0001L
> +#define SMUIO_GFX_MISC_CNTL__PWR_GFXOFF_STATUS_MASK
>   0x0006L
> +
>  #define FEATURE_MASK(feature) (1ULL << feature)
>  #define SMC_DPM_FEATURE ( \
>   FEATURE_MASK(FEATURE_CCLK_DPM_BIT) | \
> @@ -2045,6 +2057,31 @@ static int vangogh_mode2_reset(struct
> smu_context *smu)
>   return vangogh_mode_reset(smu, SMU_RESET_MODE_2);
>  }
> 
> +/**
> + * vangogh_get_gfxoff_status - Get gfxoff status
> + *
> + * @smu: amdgpu_device pointer
> + *
> + * Get current gfxoff status
> + *
> + * Return:
> + * * 0   - GFXOFF (default if enabled).
> + * * 1   - Transition out of GFX State.
> + * * 2   - Not in GFXOFF.
> + * * 3   - Transition into GFXOFF.
> + */
> +static u32 vangogh_get_gfxoff_status(struct smu_context *smu)
> +{
> + struct amdgpu_device *adev = smu->adev;
> + u32 reg, gfxoff_status;
> +
> + reg = RREG32_SOC15(SMUIO, 0, mmSMUIO_GFX_MISC_CNTL);
> + gfxoff_status = (reg &
> SMUIO_GFX_MISC_CNTL__PWR_GFXOFF_STATUS_MASK)
> + >>
> SMUIO_GFX_MISC_CNTL__PWR_GFXOFF_STATUS__SHIFT;
> +
> + return gfxoff_status;
> +}
> +
>  static int vangogh_get_power_limit(struct smu_context *smu,
>  uint32_t *current_power_limit,
>  uint32_t *default_power_limit,
> @@ -2199,6 +2236,7 @@ static const struct pptable_funcs
> vangogh_ppt_funcs = {
>   .post_init = vangogh_post_smu_init,
>   .mode2_reset = vangogh_mode2_reset,
>   .gfx_off_control = smu_v11_0_gfx_off_control,
> + .get_gfx_off_status = vangogh_get_gfxoff_status,
>   .get_ppt_limit = vangogh_get_ppt_limit,
>   .get_power_limit = vangogh_get_power_limit,
>   .set_power_limit = vangogh_set_power_limit,
> --
> 2.37.0

RE: [PATCH] gpu/amd: vega10_hwmgr: fix inappropriate private variable name

2022-02-25 Thread Quan, Evan

[AMD Official Use Only]

Thanks!
The patch is reviewed-by: Evan Quan 

> -Original Message-
> From: Meng Tang 
> Sent: Friday, February 25, 2022 5:47 PM
> To: airl...@linux.ie; dan...@ffwll.ch
> Cc: Quan, Evan ; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> ker...@vger.kernel.org; Meng Tang 
> Subject: [PATCH] gpu/amd: vega10_hwmgr: fix inappropriate private variable
> name
> 
> In file vega10_hwmgr.c, the names of struct vega10_power_state *
> and struct pp_power_state * are confusingly used, which may lead
> to some confusion.
> 
> Status quo is that variables of type struct vega10_power_state *
> are named "vega10_ps", "ps", "vega10_power_state". A more
> appropriate usage is that struct are named "ps" is used for
> variabled of type struct pp_power_state *.
> 
> So rename struct vega10_power_state * which are named "ps" and
> "vega10_power_state" to "vega10_ps", I also renamed "psa" to
> "vega10_psa" and "psb" to "vega10_psb" to make it more clearly.
> 
> The rows longer than 100 columns are involved.
> 
> Signed-off-by: Meng Tang 
> ---
>  .../drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c | 68 +++---
> -
>  1 file changed, 38 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c
> b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c
> index 3f040be0d158..37324f2009ca 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c
> @@ -3095,7 +3095,7 @@ static int
> vega10_get_pp_table_entry_callback_func(struct pp_hwmgr *hwmgr,
>   void *pp_table, uint32_t classification_flag)
>  {
>   ATOM_Vega10_GFXCLK_Dependency_Record_V2
> *patom_record_V2;
> - struct vega10_power_state *vega10_power_state =
> + struct vega10_power_state *vega10_ps =
>   cast_phw_vega10_power_state(&(power_state-
> >hardware));
>   struct vega10_performance_level *performance_level;
>   ATOM_Vega10_State *state_entry = (ATOM_Vega10_State *)state;
> @@ -3145,17 +3145,17 @@ static int
> vega10_get_pp_table_entry_callback_func(struct pp_hwmgr *hwmgr,
>   power_state->temperatures.min = 0;
>   power_state->temperatures.max = 0;
> 
> - performance_level = &(vega10_power_state->performance_levels
> - [vega10_power_state-
> >performance_level_count++]);
> + performance_level = &(vega10_ps->performance_levels
> + [vega10_ps->performance_level_count++]);
> 
>   PP_ASSERT_WITH_CODE(
> - (vega10_power_state->performance_level_count <
> + (vega10_ps->performance_level_count <
>   NUM_GFXCLK_DPM_LEVELS),
>   "Performance levels exceeds SMC limit!",
>   return -1);
> 
>   PP_ASSERT_WITH_CODE(
> - (vega10_power_state->performance_level_count
> <=
> + (vega10_ps->performance_level_count <=
>   hwmgr->platform_descriptor.
>   hardwareActivityPerformanceLevels),
>   "Performance levels exceeds Driver limit!",
> @@ -3169,8 +3169,8 @@ static int
> vega10_get_pp_table_entry_callback_func(struct pp_hwmgr *hwmgr,
>   performance_level->mem_clock = mclk_dep_table->entries
>   [state_entry->ucMemClockIndexLow].ulMemClk;
> 
> - performance_level = &(vega10_power_state->performance_levels
> - [vega10_power_state-
> >performance_level_count++]);
> + performance_level = &(vega10_ps->performance_levels
> + [vega10_ps->performance_level_count++]);
>   performance_level->soc_clock = socclk_dep_table->entries
>   [state_entry->ucSocClockIndexHigh].ulClk;
>   if (gfxclk_dep_table->ucRevId == 0) {
> @@ -3201,11 +3201,11 @@ static int vega10_get_pp_table_entry(struct
> pp_hwmgr *hwmgr,
>   unsigned long entry_index, struct pp_power_state *state)
>  {
>   int result;
> - struct vega10_power_state *ps;
> + struct vega10_power_state *vega10_ps;
> 
>   state->hardware.magic = PhwVega10_Magic;
> 
> - ps = cast_phw_vega10_power_state(>hardware);
> + vega10_ps = cast_phw_vega10_power_state(>hardwar

RE: Regression from 3c196f056666 ("drm/amdgpu: always reset the asic in suspend (v2)") on suspend?

2022-02-14 Thread Quan, Evan

[AMD Official Use Only]



> -Original Message-
> From: Salvatore Bonaccorso  On Behalf
> Of Salvatore Bonaccorso
> Sent: Sunday, February 13, 2022 2:24 AM
> To: Deucher, Alexander 
> Cc: Dominique Dumont ; 1005...@bugs.debian.org;
> Tuikov, Luben ; Quan, Evan
> ; Sasha Levin ; Koenig, Christian
> ; Pan, Xinhui ; David
> Airlie ; Daniel Vetter ; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> ker...@vger.kernel.org
> Subject: Regression from 3c196f05 ("drm/amdgpu: always reset the asic
> in suspend (v2)") on suspend?
> 
> Hi Alex, hi all
> 
> In Debian we got a regression report from Dominique Dumont, CC'ed in
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs
> .debian.org%2F1005005data=04%7C01%7Cevan.quan%40amd.com%7
> C735917b6e3f44fc8fda808d9ee54cbc0%7C3dd8961fe4884e608e11a82d994e1
> 83d%7C0%7C0%7C637802870862664095%7CUnknown%7CTWFpbGZsb3d8eyJ
> WIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> 7C3000sdata=6xECB3MmvNYuOn41ZOEDPyWUjklY%2Bfxumz7lf8fijwA
> %3Dreserved=0 that afer an update to 5.15.15 based kernel, his
> machine noe longer suspends correctly, after screen going black as usual it
> comes back. The Debian bug above contians a trace.
> 
> Dominique confirmed that this issue persisted after updating to 5.16.7
> furthermore he bisected the issue and found
> 
>   3c196f0510912645c7c5d9107706003f67c3 is the first bad commit
>   commit 3c196f0510912645c7c5d9107706003f67c3
>   Author: Alex Deucher 
>   Date:   Fri Nov 12 11:25:30 2021 -0500
> 
>   drm/amdgpu: always reset the asic in suspend (v2)
> 
>   [ Upstream commit daf8de0874ab5b74b38a38726fdd3d07ef98a7ee ]
> 
>   If the platform suspend happens to fail and the power rail
>   is not turned off, the GPU will be in an unknown state on
>   resume, so reset the asic so that it will be in a known
>   good state on resume even if the platform suspend failed.
> 
>   v2: handle s0ix
> 
>   Acked-by: Luben Tuikov 
>   Acked-by: Evan Quan 
>   Signed-off-by: Alex Deucher 
>   Signed-off-by: Sasha Levin 
> 
>drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 5 -
>1 file changed, 4 insertions(+), 1 deletion(-)
> 
> to be the first bad commit, see
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs
> .debian.org%2F1005005%2334data=04%7C01%7Cevan.quan%40amd.c
> om%7C735917b6e3f44fc8fda808d9ee54cbc0%7C3dd8961fe4884e608e11a82d
> 994e183d%7C0%7C0%7C637802870862664095%7CUnknown%7CTWFpbGZsb3
> d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0
> %3D%7C3000sdata=CV%2FKmpYT8WOVJnrTiU91godaFDJMpjih%2FAV
> NAcw5qaI%3Dreserved=0 .
I checked the back trace posted there(below). It seems the error occurred 
during amdgpu_device_suspend(). 
That means Alex's patch should not be related(as it affected only those logic 
after amdgpu_device_suspend()). 
So we might got a wrong regression point here.
[  257.842851]  ? vi_common_set_clockgating_state+0x229/0x2f0 [amdgpu]
[  257.843356]  amdgpu_device_ip_suspend_phase1+0x5e/0xc0 [amdgpu]
[  257.843771]  amdgpu_device_suspend+0x62/0xc0 [amdgpu]
[  257.844184]  amdgpu_pmops_suspend+0x36/0x70 [amdgpu]
[  257.844631]  pci_pm_suspend+0x71/0x160
[  257.844643]  ? pci_pm_freeze+0xb0/0xb0

BR
Evan
> 
> Does this ring any bell? Any idea on the problem?
> 
> Regards,
> Salvatore

RE: [PATCH] drm/amd/pm: add missing prototypes to amdgpu_dpm_internal

2022-02-06 Thread Quan, Evan

[AMD Official Use Only]

Thanks for the fix!
Reviewed-by: Evan Quan 

> -Original Message-
> From: Maíra Canal 
> Sent: Thursday, February 3, 2022 8:40 AM
> To: Quan, Evan ; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@linux.ie; dan...@ffwll.ch; nat...@kernel.org;
> ndesaulni...@google.com; Lazar, Lijo ; Tuikov, Luben
> ; Chen, Guchun ;
> Zhang, Hawking ;
> jiapeng.ch...@linux.alibaba.com
> Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> ker...@vger.kernel.org
> Subject: [PATCH] drm/amd/pm: add missing prototypes to
> amdgpu_dpm_internal
> 
> Include the header with the prototype to silence the following clang
> warnings:
> 
> drivers/gpu/drm/amd/amdgpu/../pm/amdgpu_dpm_internal.c:29:6:
> warning: no
> previous prototype for function 'amdgpu_dpm_get_active_displays'
> [-Wmissing-prototypes]
> void amdgpu_dpm_get_active_displays(struct amdgpu_device *adev)
>  ^
> drivers/gpu/drm/amd/amdgpu/../pm/amdgpu_dpm_internal.c:29:1: note:
> declare
> 'static' if the function is not intended to be used outside of this
> translation unit
> void amdgpu_dpm_get_active_displays(struct amdgpu_device *adev)
> ^
> static
> drivers/gpu/drm/amd/amdgpu/../pm/amdgpu_dpm_internal.c:76:5:
> warning: no
> previous prototype for function 'amdgpu_dpm_get_vrefresh'
> [-Wmissing-prototypes]
> u32 amdgpu_dpm_get_vrefresh(struct amdgpu_device *adev)
> ^
> drivers/gpu/drm/amd/amdgpu/../pm/amdgpu_dpm_internal.c:76:1: note:
> declare
> 'static' if the function is not intended to be used outside of this
> translation unit
> u32 amdgpu_dpm_get_vrefresh(struct amdgpu_device *adev)
> ^
> static
> 2 warnings generated.
> 
> Besides that, remove the duplicated prototype of the function
> amdgpu_dpm_get_vblank_time in order to keep the consistency of the
> headers.
> 
> fixes: 6ddbd37f ("drm/amd/pm: optimize the amdgpu_pm_compute_clocks()
> implementations")
> 
> Signed-off-by: Maíra Canal 
> ---
>  drivers/gpu/drm/amd/pm/amdgpu_dpm_internal.c | 1 +
>  drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h  | 1 -
>  drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c   | 1 +
>  3 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/amdgpu_dpm_internal.c
> b/drivers/gpu/drm/amd/pm/amdgpu_dpm_internal.c
> index ba5f6413412d..42efe838fa85 100644
> --- a/drivers/gpu/drm/amd/pm/amdgpu_dpm_internal.c
> +++ b/drivers/gpu/drm/amd/pm/amdgpu_dpm_internal.c
> @@ -25,6 +25,7 @@
>  #include "amdgpu_display.h"
>  #include "hwmgr.h"
>  #include "amdgpu_smu.h"
> +#include "amdgpu_dpm_internal.h"
> 
>  void amdgpu_dpm_get_active_displays(struct amdgpu_device *adev)
>  {
> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h
> b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h
> index 5cc05110cdae..09790413cbc4 100644
> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h
> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h
> @@ -343,7 +343,6 @@ struct amdgpu_pm {
>   struct amdgpu_ctx   *stable_pstate_ctx;
>  };
> 
> -u32 amdgpu_dpm_get_vblank_time(struct amdgpu_device *adev);
>  int amdgpu_dpm_read_sensor(struct amdgpu_device *adev, enum
> amd_pp_sensors sensor,
>  void *data, uint32_t *size);
> 
> diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> index 7427c50409d4..caae54487f9c 100644
> --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> @@ -28,6 +28,7 @@
>  #include "amdgpu_pm.h"
>  #include "amdgpu_dpm.h"
>  #include "amdgpu_atombios.h"
> +#include "amdgpu_dpm_internal.h"
>  #include "amd_pcie.h"
>  #include "sid.h"
>  #include "r600_dpm.h"
> --
> 2.34.1

RE: [PATCH] drm/amd/pm: fix error handling

2022-02-06 Thread Quan, Evan

[AMD Official Use Only]

Reviewed-by: Evan Quan 

> -Original Message-
> From: t...@redhat.com 
> Sent: Saturday, February 5, 2022 11:00 PM
> To: Quan, Evan ; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@linux.ie; dan...@ffwll.ch; nat...@kernel.org;
> ndesaulni...@google.com; Lazar, Lijo ; Powell, Darren
> ; Chen, Guchun ;
> Grodzovsky, Andrey 
> Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> ker...@vger.kernel.org; l...@lists.linux.dev; Tom Rix 
> Subject: [PATCH] drm/amd/pm: fix error handling
> 
> From: Tom Rix 
> 
> clang static analysis reports this error
> amdgpu_smu.c:2289:9: warning: Called function pointer
>   is null (null dereference)
> return smu->ppt_funcs->emit_clk_levels(
>^~~~
> 
> There is a logic error in the earlier check of
> emit_clk_levels.  The error value is set to
> the ret variable but ret is never used.  Return
> directly and remove the unneeded ret variable.
> 
> Fixes: 5d64f9bbb628 ("amdgpu/pm: Implement new API function "emit" that
> accepts buffer base and write offset")
> Signed-off-by: Tom Rix 
> ---
>  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> index af368aa1fd0ae..5f3b3745a9b7a 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> @@ -2274,7 +2274,6 @@ static int smu_emit_ppclk_levels(void *handle,
> enum pp_clock_type type, char *bu
>  {
>   struct smu_context *smu = handle;
>   enum smu_clk_type clk_type;
> - int ret = 0;
> 
>   clk_type = smu_convert_to_smuclk(type);
>   if (clk_type == SMU_CLK_COUNT)
> @@ -2284,7 +2283,7 @@ static int smu_emit_ppclk_levels(void *handle,
> enum pp_clock_type type, char *bu
>   return -EOPNOTSUPP;
> 
>   if (!smu->ppt_funcs->emit_clk_levels)
> - ret = -ENOENT;
> + return -ENOENT;
> 
>   return smu->ppt_funcs->emit_clk_levels(smu, clk_type, buf, offset);
> 
> --
> 2.26.3

RE: [PATCH 1/2] MAINTAINERS: fix up entry for AMD Powerplay

2021-09-17 Thread Quan, Evan

[AMD Official Use Only]

Reviewed-by: Evan Quan 

> -Original Message-
> From: amd-gfx  On Behalf Of Alex
> Deucher
> Sent: Saturday, September 18, 2021 12:16 AM
> To: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: [PATCH 1/2] MAINTAINERS: fix up entry for AMD Powerplay
> 
> Fix the path to cover both the older powerplay infrastructure
> and the newer SwSMU infrastructure.
> 
> Signed-off-by: Alex Deucher 
> ---
>  MAINTAINERS | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 851255b71ccc..379092f34fff 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -972,12 +972,12 @@ L:  platform-driver-...@vger.kernel.org
>  S:   Maintained
>  F:   drivers/platform/x86/amd-pmc.*
> 
> -AMD POWERPLAY
> +AMD POWERPLAY AND SWSMU
>  M:   Evan Quan 
>  L:   amd-...@lists.freedesktop.org
>  S:   Supported
>  T:   git
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitla
> b.freedesktop.org%2Fagd5f%2Flinux.gitdata=04%7C01%7Cevan.quan
> %40amd.com%7Cb6158a1eb6774e147b0008d979f67086%7C3dd8961fe4884e6
> 08e11a82d994e183d%7C0%7C0%7C637674921632876884%7CUnknown%7CT
> WFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLC
> JXVCI6Mn0%3D%7C1000sdata=qcP4K1JnXBAOluo7%2FwnoQUofGBN
> KpbTQJeOKPnUbmc0%3Dreserved=0
> -F:   drivers/gpu/drm/amd/pm/powerplay/
> +F:   drivers/gpu/drm/amd/pm/
> 
>  AMD SEATTLE DEVICE TREE SUPPORT
>  M:   Brijesh Singh 
> --
> 2.31.1

RE: [RFC][PATCH] drm/amdgpu/powerplay/smu10: Add custom profile

2021-09-13 Thread Quan, Evan

[AMD Official Use Only]

Driver can exchange the custom profiling settings with SMU FW using the table 
below:
TABLE_CUSTOM_DPM

And the related data structure is CustomDpmSettings_t.

BR
Evan
> -Original Message-
> From: Alex Deucher 
> Sent: Monday, September 13, 2021 11:11 PM
> To: Daniel Gomez ; Huang, Ray ;
> Quan, Evan ; Zhu, Changfeng
> 
> Cc: amd-gfx list ; Maling list - DRI
> developers ; Daniel Gomez
> ; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui 
> Subject: Re: [RFC][PATCH] drm/amdgpu/powerplay/smu10: Add custom
> profile
> 
> On Wed, Sep 8, 2021 at 3:23 AM Daniel Gomez  wrote:
> >
> > On Tue, 7 Sept 2021 at 19:23, Alex Deucher 
> wrote:
> > >
> > > On Tue, Sep 7, 2021 at 4:53 AM Daniel Gomez  wrote:
> > > >
> > > > Add custom power profile mode support on smu10.
> > > > Update workload bit list.
> > > > ---
> > > >
> > > > Hi,
> > > >
> > > > I'm trying to add custom profile for the Raven Ridge but not sure
> > > > if I'd need a different parameter than PPSMC_MSG_SetCustomPolicy
> > > > to configure the custom values. The code seemed to support CUSTOM
> > > > for workload types but it didn't show up in the menu or accept any
> > > > user input parameter. So far, I've added that part but a bit
> > > > confusing to me what is the policy I need for setting these
> > > > parameters or if it's maybe not possible at all.
> > > >
> > > > After applying the changes I'd configure the CUSTOM mode as follows:
> > > >
> > > > echo manual >
> > > >
> /sys/class/drm/card0/device/hwmon/hwmon1/device/power_dpm_force_
> pe
> > > > rformance_level echo "6 70 90 0 0" >
> > > >
> /sys/class/drm/card0/device/hwmon/hwmon1/device/pp_power_profile_
> m
> > > > ode
> > > >
> > > > Then, using Darren Powell script for testing modes I get the
> > > > following
> > > > output:
> > > >
> > > > 05:00.0 VGA compatible controller [0300]: Advanced Micro Devices,
> > > > Inc. [AMD/ATI] Raven Ridge [Radeon Vega Series / Radeon Vega
> > > > Mobile Series] [1002:15dd] (rev 83) === pp_dpm_sclk ===
> > > > 0: 200Mhz
> > > > 1: 400Mhz *
> > > > 2: 1100Mhz
> > > > === pp_dpm_mclk ===
> > > > 0: 400Mhz
> > > > 1: 933Mhz *
> > > > 2: 1067Mhz
> > > > 3: 1200Mhz
> > > > === pp_power_profile_mode ===
> > > > NUMMODE_NAME BUSY_SET_POINT FPS USE_RLC_BUSY
> MIN_ACTIVE_LEVEL
> > > >   0 BOOTUP_DEFAULT : 70  60  0  0
> > > >   1 3D_FULL_SCREEN : 70  60  1  3
> > > >   2   POWER_SAVING : 90  60  0  0
> > > >   3  VIDEO : 70  60  0  0
> > > >   4 VR : 70  90  0  0
> > > >   5COMPUTE : 30  60  0  6
> > > >   6 CUSTOM*: 70  90  0  0
> > > >
> > > > As you can also see in my changes, I've also updated the workload
> > > > bit table but I'm not completely sure about that change. With the
> > > > tests I've done, using bit 5 for the WORKLOAD_PPLIB_CUSTOM_BIT
> > > > makes the gpu sclk locked around ~36%. So, maybe I'm missing a
> > > > clock limit configuraton table somewhere. Would you give me some
> > > > hints to proceed with this?
> > >
> > > I don't think APUs support customizing the workloads the same way
> > > dGPUs do.  I think they just support predefined profiles.
> > >
> > > Alex
> >
> >
> > Thanks Alex for the quick response. Would it make sense then to remove
> > the custom workload code (PP_SMC_POWER_PROFILE_CUSTOM) from the
> smu10?
> > That workload was added in this commit:
> > f6f75ebdc06c04d3cfcd100f1b10256a9cdca407 [1] and not use at all in the
> > code as it's limited to PP_SMC_POWER_PROFILE_COMPUTE index. The
> > smu10.h also includes the custom workload bit definition and that was
> > a bit confusing for me to understand if it was half-supported or not
> > possible to use at all as I understood from your comment.
> >
> > Perhaps could also be mentioned (if that's kind of standard) in the
> > documentation[2] so, the custom pp_power_profile_mode is only
> > supported in

RE: [PATCH] drm/amdgpu: Make use of the helper macro SET_RUNTIME_PM_OPS()

2021-08-29 Thread Quan, Evan

[AMD Official Use Only]

Reviewed-by: Evan Quan 

> -Original Message-
> From: amd-gfx  On Behalf Of Cai
> Huoqing
> Sent: Saturday, August 28, 2021 4:41 PM
> To: Deucher, Alexander ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@linux.ie; dan...@ffwll.ch
> Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Cai
> Huoqing 
> Subject: [PATCH] drm/amdgpu: Make use of the helper macro
> SET_RUNTIME_PM_OPS()
> 
> Use the helper macro SET_RUNTIME_PM_OPS() instead of the verbose
> operators ".runtime_suspend/.runtime_resume/.runtime_idle", because
> the SET_RUNTIME_PM_OPS() is a nice helper macro that could be brought
> in to make code a little clearer, a little more concise.
> 
> Signed-off-by: Cai Huoqing 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index b6640291f980..9e5fb8d2e0e0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1699,6 +1699,8 @@ long amdgpu_drm_ioctl(struct file *filp,
>  }
> 
>  static const struct dev_pm_ops amdgpu_pm_ops = {
> + SET_RUNTIME_PM_OPS(amdgpu_pmops_runtime_suspend,
> +amdgpu_pmops_runtime_resume,
> amdgpu_pmops_runtime_idle)
>   .prepare = amdgpu_pmops_prepare,
>   .complete = amdgpu_pmops_complete,
>   .suspend = amdgpu_pmops_suspend,
> @@ -1707,9 +1709,6 @@ static const struct dev_pm_ops amdgpu_pm_ops =
> {
>   .thaw = amdgpu_pmops_thaw,
>   .poweroff = amdgpu_pmops_poweroff,
>   .restore = amdgpu_pmops_restore,
> - .runtime_suspend = amdgpu_pmops_runtime_suspend,
> - .runtime_resume = amdgpu_pmops_runtime_resume,
> - .runtime_idle = amdgpu_pmops_runtime_idle,
>  };
> 
>  static int amdgpu_flush(struct file *f, fl_owner_t id)
> --
> 2.25.1

RE: [PATCH 1/4] drm/amdgpu: Move flush VCE idle_work during HW fini

2021-08-24 Thread Quan, Evan

[AMD Official Use Only]

Just landed.

Thanks,
Evan
> -Original Message-
> From: Grodzovsky, Andrey 
> Sent: Wednesday, August 25, 2021 11:20 AM
> To: Quan, Evan ; dri-devel@lists.freedesktop.org;
> amd-...@lists.freedesktop.org
> Cc: ckoenig.leichtzumer...@gmail.com
> Subject: Re: [PATCH 1/4] drm/amdgpu: Move flush VCE idle_work during HW
> fini
> 
> Right, they will cover my use case, when are they landing ? I rebased today
> and haven't seen them.
> 
> Andrey
> 
> On 2021-08-24 9:41 p.m., Quan, Evan wrote:
> > [AMD Official Use Only]
> >
> > Hi Andrey,
> >
> > I sent out a similar patch set to address S3 issue. And I believe it should 
> > be
> able to address the issue here too.
> > https://lists.freedesktop.org/archives/amd-gfx/2021-August/067972.html
> > https://lists.freedesktop.org/archives/amd-gfx/2021-August/067967.html
> >
> > BR
> > Evan
> >> -Original Message-
> >> From: amd-gfx  On Behalf Of
> >> Andrey Grodzovsky
> >> Sent: Wednesday, August 25, 2021 5:01 AM
> >> To: dri-devel@lists.freedesktop.org; amd-...@lists.freedesktop.org
> >> Cc: ckoenig.leichtzumer...@gmail.com; Grodzovsky, Andrey
> >> 
> >> Subject: [PATCH 1/4] drm/amdgpu: Move flush VCE idle_work during HW
> >> fini
> >>
> >> Attepmts to powergate after device is removed lead to crash.
> >>
> >> Signed-off-by: Andrey Grodzovsky 
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 1 -
> >>   drivers/gpu/drm/amd/amdgpu/vce_v2_0.c   | 4 
> >>   drivers/gpu/drm/amd/amdgpu/vce_v3_0.c   | 5 -
> >>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c   | 2 ++
> >>   4 files changed, 10 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> >> index 1ae7f824adc7..8e8dee9fac9f 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> >> @@ -218,7 +218,6 @@ int amdgpu_vce_sw_fini(struct amdgpu_device
> >> *adev)
> >>if (adev->vce.vcpu_bo == NULL)
> >>return 0;
> >>
> >> -  cancel_delayed_work_sync(>vce.idle_work);
> >>drm_sched_entity_destroy(>vce.entity);
> >>
> >>amdgpu_bo_free_kernel(>vce.vcpu_bo, 
> >>> vce.gpu_addr,
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
> >> b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
> >> index c7d28c169be5..716dfdd020b4 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
> >> @@ -477,6 +477,10 @@ static int vce_v2_0_hw_init(void *handle)
> >>
> >>   static int vce_v2_0_hw_fini(void *handle)
> >>   {
> >> +  struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> >> +
> >> +  cancel_delayed_work_sync(>vce.idle_work);
> >> +
> >>return 0;
> >>   }
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> >> b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> >> index 3b82fb289ef6..49581c6e0cea 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> >> @@ -495,7 +495,10 @@ static int vce_v3_0_hw_fini(void *handle)
> >>return r;
> >>
> >>vce_v3_0_stop(adev);
> >> -  return vce_v3_0_set_clockgating_state(adev,
> >> AMD_CG_STATE_GATE);
> >> +  r =  vce_v3_0_set_clockgating_state(adev, AMD_CG_STATE_GATE);
> >> +  cancel_delayed_work_sync(>vce.idle_work);
> >> +
> >> +  return r;
> >>   }
> >>
> >>   static int vce_v3_0_suspend(void *handle) diff --git
> >> a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> >> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> >> index 90910d19db12..3297405fd32d 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> >> @@ -550,6 +550,8 @@ static int vce_v4_0_hw_fini(void *handle)
> >>DRM_DEBUG("For SRIOV client, shouldn't do anything.\n");
> >>}
> >>
> >> +  cancel_delayed_work_sync(>vce.idle_work);
> >> +
> >>return 0;
> >>   }
> >>
> >> --
> >> 2.25.1

RE: [PATCH 1/4] drm/amdgpu: Move flush VCE idle_work during HW fini

2021-08-24 Thread Quan, Evan

[AMD Official Use Only]

Hi Andrey,

I sent out a similar patch set to address S3 issue. And I believe it should be 
able to address the issue here too.
https://lists.freedesktop.org/archives/amd-gfx/2021-August/067972.html
https://lists.freedesktop.org/archives/amd-gfx/2021-August/067967.html

BR
Evan
> -Original Message-
> From: amd-gfx  On Behalf Of
> Andrey Grodzovsky
> Sent: Wednesday, August 25, 2021 5:01 AM
> To: dri-devel@lists.freedesktop.org; amd-...@lists.freedesktop.org
> Cc: ckoenig.leichtzumer...@gmail.com; Grodzovsky, Andrey
> 
> Subject: [PATCH 1/4] drm/amdgpu: Move flush VCE idle_work during HW fini
> 
> Attepmts to powergate after device is removed lead to crash.
> 
> Signed-off-by: Andrey Grodzovsky 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 1 -
>  drivers/gpu/drm/amd/amdgpu/vce_v2_0.c   | 4 
>  drivers/gpu/drm/amd/amdgpu/vce_v3_0.c   | 5 -
>  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c   | 2 ++
>  4 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> index 1ae7f824adc7..8e8dee9fac9f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> @@ -218,7 +218,6 @@ int amdgpu_vce_sw_fini(struct amdgpu_device
> *adev)
>   if (adev->vce.vcpu_bo == NULL)
>   return 0;
> 
> - cancel_delayed_work_sync(>vce.idle_work);
>   drm_sched_entity_destroy(>vce.entity);
> 
>   amdgpu_bo_free_kernel(>vce.vcpu_bo, 
> >vce.gpu_addr,
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
> b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
> index c7d28c169be5..716dfdd020b4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
> @@ -477,6 +477,10 @@ static int vce_v2_0_hw_init(void *handle)
> 
>  static int vce_v2_0_hw_fini(void *handle)
>  {
> + struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> +
> + cancel_delayed_work_sync(>vce.idle_work);
> +
>   return 0;
>  }
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> index 3b82fb289ef6..49581c6e0cea 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> @@ -495,7 +495,10 @@ static int vce_v3_0_hw_fini(void *handle)
>   return r;
> 
>   vce_v3_0_stop(adev);
> - return vce_v3_0_set_clockgating_state(adev,
> AMD_CG_STATE_GATE);
> + r =  vce_v3_0_set_clockgating_state(adev, AMD_CG_STATE_GATE);
> + cancel_delayed_work_sync(>vce.idle_work);
> +
> + return r;
>  }
> 
>  static int vce_v3_0_suspend(void *handle)
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> index 90910d19db12..3297405fd32d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> @@ -550,6 +550,8 @@ static int vce_v4_0_hw_fini(void *handle)
>   DRM_DEBUG("For SRIOV client, shouldn't do anything.\n");
>   }
> 
> + cancel_delayed_work_sync(>vce.idle_work);
> +
>   return 0;
>  }
> 
> --
> 2.25.1

RE: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-17 Thread Quan, Evan

[AMD Official Use Only]

Thanks! This seems fine to me.
Reviewed-by: Evan Quan 

> -Original Message-
> From: amd-gfx  On Behalf Of
> Michel Dänzer
> Sent: Tuesday, August 17, 2021 4:23 PM
> To: Deucher, Alexander ; Koenig, Christian
> 
> Cc: Liu, Leo ; Zhu, James ; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Subject: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is
> disabled
> 
> From: Michel Dänzer 
> 
> schedule_delayed_work does not push back the work if it was already
> scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
> after the first time GFXOFF was disabled and re-enabled, even if GFXOFF was
> disabled and re-enabled again during those 100 ms.
> 
> This resulted in frame drops / stutter with the upcoming mutter 41 release
> on Navi 14, due to constantly enabling GFXOFF in the HW and disabling it
> again (for getting the GPU clock counter).
> 
> To fix this, call cancel_delayed_work_sync when the disable count transitions
> from 0 to 1, and only schedule the delayed work on the reverse transition,
> not if the disable count was already 0. This makes sure the delayed work
> doesn't run at unexpected times, and allows it to be lock-free.
> 
> v2:
> * Use cancel_delayed_work_sync & mutex_trylock instead of
>   mod_delayed_work.
> v3:
> * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
> v4:
> * Fix race condition between amdgpu_gfx_off_ctrl incrementing
>   adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
>   checking for it to be 0 (Evan Quan)
> 
> Cc: sta...@vger.kernel.org
> Reviewed-by: Lijo Lazar  # v3
> Acked-by: Christian König  # v3
> Signed-off-by: Michel Dänzer 
> ---
> 
> Alex, probably best to wait a bit longer before picking this up. :)
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 36 +++
> ---
>  2 files changed, 30 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index f3fd5ec710b6..f944ed858f3e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2777,12 +2777,11 @@ static void
> amdgpu_device_delay_enable_gfx_off(struct work_struct *work)
>   struct amdgpu_device *adev =
>   container_of(work, struct amdgpu_device,
> gfx.gfx_off_delay_work.work);
> 
> - mutex_lock(>gfx.gfx_off_mutex);
> - if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
> - if (!amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, true))
> - adev->gfx.gfx_off_state = true;
> - }
> - mutex_unlock(>gfx.gfx_off_mutex);
> + WARN_ON_ONCE(adev->gfx.gfx_off_state);
> + WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
> +
> + if (!amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, true))
> + adev->gfx.gfx_off_state = true;
>  }
> 
>  /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index a0be0772c8b3..b4ced45301be 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -563,24 +563,38 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device
> *adev, bool enable)
> 
>   mutex_lock(>gfx.gfx_off_mutex);
> 
> - if (!enable)
> - adev->gfx.gfx_off_req_count++;
> - else if (adev->gfx.gfx_off_req_count > 0)
> + if (enable) {
> + /* If the count is already 0, it means there's an imbalance bug
> somewhere.
> +  * Note that the bug may be in a different caller than the one
> which triggers the
> +  * WARN_ON_ONCE.
> +  */
> + if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
> + goto unlock;
> +
>   adev->gfx.gfx_off_req_count--;
> 
> - if (enable && !adev->gfx.gfx_off_state && !adev-
> >gfx.gfx_off_req_count) {
> - schedule_delayed_work(>gfx.gfx_off_delay_work,
> GFX_OFF_DELAY_ENABLE);
> - } else if (!enable && adev->gfx.gfx_off_state) {
> - if (!amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, false)) {
> - adev->gfx.gfx_off_state = false;
> + if (adev->gfx.gfx_off_req_count == 0 && !adev-
> >gfx.gfx_off_state)
> + schedule_delayed_work(
> >gfx.gfx_off_delay_work, GFX_OFF_DELAY_ENABLE);
> + } else {
> + if (adev->gfx.gfx_off_req_count == 0) {
> + cancel_delayed_work_sync(
> >gfx.gfx_off_delay_work);
> +
> + if (adev->gfx.gfx_off_state &&
> + !amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, false)) {
> + adev->gfx.gfx_off_state = false;
> 
> - if (adev->gfx.funcs->init_spm_golden) {
> - dev_dbg(adev->dev,

RE: [PATCH v3] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-17 Thread Quan, Evan

[AMD Official Use Only]



> -Original Message-
> From: amd-gfx  On Behalf Of
> Michel Dänzer
> Sent: Monday, August 16, 2021 6:35 PM
> To: Deucher, Alexander ; Koenig, Christian
> 
> Cc: Liu, Leo ; Zhu, James ; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Subject: [PATCH v3] drm/amdgpu: Cancel delayed work when GFXOFF is
> disabled
> 
> From: Michel Dänzer 
> 
> schedule_delayed_work does not push back the work if it was already
> scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
> after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
> was disabled and re-enabled again during those 100 ms.
> 
> This resulted in frame drops / stutter with the upcoming mutter 41
> release on Navi 14, due to constantly enabling GFXOFF in the HW and
> disabling it again (for getting the GPU clock counter).
> 
> To fix this, call cancel_delayed_work_sync when the disable count
> transitions from 0 to 1, and only schedule the delayed work on the
> reverse transition, not if the disable count was already 0. This makes
> sure the delayed work doesn't run at unexpected times, and allows it to
> be lock-free.
> 
> v2:
> * Use cancel_delayed_work_sync & mutex_trylock instead of
>   mod_delayed_work.
> v3:
> * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
> 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Michel Dänzer 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 22 +-
> 
>  2 files changed, 22 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index f3fd5ec710b6..f944ed858f3e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2777,12 +2777,11 @@ static void
> amdgpu_device_delay_enable_gfx_off(struct work_struct *work)
>   struct amdgpu_device *adev =
>   container_of(work, struct amdgpu_device,
> gfx.gfx_off_delay_work.work);
> 
> - mutex_lock(>gfx.gfx_off_mutex);
> - if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
> - if (!amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, true))
> - adev->gfx.gfx_off_state = true;
> - }
> - mutex_unlock(>gfx.gfx_off_mutex);
> + WARN_ON_ONCE(adev->gfx.gfx_off_state);
> + WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
> +
> + if (!amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, true))
> + adev->gfx.gfx_off_state = true;
>  }
> 
>  /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index a0be0772c8b3..ca91aafcb32b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -563,15 +563,26 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device
> *adev, bool enable)
> 
>   mutex_lock(>gfx.gfx_off_mutex);
> 
> - if (!enable)
> - adev->gfx.gfx_off_req_count++;
> - else if (adev->gfx.gfx_off_req_count > 0)
> + if (enable) {
> + /* If the count is already 0, it means there's an imbalance bug
> somewhere.
> +  * Note that the bug may be in a different caller than the one
> which triggers the
> +  * WARN_ON_ONCE.
> +  */
> + if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
> + goto unlock;
> +
>   adev->gfx.gfx_off_req_count--;
> + } else {
> + adev->gfx.gfx_off_req_count++;
> + }
> 
>   if (enable && !adev->gfx.gfx_off_state && !adev-
> >gfx.gfx_off_req_count) {
>   schedule_delayed_work(>gfx.gfx_off_delay_work,
> GFX_OFF_DELAY_ENABLE);
> - } else if (!enable && adev->gfx.gfx_off_state) {
> - if (!amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, false)) {
> + } else if (!enable && adev->gfx.gfx_off_req_count == 1) {
[Quan, Evan] It seems here will leave a small time window for race condition. 
If amdgpu_device_delay_enable_gfx_off() happens to occur here, it will 
"WARN_ON_ONCE(adev->gfx.gfx_off_req_count);". How about something as below?
@@ -573,13 +573,11 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool 
enable)
goto unlock;

adev->gfx.gfx_off_req_count--;
-   } else {
-   adev->gfx.gfx_off_req_count++;
}

if (enable && !adev->gfx.gfx_of

RE: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled

2021-08-16 Thread Quan, Evan

[AMD Official Use Only]

Hi Michel,

The patch seems reasonable to me(especially the cancel_delayed_work_sync() 
part).
However, can you explain more about the code below?
What's the race issue here exactly?

+   /* mutex_lock could deadlock with cancel_delayed_work_sync in 
amdgpu_gfx_off_ctrl. */
+   if (!mutex_trylock(>gfx.gfx_off_mutex)) {
+   /* If there's a bug which causes amdgpu_gfx_off_ctrl to be 
called with enable=true
+* when adev->gfx.gfx_off_req_count is already 0, we might race 
with that.
+* Re-schedule to make sure gfx off will be re-enabled in the 
HW eventually.
+*/
+   schedule_delayed_work(>gfx.gfx_off_delay_work, 
AMDGPU_GFX_OFF_DELAY_ENABLE);
+   return;
+   }

BR
Evan
> -Original Message-
> From: amd-gfx  On Behalf Of
> Michel Dänzer
> Sent: Friday, August 13, 2021 6:29 PM
> To: Deucher, Alexander ; Koenig, Christian
> 
> Cc: Liu, Leo ; Zhu, James ; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Subject: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is
> disabled
> 
> From: Michel Dänzer 
> 
> schedule_delayed_work does not push back the work if it was already
> scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
> after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
> was disabled and re-enabled again during those 100 ms.
> 
> This resulted in frame drops / stutter with the upcoming mutter 41
> release on Navi 14, due to constantly enabling GFXOFF in the HW and
> disabling it again (for getting the GPU clock counter).
> 
> To fix this, call cancel_delayed_work_sync when GFXOFF transitions from
> enabled to disabled. This makes sure the delayed work will be scheduled
> as intended in the reverse case.
> 
> In order to avoid a deadlock, amdgpu_device_delay_enable_gfx_off needs
> to use mutex_trylock instead of mutex_lock.
> 
> v2:
> * Use cancel_delayed_work_sync & mutex_trylock instead of
>   mod_delayed_work.
> 
> Signed-off-by: Michel Dänzer 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 13 +++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h|  3 +++
>  3 files changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index f3fd5ec710b6..8b025f70706c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2777,7 +2777,16 @@ static void
> amdgpu_device_delay_enable_gfx_off(struct work_struct *work)
>   struct amdgpu_device *adev =
>   container_of(work, struct amdgpu_device,
> gfx.gfx_off_delay_work.work);
> 
> - mutex_lock(>gfx.gfx_off_mutex);
> + /* mutex_lock could deadlock with cancel_delayed_work_sync in
> amdgpu_gfx_off_ctrl. */
> + if (!mutex_trylock(>gfx.gfx_off_mutex)) {
> + /* If there's a bug which causes amdgpu_gfx_off_ctrl to be
> called with enable=true
> +  * when adev->gfx.gfx_off_req_count is already 0, we might
> race with that.
> +  * Re-schedule to make sure gfx off will be re-enabled in the
> HW eventually.
> +  */
> + schedule_delayed_work(>gfx.gfx_off_delay_work,
> AMDGPU_GFX_OFF_DELAY_ENABLE);
> + return;
> + }
> +
>   if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
>   if (!amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, true))
>   adev->gfx.gfx_off_state = true;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index a0be0772c8b3..da4c46db3093 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -28,9 +28,6 @@
>  #include "amdgpu_rlc.h"
>  #include "amdgpu_ras.h"
> 
> -/* delay 0.1 second to enable gfx off feature */
> -#define GFX_OFF_DELAY_ENABLE msecs_to_jiffies(100)
> -
>  /*
>   * GPU GFX IP block helpers function.
>   */
> @@ -569,9 +566,13 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device
> *adev, bool enable)
>   adev->gfx.gfx_off_req_count--;
> 
>   if (enable && !adev->gfx.gfx_off_state && !adev-
> >gfx.gfx_off_req_count) {
> - schedule_delayed_work(>gfx.gfx_off_delay_work,
> GFX_OFF_DELAY_ENABLE);
> - } else if (!enable && adev->gfx.gfx_off_state) {
> - if (!amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, false)) {
> + schedule_delayed_work(>gfx.gfx_off_delay_work,
> AMDGPU_GFX_OFF_DELAY_ENABLE);
> + } else if (!enable) {
> + if (adev->gfx.gfx_off_req_count == 1 && !adev-
> >gfx.gfx_off_state)
> + cancel_delayed_work_sync(
> >gfx.gfx_off_delay_work);
> +
> + if (adev->gfx.gfx_off_state &&
> + !amdgpu_dpm_set_powergating_by_smu(adev,
>

RE: [PATCH 1/2] drm/amdgpu: Use mod_delayed_work in amdgpu_gfx_off_ctrl

2021-08-11 Thread Quan, Evan

[AMD Official Use Only]

Reviewed-by: Evan Quan 

> -Original Message-
> From: amd-gfx  On Behalf Of
> Michel Dänzer
> Sent: Thursday, August 12, 2021 12:52 AM
> To: Deucher, Alexander ; Koenig, Christian
> 
> Cc: Liu, Leo ; Zhu, James ; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Subject: [PATCH 1/2] drm/amdgpu: Use mod_delayed_work in
> amdgpu_gfx_off_ctrl
> 
> From: Michel Dänzer 
> 
> In contrast to schedule_delayed_work, this pushes back the work if it
> was already scheduled before. Specific behaviour change:
> 
> Before:
> 
> amdgpu_device_delay_enable_gfx_off ran ~100 ms after the first time
> GFXOFF was disabled and re-enabled, even if GFXOFF was disabled and
> re-enabled again during those 100 ms.
> 
> After:
> 
> amdgpu_device_delay_enable_gfx_off runs ~100 ms after the last time
> GFXOFF is disabled and re-enabled.
> 
> The former resulted in frame drops / stutter with the upcoming mutter
> 41 release on Navi 14, due to constantly enabling GFXOFF in the HW and
> disabling it again (for getting the GPU clock counter).
> 
> Signed-off-by: Michel Dänzer 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index a0be0772c8b3..9cfef56b2aee 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -569,7 +569,7 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device
> *adev, bool enable)
>   adev->gfx.gfx_off_req_count--;
> 
>   if (enable && !adev->gfx.gfx_off_state && !adev-
> >gfx.gfx_off_req_count) {
> - schedule_delayed_work(>gfx.gfx_off_delay_work,
> GFX_OFF_DELAY_ENABLE);
> + mod_delayed_work(system_wq, 
> >gfx.gfx_off_delay_work, GFX_OFF_DELAY_ENABLE);
>   } else if (!enable && adev->gfx.gfx_off_state) {
>   if (!amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GFX, false)) {
>   adev->gfx.gfx_off_state = false;
> --
> 2.32.0

RE: [PATCH 2/2] drm/amdgpu: Use mod_delayed_work in JPEG/UVD/VCE/VCN ring_end_use hooks

2021-08-11 Thread Quan, Evan

[AMD Official Use Only]

Different from the 1st patch(for amdgpu_gfx_off_ctrl) of the series, 
"cancel_delayed_work_sync(>uvd.idle_work)" will be called on like 
amdgpu_uvd_ring_begin_use().  Under this case, does it make any difference from 
previous implementation "schedule_delayed_work"?
Suppose the sequence is as below:

  *   Ring begin use
  *   Ring end use -->  mod_delayed_work() : queue a new delayed work, right?
  *   Ring begin use (within 1s) --> cancel_delayed_work_sync() will cancel the 
work submitted above, right?
  *   Ring end use  --> mod_delayed_work(): queue another new scheduled work, 
same as previous "schedule_delayed_work"?

BR
Evan
From: amd-gfx  On Behalf Of Koenig, 
Christian
Sent: Thursday, August 12, 2021 5:34 AM
To: Michel Dänzer ; Deucher, Alexander 

Cc: Liu, Leo ; Zhu, James ; 
amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Subject: AW: [PATCH 2/2] drm/amdgpu: Use mod_delayed_work in JPEG/UVD/VCE/VCN 
ring_end_use hooks

NAK to at least this patch.

Since activating power management while submitting work is problematic 
cancel_delayed_work() must have been called during begin use or otherwise we 
have a serious coding problem in the first place.

So this change shouldn't make a difference and I suggest to really stick with 
schedule_delayed_work().

Maybe add a comment how this works?

Need to take a closer look at the first patch when I'm back from vacation, but 
it could be that this applies there as well.

Regards,
Christian.


Von: Michel Dänzer mailto:mic...@daenzer.net>>
Gesendet: Mittwoch, 11. August 2021 18:52
An: Deucher, Alexander 
mailto:alexander.deuc...@amd.com>>; Koenig, 
Christian mailto:christian.koe...@amd.com>>
Cc: Liu, Leo mailto:leo@amd.com>>; Zhu, James 
mailto:james@amd.com>>; 
amd-...@lists.freedesktop.org 
mailto:amd-...@lists.freedesktop.org>>; 
dri-devel@lists.freedesktop.org 
mailto:dri-devel@lists.freedesktop.org>>
Betreff: [PATCH 2/2] drm/amdgpu: Use mod_delayed_work in JPEG/UVD/VCE/VCN 
ring_end_use hooks

From: Michel Dänzer mailto:mdaen...@redhat.com>>

In contrast to schedule_delayed_work, this pushes back the work if it
was already scheduled before. Specific behaviour change:

Before:

The scheduled work ran ~1 second after the first time ring_end_use was
called, even if the ring was used again during that second.

After:

The scheduled work runs ~1 second after the last time ring_end_use is
called.

Inspired by the corresponding change in amdgpu_gfx_off_ctrl. While I
haven't run into specific issues in this case, the new behaviour makes
more sense to me.

Signed-off-by: Michel Dänzer mailto:mdaen...@redhat.com>>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c  | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c  | 2 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c| 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c
index 8996cb4ed57a..2c0040153f6c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c
@@ -110,7 +110,7 @@ void amdgpu_jpeg_ring_begin_use(struct amdgpu_ring *ring)
 void amdgpu_jpeg_ring_end_use(struct amdgpu_ring *ring)
 {
 atomic_dec(>adev->jpeg.total_submission_cnt);
-   schedule_delayed_work(>adev->jpeg.idle_work, JPEG_IDLE_TIMEOUT);
+   mod_delayed_work(system_wq, >adev->jpeg.idle_work, 
JPEG_IDLE_TIMEOUT);
 }

 int amdgpu_jpeg_dec_ring_test_ring(struct amdgpu_ring *ring)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index 0f576f294d8a..b6b1d7eeb8e5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -1283,7 +1283,7 @@ void amdgpu_uvd_ring_begin_use(struct amdgpu_ring *ring)
 void amdgpu_uvd_ring_end_use(struct amdgpu_ring *ring)
 {
 if (!amdgpu_sriov_vf(ring->adev))
-   schedule_delayed_work(>adev->uvd.idle_work, 
UVD_IDLE_TIMEOUT);
+   mod_delayed_work(system_wq, >adev->uvd.idle_work, 
UVD_IDLE_TIMEOUT);
 }

 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index 1ae7f824adc7..2253c18a6688 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -401,7 +401,7 @@ void amdgpu_vce_ring_begin_use(struct amdgpu_ring *ring)
 void amdgpu_vce_ring_end_use(struct amdgpu_ring *ring)
 {
 if (!amdgpu_sriov_vf(ring->adev))
-   schedule_delayed_work(>adev->vce.idle_work, 
VCE_IDLE_TIMEOUT);
+   mod_delayed_work(system_wq, >adev->vce.idle_work, 
VCE_IDLE_TIMEOUT);
 }

 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
index 284bb42d6c86..d5937ab5ac80 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c

RE: [PATCH] gpu: drm: swsmu: fix error return code of smu_v11_0_set_allowed_mask()

2021-03-04 Thread Quan, Evan

[AMD Public Use]

Thanks. Reviewed-by: Evan Quan 

-Original Message-
From: Jia-Ju Bai  
Sent: Friday, March 5, 2021 11:54 AM
To: Deucher, Alexander ; Koenig, Christian 
; airl...@linux.ie; dan...@ffwll.ch; Quan, Evan 
; Zhang, Hawking ; Wang, Kevin(Yang) 
; Gao, Likun 
Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; 
linux-ker...@vger.kernel.org; Jia-Ju Bai 
Subject: [PATCH] gpu: drm: swsmu: fix error return code of 
smu_v11_0_set_allowed_mask()

When bitmap_empty() or feature->feature_num triggers an error, no error return 
code of smu_v11_0_set_allowed_mask() is assigned.
To fix this bug, ret is assigned with -EINVAL as error return code.

Reported-by: TOTE Robot 
Signed-off-by: Jia-Ju Bai 
---
 drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
index 90585461a56e..82731a932308 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
@@ -747,8 +747,10 @@ int smu_v11_0_set_allowed_mask(struct smu_context *smu)
int ret = 0;
uint32_t feature_mask[2];
 
-   if (bitmap_empty(feature->allowed, SMU_FEATURE_MAX) || 
feature->feature_num < 64)
+   if (bitmap_empty(feature->allowed, SMU_FEATURE_MAX) || 
feature->feature_num < 64) {
+   ret = -EINVAL;
goto failed;
+   }
 
bitmap_copy((unsigned long *)feature_mask, feature->allowed, 64);
 
--
2.17.1
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH v2] drm/amdgpu/swsmu/navi1x: Remove unnecessary conversion to bool

2021-02-21 Thread Quan, Evan

[AMD Official Use Only - Internal Distribution Only]

Reviewed-by: Evan Quan 

-Original Message-
From: amd-gfx  On Behalf Of Jiapeng Chong
Sent: Saturday, February 20, 2021 10:55 AM
To: Deucher, Alexander 
Cc: Jiapeng Chong ; airl...@linux.ie; 
linux-ker...@vger.kernel.org; dri-devel@lists.freedesktop.org; 
amd-...@lists.freedesktop.org; dan...@ffwll.ch; Koenig, Christian 

Subject: [PATCH v2] drm/amdgpu/swsmu/navi1x: Remove unnecessary conversion to 
bool

Fix the following coccicheck warnings:

./drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c:900:47-52: WARNING:
conversion to bool not needed here.

Reported-by: Abaci Robot 
Signed-off-by: Jiapeng Chong 
---
 drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index cd7efa9..58028a7 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -897,7 +897,7 @@ static bool navi10_is_support_fine_grained_dpm(struct 
smu_context *smu, enum smu
dpm_desc = >DpmDescriptor[clk_index];
 
/* 0 - Fine grained DPM, 1 - Discrete DPM */
-   return dpm_desc->SnapToDiscrete == 0 ? true : false;
+   return dpm_desc->SnapToDiscrete == 0;
 }
 
 static inline bool navi10_od_feature_is_supported(struct 
smu_11_0_overdrive_table *od_table, enum SMU_11_0_ODFEATURE_CAP cap)
-- 
1.8.3.1

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=04%7C01%7Cevan.quan%40amd.com%7C443a5df938954827326108d8d6582201%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637495021310885387%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=6ETadiVbRBgbXfEbkXbxTX%2F1Ozg1wp3Nr9lHGF3SKHk%3Dreserved=0
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH] drm/amd/powerplay: fix spelling mistake "smu_state_memroy_block" -> "smu_state_memory_block"

2020-11-23 Thread Quan, Evan

[AMD Official Use Only - Internal Distribution Only]

Reviewed-by: Evan Quan 

-Original Message-
From: Colin King 
Sent: Monday, November 23, 2020 6:54 PM
To: Deucher, Alexander ; Koenig, Christian 
; David Airlie ; Daniel Vetter 
; Quan, Evan ; Wang, Kevin(Yang) 
; Gui, Jack ; 
amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Cc: kernel-janit...@vger.kernel.org; linux-ker...@vger.kernel.org
Subject: [PATCH] drm/amd/powerplay: fix spelling mistake 
"smu_state_memroy_block" -> "smu_state_memory_block"

From: Colin Ian King 

The struct name smu_state_memroy_block contains a spelling mistake, rename it 
to smu_state_memory_block

Fixes: 8554e67d6e22 ("drm/amd/powerplay: implement power_dpm_state sys 
interface for SMU11")
Signed-off-by: Colin Ian King 
---
 drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h 
b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
index 7550757cc059..a559ea2204c1 100644
--- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
@@ -99,7 +99,7 @@ struct smu_state_display_block {
 bool  enable_vari_bright;
 };

-struct smu_state_memroy_block {
+struct smu_state_memory_block {
 bool  dll_off;
 uint8_t m3arb;
 uint8_t unused[3];
@@ -146,7 +146,7 @@ struct smu_power_state {
 struct smu_state_validation_block validation;
 struct smu_state_pcie_block   pcie;
 struct smu_state_display_blockdisplay;
-struct smu_state_memroy_block memory;
+struct smu_state_memory_block memory;
 struct smu_state_software_algorithm_block software;
 struct smu_uvd_clocks uvd_clocks;
 struct smu_hw_power_state hardware;
--
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH -next] drm/amdgpu/swsmu: Remove unused static struct 'navi10_i2c_algo'

2020-10-29 Thread Quan, Evan

[AMD Official Use Only - Internal Distribution Only]

Other used APIs should be also dropped together.
navi10_i2c_func()
navi10_i2c_xfer()
navi10_i2c_write_data()
navi10_i2c_read_data()

Regards,
Evan
-Original Message-
From: amd-gfx  On Behalf Of Zou Wei
Sent: Thursday, October 29, 2020 8:00 PM
To: Deucher, Alexander ; Koenig, Christian 
; airl...@linux.ie; dan...@ffwll.ch
Cc: Zou Wei ; dri-devel@lists.freedesktop.org; 
amd-...@lists.freedesktop.org; linux-ker...@vger.kernel.org
Subject: [PATCH -next] drm/amdgpu/swsmu: Remove unused static struct 
'navi10_i2c_algo'

Fixes the following W=1 kernel build warning(s):

drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/navi10_ppt.c:2527:35:
warning: ‘navi10_i2c_algo’
defined but not used [-Wunused-const-variable=]  static const struct 
i2c_algorithm navi10_i2c_algo = {
   ^~~

Reported-by: Hulk Robot 
Signed-off-by: Zou Wei 
---
 drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index ef1a62e..bec63f2 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -2523,12 +2523,6 @@ static u32 navi10_i2c_func(struct i2c_adapter *adap)
 return I2C_FUNC_I2C | I2C_FUNC_SMBUS_EMUL;  }

-
-static const struct i2c_algorithm navi10_i2c_algo = {
-.master_xfer = navi10_i2c_xfer,
-.functionality = navi10_i2c_func,
-};
-
 static ssize_t navi10_get_gpu_metrics(struct smu_context *smu,
   void **table)
 {
--
2.6.2

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=04%7C01%7Cevan.quan%40amd.com%7C1b891bdcddd04c65dcc608d87c0ced25%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637395742921994883%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=v6O4EVJvUGg%2Byas3QxIcvO16%2FavfZcvXfeiIh8SfZYo%3Dreserved=0
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH AUTOSEL 5.4 265/330] drm/amd/powerplay: try to do a graceful shutdown on SW CTF

2020-09-18 Thread Quan, Evan

[AMD Official Use Only - Internal Distribution Only]

Hi @Sasha Levin @Deucher, Alexander,

The following changes need to be applied also.
Otherwise, you may see unexpected shutdown on stress gpu loading on Vega10.

drm/amd/pm: avoid false alarm due to confusing softwareshutdowntemp setting
drm/amd/pm: correct the thermal alert temperature limit settings
drm/amd/pm: correct Vega20 swctf limit setting
drm/amd/pm: correct Vega12 swctf limit setting
drm/amd/pm: correct Vega10 swctf limit setting

BR
Evan
-Original Message-
From: Sasha Levin 
Sent: Friday, September 18, 2020 10:00 AM
To: linux-ker...@vger.kernel.org; sta...@vger.kernel.org
Cc: Quan, Evan ; Deucher, Alexander 
; Sasha Levin ; 
dri-devel@lists.freedesktop.org
Subject: [PATCH AUTOSEL 5.4 265/330] drm/amd/powerplay: try to do a graceful 
shutdown on SW CTF

From: Evan Quan 

[ Upstream commit 9495220577416632675959caf122e968469ffd16 ]

Normally this(SW CTF) should not happen. And by doing graceful shutdown we can 
prevent further damage.

Signed-off-by: Evan Quan 
Reviewed-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../gpu/drm/amd/powerplay/hwmgr/smu_helper.c  | 21 +++
 drivers/gpu/drm/amd/powerplay/smu_v11_0.c |  7 +++
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu_helper.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu_helper.c
index d09690fca4520..414added3d02c 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu_helper.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu_helper.c
@@ -22,6 +22,7 @@
  */

 #include 
+#include 

 #include "hwmgr.h"
 #include "pp_debug.h"
@@ -593,12 +594,18 @@ int phm_irq_process(struct amdgpu_device *adev,
 uint32_t src_id = entry->src_id;

 if (client_id == AMDGPU_IRQ_CLIENTID_LEGACY) {
-if (src_id == VISLANDS30_IV_SRCID_CG_TSS_THERMAL_LOW_TO_HIGH)
+if (src_id == VISLANDS30_IV_SRCID_CG_TSS_THERMAL_LOW_TO_HIGH) {
 pr_warn("GPU over temperature range detected on PCIe %d:%d.%d!\n",
 PCI_BUS_NUM(adev->pdev->devfn),
 PCI_SLOT(adev->pdev->devfn),
 PCI_FUNC(adev->pdev->devfn));
-else if (src_id == VISLANDS30_IV_SRCID_CG_TSS_THERMAL_HIGH_TO_LOW)
+/*
+ * SW CTF just occurred.
+ * Try to do a graceful shutdown to prevent further damage.
+ */
+dev_emerg(adev->dev, "System is going to shutdown due to SW CTF!\n");
+orderly_poweroff(true);
+} else if (src_id == VISLANDS30_IV_SRCID_CG_TSS_THERMAL_HIGH_TO_LOW)
 pr_warn("GPU under temperature range detected on PCIe %d:%d.%d!\n",
 PCI_BUS_NUM(adev->pdev->devfn),
 PCI_SLOT(adev->pdev->devfn),
@@ -609,12 +616,18 @@ int phm_irq_process(struct amdgpu_device *adev,
 PCI_SLOT(adev->pdev->devfn),
 PCI_FUNC(adev->pdev->devfn));
 } else if (client_id == SOC15_IH_CLIENTID_THM) {
-if (src_id == 0)
+if (src_id == 0) {
 pr_warn("GPU over temperature range detected on PCIe %d:%d.%d!\n",
 PCI_BUS_NUM(adev->pdev->devfn),
 PCI_SLOT(adev->pdev->devfn),
 PCI_FUNC(adev->pdev->devfn));
-else
+/*
+ * SW CTF just occurred.
+ * Try to do a graceful shutdown to prevent further damage.
+ */
+dev_emerg(adev->dev, "System is going to shutdown due to SW CTF!\n");
+orderly_poweroff(true);
+} else
 pr_warn("GPU under temperature range detected on PCIe %d:%d.%d!\n",
 PCI_BUS_NUM(adev->pdev->devfn),
 PCI_SLOT(adev->pdev->devfn),
diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c 
b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
index c4d8c52c6b9ca..6c4405622c9bb 100644
--- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
+++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "pp_debug.h"
 #include "amdgpu.h"
@@ -1538,6 +1539,12 @@ static int smu_v11_0_irq_process(struct amdgpu_device 
*adev,
 PCI_BUS_NUM(adev->pdev->devfn),
 PCI_SLOT(adev->pdev->devfn),
 PCI_FUNC(adev->pdev->devfn));
+/*
+ * SW CTF just occurred.
+ * Try to do a graceful shutdown to prevent further damage.
+ */
+dev_emerg(adev->dev, "System is going to shutdown due to SW CTF!\n");
+orderly_poweroff(true);
 break;
 case THM_11_0__SRCID__THM_DIG_THERM_H2L:
 pr_warn("GPU under temperature range detected on PCIe %d:%d.%d!\n",
--
2.25.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH v1] powerplay:hwmgr - modify the return value

2020-09-16 Thread Quan, Evan

[AMD Official Use Only - Internal Distribution Only]

Thanks. Reviewed-by: Evan Quan 

-Original Message-
From: Xiaoliang Pang 
Sent: Thursday, September 17, 2020 11:46 AM
To: Quan, Evan ; Deucher, Alexander 
; Koenig, Christian ; 
airl...@linux.ie; dan...@ffwll.ch; Feng, Kenneth ; 
zhengbi...@huawei.com; pe...@vangils.xyz; yt...@amd.com
Cc: Das, Nirmoy ; Huang, JinHuiEric 
; amd-...@lists.freedesktop.org; 
dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org; 
tianjia.zh...@linux.alibaba.com; dawning.p...@gmail.com
Subject: [PATCH v1] powerplay:hwmgr - modify the return value

modify the return value is -EINVAL

Fixes: f83a9991648bb("drm/amd/powerplay: add Vega10 powerplay support (v5)")
Fixes: 2cac05dee6e30("drm/amd/powerplay: add the hw manager for vega12 (v4)")
Cc: Eric Huang 
Cc: Evan Quan 
Signed-off-by: Xiaoliang Pang 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 2 +-
 drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
index c378a000c934..7eada3098ffc 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
@@ -4659,7 +4659,7 @@ static int 
vega10_display_configuration_changed_task(struct pp_hwmgr *hwmgr)
 if ((data->water_marks_bitmap & WaterMarksExist) &&
 !(data->water_marks_bitmap & WaterMarksLoaded)) {
 result = smum_smc_table_manager(hwmgr, (uint8_t *)wm_table, WMTABLE, false);
-PP_ASSERT_WITH_CODE(result, "Failed to update WMTABLE!", return EINVAL);
+PP_ASSERT_WITH_CODE(result, "Failed to update WMTABLE!", return -EINVAL);
 data->water_marks_bitmap |= WaterMarksLoaded;
 }

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c
index a678a67f1c0d..04da52cea824 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c
@@ -2390,7 +2390,7 @@ static int 
vega12_display_configuration_changed_task(struct pp_hwmgr *hwmgr)
 !(data->water_marks_bitmap & WaterMarksLoaded)) {
 result = smum_smc_table_manager(hwmgr,
 (uint8_t *)wm_table, TABLE_WATERMARKS, false);
-PP_ASSERT_WITH_CODE(result, "Failed to update WMTABLE!", return EINVAL);
+PP_ASSERT_WITH_CODE(result, "Failed to update WMTABLE!", return -EINVAL);
 data->water_marks_bitmap |= WaterMarksLoaded;
 }

--
2.17.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH -next] amdgpu: fix Documentation builds for pm/ file movement

2020-08-23 Thread Quan, Evan

[AMD Official Use Only - Internal Distribution Only]

Thanks for fixing this. The patch is reviewed-by: Evan Quan 

BR
Evan
-Original Message-
From: Randy Dunlap 
Sent: Monday, August 24, 2020 6:36 AM
To: dri-devel ; LKML 
; amd-...@lists.freedesktop.org; Deucher, 
Alexander 
Cc: Quan, Evan ; Stephen Rothwell 
Subject: [PATCH -next] amdgpu: fix Documentation builds for pm/ file movement

From: Randy Dunlap 

Fix Documentation errors for amdgpu.rst due to file rename (moved to another 
subdirectory).

Error: Cannot open file ../drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
WARNING: kernel-doc '../scripts/kernel-doc -rst -enable-lineno -function hwmon 
../drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c' failed with return code 1

Fixes: e098bc9612c2 ("drm/amd/pm: optimize the power related source code 
layout")
Signed-off-by: Randy Dunlap 
Cc: Evan Quan 
Cc: Alex Deucher 
---
 Documentation/gpu/amdgpu.rst |   24 
 1 file changed, 12 insertions(+), 12 deletions(-)

--- linux-next-20200821.orig/Documentation/gpu/amdgpu.rst
+++ linux-next-20200821/Documentation/gpu/amdgpu.rst
@@ -153,7 +153,7 @@ This section covers hwmon and power/ther  HWMON Interfaces
 

-.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+.. kernel-doc:: drivers/gpu/drm/amd/pm/amdgpu_pm.c
:doc: hwmon

 GPU sysfs Power State Interfaces
@@ -164,52 +164,52 @@ GPU power controls are exposed via sysfs  power_dpm_state 
 ~~~

-.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+.. kernel-doc:: drivers/gpu/drm/amd/pm/amdgpu_pm.c
:doc: power_dpm_state

 power_dpm_force_performance_level
 ~

-.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+.. kernel-doc:: drivers/gpu/drm/amd/pm/amdgpu_pm.c
:doc: power_dpm_force_performance_level

 pp_table
 

-.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+.. kernel-doc:: drivers/gpu/drm/amd/pm/amdgpu_pm.c
:doc: pp_table

 pp_od_clk_voltage
 ~

-.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+.. kernel-doc:: drivers/gpu/drm/amd/pm/amdgpu_pm.c
:doc: pp_od_clk_voltage

 pp_dpm_*
 

-.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+.. kernel-doc:: drivers/gpu/drm/amd/pm/amdgpu_pm.c
:doc: pp_dpm_sclk pp_dpm_mclk pp_dpm_socclk pp_dpm_fclk pp_dpm_dcefclk 
pp_dpm_pcie

 pp_power_profile_mode
 ~

-.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+.. kernel-doc:: drivers/gpu/drm/amd/pm/amdgpu_pm.c
:doc: pp_power_profile_mode

 *_busy_percent
 ~~

-.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+.. kernel-doc:: drivers/gpu/drm/amd/pm/amdgpu_pm.c
:doc: gpu_busy_percent

-.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+.. kernel-doc:: drivers/gpu/drm/amd/pm/amdgpu_pm.c
:doc: mem_busy_percent

 gpu_metrics
 ~

-.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+.. kernel-doc:: drivers/gpu/drm/amd/pm/amdgpu_pm.c
:doc: gpu_metrics

 GPU Product Information
@@ -239,7 +239,7 @@ serial_number
 unique_id
 -

-.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+.. kernel-doc:: drivers/gpu/drm/amd/pm/amdgpu_pm.c
:doc: unique_id

 GPU Memory Usage Information
@@ -289,7 +289,7 @@ PCIe Accounting Information  pcie_bw
 ---

-.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+.. kernel-doc:: drivers/gpu/drm/amd/pm/amdgpu_pm.c
:doc: pcie_bw

 pcie_replay_count

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH -next] drm/amd/powerplay: remove duplicate include

2020-08-19 Thread Quan, Evan

[AMD Official Use Only - Internal Distribution Only]

Reviewed-by: Evan Quan 

-Original Message-
From: Wang Hai 
Sent: Wednesday, August 19, 2020 7:34 PM
To: Quan, Evan ; Deucher, Alexander 
; Koenig, Christian ; 
airl...@linux.ie; dan...@ffwll.ch
Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; 
linux-ker...@vger.kernel.org
Subject: [PATCH -next] drm/amd/powerplay: remove duplicate include

Remove asic_reg/nbio/nbio_6_1_offset.h which is included more than once

Reported-by: Hulk Robot 
Signed-off-by: Wang Hai 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/vega12_inc.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_inc.h 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_inc.h
index e6d9e84059e1..0d08c57d3bca 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_inc.h
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_inc.h
@@ -34,7 +34,6 @@
 #include "asic_reg/gc/gc_9_2_1_offset.h"
 #include "asic_reg/gc/gc_9_2_1_sh_mask.h"

-#include "asic_reg/nbio/nbio_6_1_offset.h"
 #include "asic_reg/nbio/nbio_6_1_offset.h"
 #include "asic_reg/nbio/nbio_6_1_sh_mask.h"

--
2.17.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH] drm/amd/powerplay: fix a crash when overclocking Vega M

2020-07-17 Thread Quan, Evan

[AMD Official Use Only - Internal Distribution Only]

Reviewed-by: Evan Quan 

-Original Message-
From: Qiu Wenbo 
Sent: Friday, July 17, 2020 3:10 PM
To: Quan, Evan ; amd-...@lists.freedesktop.org
Cc: Qiu Wenbo ; Deucher, Alexander 
; Koenig, Christian ; 
Zhou, David(ChunMing) ; David Airlie ; 
Daniel Vetter ; Chen Wandun ; 
YueHaibing ; yu kuai ; Huang, 
JinHuiEric ; dri-devel@lists.freedesktop.org; 
linux-ker...@vger.kernel.org
Subject: [PATCH] drm/amd/powerplay: fix a crash when overclocking Vega M

Avoid kernel crash when vddci_control is SMU7_VOLTAGE_CONTROL_NONE and
vddci_voltage_table is empty. It has been tested on Intel Hades Canyon
(i7-8809G).

Bug: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D208489data=02%7C01%7Cevan.quan%40amd.com%7Cff6bf841473b46539e1708d82a20723d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637305666456662890sdata=%2FMXKE9MMkUF2JPR3JiCTNdgAyyRnQXkxpZfS9eTPrW8%3Dreserved=0
Fixes: ac7822b0026f ("drm/amd/powerplay: add smumgr support for VEGAM (v2)")
Signed-off-by: Qiu Wenbo 
---
 drivers/gpu/drm/amd/powerplay/smumgr/vegam_smumgr.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/vegam_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/vegam_smumgr.c
index 3da71a088b92..0ecc18b55ffb 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/vegam_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/vegam_smumgr.c
@@ -644,9 +644,6 @@ static int vegam_get_dependency_volt_by_clk(struct pp_hwmgr 
*hwmgr,

 /* sclk is bigger than max sclk in the dependence table */
 *voltage |= (dep_table->entries[i - 1].vddc * VOLTAGE_SCALE) << VDDC_SHIFT;
-vddci = phm_find_closest_vddci(&(data->vddci_voltage_table),
-(dep_table->entries[i - 1].vddc -
-(uint16_t)VDDC_VDDCI_DELTA));

 if (SMU7_VOLTAGE_CONTROL_NONE == data->vddci_control)
 *voltage |= (data->vbios_boot_state.vddci_bootup_value *
@@ -654,8 +651,13 @@ static int vegam_get_dependency_volt_by_clk(struct 
pp_hwmgr *hwmgr,
 else if (dep_table->entries[i - 1].vddci)
 *voltage |= (dep_table->entries[i - 1].vddci *
 VOLTAGE_SCALE) << VDDC_SHIFT;
-else
+else {
+vddci = phm_find_closest_vddci(&(data->vddci_voltage_table),
+(dep_table->entries[i - 1].vddc -
+(uint16_t)VDDC_VDDCI_DELTA));
+
 *voltage |= (vddci * VOLTAGE_SCALE) << VDDCI_SHIFT;
+}

 if (SMU7_VOLTAGE_CONTROL_NONE == data->mvdd_control)
 *mvdd = data->vbios_boot_state.mvdd_bootup_value * VOLTAGE_SCALE;
--
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH] drm/radeon: Fix reference count leaks caused by pm_runtime_get_sync

2020-06-15 Thread Quan, Evan

[AMD Official Use Only - Internal Distribution Only]

Acked-by: Evan Quan 

-Original Message-
From: amd-gfx  On Behalf Of Aditya Pakki
Sent: Sunday, June 14, 2020 10:21 AM
To: pakki...@umn.edu
Cc: wu000...@umn.edu; David Airlie ; k...@umn.edu; 
linux-ker...@vger.kernel.org; amd-...@lists.freedesktop.org; 
dri-devel@lists.freedesktop.org; Daniel Vetter ; Deucher, 
Alexander ; Koenig, Christian 

Subject: [PATCH] drm/radeon: Fix reference count leaks caused by 
pm_runtime_get_sync

On calling pm_runtime_get_sync() the reference count of the device is 
incremented. In case of failure, decrement the reference count before returning 
the error.

Signed-off-by: Aditya Pakki 
---
 drivers/gpu/drm/radeon/radeon_display.c | 4 +++-
 drivers/gpu/drm/radeon/radeon_drv.c | 4 +++-
 drivers/gpu/drm/radeon/radeon_kms.c | 4 +++-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index 35db79a168bf..df1a7eb73651 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -635,8 +635,10 @@ radeon_crtc_set_config(struct drm_mode_set *set,
 dev = set->crtc->dev;

 ret = pm_runtime_get_sync(dev->dev);
-if (ret < 0)
+if (ret < 0) {
+pm_runtime_put_autosuspend(dev->dev);
 return ret;
+}

 ret = drm_crtc_helper_set_config(set, ctx);

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index bbb0883e8ce6..62b5069122cc 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -549,8 +549,10 @@ long radeon_drm_ioctl(struct file *filp,
 long ret;
 dev = file_priv->minor->dev;
 ret = pm_runtime_get_sync(dev->dev);
-if (ret < 0)
+if (ret < 0) {
+pm_runtime_put_autosuspend(dev->dev);
 return ret;
+}

 ret = drm_ioctl(filp, cmd, arg);

diff --git a/drivers/gpu/drm/radeon/radeon_kms.c 
b/drivers/gpu/drm/radeon/radeon_kms.c
index c5d1dc9618a4..99ee60f8b604 100644
--- a/drivers/gpu/drm/radeon/radeon_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_kms.c
@@ -638,8 +638,10 @@ int radeon_driver_open_kms(struct drm_device *dev, struct 
drm_file *file_priv)
 file_priv->driver_priv = NULL;

 r = pm_runtime_get_sync(dev->dev);
-if (r < 0)
+if (r < 0) {
+pm_runtime_put_autosuspend(dev->dev);
 return r;
+}

 /* new gpu have virtual address space support */
 if (rdev->family >= CHIP_CAYMAN) {
--
2.25.1

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=02%7C01%7Cevan.quan%40amd.com%7Cc86101e02ef24c52b36408d810fdcc14%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637278029582429567sdata=qtKTCV33q8l2GTxMUX0nlJ4fV32dXaLH7y6hymksQEo%3Dreserved=0
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH] drm/amdgpu: Add missing '\n' in log messages

2020-04-12 Thread Quan, Evan

Reviewed-by: Evan Quan 

-Original Message-
From: Christophe JAILLET  
Sent: Saturday, April 11, 2020 10:04 PM
To: Deucher, Alexander ; Koenig, Christian 
; Zhou, David(ChunMing) ; 
airl...@linux.ie; dan...@ffwll.ch; Zhang, Hawking ; 
Quan, Evan ; Grodzovsky, Andrey ; 
Liu, Monk ; Russell, Kent ; Ma, Le 

Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; 
linux-ker...@vger.kernel.org; kernel-janit...@vger.kernel.org; Christophe 
JAILLET 
Subject: [PATCH] drm/amdgpu: Add missing '\n' in log messages

Message logged by 'dev_xxx()' or 'pr_xxx()' should end with a '\n'.

While at it, split some long lines that where not that far.

Signed-off-by: Christophe JAILLET 
---
Most of them have been added in commit bd607166af7f ("drm/amdgpu: Enable 
reading FRU chip via I2C v3")
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 87f7c129c8ce..3d0a50e8c36b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3249,25 +3249,25 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 
r = device_create_file(adev->dev, _attr_pcie_replay_count);
if (r) {
-   dev_err(adev->dev, "Could not create pcie_replay_count");
+   dev_err(adev->dev, "Could not create pcie_replay_count\n");
return r;
}
 
r = device_create_file(adev->dev, _attr_product_name);
if (r) {
-   dev_err(adev->dev, "Could not create product_name");
+   dev_err(adev->dev, "Could not create product_name\n");
return r;
}
 
r = device_create_file(adev->dev, _attr_product_number);
if (r) {
-   dev_err(adev->dev, "Could not create product_number");
+   dev_err(adev->dev, "Could not create product_number\n");
return r;
}
 
r = device_create_file(adev->dev, _attr_serial_number);
if (r) {
-   dev_err(adev->dev, "Could not create serial_number");
+   dev_err(adev->dev, "Could not create serial_number\n");
return r;
}
 
@@ -4270,7 +4270,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
job_signaled = true;
 
if (job_signaled) {
-   dev_info(adev->dev, "Guilty job already signaled, skipping HW 
reset");
+   dev_info(adev->dev, "Guilty job already signaled, skipping HW 
reset\n");
goto skip_hw_reset;
}
 
@@ -4339,10 +4339,12 @@ int amdgpu_device_gpu_recover(struct amdgpu_device 
*adev,
 
if (r) {
/* bad news, how to tell it to userspace ? */
-   dev_info(tmp_adev->dev, "GPU reset(%d) failed\n", 
atomic_read(_adev->gpu_reset_counter));
+   dev_info(tmp_adev->dev, "GPU reset(%d) failed\n",
+atomic_read(_adev->gpu_reset_counter));
amdgpu_vf_error_put(tmp_adev, 
AMDGIM_ERROR_VF_GPU_RESET_FAIL, 0, r);
} else {
-   dev_info(tmp_adev->dev, "GPU reset(%d) succeeded!\n", 
atomic_read(_adev->gpu_reset_counter));
+   dev_info(tmp_adev->dev, "GPU reset(%d) succeeded!\n",
+atomic_read(_adev->gpu_reset_counter));
}
}
 
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH -next] drm/amd/powerplay: Use bitwise instead of arithmetic operator for flags

2020-02-23 Thread Quan, Evan

Thanks. Reviewed-by: Evan Quan 

-Original Message-
From: Chen Zhou  
Sent: Friday, February 21, 2020 8:22 PM
To: Quan, Evan ; Deucher, Alexander 
; Koenig, Christian ; 
Zhou, David(ChunMing) ; airl...@linux.ie; dan...@ffwll.ch
Cc: Feng, Kenneth ; amd-...@lists.freedesktop.org; 
dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org; 
chenzho...@huawei.com
Subject: [PATCH -next] drm/amd/powerplay: Use bitwise instead of arithmetic 
operator for flags

This silences the following coccinelle warning:

"WARNING: sum of probable bitmasks, consider |"

Signed-off-by: Chen Zhou 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
index 92a65e3d..f29f95b 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
@@ -3382,7 +3382,7 @@ static int 
vega10_populate_and_upload_sclk_mclk_dpm_levels(
}
 
if (data->need_update_dpm_table &
-   (DPMTABLE_OD_UPDATE_SCLK + DPMTABLE_UPDATE_SCLK + 
DPMTABLE_UPDATE_SOCCLK)) {
+   (DPMTABLE_OD_UPDATE_SCLK | DPMTABLE_UPDATE_SCLK | 
+DPMTABLE_UPDATE_SOCCLK)) {
result = vega10_populate_all_graphic_levels(hwmgr);
PP_ASSERT_WITH_CODE((0 == result),
"Failed to populate SCLK during 
PopulateNewDPMClocksStates Function!", @@ -3390,7 +3390,7 @@ static int 
vega10_populate_and_upload_sclk_mclk_dpm_levels(
}
 
if (data->need_update_dpm_table &
-   (DPMTABLE_OD_UPDATE_MCLK + DPMTABLE_UPDATE_MCLK)) {
+   (DPMTABLE_OD_UPDATE_MCLK | DPMTABLE_UPDATE_MCLK)) {
result = vega10_populate_all_memory_levels(hwmgr);
PP_ASSERT_WITH_CODE((0 == result),
"Failed to populate MCLK during 
PopulateNewDPMClocksStates Function!",
--
2.7.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH] MAINTAINERS: Drop Rex Zhu for amdgpu powerplay

2019-11-24 Thread Quan, Evan

Reviewed-by: Evan Quan 

> -Original Message-
> From: amd-gfx  On Behalf Of Alex
> Deucher
> Sent: Saturday, November 23, 2019 3:19 AM
> To: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: [PATCH] MAINTAINERS: Drop Rex Zhu for amdgpu powerplay
> 
> No longer works on the driver.
> 
> Signed-off-by: Alex Deucher 
> ---
>  MAINTAINERS | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index b63c291ad029..d518588b9879 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -856,7 +856,6 @@ S:Maintained
>  F:   drivers/i2c/busses/i2c-amd-mp2*
> 
>  AMD POWERPLAY
> -M:   Rex Zhu 
>  M:   Evan Quan 
>  L:   amd-...@lists.freedesktop.org
>  S:   Supported
> --
> 2.23.0
> 
> ___
> amd-gfx mailing list
> amd-...@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.free
> desktop.org%2Fmailman%2Flistinfo%2Famd-
> gfxdata=02%7C01%7Cevan.quan%40amd.com%7Ca64ca85a7c4a41c2d5
> 2408d76f80d191%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637
> 100471424667152sdata=IlnGhFH1jHLTFk6NfLsZNGI%2FC7QNcYrER7TGG
> uMbVE4%3Dreserved=0
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH v2] drm/amd/powerplay: return errno code to caller when error occur

2019-11-18 Thread Quan, Evan

Reviewed-by: Evan Quan 

> -Original Message-
> From: Chen Wandun 
> Sent: Monday, November 18, 2019 4:04 PM
> To: Quan, Evan ; Deucher, Alexander
> ; amd-...@lists.freedesktop.org; linux-
> ker...@vger.kernel.org; dri-devel@lists.freedesktop.org
> Cc: chenwan...@huawei.com
> Subject: [PATCH v2] drm/amd/powerplay: return errno code to caller when
> error occur
> 
> return errno code to caller when error occur, and meanwhile remove gcc '-
> Wunused-but-set-variable' warning.
> 
> drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/vegam_smumgr.c: In
> function vegam_populate_smc_boot_level:
> drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/vegam_smumgr.c:1364:
> 6: warning: variable result set but not used [-Wunused-but-set-variable]
> 
> Signed-off-by: Chen Wandun 
> ---
>  drivers/gpu/drm/amd/powerplay/smumgr/vegam_smumgr.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/vegam_smumgr.c
> b/drivers/gpu/drm/amd/powerplay/smumgr/vegam_smumgr.c
> index 2068eb0..50896e9 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/vegam_smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/vegam_smumgr.c
> @@ -1371,11 +1371,16 @@ static int vegam_populate_smc_boot_level(struct
> pp_hwmgr *hwmgr,
>   result = phm_find_boot_level(&(data->dpm_table.sclk_table),
>   data->vbios_boot_state.sclk_bootup_value,
>   (uint32_t *)&(table->GraphicsBootLevel));
> + if (result)
> + return result;
> 
>   result = phm_find_boot_level(&(data->dpm_table.mclk_table),
>   data->vbios_boot_state.mclk_bootup_value,
>   (uint32_t *)&(table->MemoryBootLevel));
> 
> + if (result)
> + return result;
> +
>   table->BootVddc  = data->vbios_boot_state.vddc_bootup_value *
>   VOLTAGE_SCALE;
>   table->BootVddci = data->vbios_boot_state.vddci_bootup_value *
> --
> 2.7.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH][next] drm/amdgpu/powerplay: fix dereference before null check of pointer hwmgr

2019-11-17 Thread Quan, Evan

Reviewed-by: Evan Quan 

-Original Message-
From: Colin King  
Sent: Friday, November 15, 2019 5:48 PM
To: Rex Zhu ; Quan, Evan ; Deucher, 
Alexander ; Koenig, Christian 
; Zhou, David(ChunMing) ; David 
Airlie ; Daniel Vetter ; 
amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Cc: kernel-janit...@vger.kernel.org; linux-ker...@vger.kernel.org
Subject: [PATCH][next] drm/amdgpu/powerplay: fix dereference before null check 
of pointer hwmgr

From: Colin Ian King 

The assignment of adev dereferences pointer hwmgr before hwmgr is null checked, 
hence there is a potential null pointer deference issue. Fix this by assigning 
adev after the null check.

Addresses-Coverity: ("Dereference before null check")
Fixes: 0896d2f7ba4d ("drm/amdgpu/powerplay: properly set PP_GFXOFF_MASK")
Signed-off-by: Colin Ian King 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c
index 443625c83ec9..d2909c91d65b 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c
@@ -81,7 +81,7 @@ static void hwmgr_init_workload_prority(struct pp_hwmgr 
*hwmgr)
 
 int hwmgr_early_init(struct pp_hwmgr *hwmgr)  {
-   struct amdgpu_device *adev = hwmgr->adev;
+   struct amdgpu_device *adev;
 
if (!hwmgr)
return -EINVAL;
@@ -96,6 +96,8 @@ int hwmgr_early_init(struct pp_hwmgr *hwmgr)
hwmgr_init_workload_prority(hwmgr);
hwmgr->gfxoff_state_changed_by_workload = false;
 
+   adev = hwmgr->adev;
+
switch (hwmgr->chip_family) {
case AMDGPU_FAMILY_CI:
adev->pm.pp_feature &= ~PP_GFXOFF_MASK;
--
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH] drm/amd/powerplay: remove variable 'result' set but not used

2019-11-17 Thread Quan, Evan

Thanks. But it's better to return the 'result' out on 'result != 0'.

Regards,
Evan
-Original Message-
From: Chen Wandun  
Sent: Saturday, November 16, 2019 11:43 AM
To: Deucher, Alexander ; Quan, Evan 
; amd-...@lists.freedesktop.org; 
dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org
Cc: chenwan...@huawei.com
Subject: [PATCH] drm/amd/powerplay: remove variable 'result' set but not used

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/vegam_smumgr.c: In function 
vegam_populate_smc_boot_level:
drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/vegam_smumgr.c:1364:6: warning: 
variable result set but not used [-Wunused-but-set-variable]

Signed-off-by: Chen Wandun 
---
 drivers/gpu/drm/amd/powerplay/smumgr/vegam_smumgr.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/vegam_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/vegam_smumgr.c
index 2068eb0..fad78bf 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/vegam_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/vegam_smumgr.c
@@ -1361,20 +1361,19 @@ static int vegam_populate_smc_uvd_level(struct pp_hwmgr 
*hwmgr,
 static int vegam_populate_smc_boot_level(struct pp_hwmgr *hwmgr,
struct SMU75_Discrete_DpmTable *table)
 {
-   int result = 0;
struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);
 
table->GraphicsBootLevel = 0;
table->MemoryBootLevel = 0;
 
/* find boot level from dpm table */
-   result = phm_find_boot_level(&(data->dpm_table.sclk_table),
-   data->vbios_boot_state.sclk_bootup_value,
-   (uint32_t *)&(table->GraphicsBootLevel));
+   phm_find_boot_level(&(data->dpm_table.sclk_table),
+   data->vbios_boot_state.sclk_bootup_value,
+   (uint32_t *)&(table->GraphicsBootLevel));
 
-   result = phm_find_boot_level(&(data->dpm_table.mclk_table),
-   data->vbios_boot_state.mclk_bootup_value,
-   (uint32_t *)&(table->MemoryBootLevel));
+   phm_find_boot_level(&(data->dpm_table.mclk_table),
+   data->vbios_boot_state.mclk_bootup_value,
+   (uint32_t *)&(table->MemoryBootLevel));
 
table->BootVddc  = data->vbios_boot_state.vddc_bootup_value *
VOLTAGE_SCALE;
-- 
2.7.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH 0/2] remove some set but not used variables in hwmgr

2019-11-10 Thread Quan, Evan

Series is reviewed-by: Evan Quan 

> -Original Message-
> From: zhengbin 
> Sent: Monday, November 11, 2019 11:46 AM
> To: rex....@amd.com; Quan, Evan ; Deucher,
> Alexander ; Koenig, Christian
> ; Zhou, David(ChunMing)
> ; airl...@linux.ie; dan...@ffwll.ch; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Cc: zhengbi...@huawei.com
> Subject: [PATCH 0/2] remove some set but not used variables in hwmgr
> 
> zhengbin (2):
>   drm/amd/powerplay: remove set but not used variable
> 'vbios_version','data'
>   drm/amd/powerplay: remove set but not used variable 'data'
> 
>  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c   | 4 
>  drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 2 --
>  2 files changed, 6 deletions(-)
> 
> --
> 2.7.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH][drm-next] drm/amd/powerplay: remove redundant duplicated return check

2019-08-06 Thread Quan, Evan

Thanks! Reviewed-by: Evan Quan 

> -Original Message-
> From: Colin King 
> Sent: Monday, August 05, 2019 6:30 PM
> To: Rex Zhu ; Quan, Evan ;
> Deucher, Alexander ; Koenig, Christian
> ; Zhou, David(ChunMing)
> ; David Airlie ; Daniel Vetter
> ; amd-...@lists.freedesktop.org; dri-
> de...@lists.freedesktop.org
> Cc: kernel-janit...@vger.kernel.org; linux-ker...@vger.kernel.org
> Subject: [PATCH][drm-next] drm/amd/powerplay: remove redundant
> duplicated return check
> 
> From: Colin Ian King 
> 
> The check on ret is duplicated in two places, it is redundant code.
> Remove it.
> 
> Addresses-Coverity: ("Logically dead code")
> Fixes: b94afb61cdae ("drm/amd/powerplay: honor hw limit on fetching
> metrics data for navi10")
> Signed-off-by: Colin Ian King 
> ---
>  drivers/gpu/drm/amd/powerplay/navi10_ppt.c | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
> b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
> index d62c2784b102..b272c8dc8f79 100644
> --- a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
> +++ b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
> @@ -941,8 +941,6 @@ static int navi10_get_gpu_power(struct smu_context
> *smu, uint32_t *value)
>   ret = navi10_get_metrics_table(smu, );
>   if (ret)
>   return ret;
> - if (ret)
> - return ret;
> 
>   *value = metrics.AverageSocketPower << 8;
> 
> @@ -1001,8 +999,6 @@ static int navi10_get_fan_speed_rpm(struct
> smu_context *smu,
>   ret = navi10_get_metrics_table(smu, );
>   if (ret)
>   return ret;
> - if (ret)
> - return ret;
> 
>   *speed = metrics.CurrFanSpeed;
> 
> --
> 2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH] drm/amd/powerplay: Zero initialize some variables

2019-08-04 Thread Quan, Evan

Thanks Nathan. The patch is reviewed-by: Evan Quan 

> -Original Message-
> From: Nathan Chancellor 
> Sent: Monday, August 05, 2019 4:37 AM
> To: Quan, Evan ; Deucher, Alexander
> ; Koenig, Christian
> ; Zhou, David(ChunMing)
> 
> Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> ker...@vger.kernel.org; clang-built-li...@googlegroups.com; Nathan
> Chancellor 
> Subject: [PATCH] drm/amd/powerplay: Zero initialize some variables
> 
> Clang warns (only Navi warning shown but Arcturus warns as well):
> 
> drivers/gpu/drm/amd/amdgpu/../powerplay/navi10_ppt.c:1534:4: warning:
> variable 'asic_default_power_limit' is used uninitialized whenever '?:'
> condition is false [-Wsometimes-uninitialized]
> smu_read_smc_arg(smu, _default_power_limit);
> ^~~~
> drivers/gpu/drm/amd/amdgpu/../powerplay/inc/amdgpu_smu.h:588:3:
> note:
> expanded from macro 'smu_read_smc_arg'
> ((smu)->funcs->read_smc_arg? (smu)->funcs->read_smc_arg((smu),
> (arg)) : 0)
>  ^~
> drivers/gpu/drm/amd/amdgpu/../powerplay/navi10_ppt.c:1550:30: note:
> uninitialized use occurs here
> smu->default_power_limit = asic_default_power_limit;
>^~~~
> drivers/gpu/drm/amd/amdgpu/../powerplay/navi10_ppt.c:1534:4: note:
> remove the '?:' if its condition is always true
> smu_read_smc_arg(smu, _default_power_limit);
> ^
> drivers/gpu/drm/amd/amdgpu/../powerplay/inc/amdgpu_smu.h:588:3:
> note:
> expanded from macro 'smu_read_smc_arg'
> ((smu)->funcs->read_smc_arg? (smu)->funcs->read_smc_arg((smu),
> (arg)) : 0)
>  ^
> drivers/gpu/drm/amd/amdgpu/../powerplay/navi10_ppt.c:1517:35: note:
> initialize the variable 'asic_default_power_limit' to silence this warning
> uint32_t asic_default_power_limit;
>  ^
>   = 0
> 1 warning generated.
> 
> As the code is currently written, if read_smc_arg were ever NULL, arg would
> fail to be initialized but the code would continue executing as normal
> because the return value would just be zero.
> 
> There are a few different possible solutions to resolve this class of warnings
> which have appeared in these drivers before:
> 
> 1. Assume the function pointer will never be NULL and eliminate the
>wrapper macros.
> 
> 2. Have the wrapper macros initialize arg when the function pointer is
>NULL.
> 
> 3. Have the wrapper macros return an error code instead of 0 when the
>function pointer is NULL so that the callsites can properly bail out
>before arg can be used.
> 
> 4. Initialize arg at the top of its function.
> 
> Number four is the path of least resistance right now as every other change
> will be driver wide so do that here. I only make the comment now as food for
> thought.
> 
> Fixes: b4af964e75c4 ("drm/amd/powerplay: make power limit retrieval as
> asic specific")
> Link: https://github.com/ClangBuiltLinux/linux/issues/627
> Signed-off-by: Nathan Chancellor 
> ---
>  drivers/gpu/drm/amd/powerplay/arcturus_ppt.c | 2 +-
>  drivers/gpu/drm/amd/powerplay/navi10_ppt.c   | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c
> b/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c
> index 215f7173fca8..b92eded7374f 100644
> --- a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c
> +++ b/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c
> @@ -1326,7 +1326,7 @@ static int arcturus_get_power_limit(struct
> smu_context *smu,
>bool asic_default)
>  {
>   PPTable_t *pptable = smu->smu_table.driver_pptable;
> - uint32_t asic_default_power_limit;
> + uint32_t asic_default_power_limit = 0;
>   int ret = 0;
>   int power_src;
> 
> diff --git a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
> b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
> index 106352a4fb82..d844bc8411aa 100644
> --- a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
> +++ b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
> @@ -1514,7 +1514,7 @@ static int navi10_get_power_limit(struct
> smu_context *smu,
>bool asic_default)
>  {
>   PPTable_t *pptable = smu->smu_table.driver_pptable;
> - uint32_t asic_default_power_limit;
> + uint32_t asic_default_power_limit = 0;
>   int ret = 0;
>   int power_src;
> 
> --
> 2.23.0.rc1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH 6/7] drm/amd/powerplay: Use proper enums in vega20_print_clk_levels

2019-07-15 Thread Quan, Evan

Thanks!  This is reviewed-by: Evan Quan 

Regards
Evan
> -Original Message-
> From: Nathan Chancellor 
> Sent: Monday, July 15, 2019 11:40 PM
> To: Arnd Bergmann 
> Cc: Deucher, Alexander ; Koenig, Christian
> ; Zhou, David(ChunMing)
> ; Wentland, Harry ;
> Li, Sun peng (Leo) ; Rex Zhu ;
> Quan, Evan ; David Airlie ; Daniel
> Vetter ; amd-gfx list ; dri-
> devel ; Linux Kernel Mailing List  ker...@vger.kernel.org>; clang-built-linux  li...@googlegroups.com>; Wang, Kevin(Yang) 
> Subject: Re: [PATCH 6/7] drm/amd/powerplay: Use proper enums in
> vega20_print_clk_levels
> 
> On Mon, Jul 15, 2019 at 11:25:29AM +0200, Arnd Bergmann wrote:
> > On Thu, Jul 4, 2019 at 7:52 AM Nathan Chancellor
> >  wrote:
> > >
> > > clang warns:
> > >
> > > drivers/gpu/drm/amd/amdgpu/../powerplay/vega20_ppt.c:995:39:
> warning:
> > > implicit conversion from enumeration type 'PPCLK_e' to different
> > > enumeration type 'enum smu_clk_type' [-Wenum-conversion]
> > > ret = smu_get_current_clk_freq(smu, PPCLK_SOCCLK, );
> > >
> > > ~~^~~
> > > drivers/gpu/drm/amd/amdgpu/../powerplay/vega20_ppt.c:1016:39:
> warning:
> > > implicit conversion from enumeration type 'PPCLK_e' to different
> > > enumeration type 'enum smu_clk_type' [-Wenum-conversion]
> > > ret = smu_get_current_clk_freq(smu, PPCLK_FCLK, );
> > >
> > > ~~^
> > > drivers/gpu/drm/amd/amdgpu/../powerplay/vega20_ppt.c:1031:39:
> warning:
> > > implicit conversion from enumeration type 'PPCLK_e' to different
> > > enumeration type 'enum smu_clk_type' [-Wenum-conversion]
> > > ret = smu_get_current_clk_freq(smu, PPCLK_DCEFCLK, );
> > >
> > > ~~^~~~
> > >
> > > The values are mapped one to one in vega20_get_smu_clk_index so just
> > > use the proper enums here.
> > >
> > > Fixes: 096761014227 ("drm/amd/powerplay: support sysfs to get
> > > socclk, fclk, dcefclk")
> > > Link: https://github.com/ClangBuiltLinux/linux/issues/587
> > > Signed-off-by: Nathan Chancellor 
> > > ---
> >
> > Adding Kevin Wang for further review, as he sent a related patch in
> > d36893362d22 ("drm/amd/powerplay: fix smu clock type change miss
> > error")
> >
> > I assume this one is still required as it triggers the same warning.
> > Kevin, can you have a look?
> >
> >   Arnd
> 
> Indeed, this one and https://github.com/ClangBuiltLinux/linux/issues/586
> are still outstanding.
> 
> https://patchwork.freedesktop.org/patch/315581/
> 
> Cheers,
> Nathan
> 
> >
> > >  drivers/gpu/drm/amd/powerplay/vega20_ppt.c | 6 +++---
> > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/powerplay/vega20_ppt.c
> > > b/drivers/gpu/drm/amd/powerplay/vega20_ppt.c
> > > index 0f14fe14ecd8..e62dd6919b24 100644
> > > --- a/drivers/gpu/drm/amd/powerplay/vega20_ppt.c
> > > +++ b/drivers/gpu/drm/amd/powerplay/vega20_ppt.c
> > > @@ -992,7 +992,7 @@ static int vega20_print_clk_levels(struct
> smu_context *smu,
> > > break;
> > >
> > > case SMU_SOCCLK:
> > > -   ret = smu_get_current_clk_freq(smu, PPCLK_SOCCLK, );
> > > +   ret = smu_get_current_clk_freq(smu, SMU_SOCCLK,
> > > + );
> > > if (ret) {
> > > pr_err("Attempt to get current socclk Failed!");
> > > return ret;
> > > @@ -1013,7 +1013,7 @@ static int vega20_print_clk_levels(struct
> smu_context *smu,
> > > break;
> > >
> > > case SMU_FCLK:
> > > -   ret = smu_get_current_clk_freq(smu, PPCLK_FCLK, );
> > > +   ret = smu_get_current_clk_freq(smu, SMU_FCLK, );
> > > if (ret) {
> > > pr_err("Attempt to get current fclk Failed!");
> > > return ret;
> > > @@ -1028,7 +1028,7 @@ static int vega20_print_clk_levels(struct
> smu_context *smu,
> > > break;
> > >
> > > case SMU_DCEFCLK:
> > > -   ret = smu_get_current_clk_freq(smu, PPCLK_DCEFCLK, );
> > > +   ret = smu_get_current_clk_freq(smu, SMU_DCEFCLK,
> > > + );
> > > if (ret) {
> > > pr_err("Attempt to get current dcefclk Failed!");
> > > return ret;
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH] drm/amd/powerplay: remove spurious semicolon

2019-05-04 Thread Quan, Evan

Reviewed-by: Evan Quan 

> -Original Message-
> From: Andrea Righi 
> Sent: 2019年5月4日 0:56
> To: Deucher, Alexander ; Koenig, Christian
> ; Zhou, David(ChunMing)
> 
> Cc: Rex Zhu ; Wu, Hersen ;
> Quan, Evan ; amd-...@lists.freedesktop.org; dri-
> de...@lists.freedesktop.org; linux-ker...@vger.kernel.org
> Subject: [PATCH] drm/amd/powerplay: remove spurious semicolon
> 
> [CAUTION: External Email]
> 
> Remove unnecessary semicolons at the end of line.
> 
> Signed-off-by: Andrea Righi 
> ---
>  drivers/gpu/drm/amd/powerplay/amd_powerplay.c | 8 
>  drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c | 2 +-
>  2 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
> b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
> index 3f73f7cd18b9..1052f5119087 100644
> --- a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
> +++ b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
> @@ -1304,7 +1304,7 @@ static int pp_notify_smu_enable_pwe(void
> *handle)
> 
> if (hwmgr->hwmgr_func->smus_notify_pwe == NULL) {
> pr_info_ratelimited("%s was not implemented.\n", __func__);
> -   return -EINVAL;;
> +   return -EINVAL;
> }
> 
> mutex_lock(>smu_lock);
> @@ -1341,7 +1341,7 @@ static int pp_set_min_deep_sleep_dcefclk(void
> *handle, uint32_t clock)
> 
> if (hwmgr->hwmgr_func->set_min_deep_sleep_dcefclk == NULL) {
> pr_debug("%s was not implemented.\n", __func__);
> -   return -EINVAL;;
> +   return -EINVAL;
> }
> 
> mutex_lock(>smu_lock);
> @@ -1360,7 +1360,7 @@ static int pp_set_hard_min_dcefclk_by_freq(void
> *handle, uint32_t clock)
> 
> if (hwmgr->hwmgr_func->set_hard_min_dcefclk_by_freq == NULL) {
> pr_debug("%s was not implemented.\n", __func__);
> -   return -EINVAL;;
> +   return -EINVAL;
> }
> 
> mutex_lock(>smu_lock);
> @@ -1379,7 +1379,7 @@ static int pp_set_hard_min_fclk_by_freq(void
> *handle, uint32_t clock)
> 
> if (hwmgr->hwmgr_func->set_hard_min_fclk_by_freq == NULL) {
> pr_debug("%s was not implemented.\n", __func__);
> -   return -EINVAL;;
> +   return -EINVAL;
> }
> 
> mutex_lock(>smu_lock);
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c
> b/drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c
> index c1c51c115e57..70f7f47a2fcf 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c
> @@ -76,7 +76,7 @@ int phm_set_power_state(struct pp_hwmgr *hwmgr,
> int phm_enable_dynamic_state_management(struct pp_hwmgr *hwmgr)  {
> struct amdgpu_device *adev = NULL;
> -   int ret = -EINVAL;;
> +   int ret = -EINVAL;
> PHM_FUNC_CHECK(hwmgr);
> adev = hwmgr->adev;
> 
> --
> 2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH v2] tests/amdgpu: add deadlock test for sdma

2019-03-05 Thread Quan, Evan

Reviewed-and-tested-by: Evan Quan 

> -Original Message-
> From: amd-gfx  On Behalf Of Cui,
> Flora
> Sent: Wednesday, March 06, 2019 2:37 PM
> To: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Cc: Cui, Flora 
> Subject: [PATCH v2] tests/amdgpu: add deadlock test for sdma
> 
> deadlock test for sdma will cause gpu recoverty.
> disable the test for now until GPU reset recovery could survive at least
> 1000 times test.
> 
> v2: add modprobe parameter
> 
> Change-Id: I9adac63c62db22107345eddb30e7d81a1bda838c
> Signed-off-by: Flora Cui 
> ---
>  tests/amdgpu/amdgpu_test.c|   4 ++
>  tests/amdgpu/deadlock_tests.c | 103
> +-
>  2 files changed, 106 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/amdgpu/amdgpu_test.c b/tests/amdgpu/amdgpu_test.c
> index ebf4409..38b8a68 100644
> --- a/tests/amdgpu/amdgpu_test.c
> +++ b/tests/amdgpu/amdgpu_test.c
> @@ -426,6 +426,10 @@ static void amdgpu_disable_suites()
>   "compute ring block test (set
> amdgpu.lockup_timeout=50)", CU_FALSE))
>   fprintf(stderr, "test deactivation failed - %s\n",
> CU_get_error_msg());
> 
> + if (amdgpu_set_test_active(DEADLOCK_TESTS_STR,
> + "sdma ring block test (set
> amdgpu.lockup_timeout=50)", CU_FALSE))
> + fprintf(stderr, "test deactivation failed - %s\n",
> +CU_get_error_msg());
> +
>   if (amdgpu_set_test_active(BO_TESTS_STR, "Metadata", CU_FALSE))
>   fprintf(stderr, "test deactivation failed - %s\n",
> CU_get_error_msg());
> 
> diff --git a/tests/amdgpu/deadlock_tests.c
> b/tests/amdgpu/deadlock_tests.c index a6c2635..91368c1 100644
> --- a/tests/amdgpu/deadlock_tests.c
> +++ b/tests/amdgpu/deadlock_tests.c
> @@ -96,6 +96,9 @@
> 
>  #define mmVM_CONTEXT0_PAGE_TABLE_BASE_ADDR
> 0x54f
> 
> +#define SDMA_PKT_HEADER_OP(x)(x & 0xff)
> +#define SDMA_OP_POLL_REGMEM  8
> +
>  static  amdgpu_device_handle device_handle;  static  uint32_t
> major_version;  static  uint32_t  minor_version; @@ -110,6 +113,7 @@ static
> void amdgpu_deadlock_gfx(void);  static void
> amdgpu_deadlock_compute(void);  static void amdgpu_illegal_reg_access();
> static void amdgpu_illegal_mem_access();
> +static void amdgpu_deadlock_sdma(void);
> 
>  CU_BOOL suite_deadlock_tests_enable(void)  { @@ -171,6 +175,7 @@ int
> suite_deadlock_tests_clean(void)  CU_TestInfo deadlock_tests[] = {
>   { "gfx ring block test (set amdgpu.lockup_timeout=50)",
> amdgpu_deadlock_gfx },
>   { "compute ring block test (set amdgpu.lockup_timeout=50)",
> amdgpu_deadlock_compute },
> + { "sdma ring block test (set amdgpu.lockup_timeout=50)",
> +amdgpu_deadlock_sdma },
>   { "illegal reg access test", amdgpu_illegal_reg_access },
>   { "illegal mem access test (set amdgpu.vm_fault_stop=2)",
> amdgpu_illegal_mem_access },
>   CU_TEST_INFO_NULL,
> @@ -260,7 +265,6 @@ static void amdgpu_deadlock_helper(unsigned
> ip_type)
>   ibs_request.ibs = _info;
>   ibs_request.resources = bo_list;
>   ibs_request.fence_info.handle = NULL;
> -
>   for (i = 0; i < 200; i++) {
>   r = amdgpu_cs_submit(context_handle, 0,_request, 1);
>   CU_ASSERT_EQUAL((r == 0 || r == -ECANCELED), 1); @@ -
> 291,6 +295,103 @@ static void amdgpu_deadlock_helper(unsigned ip_type)
>   CU_ASSERT_EQUAL(r, 0);
>  }
> 
> +static void amdgpu_deadlock_sdma(void)
> +{
> + amdgpu_context_handle context_handle;
> + amdgpu_bo_handle ib_result_handle;
> + void *ib_result_cpu;
> + uint64_t ib_result_mc_address;
> + struct amdgpu_cs_request ibs_request;
> + struct amdgpu_cs_ib_info ib_info;
> + struct amdgpu_cs_fence fence_status;
> + uint32_t expired;
> + int i, r;
> + amdgpu_bo_list_handle bo_list;
> + amdgpu_va_handle va_handle;
> + struct drm_amdgpu_info_hw_ip info;
> + uint32_t ring_id;
> +
> + r = amdgpu_query_hw_ip_info(device_handle,
> AMDGPU_HW_IP_DMA, 0, );
> + CU_ASSERT_EQUAL(r, 0);
> +
> + r = amdgpu_cs_ctx_create(device_handle, _handle);
> + CU_ASSERT_EQUAL(r, 0);
> +
> + for (ring_id = 0; (1 << ring_id) & info.available_rings; ring_id++) {
> + r = pthread_create(_thread, NULL,
> write_mem_address, NULL);
> + CU_ASSERT_EQUAL(r, 0);
> +
> + r = amdgpu_bo_alloc_and_map_raw(device_handle, 4096,
> 4096,
> + AMDGPU_GEM_DOMAIN_GTT, 0,
> use_uc_mtype ? AMDGPU_VM_MTYPE_UC : 0,
> + _result_handle,
> _result_cpu,
> +
> _result_mc_address, _handle);
> + CU_ASSERT_EQUAL(r, 0);
> +
> + r = amdgpu_get_bo_list(device_handle, ib_result_handle,
> NULL,
> +_list);
> + CU_ASSERT_EQUAL(r, 0);
> +
> + ptr = ib_result_cpu;
> + i = 0;
> +
> + ptr[i++] =
> SDMA_PKT_HEADER_OP(SDMA_OP_POLL_REGMEM) |
> +

Re: [radeon-alex:drm-next-4.18-wip 44/78] drivers/gpu/drm/amd/amdgpu/soc15.c:680:3-24: duplicated argument to & or | (fwd)

2018-05-27 Thread Quan, Evan

Thanks Julia.


It's a typo. And a patch had been sent out to fix it.


Regards,

Evan


From: Julia Lawall <julia.law...@lip6.fr>
Sent: Saturday, May 19, 2018 4:50:04 AM
To: Quan, Evan
Cc: Deucher, Alexander; dri-devel@lists.freedesktop.org; kbuild-...@01.org; 
Koenig, Christian; Huang, Ray
Subject: [radeon-alex:drm-next-4.18-wip 44/78] 
drivers/gpu/drm/amd/amdgpu/soc15.c:680:3-24: duplicated argument to & or | (fwd)

Lines 680 and 682 contain the same constant.

julia

-- Forwarded message --
Date: Sat, 19 May 2018 04:22:09 +0800
From: kbuild test robot <l...@intel.com>
To: kbu...@01.org
Cc: Julia Lawall <julia.law...@lip6.fr>
Subject: [radeon-alex:drm-next-4.18-wip 44/78]
drivers/gpu/drm/amd/amdgpu/soc15.c:680:3-24: duplicated argument to & or |

CC: kbuild-...@01.org
CC: dri-devel@lists.freedesktop.org
TO: Evan Quan <evan.q...@amd.com>
CC: Alex Deucher <alexander.deuc...@amd.com>
CC: "Christian König" <christian.koe...@amd.com>
CC: Huang Rui <ray.hu...@amd.com>

tree:   git://people.freedesktop.org/~agd5f/linux.git drm-next-4.18-wip
head:   aa1bce17d841a362d40da940487e13affe4c7b3b
commit: 27eab5ad57b2da7803bb3e5d61c666b52b57d6f6 [44/78] drm/amd/powerplay: 
update vega20 cg flags
:: branch date: 26 hours ago
:: commit date: 29 hours ago

>> drivers/gpu/drm/amd/amdgpu/soc15.c:680:3-24: duplicated argument to & or |

git remote add radeon-alex git://people.freedesktop.org/~agd5f/linux.git
git remote update radeon-alex
git checkout 27eab5ad57b2da7803bb3e5d61c666b52b57d6f6
vim +680 drivers/gpu/drm/amd/amdgpu/soc15.c

220ab9bd1 Ken Wang  2017-03-06  599
220ab9bd1 Ken Wang  2017-03-06  600  static int 
soc15_common_early_init(void *handle)
220ab9bd1 Ken Wang  2017-03-06  601  {
220ab9bd1 Ken Wang  2017-03-06  602  struct amdgpu_device *adev = 
(struct amdgpu_device *)handle;
220ab9bd1 Ken Wang  2017-03-06  603
220ab9bd1 Ken Wang  2017-03-06  604  adev->smc_rreg = NULL;
220ab9bd1 Ken Wang  2017-03-06  605  adev->smc_wreg = NULL;
220ab9bd1 Ken Wang  2017-03-06  606  adev->pcie_rreg = 
_pcie_rreg;
220ab9bd1 Ken Wang  2017-03-06  607  adev->pcie_wreg = 
_pcie_wreg;
220ab9bd1 Ken Wang  2017-03-06  608  adev->uvd_ctx_rreg = 
_uvd_ctx_rreg;
220ab9bd1 Ken Wang  2017-03-06  609  adev->uvd_ctx_wreg = 
_uvd_ctx_wreg;
220ab9bd1 Ken Wang  2017-03-06  610  adev->didt_rreg = 
_didt_rreg;
220ab9bd1 Ken Wang  2017-03-06  611  adev->didt_wreg = 
_didt_wreg;
560460f28 Evan Quan 2017-07-03  612  adev->gc_cac_rreg = 
_gc_cac_rreg;
560460f28 Evan Quan 2017-07-03  613  adev->gc_cac_wreg = 
_gc_cac_wreg;
2f11fb028 Evan Quan 2017-07-04  614  adev->se_cac_rreg = 
_se_cac_rreg;
2f11fb028 Evan Quan 2017-07-04  615  adev->se_cac_wreg = 
_se_cac_wreg;
220ab9bd1 Ken Wang  2017-03-06  616
220ab9bd1 Ken Wang  2017-03-06  617  adev->asic_funcs = 
_asic_funcs;
220ab9bd1 Ken Wang  2017-03-06  618
220ab9bd1 Ken Wang  2017-03-06  619  adev->rev_id = 
soc15_get_rev_id(adev);
220ab9bd1 Ken Wang  2017-03-06  620  adev->external_rev_id = 0xFF;
220ab9bd1 Ken Wang  2017-03-06  621  switch (adev->asic_type) {
220ab9bd1 Ken Wang  2017-03-06  622  case CHIP_VEGA10:
220ab9bd1 Ken Wang  2017-03-06  623  adev->cg_flags = 
AMD_CG_SUPPORT_GFX_MGCG |
220ab9bd1 Ken Wang  2017-03-06  624  
AMD_CG_SUPPORT_GFX_MGLS |
220ab9bd1 Ken Wang  2017-03-06  625  
AMD_CG_SUPPORT_GFX_RLC_LS |
220ab9bd1 Ken Wang  2017-03-06  626  
AMD_CG_SUPPORT_GFX_CP_LS |
220ab9bd1 Ken Wang  2017-03-06  627  
AMD_CG_SUPPORT_GFX_3D_CGCG |
220ab9bd1 Ken Wang  2017-03-06  628  
AMD_CG_SUPPORT_GFX_3D_CGLS |
220ab9bd1 Ken Wang  2017-03-06  629  
AMD_CG_SUPPORT_GFX_CGCG |
220ab9bd1 Ken Wang  2017-03-06  630  
AMD_CG_SUPPORT_GFX_CGLS |
220ab9bd1 Ken Wang  2017-03-06  631  
AMD_CG_SUPPORT_BIF_MGCG |
220ab9bd1 Ken Wang  2017-03-06  632  
AMD_CG_SUPPORT_BIF_LS |
220ab9bd1 Ken Wang  2017-03-06  633  
AMD_CG_SUPPORT_HDP_LS |
220ab9bd1 Ken Wang  2017-03-06  634  
AMD_CG_SUPPORT_DRM_MGCG |
220ab9bd1 Ken Wang  2017-03-06  635  
AMD_CG_SUPPORT_DRM_LS |
220ab9bd1 Ken Wang  2017-03-06  636  
AMD_CG_SUPPORT_ROM_MGCG |
220ab9bd1 Ken Wang  2017-03-06  637  
AMD_CG_SUPPORT_DF_MGCG |
220ab9bd1 Ken Wang  2017-03-06  638  
AMD_CG_SUPPORT_SDMA_MGCG |
220ab9bd1

81 matches

Mail list logo