Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-30 Thread Sharma, Shashank




On 9/30/2022 11:54 AM, Lazar, Lijo wrote:



On 9/30/2022 2:52 PM, Sharma, Shashank wrote:



On 9/30/2022 11:13 AM, Lazar, Lijo wrote:



On 9/30/2022 2:07 PM, Sharma, Shashank wrote:



On 9/30/2022 7:08 AM, Lazar, Lijo wrote:



On 9/30/2022 12:02 AM, Alex Deucher wrote:
On Thu, Sep 29, 2022 at 10:14 AM Lazar, Lijo  
wrote:




On 9/29/2022 7:30 PM, Sharma, Shashank wrote:



On 9/29/2022 3:37 PM, Lazar, Lijo wrote:

To be clear your understanding -

Nothing is automatic in PMFW. PMFW picks a priority based on the
actual mask sent by driver.

Assuming lower bits corresponds to highest priority -

If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose
profile that corresponds to Bit0. If driver sends a mask with Bit4
Bit2 set and rest unset, PMFW will chose profile that 
corresponds to
Bit2. However if driver sends a mask only with a single bit 
set, it
chooses the profile regardless of whatever was the previous 
profile. t
doesn't check if the existing profile > newly requested one. 
That is

the behavior.

So if a job send chooses a profile that corresponds to Bit0, 
driver

will send that. Next time if another job chooses a profile that
corresponds to Bit1, PMFW will receive that as the new profile and
switch to that. It trusts the driver to send the proper 
workload mask.


Hope that gives the picture.




Thanks, my understanding is also similar, referring to the core 
power

switch profile function here:
amd_powerplay.c::pp_dpm_switch_power_profile()
*snip code*
hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]);
  index = fls(hwmgr->workload_mask);
  index = index <= Workload_Policy_Max ? index - 1 : 0;
  workload = hwmgr->workload_setting[index];
*snip_code*
hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, , 0);

Here I can see that the new workload mask is appended into the 
existing

workload mask (not overwritten). So if we keep sending new
workload_modes, they would be appended into the workload flags and
finally the PM will pick the most aggressive one of all these 
flags, as

per its policy.



Actually it's misleading -

The path for sienna is -
set_power_profile_mode -> sienna_cichlid_set_power_profile_mode


This code here is a picking one based on lookup table.

   workload_type = smu_cmn_to_asic_specific_index(smu,

CMN2ASIC_MAPPING_WORKLOAD,

smu->power_profile_mode);

This is that lookup table -

static struct cmn2asic_mapping
sienna_cichlid_workload_map[PP_SMC_POWER_PROFILE_COUNT] = {
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT,
WORKLOAD_PPLIB_DEFAULT_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_FULLSCREEN3D,
WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING,
WORKLOAD_PPLIB_POWER_SAVING_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO,
WORKLOAD_PPLIB_VIDEO_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR,
WORKLOAD_PPLIB_VR_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE,
WORKLOAD_PPLIB_COMPUTE_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM,
WORKLOAD_PPLIB_CUSTOM_BIT),
};


And this is the place of interaction with PMFW. (1 << 
workload_type) is

the mask being sent.

 smu_cmn_send_smc_msg_with_param(smu, 
SMU_MSG_SetWorkloadMask,

  1 << workload_type, NULL);

In the end, driver implementation expects only one bit to be set.


Shashank and I had a discussion about this today.  I think there 
are a

few thing we can do to handle this better:

1. Set a flag that if the user changes the default via sysfs that
overrides any runtime setting via an application since presumably 
that

is what the user wants and we won't change the hint at runtime.
2. Drop the GET API.  There's no need for this, the hint is just a 
hint.


Double checked again based on Felix's comments on API definition. 
Driver decides the priority instead of FW. That way we can still 
keep Get API.



2. Have the driver arbitrate between the available workload profiles
based on the numeric value of the hint (e.g., default < 3D < video <
VR < compute) as the higher values are more aggressive in most cases.
If requests come in for 3D and compute at the same time, the driver
will select compute because it's value is highest.  Each hint type
would be ref counted so we'd know what state to be in every time 
we go

to set the state.  If all of the clients requesting compute go away,
and only 3D requestors remain, we can switch to 3D.  If all refcounts
go to 0, we go back to default.  This will not require any change to
the current workload API in the SMU code.


Since PM layer decides priority, refcount can be kept at powerplay 
and swsmu layer instead of any higher level API.


User API may keep something like req_power_profile (for any 
logging/debug purpose) for the job preference.


No, I think there has been enough confusion around this 
implementation so far, we will implement this just as Alex/Felix 
suggested:

- No change will be done in 

Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-30 Thread Lazar, Lijo




On 9/30/2022 2:52 PM, Sharma, Shashank wrote:



On 9/30/2022 11:13 AM, Lazar, Lijo wrote:



On 9/30/2022 2:07 PM, Sharma, Shashank wrote:



On 9/30/2022 7:08 AM, Lazar, Lijo wrote:



On 9/30/2022 12:02 AM, Alex Deucher wrote:
On Thu, Sep 29, 2022 at 10:14 AM Lazar, Lijo  
wrote:




On 9/29/2022 7:30 PM, Sharma, Shashank wrote:



On 9/29/2022 3:37 PM, Lazar, Lijo wrote:

To be clear your understanding -

Nothing is automatic in PMFW. PMFW picks a priority based on the
actual mask sent by driver.

Assuming lower bits corresponds to highest priority -

If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose
profile that corresponds to Bit0. If driver sends a mask with Bit4
Bit2 set and rest unset, PMFW will chose profile that 
corresponds to

Bit2. However if driver sends a mask only with a single bit set, it
chooses the profile regardless of whatever was the previous 
profile. t
doesn't check if the existing profile > newly requested one. 
That is

the behavior.

So if a job send chooses a profile that corresponds to Bit0, driver
will send that. Next time if another job chooses a profile that
corresponds to Bit1, PMFW will receive that as the new profile and
switch to that. It trusts the driver to send the proper workload 
mask.


Hope that gives the picture.




Thanks, my understanding is also similar, referring to the core 
power

switch profile function here:
amd_powerplay.c::pp_dpm_switch_power_profile()
*snip code*
hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]);
  index = fls(hwmgr->workload_mask);
  index = index <= Workload_Policy_Max ? index - 1 : 0;
  workload = hwmgr->workload_setting[index];
*snip_code*
hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, , 0);

Here I can see that the new workload mask is appended into the 
existing

workload mask (not overwritten). So if we keep sending new
workload_modes, they would be appended into the workload flags and
finally the PM will pick the most aggressive one of all these 
flags, as

per its policy.



Actually it's misleading -

The path for sienna is -
set_power_profile_mode -> sienna_cichlid_set_power_profile_mode


This code here is a picking one based on lookup table.

   workload_type = smu_cmn_to_asic_specific_index(smu,

CMN2ASIC_MAPPING_WORKLOAD,

smu->power_profile_mode);

This is that lookup table -

static struct cmn2asic_mapping
sienna_cichlid_workload_map[PP_SMC_POWER_PROFILE_COUNT] = {
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT,
WORKLOAD_PPLIB_DEFAULT_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_FULLSCREEN3D,
WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING,
WORKLOAD_PPLIB_POWER_SAVING_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO,
WORKLOAD_PPLIB_VIDEO_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR,
WORKLOAD_PPLIB_VR_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE,
WORKLOAD_PPLIB_COMPUTE_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM,
WORKLOAD_PPLIB_CUSTOM_BIT),
};


And this is the place of interaction with PMFW. (1 << 
workload_type) is

the mask being sent.

 smu_cmn_send_smc_msg_with_param(smu, 
SMU_MSG_SetWorkloadMask,

  1 << workload_type, NULL);

In the end, driver implementation expects only one bit to be set.


Shashank and I had a discussion about this today.  I think there are a
few thing we can do to handle this better:

1. Set a flag that if the user changes the default via sysfs that
overrides any runtime setting via an application since presumably that
is what the user wants and we won't change the hint at runtime.
2. Drop the GET API.  There's no need for this, the hint is just a 
hint.


Double checked again based on Felix's comments on API definition. 
Driver decides the priority instead of FW. That way we can still 
keep Get API.



2. Have the driver arbitrate between the available workload profiles
based on the numeric value of the hint (e.g., default < 3D < video <
VR < compute) as the higher values are more aggressive in most cases.
If requests come in for 3D and compute at the same time, the driver
will select compute because it's value is highest.  Each hint type
would be ref counted so we'd know what state to be in every time we go
to set the state.  If all of the clients requesting compute go away,
and only 3D requestors remain, we can switch to 3D.  If all refcounts
go to 0, we go back to default.  This will not require any change to
the current workload API in the SMU code.


Since PM layer decides priority, refcount can be kept at powerplay 
and swsmu layer instead of any higher level API.


User API may keep something like req_power_profile (for any 
logging/debug purpose) for the job preference.


No, I think there has been enough confusion around this 
implementation so far, we will implement this just as Alex/Felix 
suggested:

- No change will be done in pm/SMU layer.


Well, a confusion doesn't justify 

Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-30 Thread Sharma, Shashank




On 9/30/2022 11:13 AM, Lazar, Lijo wrote:



On 9/30/2022 2:07 PM, Sharma, Shashank wrote:



On 9/30/2022 7:08 AM, Lazar, Lijo wrote:



On 9/30/2022 12:02 AM, Alex Deucher wrote:
On Thu, Sep 29, 2022 at 10:14 AM Lazar, Lijo  
wrote:




On 9/29/2022 7:30 PM, Sharma, Shashank wrote:



On 9/29/2022 3:37 PM, Lazar, Lijo wrote:

To be clear your understanding -

Nothing is automatic in PMFW. PMFW picks a priority based on the
actual mask sent by driver.

Assuming lower bits corresponds to highest priority -

If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose
profile that corresponds to Bit0. If driver sends a mask with Bit4
Bit2 set and rest unset, PMFW will chose profile that corresponds to
Bit2. However if driver sends a mask only with a single bit set, it
chooses the profile regardless of whatever was the previous 
profile. t

doesn't check if the existing profile > newly requested one. That is
the behavior.

So if a job send chooses a profile that corresponds to Bit0, driver
will send that. Next time if another job chooses a profile that
corresponds to Bit1, PMFW will receive that as the new profile and
switch to that. It trusts the driver to send the proper workload 
mask.


Hope that gives the picture.




Thanks, my understanding is also similar, referring to the core power
switch profile function here:
amd_powerplay.c::pp_dpm_switch_power_profile()
*snip code*
hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]);
  index = fls(hwmgr->workload_mask);
  index = index <= Workload_Policy_Max ? index - 1 : 0;
  workload = hwmgr->workload_setting[index];
*snip_code*
hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, , 0);

Here I can see that the new workload mask is appended into the 
existing

workload mask (not overwritten). So if we keep sending new
workload_modes, they would be appended into the workload flags and
finally the PM will pick the most aggressive one of all these 
flags, as

per its policy.



Actually it's misleading -

The path for sienna is -
set_power_profile_mode -> sienna_cichlid_set_power_profile_mode


This code here is a picking one based on lookup table.

   workload_type = smu_cmn_to_asic_specific_index(smu,

CMN2ASIC_MAPPING_WORKLOAD,

smu->power_profile_mode);

This is that lookup table -

static struct cmn2asic_mapping
sienna_cichlid_workload_map[PP_SMC_POWER_PROFILE_COUNT] = {
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT,
WORKLOAD_PPLIB_DEFAULT_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_FULLSCREEN3D,
WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING,
WORKLOAD_PPLIB_POWER_SAVING_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO,
WORKLOAD_PPLIB_VIDEO_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR,
WORKLOAD_PPLIB_VR_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE,
WORKLOAD_PPLIB_COMPUTE_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM,
WORKLOAD_PPLIB_CUSTOM_BIT),
};


And this is the place of interaction with PMFW. (1 << 
workload_type) is

the mask being sent.

 smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask,
  1 << workload_type, NULL);

In the end, driver implementation expects only one bit to be set.


Shashank and I had a discussion about this today.  I think there are a
few thing we can do to handle this better:

1. Set a flag that if the user changes the default via sysfs that
overrides any runtime setting via an application since presumably that
is what the user wants and we won't change the hint at runtime.
2. Drop the GET API.  There's no need for this, the hint is just a 
hint.


Double checked again based on Felix's comments on API definition. 
Driver decides the priority instead of FW. That way we can still keep 
Get API.



2. Have the driver arbitrate between the available workload profiles
based on the numeric value of the hint (e.g., default < 3D < video <
VR < compute) as the higher values are more aggressive in most cases.
If requests come in for 3D and compute at the same time, the driver
will select compute because it's value is highest.  Each hint type
would be ref counted so we'd know what state to be in every time we go
to set the state.  If all of the clients requesting compute go away,
and only 3D requestors remain, we can switch to 3D.  If all refcounts
go to 0, we go back to default.  This will not require any change to
the current workload API in the SMU code.


Since PM layer decides priority, refcount can be kept at powerplay 
and swsmu layer instead of any higher level API.


User API may keep something like req_power_profile (for any 
logging/debug purpose) for the job preference.


No, I think there has been enough confusion around this implementation 
so far, we will implement this just as Alex/Felix suggested:

- No change will be done in pm/SMU layer.


Well, a confusion doesn't justify bad implementation. You could just 
keep the refcount in 

Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-30 Thread Lazar, Lijo




On 9/30/2022 2:07 PM, Sharma, Shashank wrote:



On 9/30/2022 7:08 AM, Lazar, Lijo wrote:



On 9/30/2022 12:02 AM, Alex Deucher wrote:

On Thu, Sep 29, 2022 at 10:14 AM Lazar, Lijo  wrote:




On 9/29/2022 7:30 PM, Sharma, Shashank wrote:



On 9/29/2022 3:37 PM, Lazar, Lijo wrote:

To be clear your understanding -

Nothing is automatic in PMFW. PMFW picks a priority based on the
actual mask sent by driver.

Assuming lower bits corresponds to highest priority -

If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose
profile that corresponds to Bit0. If driver sends a mask with Bit4
Bit2 set and rest unset, PMFW will chose profile that corresponds to
Bit2. However if driver sends a mask only with a single bit set, it
chooses the profile regardless of whatever was the previous 
profile. t

doesn't check if the existing profile > newly requested one. That is
the behavior.

So if a job send chooses a profile that corresponds to Bit0, driver
will send that. Next time if another job chooses a profile that
corresponds to Bit1, PMFW will receive that as the new profile and
switch to that. It trusts the driver to send the proper workload 
mask.


Hope that gives the picture.




Thanks, my understanding is also similar, referring to the core power
switch profile function here:
amd_powerplay.c::pp_dpm_switch_power_profile()
*snip code*
hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]);
  index = fls(hwmgr->workload_mask);
  index = index <= Workload_Policy_Max ? index - 1 : 0;
  workload = hwmgr->workload_setting[index];
*snip_code*
hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, , 0);

Here I can see that the new workload mask is appended into the 
existing

workload mask (not overwritten). So if we keep sending new
workload_modes, they would be appended into the workload flags and
finally the PM will pick the most aggressive one of all these 
flags, as

per its policy.



Actually it's misleading -

The path for sienna is -
set_power_profile_mode -> sienna_cichlid_set_power_profile_mode


This code here is a picking one based on lookup table.

   workload_type = smu_cmn_to_asic_specific_index(smu,

CMN2ASIC_MAPPING_WORKLOAD,

smu->power_profile_mode);

This is that lookup table -

static struct cmn2asic_mapping
sienna_cichlid_workload_map[PP_SMC_POWER_PROFILE_COUNT] = {
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT,
WORKLOAD_PPLIB_DEFAULT_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_FULLSCREEN3D,
WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING,
WORKLOAD_PPLIB_POWER_SAVING_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO,
WORKLOAD_PPLIB_VIDEO_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR,
WORKLOAD_PPLIB_VR_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE,
WORKLOAD_PPLIB_COMPUTE_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM,
WORKLOAD_PPLIB_CUSTOM_BIT),
};


And this is the place of interaction with PMFW. (1 << workload_type) is
the mask being sent.

 smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask,
  1 << workload_type, NULL);

In the end, driver implementation expects only one bit to be set.


Shashank and I had a discussion about this today.  I think there are a
few thing we can do to handle this better:

1. Set a flag that if the user changes the default via sysfs that
overrides any runtime setting via an application since presumably that
is what the user wants and we won't change the hint at runtime.
2. Drop the GET API.  There's no need for this, the hint is just a hint.


Double checked again based on Felix's comments on API definition. 
Driver decides the priority instead of FW. That way we can still keep 
Get API.



2. Have the driver arbitrate between the available workload profiles
based on the numeric value of the hint (e.g., default < 3D < video <
VR < compute) as the higher values are more aggressive in most cases.
If requests come in for 3D and compute at the same time, the driver
will select compute because it's value is highest.  Each hint type
would be ref counted so we'd know what state to be in every time we go
to set the state.  If all of the clients requesting compute go away,
and only 3D requestors remain, we can switch to 3D.  If all refcounts
go to 0, we go back to default.  This will not require any change to
the current workload API in the SMU code.


Since PM layer decides priority, refcount can be kept at powerplay and 
swsmu layer instead of any higher level API.


User API may keep something like req_power_profile (for any 
logging/debug purpose) for the job preference.


No, I think there has been enough confusion around this implementation 
so far, we will implement this just as Alex/Felix suggested:

- No change will be done in pm/SMU layer.


Well, a confusion doesn't justify bad implementation. You could just 
keep the refcount in workload_setting.


Another API that uses power 

Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-30 Thread Sharma, Shashank




On 9/30/2022 7:08 AM, Lazar, Lijo wrote:



On 9/30/2022 12:02 AM, Alex Deucher wrote:

On Thu, Sep 29, 2022 at 10:14 AM Lazar, Lijo  wrote:




On 9/29/2022 7:30 PM, Sharma, Shashank wrote:



On 9/29/2022 3:37 PM, Lazar, Lijo wrote:

To be clear your understanding -

Nothing is automatic in PMFW. PMFW picks a priority based on the
actual mask sent by driver.

Assuming lower bits corresponds to highest priority -

If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose
profile that corresponds to Bit0. If driver sends a mask with Bit4
Bit2 set and rest unset, PMFW will chose profile that corresponds to
Bit2. However if driver sends a mask only with a single bit set, it
chooses the profile regardless of whatever was the previous profile. t
doesn't check if the existing profile > newly requested one. That is
the behavior.

So if a job send chooses a profile that corresponds to Bit0, driver
will send that. Next time if another job chooses a profile that
corresponds to Bit1, PMFW will receive that as the new profile and
switch to that. It trusts the driver to send the proper workload mask.

Hope that gives the picture.




Thanks, my understanding is also similar, referring to the core power
switch profile function here:
amd_powerplay.c::pp_dpm_switch_power_profile()
*snip code*
hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]);
  index = fls(hwmgr->workload_mask);
  index = index <= Workload_Policy_Max ? index - 1 : 0;
  workload = hwmgr->workload_setting[index];
*snip_code*
hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, , 0);

Here I can see that the new workload mask is appended into the existing
workload mask (not overwritten). So if we keep sending new
workload_modes, they would be appended into the workload flags and
finally the PM will pick the most aggressive one of all these flags, as
per its policy.



Actually it's misleading -

The path for sienna is -
set_power_profile_mode -> sienna_cichlid_set_power_profile_mode


This code here is a picking one based on lookup table.

   workload_type = smu_cmn_to_asic_specific_index(smu,

CMN2ASIC_MAPPING_WORKLOAD,

smu->power_profile_mode);

This is that lookup table -

static struct cmn2asic_mapping
sienna_cichlid_workload_map[PP_SMC_POWER_PROFILE_COUNT] = {
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT,
WORKLOAD_PPLIB_DEFAULT_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_FULLSCREEN3D,
WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING,
WORKLOAD_PPLIB_POWER_SAVING_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO,
WORKLOAD_PPLIB_VIDEO_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR,
WORKLOAD_PPLIB_VR_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE,
WORKLOAD_PPLIB_COMPUTE_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM,
WORKLOAD_PPLIB_CUSTOM_BIT),
};


And this is the place of interaction with PMFW. (1 << workload_type) is
the mask being sent.

 smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask,
  1 << workload_type, NULL);

In the end, driver implementation expects only one bit to be set.


Shashank and I had a discussion about this today.  I think there are a
few thing we can do to handle this better:

1. Set a flag that if the user changes the default via sysfs that
overrides any runtime setting via an application since presumably that
is what the user wants and we won't change the hint at runtime.
2. Drop the GET API.  There's no need for this, the hint is just a hint.


Double checked again based on Felix's comments on API definition. Driver 
decides the priority instead of FW. That way we can still keep Get API.



2. Have the driver arbitrate between the available workload profiles
based on the numeric value of the hint (e.g., default < 3D < video <
VR < compute) as the higher values are more aggressive in most cases.
If requests come in for 3D and compute at the same time, the driver
will select compute because it's value is highest.  Each hint type
would be ref counted so we'd know what state to be in every time we go
to set the state.  If all of the clients requesting compute go away,
and only 3D requestors remain, we can switch to 3D.  If all refcounts
go to 0, we go back to default.  This will not require any change to
the current workload API in the SMU code.


Since PM layer decides priority, refcount can be kept at powerplay and 
swsmu layer instead of any higher level API.


User API may keep something like req_power_profile (for any 
logging/debug purpose) for the job preference.


No, I think there has been enough confusion around this implementation 
so far, we will implement this just as Alex/Felix suggested:

- No change will be done in pm/SMU layer.
- The amdgpu_context_workload layer will keep the ref_counting and 
user_workload_hint management, and it will just call and consume the 
pm_switch_workload profile() like any other client.
- We will 

Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-29 Thread Lazar, Lijo




On 9/30/2022 12:02 AM, Alex Deucher wrote:

On Thu, Sep 29, 2022 at 10:14 AM Lazar, Lijo  wrote:




On 9/29/2022 7:30 PM, Sharma, Shashank wrote:



On 9/29/2022 3:37 PM, Lazar, Lijo wrote:

To be clear your understanding -

Nothing is automatic in PMFW. PMFW picks a priority based on the
actual mask sent by driver.

Assuming lower bits corresponds to highest priority -

If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose
profile that corresponds to Bit0. If driver sends a mask with Bit4
Bit2 set and rest unset, PMFW will chose profile that corresponds to
Bit2. However if driver sends a mask only with a single bit set, it
chooses the profile regardless of whatever was the previous profile. t
doesn't check if the existing profile > newly requested one. That is
the behavior.

So if a job send chooses a profile that corresponds to Bit0, driver
will send that. Next time if another job chooses a profile that
corresponds to Bit1, PMFW will receive that as the new profile and
switch to that. It trusts the driver to send the proper workload mask.

Hope that gives the picture.




Thanks, my understanding is also similar, referring to the core power
switch profile function here:
amd_powerplay.c::pp_dpm_switch_power_profile()
*snip code*
hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]);
  index = fls(hwmgr->workload_mask);
  index = index <= Workload_Policy_Max ? index - 1 : 0;
  workload = hwmgr->workload_setting[index];
*snip_code*
hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, , 0);

Here I can see that the new workload mask is appended into the existing
workload mask (not overwritten). So if we keep sending new
workload_modes, they would be appended into the workload flags and
finally the PM will pick the most aggressive one of all these flags, as
per its policy.



Actually it's misleading -

The path for sienna is -
set_power_profile_mode -> sienna_cichlid_set_power_profile_mode


This code here is a picking one based on lookup table.

   workload_type = smu_cmn_to_asic_specific_index(smu,

CMN2ASIC_MAPPING_WORKLOAD,

smu->power_profile_mode);

This is that lookup table -

static struct cmn2asic_mapping
sienna_cichlid_workload_map[PP_SMC_POWER_PROFILE_COUNT] = {
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT,
WORKLOAD_PPLIB_DEFAULT_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_FULLSCREEN3D,
WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING,
WORKLOAD_PPLIB_POWER_SAVING_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO,
WORKLOAD_PPLIB_VIDEO_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR,
WORKLOAD_PPLIB_VR_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE,
WORKLOAD_PPLIB_COMPUTE_BIT),
  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM,
WORKLOAD_PPLIB_CUSTOM_BIT),
};


And this is the place of interaction with PMFW. (1 << workload_type) is
the mask being sent.

 smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask,
  1 << workload_type, NULL);

In the end, driver implementation expects only one bit to be set.


Shashank and I had a discussion about this today.  I think there are a
few thing we can do to handle this better:

1. Set a flag that if the user changes the default via sysfs that
overrides any runtime setting via an application since presumably that
is what the user wants and we won't change the hint at runtime.
2. Drop the GET API.  There's no need for this, the hint is just a hint.


Double checked again based on Felix's comments on API definition. Driver 
decides the priority instead of FW. That way we can still keep Get API.



2. Have the driver arbitrate between the available workload profiles
based on the numeric value of the hint (e.g., default < 3D < video <
VR < compute) as the higher values are more aggressive in most cases.
If requests come in for 3D and compute at the same time, the driver
will select compute because it's value is highest.  Each hint type
would be ref counted so we'd know what state to be in every time we go
to set the state.  If all of the clients requesting compute go away,
and only 3D requestors remain, we can switch to 3D.  If all refcounts
go to 0, we go back to default.  This will not require any change to
the current workload API in the SMU code.


Since PM layer decides priority, refcount can be kept at powerplay and 
swsmu layer instead of any higher level API.


User API may keep something like req_power_profile (for any 
logging/debug purpose) for the job preference.


Thanks,
Lijo



Alex



Thanks,
Lijo


Now, when we have a single workload:
-> Job1: requests profile P1 via UAPI, ref count = 1
-> driver sends flags for p1
-> PM FW applies profile P1
-> Job executes in profile P1
-> Job goes to reset function, ref_count = 0,
-> Power profile resets

Now, we have conflicts only when we see multiple workloads (Job1 and Job 2)
-> Job1: requests profile P1 via UAPI, ref count = 1
-> 

Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-29 Thread Lazar, Lijo




On 9/29/2022 11:37 PM, Felix Kuehling wrote:

On 2022-09-29 07:10, Lazar, Lijo wrote:



On 9/29/2022 2:18 PM, Sharma, Shashank wrote:



On 9/28/2022 11:51 PM, Alex Deucher wrote:

On Wed, Sep 28, 2022 at 4:57 AM Sharma, Shashank
 wrote:




On 9/27/2022 10:40 PM, Alex Deucher wrote:

On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank
 wrote:




On 9/27/2022 5:23 PM, Felix Kuehling wrote:

Am 2022-09-27 um 10:58 schrieb Sharma, Shashank:

Hello Felix,

Thank for the review comments.

On 9/27/2022 4:48 PM, Felix Kuehling wrote:

Am 2022-09-27 um 02:12 schrieb Christian König:

Am 26.09.22 um 23:40 schrieb Shashank Sharma:

This patch switches the GPU workload mode to/from
compute mode, while submitting compute workload.

Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 


Feel free to add my acked-by, but Felix should probably take 
a look

as well.


This look OK purely from a compute perspective. But I'm concerned
about the interaction of compute with graphics or multiple 
graphics
contexts submitting work concurrently. They would constantly 
override

or disable each other's workload hints.

For example, you have an amdgpu_ctx with
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD
process that also wants the compute profile. Those could be 
different
processes belonging to different users. Say, KFD enables the 
compute
profile first. Then the graphics context submits a job. At the 
start
of the job, the compute profile is enabled. That's a no-op 
because
KFD already enabled the compute profile. When the job 
finishes, it

disables the compute profile for everyone, including KFD. That's
unexpected.



In this case, it will not disable the compute profile, as the
reference counter will not be zero. The reset_profile() will 
only act

if the reference counter is 0.


OK, I missed the reference counter.




But I would be happy to get any inputs about a policy which can be
more sustainable and gets better outputs, for example:
- should we not allow a profile change, if a PP mode is already
applied and keep it Early bird basis ?

For example: Policy A
- Job A sets the profile to compute
- Job B tries to set profile to 3D, but we do not allow it as 
job A is

not finished it yet.

Or Policy B: Current one
- Job A sets the profile to compute
- Job B tries to set profile to 3D, and we allow it. Job A also 
runs

in PP 3D
- Job B finishes, but does not reset PP as reference count is 
not zero

due to compute
- Job  A finishes, profile reset to NONE


I think this won't work. As I understand it, the
amdgpu_dpm_switch_power_profile enables and disables individual
profiles. Disabling the 3D profile doesn't disable the compute 
profile

at the same time. I think you'll need one refcount per profile.

Regards,
 Felix


Thanks, This is exactly what I was looking for, I think Alex's 
initial
idea was around it, but I was under the assumption that there is 
only
one HW profile in SMU which keeps on getting overwritten. This 
can solve
our problems, as I can create an array of reference counters, and 
will

disable only the profile whose reference counter goes 0.


It's been a while since I paged any of this code into my head, but I
believe the actual workload message in the SMU is a mask where you 
can

specify multiple workload types at the same time and the SMU will
arbitrate between them internally.  E.g., the most aggressive one 
will

be selected out of the ones specified.  I think in the driver we just
set one bit at a time using the current interface.  It might be 
better

to change the interface and just ref count the hint types and then
when we call the set function look at the ref counts for each hint
type and set the mask as appropriate.

Alex



Hey Alex,
Thanks for your comment, if that is the case, this current patch 
series
works straight forward, and no changes would be required. Please 
let me

know if my understanding is correct:

Assumption: Order of aggression: 3D > Media > Compute

- Job 1: Requests mode compute: PP changed to compute, ref count 1
- Job 2: Requests mode media: PP changed to media, ref count 2
- Job 3: requests mode 3D: PP changed to 3D, ref count 3
- Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref 
> 0,

PP still 3D
- Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref 
> 0,

PP still 3D
- Job 2 finishes, downs ref count to 0, PP changed to NONE,

In this way, every job will be operating in the Power profile of 
desired
aggression or higher, and this API guarantees the execution 
at-least in

the desired power profile.


I'm not entirely sure on the relative levels of aggression, but I
believe the SMU priorities them by index.  E.g.
#define WORKLOAD_PPLIB_DEFAULT_BIT    0
#define WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT 1
#define WORKLOAD_PPLIB_POWER_SAVING_BIT   2
#define WORKLOAD_PPLIB_VIDEO_BIT  3
#define WORKLOAD_PPLIB_VR_BIT 4
#define WORKLOAD_PPLIB_COMPUTE_BIT    5
#define 

Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-29 Thread Alex Deucher
On Thu, Sep 29, 2022 at 10:14 AM Lazar, Lijo  wrote:
>
>
>
> On 9/29/2022 7:30 PM, Sharma, Shashank wrote:
> >
> >
> > On 9/29/2022 3:37 PM, Lazar, Lijo wrote:
> >> To be clear your understanding -
> >>
> >> Nothing is automatic in PMFW. PMFW picks a priority based on the
> >> actual mask sent by driver.
> >>
> >> Assuming lower bits corresponds to highest priority -
> >>
> >> If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose
> >> profile that corresponds to Bit0. If driver sends a mask with Bit4
> >> Bit2 set and rest unset, PMFW will chose profile that corresponds to
> >> Bit2. However if driver sends a mask only with a single bit set, it
> >> chooses the profile regardless of whatever was the previous profile. t
> >> doesn't check if the existing profile > newly requested one. That is
> >> the behavior.
> >>
> >> So if a job send chooses a profile that corresponds to Bit0, driver
> >> will send that. Next time if another job chooses a profile that
> >> corresponds to Bit1, PMFW will receive that as the new profile and
> >> switch to that. It trusts the driver to send the proper workload mask.
> >>
> >> Hope that gives the picture.
> >>
> >
> >
> > Thanks, my understanding is also similar, referring to the core power
> > switch profile function here:
> > amd_powerplay.c::pp_dpm_switch_power_profile()
> > *snip code*
> > hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]);
> >  index = fls(hwmgr->workload_mask);
> >  index = index <= Workload_Policy_Max ? index - 1 : 0;
> >  workload = hwmgr->workload_setting[index];
> > *snip_code*
> > hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, , 0);
> >
> > Here I can see that the new workload mask is appended into the existing
> > workload mask (not overwritten). So if we keep sending new
> > workload_modes, they would be appended into the workload flags and
> > finally the PM will pick the most aggressive one of all these flags, as
> > per its policy.
> >
>
> Actually it's misleading -
>
> The path for sienna is -
> set_power_profile_mode -> sienna_cichlid_set_power_profile_mode
>
>
> This code here is a picking one based on lookup table.
>
>   workload_type = smu_cmn_to_asic_specific_index(smu,
>
> CMN2ASIC_MAPPING_WORKLOAD,
>
> smu->power_profile_mode);
>
> This is that lookup table -
>
> static struct cmn2asic_mapping
> sienna_cichlid_workload_map[PP_SMC_POWER_PROFILE_COUNT] = {
>  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT,
> WORKLOAD_PPLIB_DEFAULT_BIT),
>  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_FULLSCREEN3D,
> WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT),
>  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING,
> WORKLOAD_PPLIB_POWER_SAVING_BIT),
>  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO,
> WORKLOAD_PPLIB_VIDEO_BIT),
>  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR,
> WORKLOAD_PPLIB_VR_BIT),
>  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE,
> WORKLOAD_PPLIB_COMPUTE_BIT),
>  WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM,
> WORKLOAD_PPLIB_CUSTOM_BIT),
> };
>
>
> And this is the place of interaction with PMFW. (1 << workload_type) is
> the mask being sent.
>
> smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask,
>  1 << workload_type, NULL);
>
> In the end, driver implementation expects only one bit to be set.

Shashank and I had a discussion about this today.  I think there are a
few thing we can do to handle this better:

1. Set a flag that if the user changes the default via sysfs that
overrides any runtime setting via an application since presumably that
is what the user wants and we won't change the hint at runtime.
2. Drop the GET API.  There's no need for this, the hint is just a hint.
2. Have the driver arbitrate between the available workload profiles
based on the numeric value of the hint (e.g., default < 3D < video <
VR < compute) as the higher values are more aggressive in most cases.
If requests come in for 3D and compute at the same time, the driver
will select compute because it's value is highest.  Each hint type
would be ref counted so we'd know what state to be in every time we go
to set the state.  If all of the clients requesting compute go away,
and only 3D requestors remain, we can switch to 3D.  If all refcounts
go to 0, we go back to default.  This will not require any change to
the current workload API in the SMU code.

Alex

>
> Thanks,
> Lijo
>
> > Now, when we have a single workload:
> > -> Job1: requests profile P1 via UAPI, ref count = 1
> > -> driver sends flags for p1
> > -> PM FW applies profile P1
> > -> Job executes in profile P1
> > -> Job goes to reset function, ref_count = 0,
> > -> Power profile resets
> >
> > Now, we have conflicts only when we see multiple workloads (Job1 and Job 2)
> > -> Job1: requests profile P1 via UAPI, ref count = 1
> > -> driver sends flags for p1
> > -> PM FW applies profile P1
> > -> Job executes in profile P1
> > -> Job2: requests profile P2 via UAPI, refcount = 2
> > -> 

Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-29 Thread Felix Kuehling

On 2022-09-29 07:10, Lazar, Lijo wrote:



On 9/29/2022 2:18 PM, Sharma, Shashank wrote:



On 9/28/2022 11:51 PM, Alex Deucher wrote:

On Wed, Sep 28, 2022 at 4:57 AM Sharma, Shashank
 wrote:




On 9/27/2022 10:40 PM, Alex Deucher wrote:

On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank
 wrote:




On 9/27/2022 5:23 PM, Felix Kuehling wrote:

Am 2022-09-27 um 10:58 schrieb Sharma, Shashank:

Hello Felix,

Thank for the review comments.

On 9/27/2022 4:48 PM, Felix Kuehling wrote:

Am 2022-09-27 um 02:12 schrieb Christian König:

Am 26.09.22 um 23:40 schrieb Shashank Sharma:

This patch switches the GPU workload mode to/from
compute mode, while submitting compute workload.

Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 


Feel free to add my acked-by, but Felix should probably take 
a look

as well.


This look OK purely from a compute perspective. But I'm concerned
about the interaction of compute with graphics or multiple 
graphics
contexts submitting work concurrently. They would constantly 
override

or disable each other's workload hints.

For example, you have an amdgpu_ctx with
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD
process that also wants the compute profile. Those could be 
different
processes belonging to different users. Say, KFD enables the 
compute
profile first. Then the graphics context submits a job. At the 
start
of the job, the compute profile is enabled. That's a no-op 
because
KFD already enabled the compute profile. When the job 
finishes, it

disables the compute profile for everyone, including KFD. That's
unexpected.



In this case, it will not disable the compute profile, as the
reference counter will not be zero. The reset_profile() will 
only act

if the reference counter is 0.


OK, I missed the reference counter.




But I would be happy to get any inputs about a policy which can be
more sustainable and gets better outputs, for example:
- should we not allow a profile change, if a PP mode is already
applied and keep it Early bird basis ?

For example: Policy A
- Job A sets the profile to compute
- Job B tries to set profile to 3D, but we do not allow it as 
job A is

not finished it yet.

Or Policy B: Current one
- Job A sets the profile to compute
- Job B tries to set profile to 3D, and we allow it. Job A also 
runs

in PP 3D
- Job B finishes, but does not reset PP as reference count is 
not zero

due to compute
- Job  A finishes, profile reset to NONE


I think this won't work. As I understand it, the
amdgpu_dpm_switch_power_profile enables and disables individual
profiles. Disabling the 3D profile doesn't disable the compute 
profile

at the same time. I think you'll need one refcount per profile.

Regards,
 Felix


Thanks, This is exactly what I was looking for, I think Alex's 
initial
idea was around it, but I was under the assumption that there is 
only
one HW profile in SMU which keeps on getting overwritten. This 
can solve
our problems, as I can create an array of reference counters, and 
will

disable only the profile whose reference counter goes 0.


It's been a while since I paged any of this code into my head, but I
believe the actual workload message in the SMU is a mask where you 
can

specify multiple workload types at the same time and the SMU will
arbitrate between them internally.  E.g., the most aggressive one 
will

be selected out of the ones specified.  I think in the driver we just
set one bit at a time using the current interface.  It might be 
better

to change the interface and just ref count the hint types and then
when we call the set function look at the ref counts for each hint
type and set the mask as appropriate.

Alex



Hey Alex,
Thanks for your comment, if that is the case, this current patch 
series
works straight forward, and no changes would be required. Please 
let me

know if my understanding is correct:

Assumption: Order of aggression: 3D > Media > Compute

- Job 1: Requests mode compute: PP changed to compute, ref count 1
- Job 2: Requests mode media: PP changed to media, ref count 2
- Job 3: requests mode 3D: PP changed to 3D, ref count 3
- Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref 
> 0,

PP still 3D
- Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref 
> 0,

PP still 3D
- Job 2 finishes, downs ref count to 0, PP changed to NONE,

In this way, every job will be operating in the Power profile of 
desired
aggression or higher, and this API guarantees the execution 
at-least in

the desired power profile.


I'm not entirely sure on the relative levels of aggression, but I
believe the SMU priorities them by index.  E.g.
#define WORKLOAD_PPLIB_DEFAULT_BIT    0
#define WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT 1
#define WORKLOAD_PPLIB_POWER_SAVING_BIT   2
#define WORKLOAD_PPLIB_VIDEO_BIT  3
#define WORKLOAD_PPLIB_VR_BIT 4
#define WORKLOAD_PPLIB_COMPUTE_BIT    5
#define WORKLOAD_PPLIB_CUSTOM_BIT 6

3D < video < VR < compute < 

Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-29 Thread Sharma, Shashank




On 9/29/2022 4:14 PM, Lazar, Lijo wrote:



On 9/29/2022 7:30 PM, Sharma, Shashank wrote:



On 9/29/2022 3:37 PM, Lazar, Lijo wrote:

To be clear your understanding -

Nothing is automatic in PMFW. PMFW picks a priority based on the 
actual mask sent by driver.


Assuming lower bits corresponds to highest priority -

If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose 
profile that corresponds to Bit0. If driver sends a mask with Bit4 
Bit2 set and rest unset, PMFW will chose profile that corresponds to 
Bit2. However if driver sends a mask only with a single bit set, it 
chooses the profile regardless of whatever was the previous profile. 
t doesn't check if the existing profile > newly requested one. That 
is the behavior.


So if a job send chooses a profile that corresponds to Bit0, driver 
will send that. Next time if another job chooses a profile that 
corresponds to Bit1, PMFW will receive that as the new profile and 
switch to that. It trusts the driver to send the proper workload mask.


Hope that gives the picture.




Thanks, my understanding is also similar, referring to the core power 
switch profile function here: 
amd_powerplay.c::pp_dpm_switch_power_profile()

*snip code*
hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]);
 index = fls(hwmgr->workload_mask);
 index = index <= Workload_Policy_Max ? index - 1 : 0;
 workload = hwmgr->workload_setting[index];
*snip_code*
hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, , 0);

Here I can see that the new workload mask is appended into the 
existing workload mask (not overwritten). So if we keep sending new 
workload_modes, they would be appended into the workload flags and 
finally the PM will pick the most aggressive one of all these flags, 
as per its policy.




Actually it's misleading -

The path for sienna is -
set_power_profile_mode -> sienna_cichlid_set_power_profile_mode


This code here is a picking one based on lookup table.

  workload_type = smu_cmn_to_asic_specific_index(smu,

CMN2ASIC_MAPPING_WORKLOAD,

smu->power_profile_mode);

This is that lookup table -

static struct cmn2asic_mapping 
sienna_cichlid_workload_map[PP_SMC_POWER_PROFILE_COUNT] = {
     WORKLOAD_MAP(PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT, 
WORKLOAD_PPLIB_DEFAULT_BIT),
     WORKLOAD_MAP(PP_SMC_POWER_PROFILE_FULLSCREEN3D, 
WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT),
     WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING, 
WORKLOAD_PPLIB_POWER_SAVING_BIT),
     WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO, 
WORKLOAD_PPLIB_VIDEO_BIT),

     WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR, WORKLOAD_PPLIB_VR_BIT),
     WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE, 
WORKLOAD_PPLIB_COMPUTE_BIT),
     WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM, 
WORKLOAD_PPLIB_CUSTOM_BIT),

};


And this is the place of interaction with PMFW. (1 << workload_type) is 
the mask being sent.


    smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask,
     1 << workload_type, NULL);

In the end, driver implementation expects only one bit to be set.



Well this seems like a bug here in the core functions, because the 
powerplay layer is doing the right thing by appending the workload flags 
keeping in mind that a profile_change can be requested while one profile 
is active, but the core functions are actually ignoring those flags.


This brings us to look into actual PM FW expectations. If it expects 
only one flag to be set in the power_mode change message, we don't need 
to bother about this anymore. But if it can handle more than one flag 
but the core driver implementation is blocking it, we will have to fix 
that as well.


@Alex: How can we get more information on this ?

- Shashank


Thanks,
Lijo


Now, when we have a single workload:
-> Job1: requests profile P1 via UAPI, ref count = 1
-> driver sends flags for p1
-> PM FW applies profile P1
-> Job executes in profile P1
-> Job goes to reset function, ref_count = 0,
-> Power profile resets

Now, we have conflicts only when we see multiple workloads (Job1 and 
Job 2)

-> Job1: requests profile P1 via UAPI, ref count = 1
-> driver sends flags for p1
-> PM FW applies profile P1
-> Job executes in profile P1
-> Job2: requests profile P2 via UAPI, refcount = 2
-> driver sends flags for (P1|P2)
-> PM FW picks the more aggressive of the two (Say P1, stays in P1)
-> Job1 goes to reset function, ref_count = 1, job1 does not reset 
power profile
-> Job2 goes to reset function, ref_counter = 2, job 2 resets Power 
profile

-> Power profile resets to None

So this state machine looks like if there is only 1 job, it will be 
executed in desired mode. But if there are multiple, the most 
aggressive profile will be picked, and every job will be executed in 
atleast the requested power profile mode or higher.


Do you find any problem so far ?

- Shashank



Thanks,
Lijo


Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-29 Thread Lazar, Lijo




On 9/29/2022 7:30 PM, Sharma, Shashank wrote:



On 9/29/2022 3:37 PM, Lazar, Lijo wrote:

To be clear your understanding -

Nothing is automatic in PMFW. PMFW picks a priority based on the 
actual mask sent by driver.


Assuming lower bits corresponds to highest priority -

If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose 
profile that corresponds to Bit0. If driver sends a mask with Bit4 
Bit2 set and rest unset, PMFW will chose profile that corresponds to 
Bit2. However if driver sends a mask only with a single bit set, it 
chooses the profile regardless of whatever was the previous profile. t 
doesn't check if the existing profile > newly requested one. That is 
the behavior.


So if a job send chooses a profile that corresponds to Bit0, driver 
will send that. Next time if another job chooses a profile that 
corresponds to Bit1, PMFW will receive that as the new profile and 
switch to that. It trusts the driver to send the proper workload mask.


Hope that gives the picture.




Thanks, my understanding is also similar, referring to the core power 
switch profile function here: 
amd_powerplay.c::pp_dpm_switch_power_profile()

*snip code*
hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]);
     index = fls(hwmgr->workload_mask);
     index = index <= Workload_Policy_Max ? index - 1 : 0;
     workload = hwmgr->workload_setting[index];
*snip_code*
hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, , 0);

Here I can see that the new workload mask is appended into the existing 
workload mask (not overwritten). So if we keep sending new 
workload_modes, they would be appended into the workload flags and 
finally the PM will pick the most aggressive one of all these flags, as 
per its policy.




Actually it's misleading -

The path for sienna is -
set_power_profile_mode -> sienna_cichlid_set_power_profile_mode


This code here is a picking one based on lookup table.

 workload_type = smu_cmn_to_asic_specific_index(smu,

CMN2ASIC_MAPPING_WORKLOAD,

smu->power_profile_mode);

This is that lookup table -

static struct cmn2asic_mapping 
sienna_cichlid_workload_map[PP_SMC_POWER_PROFILE_COUNT] = {
WORKLOAD_MAP(PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT, 
WORKLOAD_PPLIB_DEFAULT_BIT),
WORKLOAD_MAP(PP_SMC_POWER_PROFILE_FULLSCREEN3D, 
WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT),
WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING, 
WORKLOAD_PPLIB_POWER_SAVING_BIT),
WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO, 
WORKLOAD_PPLIB_VIDEO_BIT),
WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR, 
WORKLOAD_PPLIB_VR_BIT),
WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE, 
WORKLOAD_PPLIB_COMPUTE_BIT),
WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM, 
WORKLOAD_PPLIB_CUSTOM_BIT),

};


And this is the place of interaction with PMFW. (1 << workload_type) is 
the mask being sent.


   smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask,
1 << workload_type, NULL);

In the end, driver implementation expects only one bit to be set.

Thanks,
Lijo


Now, when we have a single workload:
-> Job1: requests profile P1 via UAPI, ref count = 1
-> driver sends flags for p1
-> PM FW applies profile P1
-> Job executes in profile P1
-> Job goes to reset function, ref_count = 0,
-> Power profile resets

Now, we have conflicts only when we see multiple workloads (Job1 and Job 2)
-> Job1: requests profile P1 via UAPI, ref count = 1
-> driver sends flags for p1
-> PM FW applies profile P1
-> Job executes in profile P1
-> Job2: requests profile P2 via UAPI, refcount = 2
-> driver sends flags for (P1|P2)
-> PM FW picks the more aggressive of the two (Say P1, stays in P1)
-> Job1 goes to reset function, ref_count = 1, job1 does not reset power 
profile

-> Job2 goes to reset function, ref_counter = 2, job 2 resets Power profile
-> Power profile resets to None

So this state machine looks like if there is only 1 job, it will be 
executed in desired mode. But if there are multiple, the most aggressive 
profile will be picked, and every job will be executed in atleast the 
requested power profile mode or higher.


Do you find any problem so far ?

- Shashank



Thanks,
Lijo


Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-29 Thread Sharma, Shashank




On 9/29/2022 3:37 PM, Lazar, Lijo wrote:

To be clear your understanding -

Nothing is automatic in PMFW. PMFW picks a priority based on the actual 
mask sent by driver.


Assuming lower bits corresponds to highest priority -

If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose profile 
that corresponds to Bit0. If driver sends a mask with Bit4 Bit2 set and 
rest unset, PMFW will chose profile that corresponds to Bit2. However if 
driver sends a mask only with a single bit set, it chooses the profile 
regardless of whatever was the previous profile. t doesn't check if the 
existing profile > newly requested one. That is the behavior.


So if a job send chooses a profile that corresponds to Bit0, driver will 
send that. Next time if another job chooses a profile that corresponds 
to Bit1, PMFW will receive that as the new profile and switch to that. 
It trusts the driver to send the proper workload mask.


Hope that gives the picture.




Thanks, my understanding is also similar, referring to the core power 
switch profile function here: amd_powerplay.c::pp_dpm_switch_power_profile()

*snip code*
hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]);
index = fls(hwmgr->workload_mask);
index = index <= Workload_Policy_Max ? index - 1 : 0;
workload = hwmgr->workload_setting[index];
*snip_code*
hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, , 0);

Here I can see that the new workload mask is appended into the existing 
workload mask (not overwritten). So if we keep sending new 
workload_modes, they would be appended into the workload flags and 
finally the PM will pick the most aggressive one of all these flags, as 
per its policy.


Now, when we have a single workload:
-> Job1: requests profile P1 via UAPI, ref count = 1
-> driver sends flags for p1
-> PM FW applies profile P1
-> Job executes in profile P1
-> Job goes to reset function, ref_count = 0,
-> Power profile resets

Now, we have conflicts only when we see multiple workloads (Job1 and Job 2)
-> Job1: requests profile P1 via UAPI, ref count = 1
-> driver sends flags for p1
-> PM FW applies profile P1
-> Job executes in profile P1
-> Job2: requests profile P2 via UAPI, refcount = 2
-> driver sends flags for (P1|P2)
-> PM FW picks the more aggressive of the two (Say P1, stays in P1)
-> Job1 goes to reset function, ref_count = 1, job1 does not reset power 
profile

-> Job2 goes to reset function, ref_counter = 2, job 2 resets Power profile
-> Power profile resets to None

So this state machine looks like if there is only 1 job, it will be 
executed in desired mode. But if there are multiple, the most aggressive 
profile will be picked, and every job will be executed in atleast the 
requested power profile mode or higher.


Do you find any problem so far ?

- Shashank



Thanks,
Lijo


Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-29 Thread Lazar, Lijo




On 9/29/2022 6:50 PM, Sharma, Shashank wrote:



On 9/29/2022 1:10 PM, Lazar, Lijo wrote:



On 9/29/2022 2:18 PM, Sharma, Shashank wrote:



On 9/28/2022 11:51 PM, Alex Deucher wrote:

On Wed, Sep 28, 2022 at 4:57 AM Sharma, Shashank
 wrote:




On 9/27/2022 10:40 PM, Alex Deucher wrote:

On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank
 wrote:




On 9/27/2022 5:23 PM, Felix Kuehling wrote:

Am 2022-09-27 um 10:58 schrieb Sharma, Shashank:

Hello Felix,

Thank for the review comments.

On 9/27/2022 4:48 PM, Felix Kuehling wrote:

Am 2022-09-27 um 02:12 schrieb Christian König:

Am 26.09.22 um 23:40 schrieb Shashank Sharma:

This patch switches the GPU workload mode to/from
compute mode, while submitting compute workload.

Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 


Feel free to add my acked-by, but Felix should probably take 
a look

as well.


This look OK purely from a compute perspective. But I'm concerned
about the interaction of compute with graphics or multiple 
graphics
contexts submitting work concurrently. They would constantly 
override

or disable each other's workload hints.

For example, you have an amdgpu_ctx with
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD
process that also wants the compute profile. Those could be 
different
processes belonging to different users. Say, KFD enables the 
compute
profile first. Then the graphics context submits a job. At the 
start
of the job, the compute profile is enabled. That's a no-op 
because
KFD already enabled the compute profile. When the job 
finishes, it

disables the compute profile for everyone, including KFD. That's
unexpected.



In this case, it will not disable the compute profile, as the
reference counter will not be zero. The reset_profile() will 
only act

if the reference counter is 0.


OK, I missed the reference counter.




But I would be happy to get any inputs about a policy which can be
more sustainable and gets better outputs, for example:
- should we not allow a profile change, if a PP mode is already
applied and keep it Early bird basis ?

For example: Policy A
- Job A sets the profile to compute
- Job B tries to set profile to 3D, but we do not allow it as 
job A is

not finished it yet.

Or Policy B: Current one
- Job A sets the profile to compute
- Job B tries to set profile to 3D, and we allow it. Job A also 
runs

in PP 3D
- Job B finishes, but does not reset PP as reference count is 
not zero

due to compute
- Job  A finishes, profile reset to NONE


I think this won't work. As I understand it, the
amdgpu_dpm_switch_power_profile enables and disables individual
profiles. Disabling the 3D profile doesn't disable the compute 
profile

at the same time. I think you'll need one refcount per profile.

Regards,
 Felix


Thanks, This is exactly what I was looking for, I think Alex's 
initial
idea was around it, but I was under the assumption that there is 
only
one HW profile in SMU which keeps on getting overwritten. This 
can solve
our problems, as I can create an array of reference counters, and 
will

disable only the profile whose reference counter goes 0.


It's been a while since I paged any of this code into my head, but I
believe the actual workload message in the SMU is a mask where you 
can

specify multiple workload types at the same time and the SMU will
arbitrate between them internally.  E.g., the most aggressive one 
will

be selected out of the ones specified.  I think in the driver we just
set one bit at a time using the current interface.  It might be 
better

to change the interface and just ref count the hint types and then
when we call the set function look at the ref counts for each hint
type and set the mask as appropriate.

Alex



Hey Alex,
Thanks for your comment, if that is the case, this current patch 
series
works straight forward, and no changes would be required. Please 
let me

know if my understanding is correct:

Assumption: Order of aggression: 3D > Media > Compute

- Job 1: Requests mode compute: PP changed to compute, ref count 1
- Job 2: Requests mode media: PP changed to media, ref count 2
- Job 3: requests mode 3D: PP changed to 3D, ref count 3
- Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref 
> 0,

PP still 3D
- Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref 
> 0,

PP still 3D
- Job 2 finishes, downs ref count to 0, PP changed to NONE,

In this way, every job will be operating in the Power profile of 
desired
aggression or higher, and this API guarantees the execution 
at-least in

the desired power profile.


I'm not entirely sure on the relative levels of aggression, but I
believe the SMU priorities them by index.  E.g.
#define WORKLOAD_PPLIB_DEFAULT_BIT    0
#define WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT 1
#define WORKLOAD_PPLIB_POWER_SAVING_BIT   2
#define WORKLOAD_PPLIB_VIDEO_BIT  3
#define WORKLOAD_PPLIB_VR_BIT 4
#define WORKLOAD_PPLIB_COMPUTE_BIT    5
#define 

Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-29 Thread Sharma, Shashank




On 9/29/2022 1:10 PM, Lazar, Lijo wrote:



On 9/29/2022 2:18 PM, Sharma, Shashank wrote:



On 9/28/2022 11:51 PM, Alex Deucher wrote:

On Wed, Sep 28, 2022 at 4:57 AM Sharma, Shashank
 wrote:




On 9/27/2022 10:40 PM, Alex Deucher wrote:

On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank
 wrote:




On 9/27/2022 5:23 PM, Felix Kuehling wrote:

Am 2022-09-27 um 10:58 schrieb Sharma, Shashank:

Hello Felix,

Thank for the review comments.

On 9/27/2022 4:48 PM, Felix Kuehling wrote:

Am 2022-09-27 um 02:12 schrieb Christian König:

Am 26.09.22 um 23:40 schrieb Shashank Sharma:

This patch switches the GPU workload mode to/from
compute mode, while submitting compute workload.

Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 


Feel free to add my acked-by, but Felix should probably take a 
look

as well.


This look OK purely from a compute perspective. But I'm concerned
about the interaction of compute with graphics or multiple 
graphics
contexts submitting work concurrently. They would constantly 
override

or disable each other's workload hints.

For example, you have an amdgpu_ctx with
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD
process that also wants the compute profile. Those could be 
different
processes belonging to different users. Say, KFD enables the 
compute
profile first. Then the graphics context submits a job. At the 
start

of the job, the compute profile is enabled. That's a no-op because
KFD already enabled the compute profile. When the job finishes, it
disables the compute profile for everyone, including KFD. That's
unexpected.



In this case, it will not disable the compute profile, as the
reference counter will not be zero. The reset_profile() will 
only act

if the reference counter is 0.


OK, I missed the reference counter.




But I would be happy to get any inputs about a policy which can be
more sustainable and gets better outputs, for example:
- should we not allow a profile change, if a PP mode is already
applied and keep it Early bird basis ?

For example: Policy A
- Job A sets the profile to compute
- Job B tries to set profile to 3D, but we do not allow it as 
job A is

not finished it yet.

Or Policy B: Current one
- Job A sets the profile to compute
- Job B tries to set profile to 3D, and we allow it. Job A also 
runs

in PP 3D
- Job B finishes, but does not reset PP as reference count is 
not zero

due to compute
- Job  A finishes, profile reset to NONE


I think this won't work. As I understand it, the
amdgpu_dpm_switch_power_profile enables and disables individual
profiles. Disabling the 3D profile doesn't disable the compute 
profile

at the same time. I think you'll need one refcount per profile.

Regards,
 Felix


Thanks, This is exactly what I was looking for, I think Alex's 
initial

idea was around it, but I was under the assumption that there is only
one HW profile in SMU which keeps on getting overwritten. This can 
solve
our problems, as I can create an array of reference counters, and 
will

disable only the profile whose reference counter goes 0.


It's been a while since I paged any of this code into my head, but I
believe the actual workload message in the SMU is a mask where you can
specify multiple workload types at the same time and the SMU will
arbitrate between them internally.  E.g., the most aggressive one will
be selected out of the ones specified.  I think in the driver we just
set one bit at a time using the current interface.  It might be better
to change the interface and just ref count the hint types and then
when we call the set function look at the ref counts for each hint
type and set the mask as appropriate.

Alex



Hey Alex,
Thanks for your comment, if that is the case, this current patch series
works straight forward, and no changes would be required. Please let me
know if my understanding is correct:

Assumption: Order of aggression: 3D > Media > Compute

- Job 1: Requests mode compute: PP changed to compute, ref count 1
- Job 2: Requests mode media: PP changed to media, ref count 2
- Job 3: requests mode 3D: PP changed to 3D, ref count 3
- Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref 
> 0,

PP still 3D
- Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref 
> 0,

PP still 3D
- Job 2 finishes, downs ref count to 0, PP changed to NONE,

In this way, every job will be operating in the Power profile of 
desired

aggression or higher, and this API guarantees the execution at-least in
the desired power profile.


I'm not entirely sure on the relative levels of aggression, but I
believe the SMU priorities them by index.  E.g.
#define WORKLOAD_PPLIB_DEFAULT_BIT    0
#define WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT 1
#define WORKLOAD_PPLIB_POWER_SAVING_BIT   2
#define WORKLOAD_PPLIB_VIDEO_BIT  3
#define WORKLOAD_PPLIB_VR_BIT 4
#define WORKLOAD_PPLIB_COMPUTE_BIT    5
#define WORKLOAD_PPLIB_CUSTOM_BIT 6

3D < video < VR < compute < custom


Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-29 Thread Lazar, Lijo




On 9/29/2022 2:18 PM, Sharma, Shashank wrote:



On 9/28/2022 11:51 PM, Alex Deucher wrote:

On Wed, Sep 28, 2022 at 4:57 AM Sharma, Shashank
 wrote:




On 9/27/2022 10:40 PM, Alex Deucher wrote:

On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank
 wrote:




On 9/27/2022 5:23 PM, Felix Kuehling wrote:

Am 2022-09-27 um 10:58 schrieb Sharma, Shashank:

Hello Felix,

Thank for the review comments.

On 9/27/2022 4:48 PM, Felix Kuehling wrote:

Am 2022-09-27 um 02:12 schrieb Christian König:

Am 26.09.22 um 23:40 schrieb Shashank Sharma:

This patch switches the GPU workload mode to/from
compute mode, while submitting compute workload.

Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 


Feel free to add my acked-by, but Felix should probably take a 
look

as well.


This look OK purely from a compute perspective. But I'm concerned
about the interaction of compute with graphics or multiple graphics
contexts submitting work concurrently. They would constantly 
override

or disable each other's workload hints.

For example, you have an amdgpu_ctx with
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD
process that also wants the compute profile. Those could be 
different
processes belonging to different users. Say, KFD enables the 
compute
profile first. Then the graphics context submits a job. At the 
start

of the job, the compute profile is enabled. That's a no-op because
KFD already enabled the compute profile. When the job finishes, it
disables the compute profile for everyone, including KFD. That's
unexpected.



In this case, it will not disable the compute profile, as the
reference counter will not be zero. The reset_profile() will only 
act

if the reference counter is 0.


OK, I missed the reference counter.




But I would be happy to get any inputs about a policy which can be
more sustainable and gets better outputs, for example:
- should we not allow a profile change, if a PP mode is already
applied and keep it Early bird basis ?

For example: Policy A
- Job A sets the profile to compute
- Job B tries to set profile to 3D, but we do not allow it as job 
A is

not finished it yet.

Or Policy B: Current one
- Job A sets the profile to compute
- Job B tries to set profile to 3D, and we allow it. Job A also runs
in PP 3D
- Job B finishes, but does not reset PP as reference count is not 
zero

due to compute
- Job  A finishes, profile reset to NONE


I think this won't work. As I understand it, the
amdgpu_dpm_switch_power_profile enables and disables individual
profiles. Disabling the 3D profile doesn't disable the compute 
profile

at the same time. I think you'll need one refcount per profile.

Regards,
 Felix


Thanks, This is exactly what I was looking for, I think Alex's initial
idea was around it, but I was under the assumption that there is only
one HW profile in SMU which keeps on getting overwritten. This can 
solve

our problems, as I can create an array of reference counters, and will
disable only the profile whose reference counter goes 0.


It's been a while since I paged any of this code into my head, but I
believe the actual workload message in the SMU is a mask where you can
specify multiple workload types at the same time and the SMU will
arbitrate between them internally.  E.g., the most aggressive one will
be selected out of the ones specified.  I think in the driver we just
set one bit at a time using the current interface.  It might be better
to change the interface and just ref count the hint types and then
when we call the set function look at the ref counts for each hint
type and set the mask as appropriate.

Alex



Hey Alex,
Thanks for your comment, if that is the case, this current patch series
works straight forward, and no changes would be required. Please let me
know if my understanding is correct:

Assumption: Order of aggression: 3D > Media > Compute

- Job 1: Requests mode compute: PP changed to compute, ref count 1
- Job 2: Requests mode media: PP changed to media, ref count 2
- Job 3: requests mode 3D: PP changed to 3D, ref count 3
- Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref > 0,
PP still 3D
- Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref > 0,
PP still 3D
- Job 2 finishes, downs ref count to 0, PP changed to NONE,

In this way, every job will be operating in the Power profile of desired
aggression or higher, and this API guarantees the execution at-least in
the desired power profile.


I'm not entirely sure on the relative levels of aggression, but I
believe the SMU priorities them by index.  E.g.
#define WORKLOAD_PPLIB_DEFAULT_BIT    0
#define WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT 1
#define WORKLOAD_PPLIB_POWER_SAVING_BIT   2
#define WORKLOAD_PPLIB_VIDEO_BIT  3
#define WORKLOAD_PPLIB_VR_BIT 4
#define WORKLOAD_PPLIB_COMPUTE_BIT    5
#define WORKLOAD_PPLIB_CUSTOM_BIT 6

3D < video < VR < compute < custom

VR and compute are the most aggressive.  Custom takes 

Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-29 Thread Sharma, Shashank




On 9/28/2022 11:51 PM, Alex Deucher wrote:

On Wed, Sep 28, 2022 at 4:57 AM Sharma, Shashank
 wrote:




On 9/27/2022 10:40 PM, Alex Deucher wrote:

On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank
 wrote:




On 9/27/2022 5:23 PM, Felix Kuehling wrote:

Am 2022-09-27 um 10:58 schrieb Sharma, Shashank:

Hello Felix,

Thank for the review comments.

On 9/27/2022 4:48 PM, Felix Kuehling wrote:

Am 2022-09-27 um 02:12 schrieb Christian König:

Am 26.09.22 um 23:40 schrieb Shashank Sharma:

This patch switches the GPU workload mode to/from
compute mode, while submitting compute workload.

Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 


Feel free to add my acked-by, but Felix should probably take a look
as well.


This look OK purely from a compute perspective. But I'm concerned
about the interaction of compute with graphics or multiple graphics
contexts submitting work concurrently. They would constantly override
or disable each other's workload hints.

For example, you have an amdgpu_ctx with
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD
process that also wants the compute profile. Those could be different
processes belonging to different users. Say, KFD enables the compute
profile first. Then the graphics context submits a job. At the start
of the job, the compute profile is enabled. That's a no-op because
KFD already enabled the compute profile. When the job finishes, it
disables the compute profile for everyone, including KFD. That's
unexpected.



In this case, it will not disable the compute profile, as the
reference counter will not be zero. The reset_profile() will only act
if the reference counter is 0.


OK, I missed the reference counter.




But I would be happy to get any inputs about a policy which can be
more sustainable and gets better outputs, for example:
- should we not allow a profile change, if a PP mode is already
applied and keep it Early bird basis ?

For example: Policy A
- Job A sets the profile to compute
- Job B tries to set profile to 3D, but we do not allow it as job A is
not finished it yet.

Or Policy B: Current one
- Job A sets the profile to compute
- Job B tries to set profile to 3D, and we allow it. Job A also runs
in PP 3D
- Job B finishes, but does not reset PP as reference count is not zero
due to compute
- Job  A finishes, profile reset to NONE


I think this won't work. As I understand it, the
amdgpu_dpm_switch_power_profile enables and disables individual
profiles. Disabling the 3D profile doesn't disable the compute profile
at the same time. I think you'll need one refcount per profile.

Regards,
 Felix


Thanks, This is exactly what I was looking for, I think Alex's initial
idea was around it, but I was under the assumption that there is only
one HW profile in SMU which keeps on getting overwritten. This can solve
our problems, as I can create an array of reference counters, and will
disable only the profile whose reference counter goes 0.


It's been a while since I paged any of this code into my head, but I
believe the actual workload message in the SMU is a mask where you can
specify multiple workload types at the same time and the SMU will
arbitrate between them internally.  E.g., the most aggressive one will
be selected out of the ones specified.  I think in the driver we just
set one bit at a time using the current interface.  It might be better
to change the interface and just ref count the hint types and then
when we call the set function look at the ref counts for each hint
type and set the mask as appropriate.

Alex



Hey Alex,
Thanks for your comment, if that is the case, this current patch series
works straight forward, and no changes would be required. Please let me
know if my understanding is correct:

Assumption: Order of aggression: 3D > Media > Compute

- Job 1: Requests mode compute: PP changed to compute, ref count 1
- Job 2: Requests mode media: PP changed to media, ref count 2
- Job 3: requests mode 3D: PP changed to 3D, ref count 3
- Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref > 0,
PP still 3D
- Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref > 0,
PP still 3D
- Job 2 finishes, downs ref count to 0, PP changed to NONE,

In this way, every job will be operating in the Power profile of desired
aggression or higher, and this API guarantees the execution at-least in
the desired power profile.


I'm not entirely sure on the relative levels of aggression, but I
believe the SMU priorities them by index.  E.g.
#define WORKLOAD_PPLIB_DEFAULT_BIT0
#define WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT 1
#define WORKLOAD_PPLIB_POWER_SAVING_BIT   2
#define WORKLOAD_PPLIB_VIDEO_BIT  3
#define WORKLOAD_PPLIB_VR_BIT 4
#define WORKLOAD_PPLIB_COMPUTE_BIT5
#define WORKLOAD_PPLIB_CUSTOM_BIT 6

3D < video < VR < compute < custom

VR and compute are the most aggressive.  Custom takes preference
because it's user customizable.

Alex



Thanks, so this 

Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-28 Thread Alex Deucher
On Wed, Sep 28, 2022 at 4:57 AM Sharma, Shashank
 wrote:
>
>
>
> On 9/27/2022 10:40 PM, Alex Deucher wrote:
> > On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank
> >  wrote:
> >>
> >>
> >>
> >> On 9/27/2022 5:23 PM, Felix Kuehling wrote:
> >>> Am 2022-09-27 um 10:58 schrieb Sharma, Shashank:
>  Hello Felix,
> 
>  Thank for the review comments.
> 
>  On 9/27/2022 4:48 PM, Felix Kuehling wrote:
> > Am 2022-09-27 um 02:12 schrieb Christian König:
> >> Am 26.09.22 um 23:40 schrieb Shashank Sharma:
> >>> This patch switches the GPU workload mode to/from
> >>> compute mode, while submitting compute workload.
> >>>
> >>> Signed-off-by: Alex Deucher 
> >>> Signed-off-by: Shashank Sharma 
> >>
> >> Feel free to add my acked-by, but Felix should probably take a look
> >> as well.
> >
> > This look OK purely from a compute perspective. But I'm concerned
> > about the interaction of compute with graphics or multiple graphics
> > contexts submitting work concurrently. They would constantly override
> > or disable each other's workload hints.
> >
> > For example, you have an amdgpu_ctx with
> > AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD
> > process that also wants the compute profile. Those could be different
> > processes belonging to different users. Say, KFD enables the compute
> > profile first. Then the graphics context submits a job. At the start
> > of the job, the compute profile is enabled. That's a no-op because
> > KFD already enabled the compute profile. When the job finishes, it
> > disables the compute profile for everyone, including KFD. That's
> > unexpected.
> >
> 
>  In this case, it will not disable the compute profile, as the
>  reference counter will not be zero. The reset_profile() will only act
>  if the reference counter is 0.
> >>>
> >>> OK, I missed the reference counter.
> >>>
> >>>
> 
>  But I would be happy to get any inputs about a policy which can be
>  more sustainable and gets better outputs, for example:
>  - should we not allow a profile change, if a PP mode is already
>  applied and keep it Early bird basis ?
> 
>  For example: Policy A
>  - Job A sets the profile to compute
>  - Job B tries to set profile to 3D, but we do not allow it as job A is
>  not finished it yet.
> 
>  Or Policy B: Current one
>  - Job A sets the profile to compute
>  - Job B tries to set profile to 3D, and we allow it. Job A also runs
>  in PP 3D
>  - Job B finishes, but does not reset PP as reference count is not zero
>  due to compute
>  - Job  A finishes, profile reset to NONE
> >>>
> >>> I think this won't work. As I understand it, the
> >>> amdgpu_dpm_switch_power_profile enables and disables individual
> >>> profiles. Disabling the 3D profile doesn't disable the compute profile
> >>> at the same time. I think you'll need one refcount per profile.
> >>>
> >>> Regards,
> >>> Felix
> >>
> >> Thanks, This is exactly what I was looking for, I think Alex's initial
> >> idea was around it, but I was under the assumption that there is only
> >> one HW profile in SMU which keeps on getting overwritten. This can solve
> >> our problems, as I can create an array of reference counters, and will
> >> disable only the profile whose reference counter goes 0.
> >
> > It's been a while since I paged any of this code into my head, but I
> > believe the actual workload message in the SMU is a mask where you can
> > specify multiple workload types at the same time and the SMU will
> > arbitrate between them internally.  E.g., the most aggressive one will
> > be selected out of the ones specified.  I think in the driver we just
> > set one bit at a time using the current interface.  It might be better
> > to change the interface and just ref count the hint types and then
> > when we call the set function look at the ref counts for each hint
> > type and set the mask as appropriate.
> >
> > Alex
> >
>
> Hey Alex,
> Thanks for your comment, if that is the case, this current patch series
> works straight forward, and no changes would be required. Please let me
> know if my understanding is correct:
>
> Assumption: Order of aggression: 3D > Media > Compute
>
> - Job 1: Requests mode compute: PP changed to compute, ref count 1
> - Job 2: Requests mode media: PP changed to media, ref count 2
> - Job 3: requests mode 3D: PP changed to 3D, ref count 3
> - Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref > 0,
> PP still 3D
> - Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref > 0,
> PP still 3D
> - Job 2 finishes, downs ref count to 0, PP changed to NONE,
>
> In this way, every job will be operating in the Power profile of desired
> aggression or higher, and this API guarantees the execution at-least in
> the desired power profile.

I'm not entirely sure on the 

Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-28 Thread Sharma, Shashank

Small correction,

On 9/28/2022 10:56 AM, Sharma, Shashank wrote:



On 9/27/2022 10:40 PM, Alex Deucher wrote:

On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank
 wrote:




On 9/27/2022 5:23 PM, Felix Kuehling wrote:

Am 2022-09-27 um 10:58 schrieb Sharma, Shashank:

Hello Felix,

Thank for the review comments.

On 9/27/2022 4:48 PM, Felix Kuehling wrote:

Am 2022-09-27 um 02:12 schrieb Christian König:

Am 26.09.22 um 23:40 schrieb Shashank Sharma:

This patch switches the GPU workload mode to/from
compute mode, while submitting compute workload.

Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 


Feel free to add my acked-by, but Felix should probably take a look
as well.


This look OK purely from a compute perspective. But I'm concerned
about the interaction of compute with graphics or multiple graphics
contexts submitting work concurrently. They would constantly override
or disable each other's workload hints.

For example, you have an amdgpu_ctx with
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD
process that also wants the compute profile. Those could be different
processes belonging to different users. Say, KFD enables the compute
profile first. Then the graphics context submits a job. At the start
of the job, the compute profile is enabled. That's a no-op because
KFD already enabled the compute profile. When the job finishes, it
disables the compute profile for everyone, including KFD. That's
unexpected.



In this case, it will not disable the compute profile, as the
reference counter will not be zero. The reset_profile() will only act
if the reference counter is 0.


OK, I missed the reference counter.




But I would be happy to get any inputs about a policy which can be
more sustainable and gets better outputs, for example:
- should we not allow a profile change, if a PP mode is already
applied and keep it Early bird basis ?

For example: Policy A
- Job A sets the profile to compute
- Job B tries to set profile to 3D, but we do not allow it as job A is
not finished it yet.

Or Policy B: Current one
- Job A sets the profile to compute
- Job B tries to set profile to 3D, and we allow it. Job A also runs
in PP 3D
- Job B finishes, but does not reset PP as reference count is not zero
due to compute
- Job  A finishes, profile reset to NONE


I think this won't work. As I understand it, the
amdgpu_dpm_switch_power_profile enables and disables individual
profiles. Disabling the 3D profile doesn't disable the compute profile
at the same time. I think you'll need one refcount per profile.

Regards,
    Felix


Thanks, This is exactly what I was looking for, I think Alex's initial
idea was around it, but I was under the assumption that there is only
one HW profile in SMU which keeps on getting overwritten. This can solve
our problems, as I can create an array of reference counters, and will
disable only the profile whose reference counter goes 0.


It's been a while since I paged any of this code into my head, but I
believe the actual workload message in the SMU is a mask where you can
specify multiple workload types at the same time and the SMU will
arbitrate between them internally.  E.g., the most aggressive one will
be selected out of the ones specified.  I think in the driver we just
set one bit at a time using the current interface.  It might be better
to change the interface and just ref count the hint types and then
when we call the set function look at the ref counts for each hint
type and set the mask as appropriate.

Alex



Hey Alex,
Thanks for your comment, if that is the case, this current patch series 
works straight forward, and no changes would be required. 


Only one change required would be to append the new power profile 
request in the existing power profile mask, instead of overwriting it. 
This is where the current state machine pm.workload_mode would be useful.


- Shashank

Please let me

know if my understanding is correct:

Assumption: Order of aggression: 3D > Media > Compute

- Job 1: Requests mode compute: PP changed to compute, ref count 1
- Job 2: Requests mode media: PP changed to media, ref count 2
- Job 3: requests mode 3D: PP changed to 3D, ref count 3
- Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref > 0, 
PP still 3D

- Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref > 0,
PP still 3D
- Job 2 finishes, downs ref count to 0, PP changed to NONE,

In this way, every job will be operating in the Power profile of desired 
aggression or higher, and this API guarantees the execution at-least in 
the desired power profile.


- Shashank





- Shashank







Or anything else ?

REgards
Shashank



Or you have multiple VCN contexts. When context1 finishes a job, it
disables the VIDEO profile. But context2 still has a job on the other
VCN engine and wants the VIDEO profile to still be enabled.

Regards,
    Felix




Christian.


---
   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++---
   1 

Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-28 Thread Sharma, Shashank




On 9/27/2022 10:40 PM, Alex Deucher wrote:

On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank
 wrote:




On 9/27/2022 5:23 PM, Felix Kuehling wrote:

Am 2022-09-27 um 10:58 schrieb Sharma, Shashank:

Hello Felix,

Thank for the review comments.

On 9/27/2022 4:48 PM, Felix Kuehling wrote:

Am 2022-09-27 um 02:12 schrieb Christian König:

Am 26.09.22 um 23:40 schrieb Shashank Sharma:

This patch switches the GPU workload mode to/from
compute mode, while submitting compute workload.

Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 


Feel free to add my acked-by, but Felix should probably take a look
as well.


This look OK purely from a compute perspective. But I'm concerned
about the interaction of compute with graphics or multiple graphics
contexts submitting work concurrently. They would constantly override
or disable each other's workload hints.

For example, you have an amdgpu_ctx with
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD
process that also wants the compute profile. Those could be different
processes belonging to different users. Say, KFD enables the compute
profile first. Then the graphics context submits a job. At the start
of the job, the compute profile is enabled. That's a no-op because
KFD already enabled the compute profile. When the job finishes, it
disables the compute profile for everyone, including KFD. That's
unexpected.



In this case, it will not disable the compute profile, as the
reference counter will not be zero. The reset_profile() will only act
if the reference counter is 0.


OK, I missed the reference counter.




But I would be happy to get any inputs about a policy which can be
more sustainable and gets better outputs, for example:
- should we not allow a profile change, if a PP mode is already
applied and keep it Early bird basis ?

For example: Policy A
- Job A sets the profile to compute
- Job B tries to set profile to 3D, but we do not allow it as job A is
not finished it yet.

Or Policy B: Current one
- Job A sets the profile to compute
- Job B tries to set profile to 3D, and we allow it. Job A also runs
in PP 3D
- Job B finishes, but does not reset PP as reference count is not zero
due to compute
- Job  A finishes, profile reset to NONE


I think this won't work. As I understand it, the
amdgpu_dpm_switch_power_profile enables and disables individual
profiles. Disabling the 3D profile doesn't disable the compute profile
at the same time. I think you'll need one refcount per profile.

Regards,
Felix


Thanks, This is exactly what I was looking for, I think Alex's initial
idea was around it, but I was under the assumption that there is only
one HW profile in SMU which keeps on getting overwritten. This can solve
our problems, as I can create an array of reference counters, and will
disable only the profile whose reference counter goes 0.


It's been a while since I paged any of this code into my head, but I
believe the actual workload message in the SMU is a mask where you can
specify multiple workload types at the same time and the SMU will
arbitrate between them internally.  E.g., the most aggressive one will
be selected out of the ones specified.  I think in the driver we just
set one bit at a time using the current interface.  It might be better
to change the interface and just ref count the hint types and then
when we call the set function look at the ref counts for each hint
type and set the mask as appropriate.

Alex



Hey Alex,
Thanks for your comment, if that is the case, this current patch series 
works straight forward, and no changes would be required. Please let me 
know if my understanding is correct:


Assumption: Order of aggression: 3D > Media > Compute

- Job 1: Requests mode compute: PP changed to compute, ref count 1
- Job 2: Requests mode media: PP changed to media, ref count 2
- Job 3: requests mode 3D: PP changed to 3D, ref count 3
- Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref > 0, 
PP still 3D

- Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref > 0,
PP still 3D
- Job 2 finishes, downs ref count to 0, PP changed to NONE,

In this way, every job will be operating in the Power profile of desired 
aggression or higher, and this API guarantees the execution at-least in 
the desired power profile.


- Shashank





- Shashank







Or anything else ?

REgards
Shashank



Or you have multiple VCN contexts. When context1 finishes a job, it
disables the VIDEO profile. But context2 still has a job on the other
VCN engine and wants the VIDEO profile to still be enabled.

Regards,
Felix




Christian.


---
   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++---
   1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 5e53a5293935..1caed319a448 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -34,6 

Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-28 Thread Lazar, Lijo




On 9/28/2022 2:10 AM, Alex Deucher wrote:

On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank
 wrote:




On 9/27/2022 5:23 PM, Felix Kuehling wrote:

Am 2022-09-27 um 10:58 schrieb Sharma, Shashank:

Hello Felix,

Thank for the review comments.

On 9/27/2022 4:48 PM, Felix Kuehling wrote:

Am 2022-09-27 um 02:12 schrieb Christian König:

Am 26.09.22 um 23:40 schrieb Shashank Sharma:

This patch switches the GPU workload mode to/from
compute mode, while submitting compute workload.

Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 


Feel free to add my acked-by, but Felix should probably take a look
as well.


This look OK purely from a compute perspective. But I'm concerned
about the interaction of compute with graphics or multiple graphics
contexts submitting work concurrently. They would constantly override
or disable each other's workload hints.

For example, you have an amdgpu_ctx with
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD
process that also wants the compute profile. Those could be different
processes belonging to different users. Say, KFD enables the compute
profile first. Then the graphics context submits a job. At the start
of the job, the compute profile is enabled. That's a no-op because
KFD already enabled the compute profile. When the job finishes, it
disables the compute profile for everyone, including KFD. That's
unexpected.



In this case, it will not disable the compute profile, as the
reference counter will not be zero. The reset_profile() will only act
if the reference counter is 0.


OK, I missed the reference counter.




But I would be happy to get any inputs about a policy which can be
more sustainable and gets better outputs, for example:
- should we not allow a profile change, if a PP mode is already
applied and keep it Early bird basis ?

For example: Policy A
- Job A sets the profile to compute
- Job B tries to set profile to 3D, but we do not allow it as job A is
not finished it yet.

Or Policy B: Current one
- Job A sets the profile to compute
- Job B tries to set profile to 3D, and we allow it. Job A also runs
in PP 3D
- Job B finishes, but does not reset PP as reference count is not zero
due to compute
- Job  A finishes, profile reset to NONE


I think this won't work. As I understand it, the
amdgpu_dpm_switch_power_profile enables and disables individual
profiles. Disabling the 3D profile doesn't disable the compute profile
at the same time. I think you'll need one refcount per profile.

Regards,
Felix


Thanks, This is exactly what I was looking for, I think Alex's initial
idea was around it, but I was under the assumption that there is only
one HW profile in SMU which keeps on getting overwritten. This can solve
our problems, as I can create an array of reference counters, and will
disable only the profile whose reference counter goes 0.


It's been a while since I paged any of this code into my head, but I
believe the actual workload message in the SMU is a mask where you can
specify multiple workload types at the same time and the SMU will
arbitrate between them internally.  E.g., the most aggressive one will
be selected out of the ones specified.  I think in the driver we just
set one bit at a time using the current interface.


Yes, this is how it works today. Only one profile is set at a time and 
so setting another one will overwrite the current driver preference.


I think the current expectation of usage is from a system settings 
perspective like Gaming Mode (Full screen 3D) or Cinematic mode (Video) 
etc. This is also set through sysfs and there is also a Custom mode. 
It's not used in the sense of a per-job setting.


  It might be better

to change the interface and just ref count the hint types and then
when we call the set function look at the ref counts for each hint
type and set the mask as appropriate.

This means a pm subsytem level change and the ref counts need to be kept 
in pm layer to account for changes through sysfs or APIs.


Thanks,
Lijo


Alex




- Shashank







Or anything else ?

REgards
Shashank



Or you have multiple VCN contexts. When context1 finishes a job, it
disables the VIDEO profile. But context2 still has a job on the other
VCN engine and wants the VIDEO profile to still be enabled.

Regards,
Felix




Christian.


---
   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++---
   1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 5e53a5293935..1caed319a448 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -34,6 +34,7 @@
   #include "amdgpu_ras.h"
   #include "amdgpu_umc.h"
   #include "amdgpu_reset.h"
+#include "amdgpu_ctx_workload.h"
 /* Total memory size in system memory and all GPU VRAM. Used to
* estimate worst case amount of memory to reserve for page tables
@@ -703,9 +704,16 @@ int 

Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-27 Thread Alex Deucher
On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank
 wrote:
>
>
>
> On 9/27/2022 5:23 PM, Felix Kuehling wrote:
> > Am 2022-09-27 um 10:58 schrieb Sharma, Shashank:
> >> Hello Felix,
> >>
> >> Thank for the review comments.
> >>
> >> On 9/27/2022 4:48 PM, Felix Kuehling wrote:
> >>> Am 2022-09-27 um 02:12 schrieb Christian König:
>  Am 26.09.22 um 23:40 schrieb Shashank Sharma:
> > This patch switches the GPU workload mode to/from
> > compute mode, while submitting compute workload.
> >
> > Signed-off-by: Alex Deucher 
> > Signed-off-by: Shashank Sharma 
> 
>  Feel free to add my acked-by, but Felix should probably take a look
>  as well.
> >>>
> >>> This look OK purely from a compute perspective. But I'm concerned
> >>> about the interaction of compute with graphics or multiple graphics
> >>> contexts submitting work concurrently. They would constantly override
> >>> or disable each other's workload hints.
> >>>
> >>> For example, you have an amdgpu_ctx with
> >>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD
> >>> process that also wants the compute profile. Those could be different
> >>> processes belonging to different users. Say, KFD enables the compute
> >>> profile first. Then the graphics context submits a job. At the start
> >>> of the job, the compute profile is enabled. That's a no-op because
> >>> KFD already enabled the compute profile. When the job finishes, it
> >>> disables the compute profile for everyone, including KFD. That's
> >>> unexpected.
> >>>
> >>
> >> In this case, it will not disable the compute profile, as the
> >> reference counter will not be zero. The reset_profile() will only act
> >> if the reference counter is 0.
> >
> > OK, I missed the reference counter.
> >
> >
> >>
> >> But I would be happy to get any inputs about a policy which can be
> >> more sustainable and gets better outputs, for example:
> >> - should we not allow a profile change, if a PP mode is already
> >> applied and keep it Early bird basis ?
> >>
> >> For example: Policy A
> >> - Job A sets the profile to compute
> >> - Job B tries to set profile to 3D, but we do not allow it as job A is
> >> not finished it yet.
> >>
> >> Or Policy B: Current one
> >> - Job A sets the profile to compute
> >> - Job B tries to set profile to 3D, and we allow it. Job A also runs
> >> in PP 3D
> >> - Job B finishes, but does not reset PP as reference count is not zero
> >> due to compute
> >> - Job  A finishes, profile reset to NONE
> >
> > I think this won't work. As I understand it, the
> > amdgpu_dpm_switch_power_profile enables and disables individual
> > profiles. Disabling the 3D profile doesn't disable the compute profile
> > at the same time. I think you'll need one refcount per profile.
> >
> > Regards,
> >Felix
>
> Thanks, This is exactly what I was looking for, I think Alex's initial
> idea was around it, but I was under the assumption that there is only
> one HW profile in SMU which keeps on getting overwritten. This can solve
> our problems, as I can create an array of reference counters, and will
> disable only the profile whose reference counter goes 0.

It's been a while since I paged any of this code into my head, but I
believe the actual workload message in the SMU is a mask where you can
specify multiple workload types at the same time and the SMU will
arbitrate between them internally.  E.g., the most aggressive one will
be selected out of the ones specified.  I think in the driver we just
set one bit at a time using the current interface.  It might be better
to change the interface and just ref count the hint types and then
when we call the set function look at the ref counts for each hint
type and set the mask as appropriate.

Alex


>
> - Shashank
>
> >
> >
> >>
> >>
> >> Or anything else ?
> >>
> >> REgards
> >> Shashank
> >>
> >>
> >>> Or you have multiple VCN contexts. When context1 finishes a job, it
> >>> disables the VIDEO profile. But context2 still has a job on the other
> >>> VCN engine and wants the VIDEO profile to still be enabled.
> >>>
> >>> Regards,
> >>>Felix
> >>>
> >>>
> 
>  Christian.
> 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++---
> >   1 file changed, 11 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > index 5e53a5293935..1caed319a448 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > @@ -34,6 +34,7 @@
> >   #include "amdgpu_ras.h"
> >   #include "amdgpu_umc.h"
> >   #include "amdgpu_reset.h"
> > +#include "amdgpu_ctx_workload.h"
> > /* Total memory size in system memory and all GPU VRAM. Used to
> >* estimate worst case amount of memory to reserve for page tables
> > @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct
> > 

Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-27 Thread Sharma, Shashank




On 9/27/2022 5:23 PM, Felix Kuehling wrote:

Am 2022-09-27 um 10:58 schrieb Sharma, Shashank:

Hello Felix,

Thank for the review comments.

On 9/27/2022 4:48 PM, Felix Kuehling wrote:

Am 2022-09-27 um 02:12 schrieb Christian König:

Am 26.09.22 um 23:40 schrieb Shashank Sharma:

This patch switches the GPU workload mode to/from
compute mode, while submitting compute workload.

Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 


Feel free to add my acked-by, but Felix should probably take a look 
as well.


This look OK purely from a compute perspective. But I'm concerned 
about the interaction of compute with graphics or multiple graphics 
contexts submitting work concurrently. They would constantly override 
or disable each other's workload hints.


For example, you have an amdgpu_ctx with 
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD 
process that also wants the compute profile. Those could be different 
processes belonging to different users. Say, KFD enables the compute 
profile first. Then the graphics context submits a job. At the start 
of the job, the compute profile is enabled. That's a no-op because 
KFD already enabled the compute profile. When the job finishes, it 
disables the compute profile for everyone, including KFD. That's 
unexpected.




In this case, it will not disable the compute profile, as the 
reference counter will not be zero. The reset_profile() will only act 
if the reference counter is 0.


OK, I missed the reference counter.




But I would be happy to get any inputs about a policy which can be 
more sustainable and gets better outputs, for example:
- should we not allow a profile change, if a PP mode is already 
applied and keep it Early bird basis ?


For example: Policy A
- Job A sets the profile to compute
- Job B tries to set profile to 3D, but we do not allow it as job A is 
not finished it yet.


Or Policy B: Current one
- Job A sets the profile to compute
- Job B tries to set profile to 3D, and we allow it. Job A also runs 
in PP 3D
- Job B finishes, but does not reset PP as reference count is not zero 
due to compute

- Job  A finishes, profile reset to NONE


I think this won't work. As I understand it, the 
amdgpu_dpm_switch_power_profile enables and disables individual 
profiles. Disabling the 3D profile doesn't disable the compute profile 
at the same time. I think you'll need one refcount per profile.


Regards,
   Felix


Thanks, This is exactly what I was looking for, I think Alex's initial 
idea was around it, but I was under the assumption that there is only 
one HW profile in SMU which keeps on getting overwritten. This can solve 
our problems, as I can create an array of reference counters, and will 
disable only the profile whose reference counter goes 0.


- Shashank







Or anything else ?

REgards
Shashank


Or you have multiple VCN contexts. When context1 finishes a job, it 
disables the VIDEO profile. But context2 still has a job on the other 
VCN engine and wants the VIDEO profile to still be enabled.


Regards,
   Felix




Christian.


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++---
  1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c

index 5e53a5293935..1caed319a448 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -34,6 +34,7 @@
  #include "amdgpu_ras.h"
  #include "amdgpu_umc.h"
  #include "amdgpu_reset.h"
+#include "amdgpu_ctx_workload.h"
    /* Total memory size in system memory and all GPU VRAM. Used to
   * estimate worst case amount of memory to reserve for page tables
@@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct 
amdgpu_device *adev,
    void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, 
bool idle)

  {
-    amdgpu_dpm_switch_power_profile(adev,
-    PP_SMC_POWER_PROFILE_COMPUTE,
-    !idle);
+    int ret;
+
+    if (idle)
+    ret = amdgpu_clear_workload_profile(adev, 
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE);

+    else
+    ret = amdgpu_set_workload_profile(adev, 
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE);

+
+    if (ret)
+    drm_warn(>ddev, "Failed to %s power profile to 
compute mode\n",

+ idle ? "reset" : "set");
  }
    bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 
vmid)




Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-27 Thread Felix Kuehling

Am 2022-09-27 um 10:58 schrieb Sharma, Shashank:

Hello Felix,

Thank for the review comments.

On 9/27/2022 4:48 PM, Felix Kuehling wrote:

Am 2022-09-27 um 02:12 schrieb Christian König:

Am 26.09.22 um 23:40 schrieb Shashank Sharma:

This patch switches the GPU workload mode to/from
compute mode, while submitting compute workload.

Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 


Feel free to add my acked-by, but Felix should probably take a look 
as well.


This look OK purely from a compute perspective. But I'm concerned 
about the interaction of compute with graphics or multiple graphics 
contexts submitting work concurrently. They would constantly override 
or disable each other's workload hints.


For example, you have an amdgpu_ctx with 
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD 
process that also wants the compute profile. Those could be different 
processes belonging to different users. Say, KFD enables the compute 
profile first. Then the graphics context submits a job. At the start 
of the job, the compute profile is enabled. That's a no-op because 
KFD already enabled the compute profile. When the job finishes, it 
disables the compute profile for everyone, including KFD. That's 
unexpected.




In this case, it will not disable the compute profile, as the 
reference counter will not be zero. The reset_profile() will only act 
if the reference counter is 0.


OK, I missed the reference counter.




But I would be happy to get any inputs about a policy which can be 
more sustainable and gets better outputs, for example:
- should we not allow a profile change, if a PP mode is already 
applied and keep it Early bird basis ?


For example: Policy A
- Job A sets the profile to compute
- Job B tries to set profile to 3D, but we do not allow it as job A is 
not finished it yet.


Or Policy B: Current one
- Job A sets the profile to compute
- Job B tries to set profile to 3D, and we allow it. Job A also runs 
in PP 3D
- Job B finishes, but does not reset PP as reference count is not zero 
due to compute

- Job  A finishes, profile reset to NONE


I think this won't work. As I understand it, the 
amdgpu_dpm_switch_power_profile enables and disables individual 
profiles. Disabling the 3D profile doesn't disable the compute profile 
at the same time. I think you'll need one refcount per profile.


Regards,
  Felix





Or anything else ?

REgards
Shashank


Or you have multiple VCN contexts. When context1 finishes a job, it 
disables the VIDEO profile. But context2 still has a job on the other 
VCN engine and wants the VIDEO profile to still be enabled.


Regards,
   Felix




Christian.


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++---
  1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c

index 5e53a5293935..1caed319a448 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -34,6 +34,7 @@
  #include "amdgpu_ras.h"
  #include "amdgpu_umc.h"
  #include "amdgpu_reset.h"
+#include "amdgpu_ctx_workload.h"
    /* Total memory size in system memory and all GPU VRAM. Used to
   * estimate worst case amount of memory to reserve for page tables
@@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct 
amdgpu_device *adev,
    void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, 
bool idle)

  {
-    amdgpu_dpm_switch_power_profile(adev,
-    PP_SMC_POWER_PROFILE_COMPUTE,
-    !idle);
+    int ret;
+
+    if (idle)
+    ret = amdgpu_clear_workload_profile(adev, 
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE);

+    else
+    ret = amdgpu_set_workload_profile(adev, 
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE);

+
+    if (ret)
+    drm_warn(>ddev, "Failed to %s power profile to 
compute mode\n",

+ idle ? "reset" : "set");
  }
    bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 
vmid)




Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-27 Thread Sharma, Shashank

Hello Felix,

Thank for the review comments.

On 9/27/2022 4:48 PM, Felix Kuehling wrote:

Am 2022-09-27 um 02:12 schrieb Christian König:

Am 26.09.22 um 23:40 schrieb Shashank Sharma:

This patch switches the GPU workload mode to/from
compute mode, while submitting compute workload.

Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 


Feel free to add my acked-by, but Felix should probably take a look as 
well.


This look OK purely from a compute perspective. But I'm concerned about 
the interaction of compute with graphics or multiple graphics contexts 
submitting work concurrently. They would constantly override or disable 
each other's workload hints.


For example, you have an amdgpu_ctx with 
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD 
process that also wants the compute profile. Those could be different 
processes belonging to different users. Say, KFD enables the compute 
profile first. Then the graphics context submits a job. At the start of 
the job, the compute profile is enabled. That's a no-op because KFD 
already enabled the compute profile. When the job finishes, it disables 
the compute profile for everyone, including KFD. That's unexpected.




In this case, it will not disable the compute profile, as the reference 
counter will not be zero. The reset_profile() will only act if the 
reference counter is 0.


But I would be happy to get any inputs about a policy which can be more 
sustainable and gets better outputs, for example:
- should we not allow a profile change, if a PP mode is already applied 
and keep it Early bird basis ?


For example: Policy A
- Job A sets the profile to compute
- Job B tries to set profile to 3D, but we do not allow it as job A is 
not finished it yet.


Or Policy B: Current one
- Job A sets the profile to compute
- Job B tries to set profile to 3D, and we allow it. Job A also runs in 
PP 3D
- Job B finishes, but does not reset PP as reference count is not zero 
due to compute

- Job  A finishes, profile reset to NONE


Or anything else ?

REgards
Shashank


Or you have multiple VCN contexts. When context1 finishes a job, it 
disables the VIDEO profile. But context2 still has a job on the other 
VCN engine and wants the VIDEO profile to still be enabled.


Regards,
   Felix




Christian.


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++---
  1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c

index 5e53a5293935..1caed319a448 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -34,6 +34,7 @@
  #include "amdgpu_ras.h"
  #include "amdgpu_umc.h"
  #include "amdgpu_reset.h"
+#include "amdgpu_ctx_workload.h"
    /* Total memory size in system memory and all GPU VRAM. Used to
   * estimate worst case amount of memory to reserve for page tables
@@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct amdgpu_device 
*adev,
    void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, 
bool idle)

  {
-    amdgpu_dpm_switch_power_profile(adev,
-    PP_SMC_POWER_PROFILE_COMPUTE,
-    !idle);
+    int ret;
+
+    if (idle)
+    ret = amdgpu_clear_workload_profile(adev, 
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE);

+    else
+    ret = amdgpu_set_workload_profile(adev, 
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE);

+
+    if (ret)
+    drm_warn(>ddev, "Failed to %s power profile to compute 
mode\n",

+ idle ? "reset" : "set");
  }
    bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 vmid)




Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-27 Thread Felix Kuehling

Am 2022-09-27 um 02:12 schrieb Christian König:

Am 26.09.22 um 23:40 schrieb Shashank Sharma:

This patch switches the GPU workload mode to/from
compute mode, while submitting compute workload.

Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 


Feel free to add my acked-by, but Felix should probably take a look as 
well.


This look OK purely from a compute perspective. But I'm concerned about 
the interaction of compute with graphics or multiple graphics contexts 
submitting work concurrently. They would constantly override or disable 
each other's workload hints.


For example, you have an amdgpu_ctx with 
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD 
process that also wants the compute profile. Those could be different 
processes belonging to different users. Say, KFD enables the compute 
profile first. Then the graphics context submits a job. At the start of 
the job, the compute profile is enabled. That's a no-op because KFD 
already enabled the compute profile. When the job finishes, it disables 
the compute profile for everyone, including KFD. That's unexpected.


Or you have multiple VCN contexts. When context1 finishes a job, it 
disables the VIDEO profile. But context2 still has a job on the other 
VCN engine and wants the VIDEO profile to still be enabled.


Regards,
  Felix




Christian.


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++---
  1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c

index 5e53a5293935..1caed319a448 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -34,6 +34,7 @@
  #include "amdgpu_ras.h"
  #include "amdgpu_umc.h"
  #include "amdgpu_reset.h"
+#include "amdgpu_ctx_workload.h"
    /* Total memory size in system memory and all GPU VRAM. Used to
   * estimate worst case amount of memory to reserve for page tables
@@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct amdgpu_device 
*adev,
    void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, 
bool idle)

  {
-    amdgpu_dpm_switch_power_profile(adev,
-    PP_SMC_POWER_PROFILE_COMPUTE,
-    !idle);
+    int ret;
+
+    if (idle)
+    ret = amdgpu_clear_workload_profile(adev, 
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE);

+    else
+    ret = amdgpu_set_workload_profile(adev, 
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE);

+
+    if (ret)
+    drm_warn(>ddev, "Failed to %s power profile to compute 
mode\n",

+ idle ? "reset" : "set");
  }
    bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 vmid)




Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-27 Thread Christian König

Am 26.09.22 um 23:40 schrieb Shashank Sharma:

This patch switches the GPU workload mode to/from
compute mode, while submitting compute workload.

Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 


Feel free to add my acked-by, but Felix should probably take a look as well.

Christian.


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++---
  1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 5e53a5293935..1caed319a448 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -34,6 +34,7 @@
  #include "amdgpu_ras.h"
  #include "amdgpu_umc.h"
  #include "amdgpu_reset.h"
+#include "amdgpu_ctx_workload.h"
  
  /* Total memory size in system memory and all GPU VRAM. Used to

   * estimate worst case amount of memory to reserve for page tables
@@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct amdgpu_device *adev,
  
  void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, bool idle)

  {
-   amdgpu_dpm_switch_power_profile(adev,
-   PP_SMC_POWER_PROFILE_COMPUTE,
-   !idle);
+   int ret;
+
+   if (idle)
+   ret = amdgpu_clear_workload_profile(adev, 
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE);
+   else
+   ret = amdgpu_set_workload_profile(adev, 
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE);
+
+   if (ret)
+   drm_warn(>ddev, "Failed to %s power profile to compute 
mode\n",
+idle ? "reset" : "set");
  }
  
  bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 vmid)




[PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

2022-09-26 Thread Shashank Sharma
This patch switches the GPU workload mode to/from
compute mode, while submitting compute workload.

Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 5e53a5293935..1caed319a448 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -34,6 +34,7 @@
 #include "amdgpu_ras.h"
 #include "amdgpu_umc.h"
 #include "amdgpu_reset.h"
+#include "amdgpu_ctx_workload.h"
 
 /* Total memory size in system memory and all GPU VRAM. Used to
  * estimate worst case amount of memory to reserve for page tables
@@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct amdgpu_device *adev,
 
 void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, bool idle)
 {
-   amdgpu_dpm_switch_power_profile(adev,
-   PP_SMC_POWER_PROFILE_COMPUTE,
-   !idle);
+   int ret;
+
+   if (idle)
+   ret = amdgpu_clear_workload_profile(adev, 
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE);
+   else
+   ret = amdgpu_set_workload_profile(adev, 
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE);
+
+   if (ret)
+   drm_warn(>ddev, "Failed to %s power profile to compute 
mode\n",
+idle ? "reset" : "set");
 }
 
 bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 vmid)
-- 
2.34.1