Re: [PATCH 4/4] drm/amdgpu: prevent gpu access during reset recovery

2024-05-24 Thread Christian König

Am 23.05.24 um 17:35 schrieb Li, Yunxiang (Teddy):

[Public]


Here is taking a different lock than the reset_domain->sem. It is a seperate 
reset_domain->gpu_sem that is only locked when we will actuall do reset, it is not 
taken in the skip_hw_reset path.

Exactly that is what you should *not* do. Please don't add any new lock to the 
code. This is already complicated enough.

If you think that adding wrappers for reset lock makes sense then we can 
probably do that, bot don't add any lock for hw access.

The two lock protects very different things though. The first case is we need 
to block two resets running in parallel,


No, that's not correct. Two parallel resets are prevent by using a 
worker queue for the resets.


The reset lock is here exactly to provide external thread the 
opportunity to make their operation mutual exclusive with the reset.



  this does not only cover GPU reset itself but also any teardown that happens 
before GPU reset. The second case is we need to ensure exclusive access to the 
GPU between GPU reset and GPU init, concurrent access is fine before GPU is 
reset.


If that is true you could in theory lower the locked area of the 
existing lock, but adding a new one is strict no-go from my side.


Regards,
Christian.



Theoretically, the second case happens within the first case, so locking the first 
case would protect against both. But with the current implementation this is 
infeasible, all the generic functions called between 
amdgpu_device_lock/unlock_reset_domain would need to be swapped out for special 
versions so the reset thread does not dead lock itself. It is much simpler to have a 
second, much narrower lock that only covers GPU reset<->GPU init because all 
the accesses there are very low level anyway.

Teddy




Re: [PATCH] drm/amdgpu: Adjust logic in amdgpu_device_partner_bandwidth()

2024-05-24 Thread Christian König

Am 16.05.24 um 17:05 schrieb Alex Deucher:

Use current speed/width on devices which don't support
dynamic PCIe switching.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3289
Signed-off-by: Alex Deucher 


Acked-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 19 ---
  1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index e72e774d17e6a..f0011dac589d2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5946,13 +5946,18 @@ static void amdgpu_device_partner_bandwidth(struct 
amdgpu_device *adev,
*speed = PCI_SPEED_UNKNOWN;
*width = PCIE_LNK_WIDTH_UNKNOWN;
  
-	while ((parent = pci_upstream_bridge(parent))) {

-   /* skip upstream/downstream switches internal to dGPU*/
-   if (parent->vendor == PCI_VENDOR_ID_ATI)
-   continue;
-   *speed = pcie_get_speed_cap(parent);
-   *width = pcie_get_width_cap(parent);
-   break;
+   if (amdgpu_device_pcie_dynamic_switching_supported(adev)) {
+   while ((parent = pci_upstream_bridge(parent))) {
+   /* skip upstream/downstream switches internal to dGPU*/
+   if (parent->vendor == PCI_VENDOR_ID_ATI)
+   continue;
+   *speed = pcie_get_speed_cap(parent);
+   *width = pcie_get_width_cap(parent);
+   break;
+   }
+   } else {
+   /* use the current speeds rather than max if switching is not 
supported */
+   pcie_bandwidth_available(adev->pdev, NULL, speed, width);
}
  }
  




Re: [PATCH] drm/amdgpu: silence UBSAN warning

2024-05-24 Thread Christian König

Am 16.05.24 um 15:55 schrieb Alex Deucher:

Convert a variable sized array from [1] to [].

Signed-off-by: Alex Deucher 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/include/atomfirmware.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/include/atomfirmware.h 
b/drivers/gpu/drm/amd/include/atomfirmware.h
index af3eebb4c9bcb..f732182218330 100644
--- a/drivers/gpu/drm/amd/include/atomfirmware.h
+++ b/drivers/gpu/drm/amd/include/atomfirmware.h
@@ -3540,7 +3540,7 @@ struct atom_gpio_voltage_object_v4
 uint8_t  phase_delay_us;  // phase delay in unit of 
micro second
 uint8_t  reserved;
 uint32_t gpio_mask_val; // GPIO Mask value
-   struct atom_voltage_gpio_map_lut voltage_gpio_lut[1];
+   struct atom_voltage_gpio_map_lut voltage_gpio_lut[] 
__counted_by(gpio_entry_num);
  };
  
  struct  atom_svid2_voltage_object_v4




Re: Kernel 5.15.150 black screen with AMD Raven/Picasso GPU

2024-05-24 Thread Greg KH
On Thu, May 23, 2024 at 05:59:39PM +0200, Armin Wolf wrote:
> Am 23.05.24 um 15:13 schrieb Barry Kauler:
> 
> > On Wed, May 22, 2024 at 12:58 AM Armin Wolf  wrote:
> > > Am 20.05.24 um 18:22 schrieb Alex Deucher:
> > > 
> > > > On Sat, May 18, 2024 at 8:17 PM Armin Wolf  wrote:
> > > > > Am 17.05.24 um 03:30 schrieb Barry Kauler:
> > > > > 
> > > > > > Armin, Yifan, Prike,
> > > > > > I will top-post, so you don't have to scroll down.
> > > > > > After identifying the commit that causes black screen with my gpu, I
> > > > > > posted the result to you guys, on May 9.
> > > > > > It is now May 17 and no reply.
> > > > > > OK, I have now created a patch that reverts Yifan's commit, compiled
> > > > > > 5.15.158, and my gpu now works.
> > > > > > Note, the radeon module is not loaded, so it is not a factor.
> > > > > > I'm not a kernel developer. I have identified the culprit and it is 
> > > > > > up
> > > > > > to you guys to fix it, Yifan especially, as you are the person who 
> > > > > > has
> > > > > > created the regression.
> > > > > > I will attach my patch.
> > > > > > Regards,
> > > > > > Barry Kauler
> > > > > Hi,
> > > > > 
> > > > > sorry for not responding to your findings. I normally do not work 
> > > > > with GPU drivers,
> > > > > so i hoped one of the amdgpu developers would handle this.
> > > > > 
> > > > > I cceddri-de...@lists.freedesktop.org  and 
> > > > > amd-gfx@lists.freedesktop.org so that other
> > > > > amdgpu developers hear from this issue.
> > > > > 
> > > > > Thanks you for you persistence in finding the offending commit.
> > > > Likely this patch should not have been ported to 5.15 in the first
> > > > place.  The IOMMU requirements have been dropped from the driver for
> > > > the last few kernel versions so it is no longer relevant on newer
> > > > kernels.
> > > > 
> > > > Alex
> > > Barry, can you verify that the latest upstream kernel works on you device?
> > > If yes, then the commit itself is ok and just the backporting itself was 
> > > wrong.
> > > 
> > > Thanks,
> > > Armin Wolf
> > Armin,
> > The unmodified 6.8.1 kernel works ok.
> > I presume that patch was applied long before 6.8.1 got released and
> > only got backported to 5.15.x recently.
> > 
> > Regards,
> > Barry
> > 
> Great to hear, that means we only have to revert commit 56b522f46681 
> ("drm/amdgpu: init iommu after amdkfd device init")
> from the 5.15.y series.
> 
> I CCed the stable mailing list so that they can revert the offending commit.

Please submit the patch/revert that you wish to have applied to the tree
so we can have the correct information in it.  I have no idea what to do
here with this deep response thread as-is, sorry.

thanks,

greg k-h


[PATCH] drm/amd/display: clean up some inconsistent indenting

2024-05-24 Thread Jiapeng Chong
No functional modification involved.

drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:5200 dc_power_down_on_boot() 
warn: inconsistent indenting.

Reported-by: Abaci Robot 
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9166
Signed-off-by: Jiapeng Chong 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 3a2101b052ea..4612c60edebd 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -5194,9 +5194,7 @@ void dc_power_down_on_boot(struct dc *dc)
}
 }
 
-void dc_set_power_state(
-   struct dc *dc,
-   enum dc_acpi_cm_power_state power_state)
+void dc_set_power_state(struct dc *dc, enum dc_acpi_cm_power_state power_state)
 {
if (!dc->current_state)
return;
-- 
2.20.1.7.g153144c



[PATCH] Revert "drm/amdgpu: init iommu after amdkfd device init"

2024-05-24 Thread Armin Wolf
This reverts commit 56b522f4668167096a50c39446d6263c96219f5f.

A user reported that this commit breaks the integrated gpu of his
notebook, causing a black screen. He was able to bisect the problematic
commit and verified that by reverting it the notebook works again.
He also confirmed that kernel 6.8.1 also works on his device, so the
upstream commit itself seems to be ok.

An amdgpu developer (Alex Deucher) confirmed that this patch should
have never been ported to 5.15 in the first place, so revert this
commit from the 5.15 stable series.

Reported-by: Barry Kauler 
Signed-off-by: Armin Wolf 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 222a1d9ecf16..5f6c32ec674d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2487,6 +2487,10 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
*adev)
if (r)
goto init_failed;

+   r = amdgpu_amdkfd_resume_iommu(adev);
+   if (r)
+   goto init_failed;
+
r = amdgpu_device_ip_hw_init_phase1(adev);
if (r)
goto init_failed;
@@ -2525,10 +2529,6 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
*adev)
if (!adev->gmc.xgmi.pending_reset)
amdgpu_amdkfd_device_init(adev);

-   r = amdgpu_amdkfd_resume_iommu(adev);
-   if (r)
-   goto init_failed;
-
amdgpu_fru_get_product_info(adev);

 init_failed:
--
2.39.2



Re: [PATCH v4 13/15] drm/amd/display: Use ARCH_HAS_KERNEL_FPU_SUPPORT

2024-05-24 Thread Guenter Roeck
Hi,

On Fri, Mar 29, 2024 at 12:18:28AM -0700, Samuel Holland wrote:
> Now that all previously-supported architectures select
> ARCH_HAS_KERNEL_FPU_SUPPORT, this code can depend on that symbol instead
> of the existing list of architectures. It can also take advantage of the
> common kernel-mode FPU API and method of adjusting CFLAGS.
> 
> Acked-by: Alex Deucher 
> Reviewed-by: Christoph Hellwig 
> Signed-off-by: Samuel Holland 

With this patch in the mainline kernel, I get the following build error
when trying to build powerpc:ppc32_allmodconfig.

powerpc64-linux-ld: 
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard 
float, drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_helpers.o uses 
soft float
powerpc64-linux-ld: failed to merge target specific data of file 
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o

[ repeated many times ]

The problem is no longer seen after reverting this patch.

Guenter


Re: [PATCH 1/4] drm/amdgpu: fix -Wformat-truncation warning in amdgpu_gfx_kiq_init_ring()

2024-05-24 Thread Jani Nikula
On Thu, 23 May 2024, Alex Deucher  wrote:
> Already fixed with this patch:
> https://patchwork.freedesktop.org/patch/594864/

Great, thanks!

BR,
Jani.



-- 
Jani Nikula, Intel


Re: [PATCH 4/4] drm: enable -Wformat-truncation across the subsystem

2024-05-24 Thread Jani Nikula
On Thu, 23 May 2024, Sam Ravnborg  wrote:
> Hi Jani,
>
> On Thu, May 23, 2024 at 06:51:09PM +0300, Jani Nikula wrote:
>> With the -Wformat-truncation warnings fixed, finish the job started in
>> commit a61ddb4393ad ("drm: enable (most) W=1 warnings by default across
>> the subsystem"), and enable that warning too.
>> 
>> Signed-off-by: Jani Nikula 
>
> When it is enabled for all of drm then the explicit assignments here
> could be dropped I think:
>
> drivers/gpu/drm/i915/Makefile:subdir-ccflags-y += $(call cc-option, 
> -Wformat-truncation)
> drivers/gpu/drm/xe/Makefile:subdir-ccflags-y += $(call cc-option, 
> -Wformat-truncation)
>
> Just a drive-by comment, I know this patch was mostly for the bots.

Additionally, I didn't want to create any conflicts with [1]. There's no
harm in having the duplication.

BR,
Jani.

[1] https://lore.kernel.org/r/cover.1716471145.git.jani.nik...@intel.com


>
>   Sam
>
>> 
>> ---
>> 
>> Gut feeling says there are more issues, and my configs just don't catch
>> them all, but let's see what the build bots have to say. ;)
>> ---
>>  drivers/gpu/drm/Makefile | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
>> index 68cc9258ffc4..644613dbedda 100644
>> --- a/drivers/gpu/drm/Makefile
>> +++ b/drivers/gpu/drm/Makefile
>> @@ -16,8 +16,7 @@ subdir-ccflags-y += $(call cc-option, 
>> -Wunused-but-set-variable)
>>  subdir-ccflags-y += $(call cc-option, -Wunused-const-variable)
>>  subdir-ccflags-y += $(call cc-option, -Wpacked-not-aligned)
>>  subdir-ccflags-y += $(call cc-option, -Wformat-overflow)
>> -# FIXME: fix -Wformat-truncation warnings and uncomment
>> -#subdir-ccflags-y += $(call cc-option, -Wformat-truncation)
>> +subdir-ccflags-y += $(call cc-option, -Wformat-truncation)
>>  subdir-ccflags-y += $(call cc-option, -Wstringop-truncation)
>>  # The following turn off the warnings enabled by -Wextra
>>  ifeq ($(findstring 2, $(KBUILD_EXTRA_WARN)),)
>> -- 
>> 2.39.2

-- 
Jani Nikula, Intel


Re: [PATCH 3/4] drm/imx: fix -Wformat-truncation warning in imx_ldb_probe()

2024-05-24 Thread Jani Nikula
On Thu, 23 May 2024, Ville Syrjälä  wrote:
> On Thu, May 23, 2024 at 06:51:08PM +0300, Jani Nikula wrote:
>> Enabling -Wformat-truncation yields the following warning:
>> 
>> ../drivers/gpu/drm/imx/ipuv3/imx-ldb.c: In function ‘imx_ldb_probe’:
>> ../drivers/gpu/drm/imx/ipuv3/imx-ldb.c:658:57: error: ‘_sel’ directive 
>> output may be truncated writing 4 bytes into a region of size between 3 and 
>> 13 [-Werror=format-truncation=]
>>   658 | snprintf(clkname, sizeof(clkname), "di%d_sel", i);
>>   | ^~~~
>> ../drivers/gpu/drm/imx/ipuv3/imx-ldb.c:658:17: note: ‘snprintf’ output 
>> between 8 and 18 bytes into a destination of size 16
>>   658 | snprintf(clkname, sizeof(clkname), "di%d_sel", i);
>>   | ^
>
> If only the compiler could count to three...

I did not try, but apparently using %hhd would hide the issue too:

snprintf(clkname, sizeof(clkname), "di%hhd_sel", i);

BR,
Jani.


>
>> 
>> Silence the warning by checking the snprintf() return value.
>> 
>> Signed-off-by: Jani Nikula 
>> 
>> ---
>> 
>> Cc: Philipp Zabel 
>> Cc: Shawn Guo 
>> Cc: Sascha Hauer 
>> Cc: Pengutronix Kernel Team 
>> Cc: Fabio Estevam 
>> Cc: dri-de...@lists.freedesktop.org
>> Cc: i...@lists.linux.dev
>> ---
>>  drivers/gpu/drm/imx/ipuv3/imx-ldb.c | 6 +-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/gpu/drm/imx/ipuv3/imx-ldb.c 
>> b/drivers/gpu/drm/imx/ipuv3/imx-ldb.c
>> index 71d70194fcbd..46f779fe60ee 100644
>> --- a/drivers/gpu/drm/imx/ipuv3/imx-ldb.c
>> +++ b/drivers/gpu/drm/imx/ipuv3/imx-ldb.c
>> @@ -654,8 +654,12 @@ static int imx_ldb_probe(struct platform_device *pdev)
>>   */
>>  for (i = 0; i < 4; i++) {
>>  char clkname[16];
>> +int len;
>> +
>> +len = snprintf(clkname, sizeof(clkname), "di%d_sel", i);
>> +if (len >= sizeof(clkname))
>> +dev_err(dev, "clkname truncated\n");
>>  
>> -snprintf(clkname, sizeof(clkname), "di%d_sel", i);
>>  imx_ldb->clk_sel[i] = devm_clk_get(imx_ldb->dev, clkname);
>>  if (IS_ERR(imx_ldb->clk_sel[i])) {
>>  ret = PTR_ERR(imx_ldb->clk_sel[i]);
>> -- 
>> 2.39.2

-- 
Jani Nikula, Intel


[bug report] drm/amd/display: Find max flickerless instant vtotal delta

2024-05-24 Thread Dan Carpenter
Hello Ethan Bitnun,

Commit bd051aa2fcfb ("drm/amd/display: Find max flickerless instant
vtotal delta") from Apr 1, 2024 (linux-next), leads to the following
Smatch static checker warning:

drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_stream.c:1006 
dc_stream_get_max_flickerless_instant_vtotal_delta()
warn: always true condition '((stream->timing.v_total - safe_refresh_v_total) 
>= 0) => (0-u32max >= 0)'

drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_stream.c:1008 
dc_stream_get_max_flickerless_instant_vtotal_delta()
warn: always true condition '((safe_refresh_v_total - stream->timing.v_total) 
>= 0) => (0-u32max >= 0)'

drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_stream.c
990 static unsigned int 
dc_stream_get_max_flickerless_instant_vtotal_delta(struct dc_stream_state 
*stream, bool is_gaming, bool increase)
991 {
992 if (stream->timing.v_total * stream->timing.h_total == 0)
993 return 0;
994 
995 int current_refresh_hz = (int)div64_s64((long 
long)stream->timing.pix_clk_100hz*100, 
stream->timing.v_total*stream->timing.h_total);
996 
997 int safe_refresh_hz = 
dc_stream_calculate_flickerless_refresh_rate(stream,
998  
dc_stream_get_brightness_millinits_from_refresh(stream, current_refresh_hz),
999  
current_refresh_hz,
1000  is_gaming,
1001  increase);
1002 
1003 int safe_refresh_v_total = (int)div64_s64((long 
long)stream->timing.pix_clk_100hz*100, safe_refresh_hz*stream->timing.h_total);
1004 
1005 if (increase)
--> 1006 return ((stream->timing.v_total - 
safe_refresh_v_total) >= 0) ? (stream->timing.v_total - safe_refresh_v_total) : 
0;
  ^
stream->timing.v_total is u32 so it makes the subtract u32 thus it's
always >= 0.

1007 
1008 return ((safe_refresh_v_total - stream->timing.v_total) >= 0) 
? (safe_refresh_v_total - stream->timing.v_total) : 0;
  ^^^
Same.

1009 }

regards,
dan carpenter


Re: [PATCH v4 13/15] drm/amd/display: Use ARCH_HAS_KERNEL_FPU_SUPPORT

2024-05-24 Thread Alex Deucher
On Fri, May 24, 2024 at 5:16 AM Guenter Roeck  wrote:
>
> Hi,
>
> On Fri, Mar 29, 2024 at 12:18:28AM -0700, Samuel Holland wrote:
> > Now that all previously-supported architectures select
> > ARCH_HAS_KERNEL_FPU_SUPPORT, this code can depend on that symbol instead
> > of the existing list of architectures. It can also take advantage of the
> > common kernel-mode FPU API and method of adjusting CFLAGS.
> >
> > Acked-by: Alex Deucher 
> > Reviewed-by: Christoph Hellwig 
> > Signed-off-by: Samuel Holland 
>
> With this patch in the mainline kernel, I get the following build error
> when trying to build powerpc:ppc32_allmodconfig.
>
> powerpc64-linux-ld: 
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard 
> float, drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_helpers.o 
> uses soft float
> powerpc64-linux-ld: failed to merge target specific data of file 
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o
>
> [ repeated many times ]
>
> The problem is no longer seen after reverting this patch.

Should be fixed with this patch:
https://gitlab.freedesktop.org/agd5f/linux/-/commit/5f56be33f33dd1d50b9433f842c879a20dc00f5b
Will pull it into my -fixes branch.

Alex

>
> Guenter


Re: [PATCH v4 13/15] drm/amd/display: Use ARCH_HAS_KERNEL_FPU_SUPPORT

2024-05-24 Thread Alex Deucher
On Fri, May 24, 2024 at 9:17 AM Alex Deucher  wrote:
>
> On Fri, May 24, 2024 at 5:16 AM Guenter Roeck  wrote:
> >
> > Hi,
> >
> > On Fri, Mar 29, 2024 at 12:18:28AM -0700, Samuel Holland wrote:
> > > Now that all previously-supported architectures select
> > > ARCH_HAS_KERNEL_FPU_SUPPORT, this code can depend on that symbol instead
> > > of the existing list of architectures. It can also take advantage of the
> > > common kernel-mode FPU API and method of adjusting CFLAGS.
> > >
> > > Acked-by: Alex Deucher 
> > > Reviewed-by: Christoph Hellwig 
> > > Signed-off-by: Samuel Holland 
> >
> > With this patch in the mainline kernel, I get the following build error
> > when trying to build powerpc:ppc32_allmodconfig.
> >
> > powerpc64-linux-ld: 
> > drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard 
> > float, drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_helpers.o 
> > uses soft float
> > powerpc64-linux-ld: failed to merge target specific data of file 
> > drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o
> >
> > [ repeated many times ]
> >
> > The problem is no longer seen after reverting this patch.
>
> Should be fixed with this patch:
> https://gitlab.freedesktop.org/agd5f/linux/-/commit/5f56be33f33dd1d50b9433f842c879a20dc00f5b
> Will pull it into my -fixes branch.
>

Nevermind, this is something else.

Alex


> Alex
>
> >
> > Guenter


Re: [PATCH] drm/amdgpu: drop MES 10.1 support v3

2024-05-24 Thread Alex Deucher
On Thu, May 23, 2024 at 7:22 PM Kasiviswanathan, Harish
 wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> I had one more comment. With that fixed this patch is Reviewed-by: Harish 
> Kasiviswanathan 
>
> static gfx_v10_0_ring_invalidate_tlbs() function can be removed since it is 
> no longer used.

It's still used by gfx10_kiq_invalidate_tlbs().

Alex

>
> Best Regards,
> Harish
>
>
> -Original Message-
> From: amd-gfx  On Behalf Of Alex 
> Deucher
> Sent: Thursday, May 23, 2024 3:48 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: [PATCH] drm/amdgpu: drop MES 10.1 support v3
>
> It was an enablement vehicle for MES 11 and was never
> productized.  Remove it.
>
> v2: drop additional checks in the GFX10 code.
> v3: drop mes_api_def.h
>
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/Makefile   |1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c |   20 -
>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c|  281 +---
>  drivers/gpu/drm/amd/amdgpu/mes_v10_1.c| 1190 -
>  drivers/gpu/drm/amd/amdgpu/mes_v10_1.h|   29 -
>  drivers/gpu/drm/amd/amdgpu/nv.c   |1 -
>  .../drm/amd/amdkfd/kfd_device_queue_manager.c |2 +-
>  drivers/gpu/drm/amd/include/mes_api_def.h |  570 
>  8 files changed, 72 insertions(+), 2022 deletions(-)
>  delete mode 100644 drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
>  delete mode 100644 drivers/gpu/drm/amd/amdgpu/mes_v10_1.h
>  delete mode 100644 drivers/gpu/drm/amd/include/mes_api_def.h
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
> b/drivers/gpu/drm/amd/amdgpu/Makefile
> index 6e3d7f51616f..eddbb69a179f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> @@ -187,7 +187,6 @@ amdgpu-y += \
>  # add MES block
>  amdgpu-y += \
> amdgpu_mes.o \
> -   mes_v10_1.o \
> mes_v11_0.o \
> mes_v12_0.o
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> index c5f23e1a1362..510916e28d37 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> @@ -94,7 +94,6 @@
>  #include "vcn_v4_0_5.h"
>  #include "jpeg_v4_0_5.h"
>  #include "amdgpu_vkms.h"
> -#include "mes_v10_1.h"
>  #include "mes_v11_0.h"
>  #include "mes_v12_0.h"
>  #include "smuio_v11_0.h"
> @@ -2319,25 +2318,6 @@ static int amdgpu_discovery_set_mm_ip_blocks(struct 
> amdgpu_device *adev)
>  static int amdgpu_discovery_set_mes_ip_blocks(struct amdgpu_device *adev)
>  {
> switch (amdgpu_ip_version(adev, GC_HWIP, 0)) {
> -   case IP_VERSION(10, 1, 10):
> -   case IP_VERSION(10, 1, 1):
> -   case IP_VERSION(10, 1, 2):
> -   case IP_VERSION(10, 1, 3):
> -   case IP_VERSION(10, 1, 4):
> -   case IP_VERSION(10, 3, 0):
> -   case IP_VERSION(10, 3, 1):
> -   case IP_VERSION(10, 3, 2):
> -   case IP_VERSION(10, 3, 3):
> -   case IP_VERSION(10, 3, 4):
> -   case IP_VERSION(10, 3, 5):
> -   case IP_VERSION(10, 3, 6):
> -   if (amdgpu_mes) {
> -   amdgpu_device_ip_block_add(adev, &mes_v10_1_ip_block);
> -   adev->enable_mes = true;
> -   if (amdgpu_mes_kiq)
> -   adev->enable_mes_kiq = true;
> -   }
> -   break;
> case IP_VERSION(11, 0, 0):
> case IP_VERSION(11, 0, 1):
> case IP_VERSION(11, 0, 2):
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index 8ceb26a5575a..2a808029a47c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -3734,14 +3734,8 @@ static void gfx10_kiq_unmap_queues(struct amdgpu_ring 
> *kiq_ring,
>enum amdgpu_unmap_queues_action action,
>u64 gpu_addr, u64 seq)
>  {
> -   struct amdgpu_device *adev = kiq_ring->adev;
> uint32_t eng_sel = ring->funcs->type == AMDGPU_RING_TYPE_GFX ? 4 : 0;
>
> -   if (adev->enable_mes && !adev->gfx.kiq[0].ring.sched.ready) {
> -   amdgpu_mes_unmap_legacy_queue(adev, ring, action, gpu_addr, 
> seq);
> -   return;
> -   }
> -
> amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_UNMAP_QUEUES, 4));
> amdgpu_ring_write(kiq_ring, /* Q_sel: 0, vmid: 0, engine: 0, num_Q: 1 
> */
>   PACKET3_UNMAP_QUEUES_ACTION(action) |
> @@ -3999,33 +3993,18 @@ static int gfx_v10_0_ring_test_ib(struct amdgpu_ring 
> *ring, long timeout)
>
> memset(&ib, 0, sizeof(ib));
>
> -   if (ring->is_mes_queue) {
> -   uint32_t padding, offset;
> -
> -   offset = amdgpu_mes_ctx_get_offs(ring, 
> AMDGPU_MES_CTX_IB_OFFS);
> -   padding = amdgpu_mes_ctx_get_offs(ring,
> - 

RE: [PATCH 4/4] drm/amdgpu: prevent gpu access during reset recovery

2024-05-24 Thread Li, Yunxiang (Teddy)
[AMD Official Use Only - AMD Internal Distribution Only]

> If that is true you could in theory lower the locked area of the existing 
> lock, but adding a new one is strict no-go from my side.

I'll try this, right now I see two places where this would be problematic 
because they are trying to suspend either gfx or kfd scheduler and later resume 
it, which would race with the same in reset. It is exactly this two functions 
I'm trying to avoid putting in the scope of the second lock, because they are 
common to both in and outside reset.

Teddy


Re: [PATCH] drm/amd/display: clean up some inconsistent indenting

2024-05-24 Thread Alex Deucher
Applied.  Thanks!

On Thu, May 23, 2024 at 10:37 PM Jiapeng Chong
 wrote:
>
> No functional modification involved.
>
> drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:5200 
> dc_power_down_on_boot() warn: inconsistent indenting.
>
> Reported-by: Abaci Robot 
> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9166
> Signed-off-by: Jiapeng Chong 
> ---
>  drivers/gpu/drm/amd/display/dc/core/dc.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
> b/drivers/gpu/drm/amd/display/dc/core/dc.c
> index 3a2101b052ea..4612c60edebd 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
> @@ -5194,9 +5194,7 @@ void dc_power_down_on_boot(struct dc *dc)
> }
>  }
>
> -void dc_set_power_state(
> -   struct dc *dc,
> -   enum dc_acpi_cm_power_state power_state)
> +void dc_set_power_state(struct dc *dc, enum dc_acpi_cm_power_state 
> power_state)
>  {
> if (!dc->current_state)
> return;
> --
> 2.20.1.7.g153144c
>


RE: [PATCH] drm/amdgpu: drop MES 10.1 support v3

2024-05-24 Thread Kasiviswanathan, Harish
Ah yes. Missed it.
Reviewed-by: Harish Kasiviswanathan 

-Original Message-
From: Alex Deucher  
Sent: Friday, May 24, 2024 9:22 AM
To: Kasiviswanathan, Harish 
Cc: Deucher, Alexander ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: drop MES 10.1 support v3

On Thu, May 23, 2024 at 7:22 PM Kasiviswanathan, Harish
 wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> I had one more comment. With that fixed this patch is Reviewed-by: Harish 
> Kasiviswanathan 
>
> static gfx_v10_0_ring_invalidate_tlbs() function can be removed since it is 
> no longer used.

It's still used by gfx10_kiq_invalidate_tlbs().

Alex

>
> Best Regards,
> Harish
>
>
> -Original Message-
> From: amd-gfx  On Behalf Of Alex 
> Deucher
> Sent: Thursday, May 23, 2024 3:48 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: [PATCH] drm/amdgpu: drop MES 10.1 support v3
>
> It was an enablement vehicle for MES 11 and was never
> productized.  Remove it.
>
> v2: drop additional checks in the GFX10 code.
> v3: drop mes_api_def.h
>
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/Makefile   |1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c |   20 -
>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c|  281 +---
>  drivers/gpu/drm/amd/amdgpu/mes_v10_1.c| 1190 -
>  drivers/gpu/drm/amd/amdgpu/mes_v10_1.h|   29 -
>  drivers/gpu/drm/amd/amdgpu/nv.c   |1 -
>  .../drm/amd/amdkfd/kfd_device_queue_manager.c |2 +-
>  drivers/gpu/drm/amd/include/mes_api_def.h |  570 
>  8 files changed, 72 insertions(+), 2022 deletions(-)
>  delete mode 100644 drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
>  delete mode 100644 drivers/gpu/drm/amd/amdgpu/mes_v10_1.h
>  delete mode 100644 drivers/gpu/drm/amd/include/mes_api_def.h
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
> b/drivers/gpu/drm/amd/amdgpu/Makefile
> index 6e3d7f51616f..eddbb69a179f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> @@ -187,7 +187,6 @@ amdgpu-y += \
>  # add MES block
>  amdgpu-y += \
> amdgpu_mes.o \
> -   mes_v10_1.o \
> mes_v11_0.o \
> mes_v12_0.o
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> index c5f23e1a1362..510916e28d37 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> @@ -94,7 +94,6 @@
>  #include "vcn_v4_0_5.h"
>  #include "jpeg_v4_0_5.h"
>  #include "amdgpu_vkms.h"
> -#include "mes_v10_1.h"
>  #include "mes_v11_0.h"
>  #include "mes_v12_0.h"
>  #include "smuio_v11_0.h"
> @@ -2319,25 +2318,6 @@ static int amdgpu_discovery_set_mm_ip_blocks(struct 
> amdgpu_device *adev)
>  static int amdgpu_discovery_set_mes_ip_blocks(struct amdgpu_device *adev)
>  {
> switch (amdgpu_ip_version(adev, GC_HWIP, 0)) {
> -   case IP_VERSION(10, 1, 10):
> -   case IP_VERSION(10, 1, 1):
> -   case IP_VERSION(10, 1, 2):
> -   case IP_VERSION(10, 1, 3):
> -   case IP_VERSION(10, 1, 4):
> -   case IP_VERSION(10, 3, 0):
> -   case IP_VERSION(10, 3, 1):
> -   case IP_VERSION(10, 3, 2):
> -   case IP_VERSION(10, 3, 3):
> -   case IP_VERSION(10, 3, 4):
> -   case IP_VERSION(10, 3, 5):
> -   case IP_VERSION(10, 3, 6):
> -   if (amdgpu_mes) {
> -   amdgpu_device_ip_block_add(adev, &mes_v10_1_ip_block);
> -   adev->enable_mes = true;
> -   if (amdgpu_mes_kiq)
> -   adev->enable_mes_kiq = true;
> -   }
> -   break;
> case IP_VERSION(11, 0, 0):
> case IP_VERSION(11, 0, 1):
> case IP_VERSION(11, 0, 2):
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index 8ceb26a5575a..2a808029a47c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -3734,14 +3734,8 @@ static void gfx10_kiq_unmap_queues(struct amdgpu_ring 
> *kiq_ring,
>enum amdgpu_unmap_queues_action action,
>u64 gpu_addr, u64 seq)
>  {
> -   struct amdgpu_device *adev = kiq_ring->adev;
> uint32_t eng_sel = ring->funcs->type == AMDGPU_RING_TYPE_GFX ? 4 : 0;
>
> -   if (adev->enable_mes && !adev->gfx.kiq[0].ring.sched.ready) {
> -   amdgpu_mes_unmap_legacy_queue(adev, ring, action, gpu_addr, 
> seq);
> -   return;
> -   }
> -
> amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_UNMAP_QUEUES, 4));
> amdgpu_ring_write(kiq_ring, /* Q_sel: 0, vmid: 0, engine: 0, num_Q: 1 
> */
>   PACKET3_UNMAP_QUEUES_ACTION(action) |
> @@ -3999,33 +3993,18 @@ static int gfx_v10_0_ring_test_ib(struct amdgpu_ring 
> *ring, long timeout)
>
> memset(&ib, 0

[PATCH] drm/amdkfd: simplify APU VRAM handling

2024-05-24 Thread Alex Deucher
With commit 89773b85599a
("drm/amdkfd: Let VRAM allocations go to GTT domain on small APUs")
big and small APU "VRAM" handling in KFD was unified.  Since AMD_IS_APU
is set for both big and small APUs, we can simplify the checks in
the code.

v2: clean up a few more places (Lang)

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16 
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c |  6 ++
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h |  1 -
 4 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 336eb51c4839..3af00b57cd8a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -196,7 +196,7 @@ int amdgpu_amdkfd_reserve_mem_limit(struct amdgpu_device 
*adev,
return -EINVAL;
 
vram_size = KFD_XCP_MEMORY_SIZE(adev, xcp_id);
-   if (adev->gmc.is_app_apu || adev->flags & AMD_IS_APU) {
+   if (adev->flags & AMD_IS_APU) {
system_mem_needed = size;
ttm_mem_needed = size;
}
@@ -233,7 +233,7 @@ int amdgpu_amdkfd_reserve_mem_limit(struct amdgpu_device 
*adev,
if (adev && xcp_id >= 0) {
adev->kfd.vram_used[xcp_id] += vram_needed;
adev->kfd.vram_used_aligned[xcp_id] +=
-   (adev->gmc.is_app_apu || adev->flags & 
AMD_IS_APU) ?
+   (adev->flags & AMD_IS_APU) ?
vram_needed :
ALIGN(vram_needed, VRAM_AVAILABLITY_ALIGN);
}
@@ -261,7 +261,7 @@ void amdgpu_amdkfd_unreserve_mem_limit(struct amdgpu_device 
*adev,
 
if (adev) {
adev->kfd.vram_used[xcp_id] -= size;
-   if (adev->gmc.is_app_apu || adev->flags & AMD_IS_APU) {
+   if (adev->flags & AMD_IS_APU) {
adev->kfd.vram_used_aligned[xcp_id] -= size;
kfd_mem_limit.system_mem_used -= size;
kfd_mem_limit.ttm_mem_used -= size;
@@ -894,7 +894,7 @@ static int kfd_mem_attach(struct amdgpu_device *adev, 
struct kgd_mem *mem,
 * if peer device has large BAR. In contrast, access over xGMI is
 * allowed for both small and large BAR configurations of peer device
 */
-   if ((adev != bo_adev && !(adev->gmc.is_app_apu || adev->flags & 
AMD_IS_APU)) &&
+   if ((adev != bo_adev && !(adev->flags & AMD_IS_APU)) &&
((mem->domain == AMDGPU_GEM_DOMAIN_VRAM) ||
 (mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL) ||
 (mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP))) {
@@ -1682,7 +1682,7 @@ size_t amdgpu_amdkfd_get_available_memory(struct 
amdgpu_device *adev,
- atomic64_read(&adev->vram_pin_size)
- reserved_for_pt;
 
-   if (adev->gmc.is_app_apu || adev->flags & AMD_IS_APU) {
+   if (adev->flags & AMD_IS_APU) {
system_mem_available = no_system_mem_limit ?
kfd_mem_limit.max_system_mem_limit :
kfd_mem_limit.max_system_mem_limit -
@@ -1730,7 +1730,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
if (flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
domain = alloc_domain = AMDGPU_GEM_DOMAIN_VRAM;
 
-   if (adev->gmc.is_app_apu || adev->flags & AMD_IS_APU) {
+   if (adev->flags & AMD_IS_APU) {
domain = AMDGPU_GEM_DOMAIN_GTT;
alloc_domain = AMDGPU_GEM_DOMAIN_GTT;
alloc_flags = 0;
@@ -1981,7 +1981,7 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
if (size) {
if (!is_imported &&
   (mem->bo->preferred_domains == AMDGPU_GEM_DOMAIN_VRAM ||
-  ((adev->gmc.is_app_apu || adev->flags & AMD_IS_APU) &&
+  ((adev->flags & AMD_IS_APU) &&
mem->bo->preferred_domains == AMDGPU_GEM_DOMAIN_GTT)))
*size = bo_size;
else
@@ -2404,7 +2404,7 @@ static int import_obj_create(struct amdgpu_device *adev,
(*mem)->bo = bo;
(*mem)->va = va;
(*mem)->domain = (bo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM) &&
-!(adev->gmc.is_app_apu || adev->flags & AMD_IS_APU) ?
+!(adev->flags & AMD_IS_APU) ?
 AMDGPU_GEM_DOMAIN_VRAM : AMDGPU_GEM_DOMAIN_GTT;
 
(*mem)->mapped_to_gpu_memory = 0;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index 4816fcb9803a..8ee3d07ffbdf 100644
--- a/drivers/gp

Re: [PATCH 4/4] drm/amdgpu: prevent gpu access during reset recovery

2024-05-24 Thread Christian König

Am 24.05.24 um 15:35 schrieb Li, Yunxiang (Teddy):

[AMD Official Use Only - AMD Internal Distribution Only]


If that is true you could in theory lower the locked area of the existing lock, 
but adding a new one is strict no-go from my side.

I'll try this, right now I see two places where this would be problematic 
because they are trying to suspend either gfx or kfd scheduler and later resume 
it, which would race with the same in reset. It is exactly this two functions 
I'm trying to avoid putting in the scope of the second lock, because they are 
common to both in and outside reset.


I don't think that avoiding the suspending and resuming the schedulers 
is worth moving the locks.


That is something so rarely done that it shouldn't matter.

Christian.



Teddy




RE: [PATCH 4/4] drm/amdgpu: prevent gpu access during reset recovery

2024-05-24 Thread Li, Yunxiang (Teddy)
[AMD Official Use Only - AMD Internal Distribution Only]

Yes, the two places are 1. In debugfs and 2. In MI100's en/disable_debug_trap, 
and evidently someone is testing the debugfs interface because there's a bug 
fix for a race condition of it.

Teddy


Re: [RFC PATCH] drm/amdgpu: Refactor sysfs attr functions in AMDGPU for reusability

2024-05-24 Thread Christian König

Am 23.05.24 um 11:18 schrieb Srinivasan Shanmugam:

This commit refactors the sysfs attribute management functions
(`amdgpu_device_attr_create`, `amdgpu_device_attr_remove`,
`amdgpu_device_attr_create_groups`, `amdgpu_device_attr_remove_groups`)
into `amdgpu_sysfs.c`, which were originally in `amdgpu_pm.c`. This
change allows these functions to be reused by other modules like gfx,
pm, etc.

Additionally, the attribute update logic is now encapsulated in the
`pm_update_sysfs_attr` function, which is located in amdgpu_pm.c. This
function is specific to the pm module and is invoked by
amdgpu_device_attr_create for each attribute before the attribute is
created.

The `amdgpu_device_attr_create_groups` function has also been updated to
use `pm_update_syfs_attr`. This ensures that the attribute update logic
is consistently applied to all attributes.

This refactoring enhances the modularity and maintainability of the
code. It also increases the reusability of the attribute management
functions, allowing them to be used by multiple modules.

Cc: Lijo Lazar 
Cc: Alex Deucher 
Cc: Christian König 
Suggested-by: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 


While at it you could fix some minor coding style nits. See below.

With that fixed the patch is Reviewed-by: Christian König 




---
  drivers/gpu/drm/amd/amdgpu/Makefile   |   3 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_sysfs.c | 112 
  drivers/gpu/drm/amd/amdgpu/amdgpu_sysfs.h |  99 ++
  drivers/gpu/drm/amd/pm/amdgpu_pm.c| 150 +++---
  drivers/gpu/drm/amd/pm/inc/amdgpu_pm.h|  59 -
  5 files changed, 256 insertions(+), 167 deletions(-)
  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_sysfs.c
  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_sysfs.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 1f6b56ec99f6..8c782e26dfcb 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -81,7 +81,8 @@ amdgpu-y += amdgpu_device.o amdgpu_doorbell_mgr.o 
amdgpu_kms.o \
amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o \
amdgpu_fw_attestation.o amdgpu_securedisplay.o \
amdgpu_eeprom.o amdgpu_mca.o amdgpu_psp_ta.o amdgpu_lsdma.o \
-   amdgpu_ring_mux.o amdgpu_xcp.o amdgpu_seq64.o amdgpu_aca.o 
amdgpu_dev_coredump.o
+   amdgpu_ring_mux.o amdgpu_xcp.o amdgpu_seq64.o amdgpu_aca.o 
amdgpu_dev_coredump.o \
+   amdgpu_sysfs.o
  
  amdgpu-$(CONFIG_PROC_FS) += amdgpu_fdinfo.o
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sysfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sysfs.c

new file mode 100644
index ..bbdf3e8966d5
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sysfs.c
@@ -0,0 +1,112 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#include "amdgpu.h"
+#include "amdgpu_dpm.h"
+#include "amdgpu_drv.h"
+#include "amdgpu_reset.h"
+#include "amdgpu_sysfs.h"
+#include "atom.h"
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+int amdgpu_sysfs_attr_create(struct amdgpu_device *adev,
+struct amdgpu_device_attr *attr,
+u32 mask, struct list_head *attr_list)
+{
+   int ret;
+   struct amdgpu_device_attr_entry *attr_entry;
+   struct device_attribute *dev_attr;
+   const char *name;


Please declare ret last here.


+
+   if (!attr)
+   return -EINVAL;
+
+   dev_attr = &attr->dev_attr;
+   name = dev_attr->attr.name;
+
+   ret = device_create_file(adev->dev, dev_attr);
+   if (ret) {
+   dev_err(adev->dev, "failed to create device file %s, ret = 
%d\n",
+   name, ret);
+   }
+
+   attr_entry = kmalloc(sizeof(*attr_entry), GFP_KERNEL);
+   if (!attr_entry)
+   

[PATCH] drm/amdgpu: Fix type mismatch in amdgpu_gfx_kiq_init_ring

2024-05-24 Thread Srinivasan Shanmugam
This commit fixes a type mismatch in the amdgpu_gfx_kiq_init_ring
function triggered by the snprintf function expecting unsigned char
arguments due to the '%hhu' format specifier, but receiving int and u32
arguments.

The issue occurred because the arguments xcc_id, ring->me, ring->pipe,
and ring->queue were of type int and u32, not unsigned char. This led to
a type mismatch when these arguments were passed to snprintf.

To resolve this, the snprintf line was modified to cast these arguments
to unsigned char. This ensures that the arguments are of the correct
type for the '%hhu' format specifier and resolves the warning.

Fixes the below:
>> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:4: warning: format
>> specifies type 'unsigned char' but the argument has type 'int'
>> [-Wformat]
xcc_id, ring->me, ring->pipe, ring->queue);
^~
>> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:12: warning: format
>> specifies type 'unsigned char' but the argument has type 'u32' (aka
>> 'unsigned int') [-Wformat]
xcc_id, ring->me, ring->pipe, ring->queue);
^~~~
   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:22: warning: format specifies 
type 'unsigned char' but the argument has type 'u32' (aka 'unsigned int') 
[-Wformat]
xcc_id, ring->me, ring->pipe, ring->queue);
  ^~
   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:34: warning: format specifies 
type 'unsigned char' but the argument has type 'u32' (aka 'unsigned int') 
[-Wformat]
xcc_id, ring->me, ring->pipe, ring->queue);
  ^~~
   4 warnings generated.

Fixes: 0eb430076172 ("drm/amdgpu: Fix snprintf usage in 
amdgpu_gfx_kiq_init_ring")
Reported-by: kernel test robot 
Closes: 
https://lore.kernel.org/oe-kbuild-all/202405250446.xeawe66u-...@intel.com/
Cc: Lijo Lazar 
Cc: Alex Deucher 
Cc: Christian König 
Signed-off-by: Srinivasan Shanmugam 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 68505eaa92f9..19b1817b55d7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -330,7 +330,8 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device *adev, 
int xcc_id)
ring->eop_gpu_addr = kiq->eop_gpu_addr;
ring->no_scheduler = true;
snprintf(ring->name, sizeof(ring->name), "kiq_%hhu.%hhu.%hhu.%hhu",
-xcc_id, ring->me, ring->pipe, ring->queue);
+(unsigned char)xcc_id, (unsigned char)ring->me,
+(unsigned char)ring->pipe, (unsigned char)ring->queue);
r = amdgpu_ring_init(adev, ring, 1024, irq, AMDGPU_CP_KIQ_IRQ_DRIVER0,
 AMDGPU_RING_PRIO_DEFAULT, NULL);
if (r)
-- 
2.34.1