RE: [PATCH] gpu: drm: fix an improper check of amdgpu_bo_create_kernel
> -Original Message- > From: Kangjie Lu [mailto:k...@umn.edu] > Sent: Wednesday, December 26, 2018 2:24 PM > To: k...@umn.edu > Cc: pakki...@umn.edu; Deucher, Alexander > ; Koenig, Christian > ; Zhou, David(ChunMing) > ; David Airlie ; Daniel Vetter > ; Zhu, Rex ; Huang, Ray > ; Zhang, Hawking ; Xu, > Feifei ; Gao, Likun ; Francis, > David ; amd-...@lists.freedesktop.org; dri- > de...@lists.freedesktop.org; linux-kernel@vger.kernel.org > Subject: [PATCH] gpu: drm: fix an improper check of > amdgpu_bo_create_kernel > > adev->firmware.fw_buf being not NULL may not indicate kernel buffer is > created successful. A better way is to check the status (return value) > of it. The fix does so. Actually, it is the same. If bo is created successfully, the amdgpu_bo object will be created. But using "ret" to align with other function should be better as the return status. Thanks. Reviewed-by: Huang Rui > > Signed-off-by: Kangjie Lu > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 18 -- > 1 file changed, 12 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c > index 7b33867036e7..ba3c1cfb2c35 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c > @@ -422,13 +422,19 @@ static int amdgpu_ucode_patch_jt(struct > amdgpu_firmware_info *ucode, > > int amdgpu_ucode_create_bo(struct amdgpu_device *adev) > { > + int ret; > + > if (adev->firmware.load_type != AMDGPU_FW_LOAD_DIRECT) { > - amdgpu_bo_create_kernel(adev, adev->firmware.fw_size, > PAGE_SIZE, > - amdgpu_sriov_vf(adev) ? > AMDGPU_GEM_DOMAIN_VRAM : AMDGPU_GEM_DOMAIN_GTT, > - &adev->firmware.fw_buf, > - &adev->firmware.fw_buf_mc, > - &adev->firmware.fw_buf_ptr); > - if (!adev->firmware.fw_buf) { > + ret = > + amdgpu_bo_create_kernel(adev, > + adev->firmware.fw_size, > + PAGE_SIZE, > + amdgpu_sriov_vf(adev) ? > AMDGPU_GEM_DOMAIN_VRAM : > + AMDGPU_GEM_DOMAIN_GTT, > + &adev->firmware.fw_buf, > + &adev->firmware.fw_buf_mc, > + &adev->firmware.fw_buf_ptr); > + if (ret) { > dev_err(adev->dev, "failed to create kernel buffer > for firmware.fw_buf\n"); > return -ENOMEM; > } else if (amdgpu_sriov_vf(adev)) { > -- > 2.17.2 (Apple Git-113)
RE: linux-next: build failure after merge of the amdgpu tree
[AMD Public Use] Could you please help to check whether this patch can fix the issue? Thanks, Ray -Original Message- From: Huang, Ray Sent: Friday, January 15, 2021 1:57 PM To: Stephen Rothwell Cc: Alex Deucher ; Linux Kernel Mailing List ; Linux Next Mailing List Subject: Re: linux-next: build failure after merge of the amdgpu tree On Fri, Jan 15, 2021 at 01:35:05PM +0800, Stephen Rothwell wrote: > Hi all, > > After merging the amdgpu tree, today's linux-next build (powerpc > allyesconfig) failed like this: > > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c: In function > 'vangogh_get_smu_metrics_data': > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:300:10: error: > 'boot_cpu_data' undeclared (first use in this function); did you mean > 'boot_cpuid'? Ah, vangogh is an x86 cpu, let me look at this issue. Could you share me the config file you tested? Thanks, Ray > 300 | boot_cpu_data.x86_max_cores * sizeof(uint16_t)); > | ^ > | boot_cpuid > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c: In function > 'vangogh_read_sensor': > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:1320:11: error: > 'boot_cpu_data' undeclared (first use in this function); did you mean > 'boot_cpuid'? > 1320 | *size = boot_cpu_data.x86_max_cores * sizeof(uint16_t); > | ^ > | boot_cpuid > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c: In function > 'vangogh_od_edit_dpm_table': > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:1460:19: error: > 'boot_cpu_data' undeclared (first use in this function); did you mean > 'boot_cpuid'? > 1460 | if (input[0] >= boot_cpu_data.x86_max_cores) { > | ^ > | boot_cpuid > > Caused by commits > > 517cb957c43b ("drm/amd/pm: implement the processor clocks which read by > metric") > 0d90d0ddd10e ("drm/amd/pm: implement processor fine grain feature for > vangogh (v3)") > > The only thing I could do easily is to disable CONFIG_DRM_AMDGPU for today. > > -- > Cheers, > Stephen Rothwell 0001-drm-amdgpu-fix-build-error-without-x86-kconfig.patch Description: 0001-drm-amdgpu-fix-build-error-without-x86-kconfig.patch
RE: linux-next: build warning after merge of the amdgpu tree
[AMD Public Use] Thanks Setphen. Please check attached V2 patch. (If you can share the config file to me, I can do the test in my side) Best Regards, Ray -Original Message- From: Stephen Rothwell Sent: Monday, January 18, 2021 1:29 PM To: Alex Deucher Cc: Huang, Ray ; Linux Kernel Mailing List ; Linux Next Mailing List Subject: linux-next: build warning after merge of the amdgpu tree Hi all, After merging the amdgpu tree, today's linux-next build (powerpc allyesconfig) gave this warning: drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c: In function 'vangogh_init_smc_tables': drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:338:5: warning: "CONFIG_X86" is not defined, evaluates to 0 [-Wundef] 338 | #if CONFIG_X86 | ^~ Caused by commit 9dd19d5232a6 ("drm/amdgpu: fix build error without x86 kconfig") -- Cheers, Stephen Rothwell 0001-drm-amdgpu-fix-build-error-without-x86-kconfig-v2.patch Description: 0001-drm-amdgpu-fix-build-error-without-x86-kconfig-v2.patch
RE: linux-next: build failure after merge of the amdgpu tree
> -Original Message- > From: Stephen Rothwell > Sent: Tuesday, August 13, 2019 4:11 PM > To: Alex Deucher > Cc: Linux Next Mailing List ; Linux Kernel > Mailing List ; Liu, Aaron > ; Huang, Ray > Subject: linux-next: build failure after merge of the amdgpu tree > > Hi all, > > After merging the amdgpu tree, today's linux-next build (powerpc > allyesconfig) failed like this: > > drivers/gpu/drm/amd/amdgpu/psp_v12_0.c:39:17: error: expected > declaration specifiers or '...' before string constant > MODULE_FIRMWARE("amdgpu/renoir_asd.bin"); > ^~~ > > Caused by commit > > 6a7a0bdbfa0c ("drm/amdgpu: add psp_v12_0 for renoir (v2)") > > I have applied the following patch for today: > > From: Stephen Rothwell > Date: Tue, 13 Aug 2019 18:03:16 +1000 > Subject: [PATCH] drm/amdgpu: MODULE_FIRMWARE requires > linux/module.h > > Fixes: 6a7a0bdbfa0c ("drm/amdgpu: add psp_v12_0 for renoir (v2)") > Signed-off-by: Stephen Rothwell Thanks! Reviewed-by: Huang Rui > --- > drivers/gpu/drm/amd/amdgpu/psp_v12_0.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c > b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c > index f37b8af4b986..b474dfb79375 100644 > --- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c > @@ -21,6 +21,7 @@ > */ > > #include > +#include > #include "amdgpu.h" > #include "amdgpu_psp.h" > #include "amdgpu_ucode.h" > -- > 2.20.1 > > -- > Cheers, > Stephen Rothwell
RE: [Regression] "drm/amdgpu: enable gfxoff again on raven series (v2)"
> -Original Message- > From: Kai-Heng Feng > Sent: Thursday, August 08, 2019 1:45 AM > To: Huang, Ray > Cc: Deucher, Alexander ; Koenig, Christian > ; Zhou, David(ChunMing) > ; amd-gfx list ; > dri-de...@lists.freedesktop.org; LKML ; > Anthony Wong > Subject: Re: [Regression] "drm/amdgpu: enable gfxoff again on raven series > (v2)" > > Hi Ray, > > at 00:03, Huang, Ray wrote: > > > May I know the all firmware version in your system? Seems to the issue we encountered with IOMMU enabled. Could you please disable iommu in SBIOS or GRUB? Thanks, Ray > > # cat amdgpu_firmware_info > VCE feature version: 0, firmware version: 0x > UVD feature version: 0, firmware version: 0x > MC feature version: 0, firmware version: 0x > ME feature version: 40, firmware version: 0x0099 > PFP feature version: 40, firmware version: 0x00ae > CE feature version: 40, firmware version: 0x004d > RLC feature version: 1, firmware version: 0x0213 > RLC SRLC feature version: 1, firmware version: 0x0001 > RLC SRLG feature version: 1, firmware version: 0x0001 > RLC SRLS feature version: 1, firmware version: 0x0001 > MEC feature version: 40, firmware version: 0x018b > MEC2 feature version: 40, firmware version: 0x018b > SOS feature version: 0, firmware version: 0x > ASD feature version: 0, firmware version: 0x001ad4d4 > TA XGMI feature version: 0, firmware version: 0x > TA RAS feature version: 0, firmware version: 0x > SMC feature version: 0, firmware version: 0x1e44 > SDMA0 feature version: 41, firmware version: 0x00a9 > VCN feature version: 0, firmware version: 0x0110901c > DMCU feature version: 0, firmware version: 0x > VBIOS version: 113-RAVEN-103 > > Kai-Heng > > > > > Thanks, > > Ray > > > > From: Kai-Heng Feng > > Sent: Wednesday, August 7, 2019 8:50 PM > > To: Huang, Ray > > Cc: Deucher, Alexander; Koenig, Christian; Zhou, David(ChunMing); amd- > gfx > > list; dri-de...@lists.freedesktop.org; LKML; Anthony Wong > > Subject: [Regression] "drm/amdgpu: enable gfxoff again on raven series > > (v2)" > > > > Hi, > > > > After commit 005440066f92 ("drm/amdgpu: enable gfxoff again on raven > series > > (v2)”), browsers on Raven Ridge systems cause serious corruption like this: > > https://launchpadlibrarian.net/436319772/Screenshot%20from%202019- > 08-07%2004-20-34.png > > > > Firmwares for Raven Ridge is up-to-date. > > > > Kai-Heng >
Re: [PATCH] hwmon: (fam15h_power) Disable preemption when reading registers
> On Jun 1, 2016, at 9:23 PM, Guenter Roeck wrote: > >> On 06/01/2016 03:04 AM, Borislav Petkov wrote: >> From: Borislav Petkov >> >> We need to read a bunch of registers on each compute unit and possibly >> on the current CPU too. Disable preemption around it. > > An explanation would be helpful. Is this a bug fix ? I would like to get > a confirmation from someone at AMD that this is really necessary. > This change looks good for me. But I am in office this week, I will test it on CZ platform next week. :-) Thanks, Rui > Thanks, > Guenter > >> Signed-off-by: Borislav Petkov >> Cc: Rui Huang >> Cc: Sherry Hurwitz >> Cc: Guenter Roeck >> --- >> drivers/hwmon/fam15h_power.c | 10 ++ >> 1 file changed, 6 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c >> index eb97a9241d17..69bb810f528b 100644 >> --- a/drivers/hwmon/fam15h_power.c >> +++ b/drivers/hwmon/fam15h_power.c >> @@ -172,9 +172,9 @@ static void do_read_registers_on_cu(void *_data) >> */ >> static int read_registers(struct fam15h_power_data *data) >> { >> -int this_cpu, ret, cpu; >> int core, this_core; >> cpumask_var_t mask; >> +int ret, cpu; >> >> ret = zalloc_cpumask_var(&mask, GFP_KERNEL); >> if (!ret) >> @@ -183,7 +183,6 @@ static int read_registers(struct fam15h_power_data *data) >> memset(data->cu_on, 0, sizeof(int) * MAX_CUS); >> >> get_online_cpus(); >> -this_cpu = smp_processor_id(); >> >> /* >> * Choose the first online core of each compute unit, and then >> @@ -205,10 +204,13 @@ static int read_registers(struct fam15h_power_data >> *data) >> cpumask_set_cpu(cpumask_any(topology_sibling_cpumask(cpu)), mask); >> } >> >> -if (cpumask_test_cpu(this_cpu, mask)) >> +preempt_disable(); >> +smp_call_function_many(mask, do_read_registers_on_cu, data, true); >> + >> +if (cpumask_test_cpu(smp_processor_id(), mask)) >> do_read_registers_on_cu(data); >> >> -smp_call_function_many(mask, do_read_registers_on_cu, data, true); >> +preempt_enable(); >> put_online_cpus(); >> >> free_cpumask_var(mask); >
Re: [PATCH] hwmon: (fam15h_power) Disable preemption when reading registers
Sent from my iPad > On Jun 1, 2016, at 11:27 PM, Huang, Ray wrote: > > > >>> On Jun 1, 2016, at 9:23 PM, Guenter Roeck wrote: >>> >>> On 06/01/2016 03:04 AM, Borislav Petkov wrote: >>> From: Borislav Petkov >>> >>> We need to read a bunch of registers on each compute unit and possibly >>> on the current CPU too. Disable preemption around it. >> >> An explanation would be helpful. Is this a bug fix ? I would like to get >> a confirmation from someone at AMD that this is really necessary. >> > > This change looks good for me. But I am in office this week, I will test it > on CZ platform next week. :-) Sorry, fix typo. I am *not* in office this week. Rui