Re: Radeon regression in 6.6 kernel

2023-11-27 Thread Phillip Susi
Alex Deucher writes: >> In that case those are the already known problems with the scheduler >> changes, aren't they? > > Yes. Those changes went into 6.7 though, not 6.6 AFAIK. Maybe I'm > misunderstanding what the original report was actually testing. If it > was 6.7, then try reverting: > 5

RE: [PATCH v2] drm/amdgpu: Fix uninitialized return value

2023-11-27 Thread Zhang, Hawking
[AMD Official Use Only - General] Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Lazar, Lijo Sent: Tuesday, November 28, 2023 02:56 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Deucher, Alexander Subject: [PATCH v2] drm/amdgpu: Fix uninitialized return

[PATCH 1/2] drm/amdgpu/debugfs: fix error code when smc register accessors are NULL

2023-11-27 Thread Alex Deucher
Should be -EOPNOTSUPP. Fixes: 5104fdf50d32 ("drm/amdgpu: Fix a null pointer access when the smc_rreg pointer is NULL") Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/a

[PATCH 2/2] drm/amdgpu/debugfs: check if pcie register callbacks are valid

2023-11-27 Thread Alex Deucher
Before trying to use them in the debugfs register access functions. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c ind

Re: [PATCH v2] drm/amdgpu: Fix cat debugfs amdgpu_regs_didt causes kernel null pointer

2023-11-27 Thread Alex Deucher
Applied. Thanks! Alex On Thu, Nov 23, 2023 at 3:22 AM Christian König wrote: > > Am 23.11.23 um 02:22 schrieb Lu Yao: > > For 'AMDGPU_FAMILY_SI' family cards, in 'si_common_early_init' func, init > > 'didt_rreg' and 'didt_wreg' to 'NULL'. But in func > > 'amdgpu_debugfs_regs_didt_read/write', u

Re: [PATCH] drm/amdgpu: Fix uninitialized return value

2023-11-27 Thread Alex Deucher
On Mon, Nov 27, 2023 at 2:22 PM Christian König wrote: > > Am 27.11.23 um 19:29 schrieb Lijo Lazar: > > The return value is uniinitialized if ras context is NULL. > > > > Fixes: 0f4c8faa043c (drm/amdgpu: Move mca debug mode decision to ras) > > > > Signed-off-by: Lijo Lazar > > --- > > drivers/

Re: [PATCH 07/14] drm/radeon: Do not include

2023-11-27 Thread Alex Deucher
On Wed, Nov 22, 2023 at 7:25 AM Thomas Zimmermann wrote: > > Including is not required by radeon. > > Signed-off-by: Thomas Zimmermann > Cc: Alex Deucher > Cc: "Christian König" > Cc: "Pan, Xinhui" > Cc: amd-gfx@lists.freedesktop.org Acked-by: Alex Deucher > --- > drivers/gpu/drm/radeon/r

Re: [PATCH] drm/amd/pm: fix a memleak in aldebaran_tables_init

2023-11-27 Thread Alex Deucher
Applied. Thanks! On Thu, Nov 23, 2023 at 3:08 AM Dinghao Liu wrote: > > When kzalloc() for smu_table->ecc_table fails, we should free > the previously allocated resources to prevent memleak. > > Fixes: edd794208555 ("drm/amd/pm: add message smu to get ecc_table v2") > Signed-off-by: Dinghao Liu

RE: [PATCH] drm/amd/swsmu: update smu v14_0_0 driver if version and metrics table

2023-11-27 Thread Deucher, Alexander
[Public] > -Original Message- > From: Ma, Li > Sent: Thursday, November 23, 2023 5:07 AM > To: amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander ; Koenig, Christian > ; Zhang, Yifan ; Yu, > Lang ; Ma, Li > Subject: [PATCH] drm/amd/swsmu: update smu v14_0_0 driver if version and > me

Re: [PATCH] drm/amd: Enable PCIe PME from D3

2023-11-27 Thread Alex Deucher
On Mon, Nov 27, 2023 at 2:17 PM Mario Limonciello wrote: > > When dGPU is put into BOCO it may be in D3cold but still able send > PME on display hotplug event. For this to work it must be enabled > as wake source from D3. > > When runpm is enabled use pci_wake_from_d3() to mark wakeup as > enabled

Re: [PATCH 01/24] drm/amdkfd/kfd_ioctl: add pc sampling support

2023-11-27 Thread James Zhu
On 2023-11-27 14:11, Alex Deucher wrote: On Fri, Nov 3, 2023 at 9:22 AM James Zhu wrote: From: David Yat Sin Add pc sampling support in kfd_ioctl. Co-developed-by: James Zhu Signed-off-by: James Zhu Signed-off-by: David Yat Sin For any new IOCTL interfaces, please provide a link to the user

Re: [PATCH 01/24] drm/amdkfd/kfd_ioctl: add pc sampling support

2023-11-27 Thread Alex Deucher
On Fri, Nov 3, 2023 at 9:22 AM James Zhu wrote: > > From: David Yat Sin > > Add pc sampling support in kfd_ioctl. > > Co-developed-by: James Zhu > Signed-off-by: James Zhu > Signed-off-by: David Yat Sin For any new IOCTL interfaces, please provide a link to the user mode code branch which use

Re: [PATCH 1/2] drm/amdgpu/gmc: check if AGP is disabled in amdgpu_gmc_agp_addr()

2023-11-27 Thread Christian König
Am 21.11.23 um 16:05 schrieb Alex Deucher: Return AMDGPU_BO_INVALID_OFFSET if the AGP aperture is disabled. There is no reason to check further if the aperture is disabled. Yeah, but there shouldn't be a reason to check it earlier either. The "if (bo->ttm->dma_address[0] + PAGE_SIZE >= adev->g

Re: [PATCH 2/2] drm/amdgpu: use GTT only as fallback for VRAM|GTT

2023-11-27 Thread Christian König
Am 27.11.23 um 17:47 schrieb Bhardwaj, Rajneesh: [AMD Official Use Only - General] -Original Message- From: amd-gfx On Behalf Of Hamza Mahfooz Sent: Monday, November 27, 2023 10:53 AM To: Christian König ; jani.nik...@linux.intel.com; kher...@redhat.com; d...@redhat.com; za...@vmware.c

[PATCH v2] drm/amdgpu: Fix uninitialized return value

2023-11-27 Thread Lijo Lazar
The return value is uniinitialized if ras context is NULL. Fixes: 0f4c8faa043c (drm/amdgpu: Move mca debug mode decision to ras) Signed-off-by: Lijo Lazar --- v2: Avoid variable initialization (Christian) drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 2 ++ 1 file changed, 2 insertions(+) diff --g

Re: [PATCH] drm/amdgpu: Fix uninitialized return value

2023-11-27 Thread Christian König
Am 27.11.23 um 19:29 schrieb Lijo Lazar: The return value is uniinitialized if ras context is NULL. Fixes: 0f4c8faa043c (drm/amdgpu: Move mca debug mode decision to ras) Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(

[PATCH] drm/amd: Enable PCIe PME from D3

2023-11-27 Thread Mario Limonciello
When dGPU is put into BOCO it may be in D3cold but still able send PME on display hotplug event. For this to work it must be enabled as wake source from D3. When runpm is enabled use pci_wake_from_d3() to mark wakeup as enabled by default. Cc: sta...@vger.kernel.org # 6.1+ Signed-off-by: Mario Li

[PATCH] drm/amdgpu: Fix uninitialized return value

2023-11-27 Thread Lijo Lazar
The return value is uniinitialized if ras context is NULL. Fixes: 0f4c8faa043c (drm/amdgpu: Move mca debug mode decision to ras) Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/a

Re: [PATCH 1/2] drm/amdgpu/gmc: check if AGP is disabled in amdgpu_gmc_agp_addr()

2023-11-27 Thread Alex Deucher
On Wed, Nov 22, 2023 at 2:32 AM Alex Deucher wrote: > > Return AMDGPU_BO_INVALID_OFFSET if the AGP aperture is disabled. > There is no reason to check further if the aperture is disabled. > > Signed-off-by: Alex Deucher Ping? > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 3 +++ > 1 file ch

Re: regression/bisected/6.7rc1: Instead of desktop I see a horizontal flashing bar with a picture of the desktop background on white screen

2023-11-27 Thread Alex Deucher
On Wed, Nov 15, 2023 at 1:52 PM Lee, Alvin wrote: > > [AMD Official Use Only - General] > > This change has a DMCUB dependency - are you able to update your DMCUB > version as well? > > This version mismatch issue is something I'll need to fix in driver for Linux. @Mahfooz, Hamza @Alvin Lee any

RE: [PATCH 2/2] drm/amdgpu: use GTT only as fallback for VRAM|GTT

2023-11-27 Thread Bhardwaj, Rajneesh
[AMD Official Use Only - General] -Original Message- From: amd-gfx On Behalf Of Hamza Mahfooz Sent: Monday, November 27, 2023 10:53 AM To: Christian König ; jani.nik...@linux.intel.com; kher...@redhat.com; d...@redhat.com; za...@vmware.com; Olsak, Marek ; linux-graphics-maintai...@vmwa

Re: [PATCH] drm/amdkfd: Use partial migrations/mapping for GPU/CPU page faults in SVM

2023-11-27 Thread Chen, Xiaogang
On 11/22/2023 2:12 PM, Felix Kuehling wrote: On 2023-11-14 16:01, Xiaogang.Chen wrote: From: Xiaogang Chen This patch implements partial migration/mapping for gpu/cpu page faults in SVM according to migration granularity(default 2MB). A svm range may include pages from both system ram and

Re: [PATCH 2/2] drm/amdgpu: use GTT only as fallback for VRAM|GTT

2023-11-27 Thread Hamza Mahfooz
On 11/27/23 09:54, Christian König wrote: Try to fill up VRAM as well by setting the busy flag on GTT allocations. This fixes the issue that when VRAM was evacuated for suspend it's never filled up again unless the application is restarted. Link: https://gitlab.freedesktop.org/drm/amd/-/issue

Re: [PATCH 06/20] x86/mce/amd: Use helper for GPU UMC bank type checks

2023-11-27 Thread Yazen Ghannam
On 11/27/2023 6:46 AM, Borislav Petkov wrote: On Sat, Nov 18, 2023 at 01:32:34PM -0600, Yazen Ghannam wrote: +/* GPU UMCs have MCATYPE=0x1.*/ +bool smca_gpu_umc_bank_type(u64 ipid) +{ + if (!smca_umc_bank_type(ipid)) + return false; + + return FIELD_GET(MCI_IPID_MCATYPE

Re: [PATCH 05/20] x86/mce/amd: Use helper for UMC bank type check

2023-11-27 Thread Yazen Ghannam
On 11/27/2023 6:43 AM, Borislav Petkov wrote: On Sat, Nov 18, 2023 at 01:32:33PM -0600, Yazen Ghannam wrote: @@ -714,14 +721,10 @@ static bool legacy_mce_is_memory_error(struct mce *m) */ static bool smca_mce_is_memory_error(struct mce *m) { - enum smca_bank_types bank_type; -

[PATCH 2/2] drm/amdgpu: use GTT only as fallback for VRAM|GTT

2023-11-27 Thread Christian König
Try to fill up VRAM as well by setting the busy flag on GTT allocations. This fixes the issue that when VRAM was evacuated for suspend it's never filled up again unless the application is restarted. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 6 ++ 1 file

[PATCH 1/2] drm/ttm: replace busy placement with flags v3

2023-11-27 Thread Christian König
From: Somalapuram Amaranath Instead of a list of separate busy placement add flags which indicate that a placement should only be used when there is room or if we need to evict. v2: add missing TTM_PL_FLAG_IDLE for i915 v3: fix auto build test ERROR on drm-tip/drm-tip Signed-off-by: Christian K

TTM improvement and amdgpu fix

2023-11-27 Thread Christian König
Hi guys, TTM has a feature which allows to specify placements for normal operation as well as when all domains are "busy" and don't have free space. Not very widely used since it was a bit inflexible and required making multiple placement lists. Replace the multiple lists with flags and start t

Re: [PATCH 03/20] x86/mce: Use mce_setup() helpers for apei_smca_report_x86_error()

2023-11-27 Thread Yazen Ghannam
On 11/22/2023 1:28 PM, Borislav Petkov wrote: On Sat, Nov 18, 2023 at 01:32:31PM -0600, Yazen Ghannam wrote: Current AMD systems may report MCA errors using the ACPI Boot Error Record Table (BERT). The BERT entries for MCA errors will be an x86 Common Platform Error Record (CPER) with an MSR reg

Re: [PATCH 02/20] x86/mce: Define mce_setup() helpers for global and per-CPU fields

2023-11-27 Thread Yazen Ghannam
On 11/22/2023 1:24 PM, Borislav Petkov wrote: On Sat, Nov 18, 2023 at 01:32:30PM -0600, Yazen Ghannam wrote: +void mce_setup_global(struct mce *m) We usually call those things "common": mce_setup_common(). +{ + memset(m, 0, sizeof(struct mce)); + + m->cpuid = cpuid_eax(1); +

Re: [PATCH 06/20] x86/mce/amd: Use helper for GPU UMC bank type checks

2023-11-27 Thread Borislav Petkov
On Sat, Nov 18, 2023 at 01:32:34PM -0600, Yazen Ghannam wrote: > +/* GPU UMCs have MCATYPE=0x1.*/ > +bool smca_gpu_umc_bank_type(u64 ipid) > +{ > + if (!smca_umc_bank_type(ipid)) > + return false; > + > + return FIELD_GET(MCI_IPID_MCATYPE, ipid) == 0x1; > +} And now this tells

Re: [PATCH 05/20] x86/mce/amd: Use helper for UMC bank type check

2023-11-27 Thread Borislav Petkov
On Sat, Nov 18, 2023 at 01:32:33PM -0600, Yazen Ghannam wrote: > @@ -714,14 +721,10 @@ static bool legacy_mce_is_memory_error(struct mce *m) > */ > static bool smca_mce_is_memory_error(struct mce *m) > { > - enum smca_bank_types bank_type; > - > if (XEC(m->status, 0x3f)) >

Re: regression/bisected/6.7rc1: Instead of desktop I see a horizontal flashing bar with a picture of the desktop background on white screen

2023-11-27 Thread Linux regression tracking (Thorsten Leemhuis)
On 16.11.23 11:46, Christian König wrote: > Am 15.11.23 um 21:08 schrieb Mikhail Gavrilov: >> On Wed, Nov 15, 2023 at 11:39 PM Lee, Alvin wrote: >>> This change has a DMCUB dependency - are you able to update your >>> DMCUB version as well? >>> >> I can confirm this issue was gone after updating f