RE: [PATCH] mm: Remove double faults once write a device pfn
[AMD Official Use Only - General] > The vmf_insert_pfn_prot could cause unnecessary double faults > on a device pfn. Because currently the vmf_insert_pfn_prot does > not make the pfn writable so the pte entry is normally > read-only or dirty catching. > >>> What? How do you got to this conclusion? > >> Sorry. I did not mention that this problem only exists on arm64 > platform. > > Ok, that makes at least a little bit more sense. > > > >> Because on arm64 platform the PTE_RDONLY is automatically > >> attached to the userspace pte entries even through VM_WRITE + > VM_SHARE. > >> The PTE_RDONLY needs to be cleared in vmf_insert_pfn_prot. > >> However vmf_insert_pfn_prot do not make the pte writable passing > >> false @mkwrite to insert_pfn. > > Question is why is arm64 doing this? As far as I can see they must > > have some hardware reason for that. > > > > The mkwrite parameter to insert_pfn() was added by commit > > b2770da642540 to make insert_pfn() look more like insert_pfn_pmd() > > so that the DAX code can insert PTEs which are writable and dirty > > at the same > >>> time. > This is one scenario to do so. In fact on arm64 there are many > scenarios could be to do so. So we can let vmf_insert_pfn_prot > supporting @mkwrite for drivers at core layer and let drivers to > decide whether or not to make writable and dirty at one time. The > patch did this. Otherwise double faults on arm64 when call > >>> vmf_insert_pfn_prot. > >>> > >>> Well, that doesn't answer my question why arm64 is double faulting > >>> in the first place,. > >>> > >> > >> Eh. > >> > >> On arm64 When userspace mmap() with PROT_WRITE and MAP_SHARED > the > >> vma->vm_page_prot has the PTE_RDONLY and PTE_WRITE within > >> PAGE_SHARED_EXEC. (seeing arm64 protection_map) > > Well that's your observation, but not the explanation why arm64 is doing this. > > See this would have quite some negative impact on performance, not only for > gfx drivers but in general. > > So either the observation is incorrect or there is a *really* good reason why > arm64 is taking this performance penalty. > > >> When write the userspace virtual address the first fault happen and > >> call into driver's .fault->ttm_bo_vm_fault_reserved->vmf_insert_pfn_prot- > >insert_pfn. > >> The insert_pfn will establish the pte entry. However the > >> vmf_insert_pfn_prot pass false @mkwrite to insert_pfn by default and > >> so insert_pfn could not make the pfn writable and it do not call > >> maybe_mkwrite(pte_mkdirty(entry), vma) to clear the PTE_RDONLY bit. So > the pte entry is actually write protection for mmu. > >> So when the first fault return and re-execute the store instruction > >> the second fault happen again. And in second fault it just only do > >> pte_mkdirty(entry) which clear the PTE_RDONLY. > > It depends if the ARM64 CPU in question supports hardware dirty bit > > management (DBM). If that is the case and PTE_DBM (ie. PTE_WRITE) is > > set HW will automatically clear PTE_RDONLY bit to mark the entry dirty > > instead of raising a write fault. So you shouldn't see a double fault > > if PTE_DBM/WRITE is set. Thanks. This is reasonable. But I still really had the double faults in my project platform. > > > > On ARM64 you can kind of think of PTE_RDONLY as the HW dirty bit and > > PTE_DBM as the read/write permission bit with SW being responsible for > > updating PTE_RDONLY via the fault handler if DBM is not supported by HW. > > > > At least that's my understanding from having hacked on this in the > > past. You can see all this weirdness happening in the definitions of > > pte_dirty() and pte_write() for ARM64. > > +1 > > Thanks a lot for that, this was exactly the information I was looking for. > > In this light it makes this patch here look unnecessary and questionable at > best. > > Xianrong if you have an arm64 platform which really double faults (confirmed > through a debugger for example) then you have to ask why this platform > shows this behavior and not try to work around it. > > Behaviors like those usually have a very very good reason and without a > confirmed explanation I'm not allowing any patch in which would disable stuff > like that. Thanks. You are very right. I found CONFIG_ARM64_HW_AFDBM is not enabled in my project. So actually PTE_DBM is not hardware bit. It is software bit. Now i understand why arm64 to do this attaching PTE_RDONLY automatically. It is compatible for PTE_DBM hardware or not. This answers your question. So I met double faults in my project when CONFIG_ARM64_HW_AFDBM is false. However these double faults can be eliminated when i replace the vmf_insert_mixed to vmf_insert_mixed_mkwrite in drivers under CONFIG_ARM64_HW_AFDBM = false. The vmf_insert_pfn_prot is similar with vmf_insert_mixed. It should supply @mkwrite parameter rather than false mkwrite by default. So i think if you forgot to e
RE: [PATCH v2] drm/amdkfd: reserve the BO before validating it
[AMD Official Use Only - General] >-Original Message- >From: Kuehling, Felix >Sent: Thursday, January 25, 2024 5:41 AM >To: Yu, Lang ; amd-gfx@lists.freedesktop.org >Cc: Francis, David >Subject: Re: [PATCH v2] drm/amdkfd: reserve the BO before validating it > >On 2024-01-22 4:08, Lang Yu wrote: >> Fixes: 410f08516e0f ("drm/amdkfd: Move dma unmapping after TLB flush") >> >> v2: >> Avoid unmapping attachment twice when ERESTARTSYS. >> >> [ 41.708711] WARNING: CPU: 0 PID: 1463 at >drivers/gpu/drm/ttm/ttm_bo.c:846 ttm_bo_validate+0x146/0x1b0 [ttm] >> [ 41.708989] Call Trace: >> [ 41.708992] >> [ 41.708996] ? show_regs+0x6c/0x80 >> [ 41.709000] ? ttm_bo_validate+0x146/0x1b0 [ttm] >> [ 41.709008] ? __warn+0x93/0x190 >> [ 41.709014] ? ttm_bo_validate+0x146/0x1b0 [ttm] >> [ 41.709024] ? report_bug+0x1f9/0x210 >> [ 41.709035] ? handle_bug+0x46/0x80 >> [ 41.709041] ? exc_invalid_op+0x1d/0x80 >> [ 41.709048] ? asm_exc_invalid_op+0x1f/0x30 >> [ 41.709057] ? amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x2c/0x80 >[amdgpu] >> [ 41.709185] ? ttm_bo_validate+0x146/0x1b0 [ttm] >> [ 41.709197] ? amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x2c/0x80 >[amdgpu] >> [ 41.709337] ? srso_alias_return_thunk+0x5/0x7f >> [ 41.709346] kfd_mem_dmaunmap_attachment+0x9e/0x1e0 [amdgpu] >> [ 41.709467] amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x56/0x80 >[amdgpu] >> [ 41.709586] kfd_ioctl_unmap_memory_from_gpu+0x1b7/0x300 [amdgpu] >> [ 41.709710] kfd_ioctl+0x1ec/0x650 [amdgpu] >> [ 41.709822] ? __pfx_kfd_ioctl_unmap_memory_from_gpu+0x10/0x10 >[amdgpu] >> [ 41.709945] ? srso_alias_return_thunk+0x5/0x7f >> [ 41.709949] ? tomoyo_file_ioctl+0x20/0x30 >> [ 41.709959] __x64_sys_ioctl+0x9c/0xd0 >> [ 41.709967] do_syscall_64+0x3f/0x90 >> [ 41.709973] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 >> >> Signed-off-by: Lang Yu >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 2 +- >> .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 28 >+-- >> drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 4 ++- >> 3 files changed, 29 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h >> index 584a0cea5572..41854417e487 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h >> @@ -311,7 +311,7 @@ int >amdgpu_amdkfd_gpuvm_map_memory_to_gpu(struct amdgpu_device *adev, >>struct kgd_mem *mem, void >*drm_priv); >> int amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu( >> struct amdgpu_device *adev, struct kgd_mem *mem, void >*drm_priv); >> -void amdgpu_amdkfd_gpuvm_dmaunmap_mem(struct kgd_mem *mem, void >> *drm_priv); >> +int amdgpu_amdkfd_gpuvm_dmaunmap_mem(struct kgd_mem *mem, void >> +*drm_priv); >> int amdgpu_amdkfd_gpuvm_sync_memory( >> struct amdgpu_device *adev, struct kgd_mem *mem, bool intr); >> int amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel(struct kgd_mem *mem, >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c >> index 6f3a4cb2a9ef..7a050d46fa4d 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c >> @@ -2088,21 +2088,43 @@ int >amdgpu_amdkfd_gpuvm_map_memory_to_gpu( >> return ret; >> } >> >> -void amdgpu_amdkfd_gpuvm_dmaunmap_mem(struct kgd_mem *mem, void >> *drm_priv) >> +int amdgpu_amdkfd_gpuvm_dmaunmap_mem(struct kgd_mem *mem, void >> +*drm_priv) >> { >> struct kfd_mem_attachment *entry; >> struct amdgpu_vm *vm; >> +bool reserved = false; >> +int ret = 0; >> >> vm = drm_priv_to_vm(drm_priv); >> >> mutex_lock(&mem->lock); >> >> list_for_each_entry(entry, &mem->attachments, list) { >> -if (entry->bo_va->base.vm == vm) >> -kfd_mem_dmaunmap_attachment(mem, entry); >> +if (entry->bo_va->base.vm != vm) >> +continue; >> +if (entry->type == KFD_MEM_ATT_SHARED || >> +entry->type == KFD_MEM_ATT_DMABUF) >> +continue; >> +if (!entry->bo_va->base.bo->tbo.ttm->sg) >> +continue; > >You're going to great lengths to avoid the reservation when it's not needed by >kfd_mem_dmaunmap_attachment. However, this feels a bit fragile. Any changes >in the kfd_mem_dmaunmap_* functions could break this. > >> + >> +if (!reserved) { >> +ret = amdgpu_bo_reserve(mem->bo, true); >> +if (ret) >> +goto out; >> +reserved = true; >> +} >> + >> +kfd_mem_dmaunmap_attachment(mem, entry); >> } >> >> +if (reserved) >> +amdgpu_bo_unreserve(mem->bo); >> + >> +out: >> mutex_unlock(&mem->lock); >> + >> +return ret; >> } >> >> int amdgpu_amdkfd_gpuv
RE: [PATCH 1/2] drm/amdgpu: skip to program GFXDEC registers for PM abort case
[AMD Official Use Only - General] > From: Alex Deucher > Sent: Thursday, January 25, 2024 11:24 PM > To: Liang, Prike > Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander > ; Sharma, Deepak > > Subject: Re: [PATCH 1/2] drm/amdgpu: skip to program GFXDEC registers for > PM abort case > > On Thu, Jan 25, 2024 at 10:22 AM Alex Deucher > wrote: > > > > On Wed, Jan 24, 2024 at 9:39 PM Liang, Prike > wrote: > > > > > > [AMD Official Use Only - General] > > > > > > Hi, Alex > > > > -Original Message- > > > > From: Alex Deucher > > > > Sent: Wednesday, January 24, 2024 11:59 PM > > > > To: Liang, Prike > > > > Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander > > > > ; Sharma, Deepak > > > > > > > > Subject: Re: [PATCH 1/2] drm/amdgpu: skip to program GFXDEC > > > > registers for PM abort case > > > > > > > > On Wed, Jan 24, 2024 at 2:12 AM Prike Liang > wrote: > > > > > > > > > > In the PM abort cases, the gfx power rail doesn't turn off so > > > > > some GFXDEC registers/CSB can't reset to default vaule. In order > > > > > to avoid unexpected problem now need skip to program GFXDEC > > > > > registers and bypass issue CSB packet for PM abort case. > > > > > > > > > > Signed-off-by: Prike Liang > > > > > --- > > > > > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + > > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 + > > > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 > > > > > 3 files changed, 6 insertions(+) > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > > > index c5f3859fd682..26d983eb831b 100644 > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > > > @@ -1079,6 +1079,7 @@ struct amdgpu_device { > > > > > boolin_s3; > > > > > boolin_s4; > > > > > boolin_s0ix; > > > > > + boolpm_complete; > > > > > > > > > > enum pp_mp1_state mp1_state; > > > > > struct amdgpu_doorbell_index doorbell_index; diff --git > > > > > a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > > > > index 475bd59c9ac2..a01f9b0c2f30 100644 > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > > > > @@ -2486,6 +2486,7 @@ static int > > > > > amdgpu_pmops_suspend_noirq(struct > > > > device *dev) > > > > > struct drm_device *drm_dev = dev_get_drvdata(dev); > > > > > struct amdgpu_device *adev = drm_to_adev(drm_dev); > > > > > > > > > > + adev->pm_complete = true; > > > > > > > > This needs to be cleared somewhere on resume. > > > [Liang, Prike] This flag is designed to indicate the amdgpu device > suspension process status and will update the patch and clear it at the > amdgpu suspension beginning point. > > > > > > > > > if (amdgpu_acpi_should_gpu_reset(adev)) > > > > > return amdgpu_asic_reset(adev); > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > > > index 57808be6e3ec..3bf51f18e13c 100644 > > > > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > > > @@ -3034,6 +3034,10 @@ static int gfx_v9_0_cp_gfx_start(struct > > > > > amdgpu_device *adev) > > > > > > > > > > gfx_v9_0_cp_gfx_enable(adev, true); > > > > > > > > > > + if (adev->in_suspend && !adev->pm_complete) { > > > > > + DRM_INFO(" will skip the csb ring write\n"); > > > > > + return 0; > > > > > + } > > > > > > > > We probably want a similar fix for other gfx generations as well. > > > > > > > > Alex > > > > > > > [Liang, Prike] IIRC, there's no issue on the Mendocino side even without > the fix. How about keep the other gfx generations unchanged firstly and after > sort out the failed case will add the quirk for each specific gfx > respectively? > > > > Mendocino only supports S0i3 so we don't touch gfx on suspend/resume. > > This would only happen on platforms that support S3. > > E.g., try an aborted suspend on Raphael or PHX2. > > Alex > [Liang, Prike] Thanks for the reminder, but the Mendocino also was verified on the system with S3 enabled from BIOS. I will double confirm if there need the quirk on the RPL or PHX2. > > > > Alex > > > > > > > > > > r = amdgpu_ring_alloc(ring, gfx_v9_0_get_csb_size(adev) + 4 + > > > > > 3); > > > > > if (r) { > > > > > DRM_ERROR("amdgpu: cp failed to lock ring > > > > > (%d).\n", r); > > > > > -- > > > > > 2.34.1 > > > > >
RE: [PATCH] drm/amdgpu: update documentation on new chips
[AMD Official Use Only - General] Acked-by: Feifei Xu -Original Message- From: amd-gfx On Behalf Of Alex Deucher Sent: Friday, January 19, 2024 3:51 AM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: [PATCH] drm/amdgpu: update documentation on new chips These have been released now, so add them to the documentation. Signed-off-by: Alex Deucher --- Documentation/gpu/amdgpu/dgpu-asic-info-table.csv | 2 ++ 1 file changed, 2 insertions(+) diff --git a/Documentation/gpu/amdgpu/dgpu-asic-info-table.csv b/Documentation/gpu/amdgpu/dgpu-asic-info-table.csv index 882d2518f8ed..3825f00ca9fe 100644 --- a/Documentation/gpu/amdgpu/dgpu-asic-info-table.csv +++ b/Documentation/gpu/amdgpu/dgpu-asic-info-table.csv @@ -16,6 +16,7 @@ Radeon (RX|TM) (PRO|WX) Vega /MI25 /V320 /V340L /8200 /9100 /SSG MxGPU, VEGA10, AMD Radeon (Pro) VII /MI50 /MI60, VEGA20, DCE 12, 9.4.0, VCE 4.1.0 / UVD 7.2.0, 4.2.0 MI100, ARCTURUS, *, 9.4.1, VCN 2.5.0, 4.2.2 MI200, ALDEBARAN, *, 9.4.2, VCN 2.6.0, 4.4.0 +MI300, AQUA_VANGARAM, *, 9.4.3, VCN 4.0.3, 4.4.2 AMD Radeon (RX|Pro) 5600(M|XT) /5700 (M|XT|XTB) /W5700, NAVI10, DCN 2.0.0, 10.1.10, VCN 2.0.0, 5.0.0 AMD Radeon (Pro) 5300 /5500XTB/5500(XT|M) /W5500M /W5500, NAVI14, DCN 2.0.0, 10.1.1, VCN 2.0.2, 5.0.2 AMD Radeon RX 6800(XT) /6900(XT) /W6800, SIENNA_CICHLID, DCN 3.0.0, 10.3.0, VCN 3.0.0, 5.2.0 @@ -23,4 +24,5 @@ AMD Radeon RX 6700 XT / 6800M / 6700M, NAVY_FLOUNDER, DCN 3.0.0, 10.3.2, VCN 3.0 AMD Radeon RX 6600(XT) /6600M /W6600 /W6600M, DIMGREY_CAVEFISH, DCN 3.0.2, 10.3.4, VCN 3.0.16, 5.2.4 AMD Radeon RX 6500M /6300M /W6500M /W6300M, BEIGE_GOBY, DCN 3.0.3, 10.3.5, VCN 3.0.33, 5.2.5 AMD Radeon RX 7900 XT /XTX, , DCN 3.2.0, 11.0.0, VCN 4.0.0, 6.0.0 +AMD Radeon RX 7800 XT, , DCN 3.2.0, 11.0.3, VCN 4.0.0, 6.0.3 AMD Radeon RX 7600M (XT) /7700S /7600S, , DCN 3.2.1, 11.0.2, VCN 4.0.4, 6.0.2 -- 2.42.0
Re: [PATCH 2/2] drm/amdgpu: reset gpu for pm abort case
On Wed, Jan 24, 2024 at 11:11 PM Prike Liang wrote: > > In the pm abort case the gfx power rail not turn off from FCH side and > this will lead to the gfx reinitialized failed base on the unknown gfx > HW status, so let's reset the gpu to a known good power state. > > Signed-off-by: Prike Liang > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 + > drivers/gpu/drm/amd/amdgpu/soc15.c | 8 +++- > 2 files changed, 12 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 56d9dfa61290..4c40ffaaa5c2 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -4627,6 +4627,11 @@ int amdgpu_device_resume(struct drm_device *dev, bool > fbcon) > return r; > } > > + if(amdgpu_asic_need_reset_on_init(adev)) { space between if and (. Also, I think we need to check that we are not in S0ix as well otherwise I think we'll always reset in S0ix. We could probably do away with the GPU reset in the suspend_noirq callback with this change, but maybe make that a separate follow up patch. Alex > + DRM_INFO("PM abort case and let's reset asic \n"); > + amdgpu_asic_reset(adev); > + } > + > if (dev->switch_power_state == DRM_SWITCH_POWER_OFF) > return 0; > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c > b/drivers/gpu/drm/amd/amdgpu/soc15.c > index 15033efec2ba..9329a00b6abc 100644 > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > @@ -804,9 +804,16 @@ static bool soc15_need_reset_on_init(struct > amdgpu_device *adev) > if (adev->asic_type == CHIP_RENOIR) > return true; > > + sol_reg = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_81); > + > /* Just return false for soc15 GPUs. Reset does not seem to > * be necessary. > */ > + if (adev->in_suspend && !adev->in_s0ix && > + !adev->pm_complete && > + sol_reg) > + return true; > + > if (!amdgpu_passthrough(adev)) > return false; > > @@ -816,7 +823,6 @@ static bool soc15_need_reset_on_init(struct amdgpu_device > *adev) > /* Check sOS sign of life register to confirm sys driver and sOS > * are already been loaded. > */ > - sol_reg = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_81); > if (sol_reg) > return true; > > -- > 2.34.1 >
[PATCH] drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole (v2)
The TBA and TMA, along with an unused IB allocation, reside at low addresses in the VM address space. A stray VM fault which hits these pages must be serviced by making their page table entries invalid. The scheduler depends upon these pages being resident and fails, preventing a debugger from inspecting the failure state. By relocating these pages above 47 bits in the VM address space they can only be reached when bits [63:48] are set to 1. This makes it much less likely for a misbehaving program to generate accesses to them. The current placement at VA (PAGE_SIZE*2) is readily hit by a NULL access with a small offset. v2: - Move it to the reserved space to avoid concflicts with Mesa - Add macros to make reserved space management easier Cc: Arunpravin Paneer Selvam Cc: Christian Koenig Signed-off-by: Jay Cornwall Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c| 7 ++--- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 12 ++-- drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c | 30 +++- 4 files changed, 30 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c index 823d31f4a2a3..53d0a458d78e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c @@ -28,9 +28,9 @@ uint64_t amdgpu_csa_vaddr(struct amdgpu_device *adev) { - uint64_t addr = adev->vm_manager.max_pfn << AMDGPU_GPU_PAGE_SHIFT; + uint64_t addr = AMDGPU_VA_RESERVED_CSA_START( + adev->vm_manager.max_pfn << AMDGPU_GPU_PAGE_SHIFT); - addr -= AMDGPU_VA_RESERVED_CSA_SIZE; addr = amdgpu_gmc_sign_extend(addr); return addr; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c index 3d0d56087d41..9e769ef50f2e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c @@ -45,11 +45,8 @@ */ static inline u64 amdgpu_seq64_get_va_base(struct amdgpu_device *adev) { - u64 addr = adev->vm_manager.max_pfn << AMDGPU_GPU_PAGE_SHIFT; - - addr -= AMDGPU_VA_RESERVED_TOP; - - return addr; + return AMDGPU_VA_RESERVED_SEQ64_START( + adev->vm_manager.max_pfn << AMDGPU_GPU_PAGE_SHIFT); } /** diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h index 98a57192..f23b6153d310 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -135,11 +135,19 @@ struct amdgpu_mem_stats; #define AMDGPU_IS_MMHUB0(x) ((x) >= AMDGPU_MMHUB0_START && (x) < AMDGPU_MMHUB1_START) #define AMDGPU_IS_MMHUB1(x) ((x) >= AMDGPU_MMHUB1_START && (x) < AMDGPU_MAX_VMHUBS) -/* Reserve 2MB at top/bottom of address space for kernel use */ +/* Reserve space at top/bottom of address space for kernel use */ #define AMDGPU_VA_RESERVED_CSA_SIZE(2ULL << 20) +#define AMDGPU_VA_RESERVED_CSA_START(top) ((top) \ +- AMDGPU_VA_RESERVED_CSA_SIZE) #define AMDGPU_VA_RESERVED_SEQ64_SIZE (2ULL << 20) +#define AMDGPU_VA_RESERVED_SEQ64_START(top) (AMDGPU_VA_RESERVED_CSA_START(top) \ +- AMDGPU_VA_RESERVED_SEQ64_SIZE) +#define AMDGPU_VA_RESERVED_TRAP_SIZE (2ULL << 12) +#define AMDGPU_VA_RESERVED_TRAP_START(top) (AMDGPU_VA_RESERVED_SEQ64_START(top) \ +- AMDGPU_VA_RESERVED_TRAP_SIZE) #define AMDGPU_VA_RESERVED_BOTTOM (2ULL << 20) -#define AMDGPU_VA_RESERVED_TOP (AMDGPU_VA_RESERVED_SEQ64_SIZE + \ +#define AMDGPU_VA_RESERVED_TOP (AMDGPU_VA_RESERVED_TRAP_SIZE + \ +AMDGPU_VA_RESERVED_SEQ64_SIZE + \ AMDGPU_VA_RESERVED_CSA_SIZE) /* See vm_update_mode */ diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c b/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c index 6604a3f99c5e..f899cce25b2a 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c @@ -36,6 +36,7 @@ #include #include #include +#include "amdgpu_vm.h" /* * The primary memory I/O features being added for revisions of gfxip @@ -326,10 +327,16 @@ static void kfd_init_apertures_vi(struct kfd_process_device *pdd, uint8_t id) * with small reserved space for kernel. * Set them to CANONICAL addresses. */ - pdd->gpuvm_base = SVM_USER_BASE; + pdd->gpuvm_base = max(SVM_USER_BASE, AMDGPU_VA_RESERVED_BOTTOM); pdd->gpuvm_limit = pdd->dev->kfd->shared_resources.gpuvm_size - 1; + /* dGPUs: the reserved space for kernel +* before SVM +*/ + pdd->qpd.cwsr_base = SVM_CWSR_BASE; + pdd->qpd.ib_base = S
[pull] amdgpu drm-fixes-6.8
Hi Dave, Sima, Fixes for 6.8. The following changes since commit b16702be210bb49256f8a32df2c310383134dd57: Merge tag 'exynos-drm-fixes-for-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes (2024-01-25 14:22:15 +1000) are available in the Git repository at: https://gitlab.freedesktop.org/agd5f/linux.git tags/amd-drm-fixes-6.8-2024-01-25 for you to fetch changes up to c82eb25c5f005b33aebb1415a8472fc2eeea0deb: drm/amd/display: "Enable IPS by default" (2024-01-25 16:00:24 -0500) amd-drm-fixes-6.8-2024-01-25: amdgpu: - AC/DC power supply tracking fix - Don't show invalid vram vendor data - SMU 13.0.x fixes - GART fix for umr on systems without VRAM - GFX 10/11 UNORD_DISPATCH fixes - IPS display fixes (required for S0ix on some platforms) - Misc fixes Alex Deucher (2): drm/amdgpu/gfx10: set UNORD_DISPATCH in compute MQDs drm/amdgpu/gfx11: set UNORD_DISPATCH in compute MQDs Alvin Lee (1): drm/amd/display: Add Replay IPS register for DMUB command table ChunTao Tso (1): drm/amd/display: Replay + IPS + ABM in Full Screen VPB Hawking Zhang (1): drm/amdgpu: Fix null pointer dereference Kenneth Feng (1): drm/amd/pm: update the power cap setting Lijo Lazar (3): drm/amdgpu: Avoid fetching vram vendor information drm/amdgpu: Show vram vendor only if available drm/amd/pm: Fetch current power limit from FW Ma Jun (1): drm/amdgpu/pm: Fix the power source flag error Nicholas Kazlauskas (1): drm/amd/display: Allow IPS2 during Replay Roman Li (4): drm/amd/display: Add IPS checks before dcn register access drm/amd/display: Disable ips before dc interrupt setting drm/amd: Add a DC debug mask for IPS drm/amd/display: "Enable IPS by default" Srinivasan Shanmugam (1): drm/amd/display: Fix uninitialized variable usage in core_link_ 'read_dpcd() & write_dpcd()' functions Tom St Denis (1): drm/amd/amdgpu: Assign GART pages to AMD device mapping Yang Wang (1): drm/amd/pm: udpate smu v13.0.6 message permission drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 8 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 17 ++- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 +- drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 1 + drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c | 1 + drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 21 - .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_irq.c | 5 +- drivers/gpu/drm/amd/display/dc/dc.h| 1 + drivers/gpu/drm/amd/display/dc/dc_types.h | 5 ++ .../drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.c| 9 +++- .../drm/amd/display/dc/link/protocols/link_dpcd.c | 4 +- drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h| 47 +++ .../drm/amd/display/modules/power/power_helpers.c | 5 ++ .../drm/amd/display/modules/power/power_helpers.h | 1 + drivers/gpu/drm/amd/include/amd_shared.h | 1 + drivers/gpu/drm/amd/include/amdgpu_reg_state.h | 2 +- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 14 ++ drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 2 + drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 2 + .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 54 +- .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 4 +- .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c | 54 +- 24 files changed, 229 insertions(+), 36 deletions(-)
Re: [PATCH] drm/amdgpu: update documentation on new chips
Ping? On Thu, Jan 18, 2024 at 3:47 PM Alex Deucher wrote: > > These have been released now, so add them to the documentation. > > Signed-off-by: Alex Deucher > --- > Documentation/gpu/amdgpu/dgpu-asic-info-table.csv | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/Documentation/gpu/amdgpu/dgpu-asic-info-table.csv > b/Documentation/gpu/amdgpu/dgpu-asic-info-table.csv > index 882d2518f8ed..3825f00ca9fe 100644 > --- a/Documentation/gpu/amdgpu/dgpu-asic-info-table.csv > +++ b/Documentation/gpu/amdgpu/dgpu-asic-info-table.csv > @@ -16,6 +16,7 @@ Radeon (RX|TM) (PRO|WX) Vega /MI25 /V320 /V340L /8200 /9100 > /SSG MxGPU, VEGA10, > AMD Radeon (Pro) VII /MI50 /MI60, VEGA20, DCE 12, 9.4.0, VCE 4.1.0 / UVD > 7.2.0, 4.2.0 > MI100, ARCTURUS, *, 9.4.1, VCN 2.5.0, 4.2.2 > MI200, ALDEBARAN, *, 9.4.2, VCN 2.6.0, 4.4.0 > +MI300, AQUA_VANGARAM, *, 9.4.3, VCN 4.0.3, 4.4.2 > AMD Radeon (RX|Pro) 5600(M|XT) /5700 (M|XT|XTB) /W5700, NAVI10, DCN 2.0.0, > 10.1.10, VCN 2.0.0, 5.0.0 > AMD Radeon (Pro) 5300 /5500XTB/5500(XT|M) /W5500M /W5500, NAVI14, DCN 2.0.0, > 10.1.1, VCN 2.0.2, 5.0.2 > AMD Radeon RX 6800(XT) /6900(XT) /W6800, SIENNA_CICHLID, DCN 3.0.0, 10.3.0, > VCN 3.0.0, 5.2.0 > @@ -23,4 +24,5 @@ AMD Radeon RX 6700 XT / 6800M / 6700M, NAVY_FLOUNDER, DCN > 3.0.0, 10.3.2, VCN 3.0 > AMD Radeon RX 6600(XT) /6600M /W6600 /W6600M, DIMGREY_CAVEFISH, DCN 3.0.2, > 10.3.4, VCN 3.0.16, 5.2.4 > AMD Radeon RX 6500M /6300M /W6500M /W6300M, BEIGE_GOBY, DCN 3.0.3, 10.3.5, > VCN 3.0.33, 5.2.5 > AMD Radeon RX 7900 XT /XTX, , DCN 3.2.0, 11.0.0, VCN 4.0.0, 6.0.0 > +AMD Radeon RX 7800 XT, , DCN 3.2.0, 11.0.3, VCN 4.0.0, 6.0.3 > AMD Radeon RX 7600M (XT) /7700S /7600S, , DCN 3.2.1, 11.0.2, VCN 4.0.4, 6.0.2 > -- > 2.42.0 >
Re: [PATCH] drm/amd/include: Add missing registers/mask for DCN316 and 350
[AMD Official Use Only - General] Reviewed-by: Aurabindo Pillai -- Regards, Jay From: Siqueira, Rodrigo Sent: Thursday, January 25, 2024 1:37 PM To: amd-gfx@lists.freedesktop.org Cc: Siqueira, Rodrigo ; Lei, Jun ; Pillai, Aurabindo ; Mahfooz, Hamza ; Wentland, Harry ; Deucher, Alexander Subject: [PATCH] drm/amd/include: Add missing registers/mask for DCN316 and 350 Cc: Jun Lei Cc: Aurabindo Pillai Cc: Hamza Mahfooz Cc: Harry Wentland Cc: Alex Deucher Signed-off-by: Rodrigo Siqueira --- .../include/asic_reg/dcn/dcn_3_1_6_offset.h | 4 ++ .../include/asic_reg/dcn/dcn_3_1_6_sh_mask.h | 10 +++ .../include/asic_reg/dcn/dcn_3_5_0_offset.h | 24 +++ .../include/asic_reg/dcn/dcn_3_5_0_sh_mask.h | 65 +++ 4 files changed, 103 insertions(+) diff --git a/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_3_1_6_offset.h b/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_3_1_6_offset.h index 222fa8d13269..a05bf8e4f58d 100644 --- a/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_3_1_6_offset.h +++ b/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_3_1_6_offset.h @@ -626,6 +626,8 @@ #define regDTBCLK_DTO2_MODULO_BASE_IDX 2 #define regDTBCLK_DTO3_MODULO 0x0022 #define regDTBCLK_DTO3_MODULO_BASE_IDX 2 +#define regHDMICHARCLK0_CLOCK_CNTL 0x004a +#define regHDMICHARCLK0_CLOCK_CNTL_BASE_IDX 2 #define regPHYASYMCLK_CLOCK_CNTL 0x0052 #define regPHYASYMCLK_CLOCK_CNTL_BASE_IDX 2 #define regPHYBSYMCLK_CLOCK_CNTL 0x0053 @@ -638,6 +640,8 @@ #define regPHYESYMCLK_CLOCK_CNTL_BASE_IDX 2 #define regPHYFSYMCLK_CLOCK_CNTL 0x0057 #define regPHYFSYMCLK_CLOCK_CNTL_BASE_IDX 2 +#define regHDMISTREAMCLK_CNTL 0x0059 +#define regHDMISTREAMCLK_CNTL_BASE_IDX 2 #define regDCCG_GATE_DISABLE_CNTL3 0x005a #define regDCCG_GATE_DISABLE_CNTL3_BASE_IDX 2 #define regHDMISTREAMCLK0_DTO_PARAM 0x005b diff --git a/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_3_1_6_sh_mask.h b/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_3_1_6_sh_mask.h index 8ddb03a1dc39..df84941bbe5b 100644 --- a/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_3_1_6_sh_mask.h +++ b/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_3_1_6_sh_mask.h @@ -1933,6 +1933,11 @@ //DTBCLK_DTO3_MODULO #define DTBCLK_DTO3_MODULO__DTBCLK_DTO3_MODULO__SHIFT 0x0 #define DTBCLK_DTO3_MODULO__DTBCLK_DTO3_MODULO_MASK 0xL +//HDMICHARCLK0_CLOCK_CNTL +#define HDMICHARCLK0_CLOCK_CNTL__HDMICHARCLK0_EN__SHIFT 0x0 +#define HDMICHARCLK0_CLOCK_CNTL__HDMICHARCLK0_SRC_SEL__SHIFT 0x4 +#define HDMICHARCLK0_CLOCK_CNTL__HDMICHARCLK0_EN_MASK 0x0001L +#define HDMICHARCLK0_CLOCK_CNTL__HDMICHARCLK0_SRC_SEL_MASK 0x0070L //PHYASYMCLK_CLOCK_CNTL #define PHYASYMCLK_CLOCK_CNTL__PHYASYMCLK_FORCE_EN__SHIFT 0x0 #define PHYASYMCLK_CLOCK_CNTL__PHYASYMCLK_FORCE_SRC_SEL__SHIFT 0x4 @@ -1967,6 +1972,11 @@ #define PHYFSYMCLK_CLOCK_CNTL__PHYFSYMCLK_FORCE_SRC_SEL__SHIFT 0x4 #define PHYFSYMCLK_CLOCK_CNTL__PHYFSYMCLK_FORCE_EN_MASK 0x0001L #define PHYFSYMCLK_CLOCK_CNTL__PHYFSYMCLK_FORCE_SRC_SEL_MASK 0x0030L +//HDMISTREAMCLK_CNTL +#define HDMISTREAMCLK_CNTL__HDMISTREAMCLK0_SRC_SEL__SHIFT 0x0 +#define HDMISTREAMCLK_CNTL__HDMISTREAMCLK0_DTO_FORCE_DIS__SHIFT 0x10 +#define HDMISTREAMCLK_CNTL__HDMISTREAMCLK0_SRC_SEL_MASK
Re: [PATCH v2] drm/amdgpu: move the drm client creation behind drm device registration
On 2024-01-25 08:15, Lazar, Lijo wrote: On 1/25/2024 1:37 PM, Le Ma wrote: This patch is to eliminate interrupt warning below: "[drm] Fence fallback timer expired on ring sdma0.0". An early vm pt clearing job is sent to SDMA ahead of interrupt enabled, introduced by patch below: - drm/amdkfd: Export DMABufs from KFD using GEM handles The drm client creation logic in this patch won't work well for dynamic partition switch cases. Better add a ''Fixes' tag also. I agree. It would also be problematic with GPU resets. amdgpu_pci_probe seems to be the right place for this to ensure it only gets called once. The fix looks reasonable to me. Reviewed-by: Felix Kuehling This looks fine to me, needs to be checked by Felix anyway. Thanks, Lijo And re-locating the drm client creation following after drm_dev_register looks like a more proper flow. v2: wrap the drm client creation Signed-off-by: Le Ma --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 32 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 4 +++ 3 files changed, 27 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index 77e263660288..41db030ddc4e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c @@ -141,11 +141,31 @@ static void amdgpu_amdkfd_reset_work(struct work_struct *work) static const struct drm_client_funcs kfd_client_funcs = { .unregister = drm_client_release, }; + +int amdgpu_amdkfd_drm_client_create(struct amdgpu_device *adev) +{ + int ret; + + if (!adev->kfd.init_complete) + return 0; + + ret = drm_client_init(&adev->ddev, &adev->kfd.client, "kfd", + &kfd_client_funcs); + if (ret) { + dev_err(adev->dev, "Failed to init DRM client: %d\n", + ret); + return ret; + } + + drm_client_register(&adev->kfd.client); + + return 0; +} + void amdgpu_amdkfd_device_init(struct amdgpu_device *adev) { int i; int last_valid_bit; - int ret; amdgpu_amdkfd_gpuvm_init_mem_limits(); @@ -164,12 +184,6 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev) .enable_mes = adev->enable_mes, }; - ret = drm_client_init(&adev->ddev, &adev->kfd.client, "kfd", &kfd_client_funcs); - if (ret) { - dev_err(adev->dev, "Failed to init DRM client: %d\n", ret); - return; - } - /* this is going to have a few of the MSBs set that we need to * clear */ @@ -208,10 +222,6 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev) adev->kfd.init_complete = kgd2kfd_device_init(adev->kfd.dev, &gpu_resources); - if (adev->kfd.init_complete) - drm_client_register(&adev->kfd.client); - else - drm_client_release(&adev->kfd.client); amdgpu_amdkfd_total_mem_size += adev->gmc.real_vram_size; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h index 584a0cea5572..da175c384adf 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h @@ -182,6 +182,8 @@ int amdgpu_queue_mask_bit_to_set_resource_bit(struct amdgpu_device *adev, struct amdgpu_amdkfd_fence *amdgpu_amdkfd_fence_create(u64 context, struct mm_struct *mm, struct svm_range_bo *svm_bo); + +int amdgpu_amdkfd_drm_client_create(struct amdgpu_device *adev); #if defined(CONFIG_DEBUG_FS) int kfd_debugfs_kfd_mem_limits(struct seq_file *m, void *data); #endif diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 475bd59c9ac2..91d5d9435067 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -2255,6 +2255,10 @@ static int amdgpu_pci_probe(struct pci_dev *pdev, if (ret) goto err_pci; + ret = amdgpu_amdkfd_drm_client_create(adev); + if (ret) + goto err_pci; + /* * 1. don't init fbdev on hw without DCE * 2. don't init fbdev if there are no connectors
Re: [PATCH v2 0/8] Expand and improve AMDGPU documentation
On 1/23/24 09:19, Hamza Mahfooz wrote: On 1/22/24 16:24, Rodrigo Siqueira wrote: This patchset improves how the AMDGPU display documentation is organized, expands the kernel-doc to extract information from the source, and adds more context about DC workflow. Finally, at the end of this series, we also introduce a contribution section for those interested in contributing to the display code. Changes since V1: - Remove unprecise information about the DC process. - Expand the contribution list. - Rebase. Thanks Siqueira Series is: Acked-by: Hamza Mahfooz Rodrigo Siqueira (8): Documentation/gpu: Add basic page for HUBP Documentation/gpu: Add simple doc page for DCHUBBUB Documentation/gpu: Add kernel doc entry for DPP Documentation/gpu: Add kernel doc entry for MPC Documentation/gpu: Add entry for OPP in the kernel doc Documentation/gpu: Add entry for the DIO component Documentation/gpu: Add an explanation about the DC weekly patches Documentation/gpu: Introduce a simple contribution list for display code .../gpu/amdgpu/display/dcn-blocks.rst | 78 ++ .../amdgpu/display/display-contributing.rst | 168 .../gpu/amdgpu/display/display-manager.rst | 3 - Documentation/gpu/amdgpu/display/index.rst | 78 +- drivers/gpu/drm/amd/display/TODO | 110 .../gpu/drm/amd/display/dc/inc/hw/dchubbub.h | 6 + drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h | 26 ++ drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h | 13 +- drivers/gpu/drm/amd/display/dc/inc/hw/mpc.h | 250 -- drivers/gpu/drm/amd/display/dc/inc/hw/opp.h | 16 ++ .../amd/display/dc/link/hwss/link_hwss_dio.h | 10 + 11 files changed, 560 insertions(+), 198 deletions(-) create mode 100644 Documentation/gpu/amdgpu/display/dcn-blocks.rst create mode 100644 Documentation/gpu/amdgpu/display/display-contributing.rst delete mode 100644 drivers/gpu/drm/amd/display/TODO Thanks! Change pushed to amd-staging-drm-next.
[PATCH] drm/amd/include: Add missing registers/mask for DCN316 and 350
Cc: Jun Lei Cc: Aurabindo Pillai Cc: Hamza Mahfooz Cc: Harry Wentland Cc: Alex Deucher Signed-off-by: Rodrigo Siqueira --- .../include/asic_reg/dcn/dcn_3_1_6_offset.h | 4 ++ .../include/asic_reg/dcn/dcn_3_1_6_sh_mask.h | 10 +++ .../include/asic_reg/dcn/dcn_3_5_0_offset.h | 24 +++ .../include/asic_reg/dcn/dcn_3_5_0_sh_mask.h | 65 +++ 4 files changed, 103 insertions(+) diff --git a/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_3_1_6_offset.h b/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_3_1_6_offset.h index 222fa8d13269..a05bf8e4f58d 100644 --- a/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_3_1_6_offset.h +++ b/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_3_1_6_offset.h @@ -626,6 +626,8 @@ #define regDTBCLK_DTO2_MODULO_BASE_IDX 2 #define regDTBCLK_DTO3_MODULO 0x0022 #define regDTBCLK_DTO3_MODULO_BASE_IDX 2 +#define regHDMICHARCLK0_CLOCK_CNTL 0x004a +#define regHDMICHARCLK0_CLOCK_CNTL_BASE_IDX 2 #define regPHYASYMCLK_CLOCK_CNTL 0x0052 #define regPHYASYMCLK_CLOCK_CNTL_BASE_IDX 2 #define regPHYBSYMCLK_CLOCK_CNTL 0x0053 @@ -638,6 +640,8 @@ #define regPHYESYMCLK_CLOCK_CNTL_BASE_IDX 2 #define regPHYFSYMCLK_CLOCK_CNTL 0x0057 #define regPHYFSYMCLK_CLOCK_CNTL_BASE_IDX 2 +#define regHDMISTREAMCLK_CNTL 0x0059 +#define regHDMISTREAMCLK_CNTL_BASE_IDX 2 #define regDCCG_GATE_DISABLE_CNTL3 0x005a #define regDCCG_GATE_DISABLE_CNTL3_BASE_IDX 2 #define regHDMISTREAMCLK0_DTO_PARAM 0x005b diff --git a/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_3_1_6_sh_mask.h b/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_3_1_6_sh_mask.h index 8ddb03a1dc39..df84941bbe5b 100644 --- a/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_3_1_6_sh_mask.h +++ b/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_3_1_6_sh_mask.h @@ -1933,6 +1933,11 @@ //DTBCLK_DTO3_MODULO #define DTBCLK_DTO3_MODULO__DTBCLK_DTO3_MODULO__SHIFT 0x0 #define DTBCLK_DTO3_MODULO__DTBCLK_DTO3_MODULO_MASK 0xL +//HDMICHARCLK0_CLOCK_CNTL +#define HDMICHARCLK0_CLOCK_CNTL__HDMICHARCLK0_EN__SHIFT 0x0 +#define HDMICHARCLK0_CLOCK_CNTL__HDMICHARCLK0_SRC_SEL__SHIFT 0x4 +#define HDMICHARCLK0_CLOCK_CNTL__HDMICHARCLK0_EN_MASK 0x0001L +#define HDMICHARCLK0_CLOCK_CNTL__HDMICHARCLK0_SRC_SEL_MASK 0x0070L //PHYASYMCLK_CLOCK_CNTL #define PHYASYMCLK_CLOCK_CNTL__PHYASYMCLK_FORCE_EN__SHIFT 0x0 #define PHYASYMCLK_CLOCK_CNTL__PHYASYMCLK_FORCE_SRC_SEL__SHIFT 0x4 @@ -1967,6 +1972,11 @@ #define PHYFSYMCLK_CLOCK_CNTL__PHYFSYMCLK_FORCE_SRC_SEL__SHIFT 0x4 #define PHYFSYMCLK_CLOCK_CNTL__PHYFSYMCLK_FORCE_EN_MASK 0x0001L #define PHYFSYMCLK_CLOCK_CNTL__PHYFSYMCLK_FORCE_SRC_SEL_MASK 0x0030L +//HDMISTREAMCLK_CNTL +#define HDMISTREAMCLK_CNTL__HDMISTREAMCLK0_SRC_SEL__SHIFT 0x0 +#define HDMISTREAMCLK_CNTL__HDMISTREAMCLK0_DTO_FORCE_DIS__SHIFT 0x10 +#define HDMISTREAMCLK_CNTL__HDMISTREAMCLK0_SRC_SEL_MASK 0x0003L +#define HDMISTREAMCLK_CNTL__HDMISTREAMCLK0_DTO_FORCE_DIS_MASK 0x0001L //DCCG_GATE_DISABLE_CNTL3 #define DCCG_GATE_DISABLE_CNTL3__HDMISTREAMCLK0_GATE_DISABLE__SHIFT 0x0 #define DCCG_GATE_DISABLE_CNTL3__HDMISTREAMCLK1_GATE_DISABLE__SHIFT
RE: [PATCH] drm/amd/display: Fix potential NULL pointer dereferences in 'dcn10_set_output_transfer_func()'
[AMD Official Use Only - General] Reviewed-by: Anthony Koo Thanks, Anthony -Original Message- From: SHANMUGAM, SRINIVASAN Sent: Thursday, January 25, 2024 10:55 AM To: Siqueira, Rodrigo ; Pillai, Aurabindo Cc: amd-gfx@lists.freedesktop.org; SHANMUGAM, SRINIVASAN ; Wyatt Wood ; Koo, Anthony Subject: [PATCH] drm/amd/display: Fix potential NULL pointer dereferences in 'dcn10_set_output_transfer_func()' The 'stream' pointer is used in dcn10_set_output_transfer_func() before the check if 'stream' is NULL. Fixes the below: drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn10/dcn10_hwseq.c:1892 dcn10_set_output_transfer_func() warn: variable dereferenced before check 'stream' (see line 1875) Fixes: ddef02de0d71 ("drm/amd/display: add null checks before logging") Cc: Wyatt Wood Cc: Anthony Koo Cc: Rodrigo Siqueira Cc: Aurabindo Pillai Signed-off-by: Srinivasan Shanmugam --- drivers/gpu/drm/amd/display/dc/hwss/dcn10/dcn10_hwseq.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn10/dcn10_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn10/dcn10_hwseq.c index d923d8d915f9..22cce2b58f95 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn10/dcn10_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn10/dcn10_hwseq.c @@ -1890,6 +1890,9 @@ bool dcn10_set_output_transfer_func(struct dc *dc, struct pipe_ctx *pipe_ctx, { struct dpp *dpp = pipe_ctx->plane_res.dpp; + if (!stream) + return false; + if (dpp == NULL) return false; @@ -1912,8 +1915,8 @@ bool dcn10_set_output_transfer_func(struct dc *dc, struct pipe_ctx *pipe_ctx, } else dpp->funcs->dpp_program_regamma_pwl(dpp, NULL, OPP_REGAMMA_BYPASS); - if (stream != NULL && stream->ctx != NULL && - stream->out_transfer_func != NULL) { + if (stream->ctx && + stream->out_transfer_func) { log_tf(stream->ctx, stream->out_transfer_func, dpp->regamma_params.hw_points_num); -- 2.34.1
[PATCH] drm/amdgpu: Fix the warning info in mode1 reset
From: Ma Jun Fix the warning info below during mode1 reset. [ +0.04] Call Trace: [ +0.04] [ +0.06] ? show_regs+0x6e/0x80 [ +0.11] ? __flush_work.isra.0+0x2e8/0x390 [ +0.05] ? __warn+0x91/0x150 [ +0.09] ? __flush_work.isra.0+0x2e8/0x390 [ +0.06] ? report_bug+0x19d/0x1b0 [ +0.13] ? handle_bug+0x46/0x80 [ +0.12] ? exc_invalid_op+0x1d/0x80 [ +0.11] ? asm_exc_invalid_op+0x1f/0x30 [ +0.14] ? __flush_work.isra.0+0x2e8/0x390 [ +0.07] ? __flush_work.isra.0+0x208/0x390 [ +0.07] ? _prb_read_valid+0x216/0x290 [ +0.08] __cancel_work_timer+0x11d/0x1a0 [ +0.07] ? try_to_grab_pending+0xe8/0x190 [ +0.12] cancel_work_sync+0x14/0x20 [ +0.08] amddrm_sched_stop+0x3c/0x1d0 [amd_sched] [ +0.32] amdgpu_device_gpu_recover+0x29a/0xe90 [amdgpu] This warning info was printed after applying the patch "drm/sched: Convert drm scheduler to use a work queue rather than kthread". The root cause is that amdgpu driver tries to use the uninitialized work_struct in the struct drm_gpu_scheduler v2: - Rename the function to amdgpu_ring_sched_ready and move it to amdgpu_ring.c (Alex) v3: - Fix a few more checks based on Vitaly's patch (Alex) Fixes: 11b3b9f461c5 ("drm/sched: Check scheduler ready before calling timeout handling") Reviewed-by: Alex Deucher Signed-off-by: Vitaly Prosyak Signed-off-by: Ma Jun Signed-off-by: Alex Deucher --- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c| 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c| 8 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 14 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +- 5 files changed, 25 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c index 899e31e3a5e8..3a3f3ce09f00 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c @@ -290,7 +290,7 @@ static int suspend_resume_compute_scheduler(struct amdgpu_device *adev, bool sus for (i = 0; i < adev->gfx.num_compute_rings; i++) { struct amdgpu_ring *ring = &adev->gfx.compute_ring[i]; - if (!(ring && drm_sched_wqueue_ready(&ring->sched))) + if (!amdgpu_ring_sched_ready(ring)) continue; /* stop secheduler and drain ring. */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index e485dd3357c6..1afbb2e932c6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -1678,7 +1678,7 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused) for (i = 0; i < AMDGPU_MAX_RINGS; i++) { struct amdgpu_ring *ring = adev->rings[i]; - if (!ring || !drm_sched_wqueue_ready(&ring->sched)) + if (!amdgpu_ring_sched_ready(ring)) continue; drm_sched_wqueue_stop(&ring->sched); } @@ -1694,7 +1694,7 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused) for (i = 0; i < AMDGPU_MAX_RINGS; i++) { struct amdgpu_ring *ring = adev->rings[i]; - if (!ring || !drm_sched_wqueue_ready(&ring->sched)) + if (!amdgpu_ring_sched_ready(ring)) continue; drm_sched_wqueue_start(&ring->sched); } @@ -1916,8 +1916,8 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val) ring = adev->rings[val]; - if (!ring || !ring->funcs->preempt_ib || - !drm_sched_wqueue_ready(&ring->sched)) + if (!amdgpu_ring_sched_ready(ring) || + !ring->funcs->preempt_ib) return -EINVAL; /* the last preemption failed */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 1a04ccba9542..7ff17df7a5ce 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -5042,7 +5042,7 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev) for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { struct amdgpu_ring *ring = adev->rings[i]; - if (!ring || !drm_sched_wqueue_ready(&ring->sched)) + if (!amdgpu_ring_sched_ready(ring)) continue; spin_lock(&ring->sched.job_list_lock); @@ -5181,7 +5181,7 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev, for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { struct amdgpu_ring *ring = adev->rings[i]; - if (!ring || !drm_sched_wqueue_ready(&ring->sched)) + if (!amdgpu_ring_sched_ready(ring)) continu
[PATCH] drm/amd/display: Fix potential NULL pointer dereferences in 'dcn10_set_output_transfer_func()'
The 'stream' pointer is used in dcn10_set_output_transfer_func() before the check if 'stream' is NULL. Fixes the below: drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn10/dcn10_hwseq.c:1892 dcn10_set_output_transfer_func() warn: variable dereferenced before check 'stream' (see line 1875) Fixes: ddef02de0d71 ("drm/amd/display: add null checks before logging") Cc: Wyatt Wood Cc: Anthony Koo Cc: Rodrigo Siqueira Cc: Aurabindo Pillai Signed-off-by: Srinivasan Shanmugam --- drivers/gpu/drm/amd/display/dc/hwss/dcn10/dcn10_hwseq.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn10/dcn10_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn10/dcn10_hwseq.c index d923d8d915f9..22cce2b58f95 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn10/dcn10_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn10/dcn10_hwseq.c @@ -1890,6 +1890,9 @@ bool dcn10_set_output_transfer_func(struct dc *dc, struct pipe_ctx *pipe_ctx, { struct dpp *dpp = pipe_ctx->plane_res.dpp; + if (!stream) + return false; + if (dpp == NULL) return false; @@ -1912,8 +1915,8 @@ bool dcn10_set_output_transfer_func(struct dc *dc, struct pipe_ctx *pipe_ctx, } else dpp->funcs->dpp_program_regamma_pwl(dpp, NULL, OPP_REGAMMA_BYPASS); - if (stream != NULL && stream->ctx != NULL && - stream->out_transfer_func != NULL) { + if (stream->ctx && + stream->out_transfer_func) { log_tf(stream->ctx, stream->out_transfer_func, dpp->regamma_params.hw_points_num); -- 2.34.1
Re: [PATCH 1/2] drm/amdgpu: skip to program GFXDEC registers for PM abort case
On Thu, Jan 25, 2024 at 10:22 AM Alex Deucher wrote: > > On Wed, Jan 24, 2024 at 9:39 PM Liang, Prike wrote: > > > > [AMD Official Use Only - General] > > > > Hi, Alex > > > -Original Message- > > > From: Alex Deucher > > > Sent: Wednesday, January 24, 2024 11:59 PM > > > To: Liang, Prike > > > Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander > > > ; Sharma, Deepak > > > > > > Subject: Re: [PATCH 1/2] drm/amdgpu: skip to program GFXDEC registers for > > > PM abort case > > > > > > On Wed, Jan 24, 2024 at 2:12 AM Prike Liang wrote: > > > > > > > > In the PM abort cases, the gfx power rail doesn't turn off so some > > > > GFXDEC registers/CSB can't reset to default vaule. In order to avoid > > > > unexpected problem now need skip to program GFXDEC registers and > > > > bypass issue CSB packet for PM abort case. > > > > > > > > Signed-off-by: Prike Liang > > > > --- > > > > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 + > > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 > > > > 3 files changed, 6 insertions(+) > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > > index c5f3859fd682..26d983eb831b 100644 > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > > @@ -1079,6 +1079,7 @@ struct amdgpu_device { > > > > boolin_s3; > > > > boolin_s4; > > > > boolin_s0ix; > > > > + boolpm_complete; > > > > > > > > enum pp_mp1_state mp1_state; > > > > struct amdgpu_doorbell_index doorbell_index; diff --git > > > > a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > > > index 475bd59c9ac2..a01f9b0c2f30 100644 > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > > > @@ -2486,6 +2486,7 @@ static int amdgpu_pmops_suspend_noirq(struct > > > device *dev) > > > > struct drm_device *drm_dev = dev_get_drvdata(dev); > > > > struct amdgpu_device *adev = drm_to_adev(drm_dev); > > > > > > > > + adev->pm_complete = true; > > > > > > This needs to be cleared somewhere on resume. > > [Liang, Prike] This flag is designed to indicate the amdgpu device > > suspension process status and will update the patch and clear it at the > > amdgpu suspension beginning point. > > > > > > > if (amdgpu_acpi_should_gpu_reset(adev)) > > > > return amdgpu_asic_reset(adev); > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > > index 57808be6e3ec..3bf51f18e13c 100644 > > > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > > @@ -3034,6 +3034,10 @@ static int gfx_v9_0_cp_gfx_start(struct > > > > amdgpu_device *adev) > > > > > > > > gfx_v9_0_cp_gfx_enable(adev, true); > > > > > > > > + if (adev->in_suspend && !adev->pm_complete) { > > > > + DRM_INFO(" will skip the csb ring write\n"); > > > > + return 0; > > > > + } > > > > > > We probably want a similar fix for other gfx generations as well. > > > > > > Alex > > > > > [Liang, Prike] IIRC, there's no issue on the Mendocino side even without > > the fix. How about keep the other gfx generations unchanged firstly and > > after sort out the failed case will add the quirk for each specific gfx > > respectively? > > Mendocino only supports S0i3 so we don't touch gfx on suspend/resume. > This would only happen on platforms that support S3. E.g., try an aborted suspend on Raphael or PHX2. Alex > > Alex > > > > > > > r = amdgpu_ring_alloc(ring, gfx_v9_0_get_csb_size(adev) + 4 + > > > > 3); > > > > if (r) { > > > > DRM_ERROR("amdgpu: cp failed to lock ring (%d).\n", > > > > r); > > > > -- > > > > 2.34.1 > > > >
Re: [PATCH 1/2] drm/amdgpu: skip to program GFXDEC registers for PM abort case
On Wed, Jan 24, 2024 at 10:46 PM Prike Liang wrote: > > In the PM abort cases, the gfx power rail doesn't turn off so > some GFXDEC registers/CSB can't reset to default vaule. In order > to avoid unexpected problem now need skip to program GFXDEC registers > and bypass issue CSB packet for PM abort case. > > Signed-off-by: Prike Liang > --- > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++ > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 ++ > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 6 ++ > 3 files changed, 10 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > index c5f3859fd682..9e9e3385b5d3 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > @@ -1079,6 +1079,8 @@ struct amdgpu_device { > boolin_s3; > boolin_s4; > boolin_s0ix; > + /* indicate amdgpu suspension status */ > + boolpm_complete; Maybe rename this suspend_complete to better specify that it only applies on the suspend side. Alex > > enum pp_mp1_state mp1_state; > struct amdgpu_doorbell_index doorbell_index; > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > index 475bd59c9ac2..9cb8f7fe55cf 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > @@ -2472,6 +2472,7 @@ static int amdgpu_pmops_suspend(struct device *dev) > struct drm_device *drm_dev = dev_get_drvdata(dev); > struct amdgpu_device *adev = drm_to_adev(drm_dev); > > + adev->pm_complete = false; > if (amdgpu_acpi_is_s0ix_active(adev)) > adev->in_s0ix = true; > else if (amdgpu_acpi_is_s3_active(adev)) > @@ -2486,6 +2487,7 @@ static int amdgpu_pmops_suspend_noirq(struct device > *dev) > struct drm_device *drm_dev = dev_get_drvdata(dev); > struct amdgpu_device *adev = drm_to_adev(drm_dev); > > + adev->pm_complete = true; > if (amdgpu_acpi_should_gpu_reset(adev)) > return amdgpu_asic_reset(adev); > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > index 57808be6e3ec..0abdc85eda77 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > @@ -3034,6 +3034,12 @@ static int gfx_v9_0_cp_gfx_start(struct amdgpu_device > *adev) > > gfx_v9_0_cp_gfx_enable(adev, true); > > + /* TODO: double confirm whether need to reinitialize gfxedc and > submit csb packet > +* on the other gfx generations for the pm suspend abort case. */ > + if (adev->in_suspend && !adev->pm_complete) { > + DRM_INFO(" will skip the csb ring write\n"); > + return 0; > + } > r = amdgpu_ring_alloc(ring, gfx_v9_0_get_csb_size(adev) + 4 + 3); > if (r) { > DRM_ERROR("amdgpu: cp failed to lock ring (%d).\n", r); > -- > 2.34.1 >
Re: [PATCH 1/2] drm/amdgpu: skip to program GFXDEC registers for PM abort case
On Wed, Jan 24, 2024 at 9:39 PM Liang, Prike wrote: > > [AMD Official Use Only - General] > > Hi, Alex > > -Original Message- > > From: Alex Deucher > > Sent: Wednesday, January 24, 2024 11:59 PM > > To: Liang, Prike > > Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander > > ; Sharma, Deepak > > > > Subject: Re: [PATCH 1/2] drm/amdgpu: skip to program GFXDEC registers for > > PM abort case > > > > On Wed, Jan 24, 2024 at 2:12 AM Prike Liang wrote: > > > > > > In the PM abort cases, the gfx power rail doesn't turn off so some > > > GFXDEC registers/CSB can't reset to default vaule. In order to avoid > > > unexpected problem now need skip to program GFXDEC registers and > > > bypass issue CSB packet for PM abort case. > > > > > > Signed-off-by: Prike Liang > > > --- > > > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + > > > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 + > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 > > > 3 files changed, 6 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > index c5f3859fd682..26d983eb831b 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > @@ -1079,6 +1079,7 @@ struct amdgpu_device { > > > boolin_s3; > > > boolin_s4; > > > boolin_s0ix; > > > + boolpm_complete; > > > > > > enum pp_mp1_state mp1_state; > > > struct amdgpu_doorbell_index doorbell_index; diff --git > > > a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > > index 475bd59c9ac2..a01f9b0c2f30 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > > @@ -2486,6 +2486,7 @@ static int amdgpu_pmops_suspend_noirq(struct > > device *dev) > > > struct drm_device *drm_dev = dev_get_drvdata(dev); > > > struct amdgpu_device *adev = drm_to_adev(drm_dev); > > > > > > + adev->pm_complete = true; > > > > This needs to be cleared somewhere on resume. > [Liang, Prike] This flag is designed to indicate the amdgpu device > suspension process status and will update the patch and clear it at the > amdgpu suspension beginning point. > > > > > if (amdgpu_acpi_should_gpu_reset(adev)) > > > return amdgpu_asic_reset(adev); > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > index 57808be6e3ec..3bf51f18e13c 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > @@ -3034,6 +3034,10 @@ static int gfx_v9_0_cp_gfx_start(struct > > > amdgpu_device *adev) > > > > > > gfx_v9_0_cp_gfx_enable(adev, true); > > > > > > + if (adev->in_suspend && !adev->pm_complete) { > > > + DRM_INFO(" will skip the csb ring write\n"); > > > + return 0; > > > + } > > > > We probably want a similar fix for other gfx generations as well. > > > > Alex > > > [Liang, Prike] IIRC, there's no issue on the Mendocino side even without the > fix. How about keep the other gfx generations unchanged firstly and after > sort out the failed case will add the quirk for each specific gfx > respectively? Mendocino only supports S0i3 so we don't touch gfx on suspend/resume. This would only happen on platforms that support S3. Alex > > > > r = amdgpu_ring_alloc(ring, gfx_v9_0_get_csb_size(adev) + 4 + 3); > > > if (r) { > > > DRM_ERROR("amdgpu: cp failed to lock ring (%d).\n", > > > r); > > > -- > > > 2.34.1 > > >
[PATCH v4 3/3] drm/buddy: Add defragmentation support
Add a function to support defragmentation. Signed-off-by: Arunpravin Paneer Selvam Suggested-by: Matthew Auld --- drivers/gpu/drm/drm_buddy.c | 48 - 1 file changed, 37 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index d44172f23f05..3cffa9cc12d7 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -277,7 +277,8 @@ drm_get_buddy(struct drm_buddy_block *block) EXPORT_SYMBOL(drm_get_buddy); static void __drm_buddy_free(struct drm_buddy *mm, -struct drm_buddy_block *block) +struct drm_buddy_block *block, +bool defrag) { struct drm_buddy_block *parent; @@ -289,12 +290,14 @@ static void __drm_buddy_free(struct drm_buddy *mm, if (!drm_buddy_block_is_free(buddy)) break; - if (drm_buddy_block_is_clear(block) != - drm_buddy_block_is_clear(buddy)) - break; + if (!defrag) { + if (drm_buddy_block_is_clear(block) != + drm_buddy_block_is_clear(buddy)) + break; - if (drm_buddy_block_is_clear(block)) - mark_cleared(parent); + if (drm_buddy_block_is_clear(block)) + mark_cleared(parent); + } list_del(&buddy->link); @@ -307,6 +310,19 @@ static void __drm_buddy_free(struct drm_buddy *mm, mark_free(mm, block); } +static void drm_buddy_defrag(struct drm_buddy *mm) +{ + struct drm_buddy_block *block; + struct list_head *list; + + list = &mm->free_list[0]; + if (list_empty(list)) + return; + + list_for_each_entry_reverse(block, list, link) + __drm_buddy_free(mm, block, 1); +} + /** * drm_buddy_free_block - free a block * @@ -321,7 +337,7 @@ void drm_buddy_free_block(struct drm_buddy *mm, if (drm_buddy_block_is_clear(block)) mm->clear_avail += drm_buddy_block_size(mm, block); - __drm_buddy_free(mm, block); + __drm_buddy_free(mm, block, 0); } EXPORT_SYMBOL(drm_buddy_free_block); @@ -447,7 +463,7 @@ __alloc_range_bias(struct drm_buddy *mm, if (buddy && (drm_buddy_block_is_free(block) && drm_buddy_block_is_free(buddy))) - __drm_buddy_free(mm, block); + __drm_buddy_free(mm, block, 0); return ERR_PTR(err); } @@ -577,7 +593,7 @@ alloc_from_freelist(struct drm_buddy *mm, err_undo: if (tmp != order) - __drm_buddy_free(mm, block); + __drm_buddy_free(mm, block, 0); return ERR_PTR(err); } @@ -657,7 +673,7 @@ static int __alloc_range(struct drm_buddy *mm, if (buddy && (drm_buddy_block_is_free(block) && drm_buddy_block_is_free(buddy))) - __drm_buddy_free(mm, block); + __drm_buddy_free(mm, block, 0); err_free: if (err == -ENOSPC && total_allocated_on_err) { @@ -903,7 +919,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, if (order-- == min_order) { if (flags & DRM_BUDDY_CONTIGUOUS_ALLOCATION && - !(flags & DRM_BUDDY_RANGE_ALLOCATION)) + !(flags & DRM_BUDDY_RANGE_ALLOCATION)) { /* * Try contiguous block allocation through * try harder method @@ -912,6 +928,16 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, original_size, original_min_size, blocks); + + /* +* Defragment the freelist and try contiguous block +* allocation +*/ + drm_buddy_defrag(mm); + if (!IS_ERR(alloc_from_freelist(mm, min_order, flags))) + break; + } + err = -ENOSPC; goto err_free; } -- 2.25.1
[PATCH v4 2/3] drm/amdgpu: Enable clear page functionality
Add clear page support in vram memory region. v1:(Christian) - Dont handle clear page as TTM flag since when moving the BO back in from GTT again we don't need that. - Make a specialized version of amdgpu_fill_buffer() which only clears the VRAM areas which are not already cleared - Drop the TTM_PL_FLAG_WIPE_ON_RELEASE check in amdgpu_object.c v2: - Modify the function name amdgpu_ttm_* (Alex) - Drop the delayed parameter (Christian) - handle amdgpu_res_cleared(&cursor) just above the size calculation (Christian) - Use AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE for clearing the buffers in the free path to properly wait for fences etc.. (Christian) v3:(Christian) - Remove buffer clear code in VRAM manager instead change the AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE handling to set the DRM_BUDDY_CLEARED flag. - Remove ! from amdgpu_res_cleared(&cursor) check. Signed-off-by: Arunpravin Paneer Selvam Suggested-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c| 22 --- .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h| 25 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 61 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 6 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h | 5 ++ 6 files changed, 111 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index b2e9a2f81d82..be32f9852d19 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -39,6 +39,7 @@ #include "amdgpu.h" #include "amdgpu_trace.h" #include "amdgpu_amdkfd.h" +#include "amdgpu_vram_mgr.h" /** * DOC: amdgpu_object @@ -595,8 +596,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev, if (!amdgpu_bo_support_uswc(bo->flags)) bo->flags &= ~AMDGPU_GEM_CREATE_CPU_GTT_USWC; - if (adev->ras_enabled) - bo->flags |= AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE; + bo->flags |= AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE; bo->tbo.bdev = &adev->mman.bdev; if (bp->domain & (AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA | @@ -626,15 +626,17 @@ int amdgpu_bo_create(struct amdgpu_device *adev, if (bp->flags & AMDGPU_GEM_CREATE_VRAM_CLEARED && bo->tbo.resource->mem_type == TTM_PL_VRAM) { - struct dma_fence *fence; + struct dma_fence *fence = NULL; - r = amdgpu_fill_buffer(bo, 0, bo->tbo.base.resv, &fence, true); + r = amdgpu_ttm_clear_buffer(bo, bo->tbo.base.resv, &fence); if (unlikely(r)) goto fail_unreserve; - dma_resv_add_fence(bo->tbo.base.resv, fence, - DMA_RESV_USAGE_KERNEL); - dma_fence_put(fence); + if (fence) { + dma_resv_add_fence(bo->tbo.base.resv, fence, + DMA_RESV_USAGE_KERNEL); + dma_fence_put(fence); + } } if (!bp->resv) amdgpu_bo_unreserve(bo); @@ -1357,8 +1359,12 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object *bo) if (WARN_ON_ONCE(!dma_resv_trylock(bo->base.resv))) return; - r = amdgpu_fill_buffer(abo, AMDGPU_POISON, bo->base.resv, &fence, true); + r = amdgpu_fill_buffer(abo, 0, bo->base.resv, &fence, true); if (!WARN_ON(r)) { + struct amdgpu_vram_mgr_resource *vres; + + vres = to_amdgpu_vram_mgr_resource(bo->resource); + vres->flags |= DRM_BUDDY_CLEARED; amdgpu_bo_fence(abo, fence, false); dma_fence_put(fence); } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h index 381101d2bf05..50fcd86e1033 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h @@ -164,4 +164,29 @@ static inline void amdgpu_res_next(struct amdgpu_res_cursor *cur, uint64_t size) } } +/** + * amdgpu_res_cleared - check if blocks are cleared + * + * @cur: the cursor to extract the block + * + * Check if the @cur block is cleared + */ +static inline bool amdgpu_res_cleared(struct amdgpu_res_cursor *cur) +{ + struct drm_buddy_block *block; + + switch (cur->mem_type) { + case TTM_PL_VRAM: + block = cur->node; + + if (!amdgpu_vram_mgr_is_cleared(block)) + return false; + break; + default: + return false; + } + + return true; +} + #endif diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 46a24d2308aa..15cdda77573c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/
[PATCH v4 1/3] drm/buddy: Implement tracking clear page feature
- Add tracking clear page feature. - Driver should enable the DRM_BUDDY_CLEARED flag if it successfully clears the blocks in the free path. On the otherhand, DRM buddy marks each block as cleared. - Track the available cleared pages size - If driver requests cleared memory we prefer cleared memory but fallback to uncleared if we can't find the cleared blocks. when driver requests uncleared memory we try to use uncleared but fallback to cleared memory if necessary. - When a block gets freed we clear it and mark the freed block as cleared, when there are buddies which are cleared as well we can merge them. Otherwise, we prefer to keep the blocks as separated. v1: (Christian) - Depends on the flag check DRM_BUDDY_CLEARED, enable the block as cleared. Else, reset the clear flag for each block in the list. - For merging the 2 cleared blocks compare as below, drm_buddy_is_clear(block) != drm_buddy_is_clear(buddy) Signed-off-by: Arunpravin Paneer Selvam Suggested-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 6 +- drivers/gpu/drm/drm_buddy.c | 169 +++--- drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 6 +- drivers/gpu/drm/tests/drm_buddy_test.c| 10 +- include/drm/drm_buddy.h | 18 +- 5 files changed, 168 insertions(+), 41 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c index 08916538a615..d0e199cc8f17 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c @@ -556,7 +556,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, return 0; error_free_blocks: - drm_buddy_free_list(mm, &vres->blocks); + drm_buddy_free_list(mm, &vres->blocks, 0); mutex_unlock(&mgr->lock); error_fini: ttm_resource_fini(man, &vres->base); @@ -589,7 +589,7 @@ static void amdgpu_vram_mgr_del(struct ttm_resource_manager *man, amdgpu_vram_mgr_do_reserve(man); - drm_buddy_free_list(mm, &vres->blocks); + drm_buddy_free_list(mm, &vres->blocks, 0); mutex_unlock(&mgr->lock); atomic64_sub(vis_usage, &mgr->vis_usage); @@ -897,7 +897,7 @@ void amdgpu_vram_mgr_fini(struct amdgpu_device *adev) kfree(rsv); list_for_each_entry_safe(rsv, temp, &mgr->reserved_pages, blocks) { - drm_buddy_free_list(&mgr->mm, &rsv->allocated); + drm_buddy_free_list(&mgr->mm, &rsv->allocated, 0); kfree(rsv); } if (!adev->gmc.is_app_apu) diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index f57e6d74fb0e..d44172f23f05 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -57,6 +57,16 @@ static void list_insert_sorted(struct drm_buddy *mm, __list_add(&block->link, node->link.prev, &node->link); } +static void clear_reset(struct drm_buddy_block *block) +{ + block->header &= ~DRM_BUDDY_HEADER_CLEAR; +} + +static void mark_cleared(struct drm_buddy_block *block) +{ + block->header |= DRM_BUDDY_HEADER_CLEAR; +} + static void mark_allocated(struct drm_buddy_block *block) { block->header &= ~DRM_BUDDY_HEADER_STATE; @@ -223,6 +233,12 @@ static int split_block(struct drm_buddy *mm, mark_free(mm, block->left); mark_free(mm, block->right); + if (drm_buddy_block_is_clear(block)) { + mark_cleared(block->left); + mark_cleared(block->right); + clear_reset(block); + } + mark_split(block); return 0; @@ -273,6 +289,13 @@ static void __drm_buddy_free(struct drm_buddy *mm, if (!drm_buddy_block_is_free(buddy)) break; + if (drm_buddy_block_is_clear(block) != + drm_buddy_block_is_clear(buddy)) + break; + + if (drm_buddy_block_is_clear(block)) + mark_cleared(parent); + list_del(&buddy->link); drm_block_free(mm, block); @@ -295,6 +318,9 @@ void drm_buddy_free_block(struct drm_buddy *mm, { BUG_ON(!drm_buddy_block_is_allocated(block)); mm->avail += drm_buddy_block_size(mm, block); + if (drm_buddy_block_is_clear(block)) + mm->clear_avail += drm_buddy_block_size(mm, block); + __drm_buddy_free(mm, block); } EXPORT_SYMBOL(drm_buddy_free_block); @@ -305,10 +331,20 @@ EXPORT_SYMBOL(drm_buddy_free_block); * @mm: DRM buddy manager * @objects: input list head to free blocks */ -void drm_buddy_free_list(struct drm_buddy *mm, struct list_head *objects) +void drm_buddy_free_list(struct drm_buddy *mm, +struct list_head *objects, +unsigned long flags) { struct drm_buddy_block *block, *on; + if (flags & DRM_BUDDY_CLEARED) { + lis
Re: [PATCH] mm: Remove double faults once write a device pfn
Am 24.01.24 um 12:04 schrieb Alistair Popple: "Zhou, Xianrong" writes: [AMD Official Use Only - General] The vmf_insert_pfn_prot could cause unnecessary double faults on a device pfn. Because currently the vmf_insert_pfn_prot does not make the pfn writable so the pte entry is normally read-only or dirty catching. What? How do you got to this conclusion? Sorry. I did not mention that this problem only exists on arm64 platform. Ok, that makes at least a little bit more sense. Because on arm64 platform the PTE_RDONLY is automatically attached to the userspace pte entries even through VM_WRITE + VM_SHARE. The PTE_RDONLY needs to be cleared in vmf_insert_pfn_prot. However vmf_insert_pfn_prot do not make the pte writable passing false @mkwrite to insert_pfn. Question is why is arm64 doing this? As far as I can see they must have some hardware reason for that. The mkwrite parameter to insert_pfn() was added by commit b2770da642540 to make insert_pfn() look more like insert_pfn_pmd() so that the DAX code can insert PTEs which are writable and dirty at the same time. This is one scenario to do so. In fact on arm64 there are many scenarios could be to do so. So we can let vmf_insert_pfn_prot supporting @mkwrite for drivers at core layer and let drivers to decide whether or not to make writable and dirty at one time. The patch did this. Otherwise double faults on arm64 when call vmf_insert_pfn_prot. Well, that doesn't answer my question why arm64 is double faulting in the first place,. Eh. On arm64 When userspace mmap() with PROT_WRITE and MAP_SHARED the vma->vm_page_prot has the PTE_RDONLY and PTE_WRITE within PAGE_SHARED_EXEC. (seeing arm64 protection_map) Well that's your observation, but not the explanation why arm64 is doing this. See this would have quite some negative impact on performance, not only for gfx drivers but in general. So either the observation is incorrect or there is a *really* good reason why arm64 is taking this performance penalty. When write the userspace virtual address the first fault happen and call into driver's .fault->ttm_bo_vm_fault_reserved->vmf_insert_pfn_prot->insert_pfn. The insert_pfn will establish the pte entry. However the vmf_insert_pfn_prot pass false @mkwrite to insert_pfn by default and so insert_pfn could not make the pfn writable and it do not call maybe_mkwrite(pte_mkdirty(entry), vma) to clear the PTE_RDONLY bit. So the pte entry is actually write protection for mmu. So when the first fault return and re-execute the store instruction the second fault happen again. And in second fault it just only do pte_mkdirty(entry) which clear the PTE_RDONLY. It depends if the ARM64 CPU in question supports hardware dirty bit management (DBM). If that is the case and PTE_DBM (ie. PTE_WRITE) is set HW will automatically clear PTE_RDONLY bit to mark the entry dirty instead of raising a write fault. So you shouldn't see a double fault if PTE_DBM/WRITE is set. On ARM64 you can kind of think of PTE_RDONLY as the HW dirty bit and PTE_DBM as the read/write permission bit with SW being responsible for updating PTE_RDONLY via the fault handler if DBM is not supported by HW. At least that's my understanding from having hacked on this in the past. You can see all this weirdness happening in the definitions of pte_dirty() and pte_write() for ARM64. +1 Thanks a lot for that, this was exactly the information I was looking for. In this light it makes this patch here look unnecessary and questionable at best. Xianrong if you have an arm64 platform which really double faults (confirmed through a debugger for example) then you have to ask why this platform shows this behavior and not try to work around it. Behaviors like those usually have a very very good reason and without a confirmed explanation I'm not allowing any patch in which would disable stuff like that. Regards, Christian. I think so and hope no wrong. So as long as this isn't sorted out I'm going to reject this patch. Regards, Christian. This is a completely different use case to what you try to use it here for and that looks extremely fishy to me. Regards, Christian. The first fault only sets up the pte entry which actually is dirty catching. And the second immediate fault to the pfn due to first dirty catching when the cpu re-execute the store instruction. It could be that this is done to work around some hw behavior, but not because of dirty catching. Normally if the drivers call vmf_insert_pfn_prot and also supply 'pfn_mkwrite' callback within vm_operations_struct which requires the pte to be dirty catching then the vmf_insert_pfn_prot and the double fault are reasonable. It is not a problem. Well, as far as I can see that behavior absolutely doesn't make sense. When pfn_mkwrite is requested then the driver should use PAGE_COPY, which is exactly what VMWGFX (the only driver using dirty tracking) is doing. Everybody else uses PAGE_SHARED which should make the pte w
Re: [PATCH v2] drm/amdgpu: move the drm client creation behind drm device registration
On 1/25/2024 1:37 PM, Le Ma wrote: > This patch is to eliminate interrupt warning below: > > "[drm] Fence fallback timer expired on ring sdma0.0". > > An early vm pt clearing job is sent to SDMA ahead of interrupt enabled, > introduced by patch below: > > - drm/amdkfd: Export DMABufs from KFD using GEM handles > The drm client creation logic in this patch won't work well for dynamic partition switch cases. Better add a ''Fixes' tag also. This looks fine to me, needs to be checked by Felix anyway. Thanks, Lijo > And re-locating the drm client creation following after drm_dev_register > looks like a more proper flow. > > v2: wrap the drm client creation > > Signed-off-by: Le Ma > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 32 ++ > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 ++ > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 4 +++ > 3 files changed, 27 insertions(+), 11 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > index 77e263660288..41db030ddc4e 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > @@ -141,11 +141,31 @@ static void amdgpu_amdkfd_reset_work(struct work_struct > *work) > static const struct drm_client_funcs kfd_client_funcs = { > .unregister = drm_client_release, > }; > + > +int amdgpu_amdkfd_drm_client_create(struct amdgpu_device *adev) > +{ > + int ret; > + > + if (!adev->kfd.init_complete) > + return 0; > + > + ret = drm_client_init(&adev->ddev, &adev->kfd.client, "kfd", > + &kfd_client_funcs); > + if (ret) { > + dev_err(adev->dev, "Failed to init DRM client: %d\n", > + ret); > + return ret; > + } > + > + drm_client_register(&adev->kfd.client); > + > + return 0; > +} > + > void amdgpu_amdkfd_device_init(struct amdgpu_device *adev) > { > int i; > int last_valid_bit; > - int ret; > > amdgpu_amdkfd_gpuvm_init_mem_limits(); > > @@ -164,12 +184,6 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device > *adev) > .enable_mes = adev->enable_mes, > }; > > - ret = drm_client_init(&adev->ddev, &adev->kfd.client, "kfd", > &kfd_client_funcs); > - if (ret) { > - dev_err(adev->dev, "Failed to init DRM client: %d\n", > ret); > - return; > - } > - > /* this is going to have a few of the MSBs set that we need to >* clear >*/ > @@ -208,10 +222,6 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device > *adev) > > adev->kfd.init_complete = kgd2kfd_device_init(adev->kfd.dev, > &gpu_resources); > - if (adev->kfd.init_complete) > - drm_client_register(&adev->kfd.client); > - else > - drm_client_release(&adev->kfd.client); > > amdgpu_amdkfd_total_mem_size += adev->gmc.real_vram_size; > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h > index 584a0cea5572..da175c384adf 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h > @@ -182,6 +182,8 @@ int amdgpu_queue_mask_bit_to_set_resource_bit(struct > amdgpu_device *adev, > struct amdgpu_amdkfd_fence *amdgpu_amdkfd_fence_create(u64 context, > struct mm_struct *mm, > struct svm_range_bo *svm_bo); > + > +int amdgpu_amdkfd_drm_client_create(struct amdgpu_device *adev); > #if defined(CONFIG_DEBUG_FS) > int kfd_debugfs_kfd_mem_limits(struct seq_file *m, void *data); > #endif > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > index 475bd59c9ac2..91d5d9435067 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > @@ -2255,6 +2255,10 @@ static int amdgpu_pci_probe(struct pci_dev *pdev, > if (ret) > goto err_pci; > > + ret = amdgpu_amdkfd_drm_client_create(adev); > + if (ret) > + goto err_pci; > + > /* >* 1. don't init fbdev on hw without DCE >* 2. don't init fbdev if there are no connectors
Re: [PATCH 2/2] drm/amdgpu: reset gpu for pm abort case
On 1/25/2024 8:52 AM, Prike Liang wrote: > In the pm abort case the gfx power rail not turn off from FCH side and > this will lead to the gfx reinitialized failed base on the unknown gfx > HW status, so let's reset the gpu to a known good power state. > >From the description, this an APU only problem (or this patch could only resolve APU abort sequence). However, there is no check for APU in the patch below. > Signed-off-by: Prike Liang > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 + > drivers/gpu/drm/amd/amdgpu/soc15.c | 8 +++- > 2 files changed, 12 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 56d9dfa61290..4c40ffaaa5c2 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -4627,6 +4627,11 @@ int amdgpu_device_resume(struct drm_device *dev, bool > fbcon) > return r; > } > > + if(amdgpu_asic_need_reset_on_init(adev)) { > + DRM_INFO("PM abort case and let's reset asic \n"); > + amdgpu_asic_reset(adev); > + } > + suspend_noirq is specific for suspend scenarios and not valid for freeze/thaw. I guess this could trigger reset for successful restore on APUs. > if (dev->switch_power_state == DRM_SWITCH_POWER_OFF) > return 0; > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c > b/drivers/gpu/drm/amd/amdgpu/soc15.c > index 15033efec2ba..9329a00b6abc 100644 > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > @@ -804,9 +804,16 @@ static bool soc15_need_reset_on_init(struct > amdgpu_device *adev) > if (adev->asic_type == CHIP_RENOIR) > return true; > > + sol_reg = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_81); > + > /* Just return false for soc15 GPUs. Reset does not seem to >* be necessary. >*/ The comment now doesn't make sense. Thanks, Lijo > + if (adev->in_suspend && !adev->in_s0ix && > + !adev->pm_complete && > + sol_reg) > + return true; > + > if (!amdgpu_passthrough(adev)) > return false; > > @@ -816,7 +823,6 @@ static bool soc15_need_reset_on_init(struct amdgpu_device > *adev) > /* Check sOS sign of life register to confirm sys driver and sOS >* are already been loaded. >*/ > - sol_reg = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_81); > if (sol_reg) > return true; >
Re: [PATCH v2 2/2] drm/amdgpu: Implement check_async_props for planes
On Wed, Jan 24, 2024 at 11:14:40AM -0300, André Almeida wrote: > Hi Ville, > > Em 19/01/2024 15:25, Ville Syrjälä escreveu: > > On Fri, Jan 19, 2024 at 03:12:35PM -0300, André Almeida wrote: > >> AMD GPUs can do async flips with changes on more properties than just > >> the FB ID, so implement a custom check_async_props for AMD planes. > >> > >> Allow amdgpu to do async flips with IN_FENCE_ID and FB_DAMAGE_CLIPS > >> properties. For userspace to check if a driver support this two > >> properties, the strategy for now is to use TEST_ONLY commits. > >> > >> Signed-off-by: André Almeida > >> --- > >> v2: Drop overlay plane option for now > >> > >> .../amd/display/amdgpu_dm/amdgpu_dm_plane.c | 29 +++ > >> 1 file changed, 29 insertions(+) > >> > >> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c > >> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c > >> index 116121e647ca..7afe8c1b62d4 100644 > >> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c > >> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c > >> @@ -25,6 +25,7 @@ > >>*/ > >> > >> #include > >> +#include > >> #include > >> #include > >> #include > >> @@ -1430,6 +1431,33 @@ static void > >> amdgpu_dm_plane_drm_plane_destroy_state(struct drm_plane *plane, > >>drm_atomic_helper_plane_destroy_state(plane, state); > >> } > >> > >> +static int amdgpu_dm_plane_check_async_props(struct drm_property *prop, > >> +struct drm_plane *plane, > >> +struct drm_plane_state *plane_state, > >> +struct drm_mode_object *obj, > >> +u64 prop_value, u64 old_val) > >> +{ > >> + struct drm_mode_config *config = &plane->dev->mode_config; > >> + int ret; > >> + > >> + if (prop != config->prop_fb_id && > >> + prop != config->prop_in_fence_fd && > > > > IN_FENCE should just be allowed always. > > Do you mean that the common path should allow IN_FENCE_FD for all drivers? Yes. -- Ville Syrjälä Intel
Re: [PATCH] mm: Remove double faults once write a device pfn
"Zhou, Xianrong" writes: > [AMD Official Use Only - General] > >> > The vmf_insert_pfn_prot could cause unnecessary double faults on a >> > device pfn. Because currently the vmf_insert_pfn_prot does not >> > make the pfn writable so the pte entry is normally read-only or >> > dirty catching. >> What? How do you got to this conclusion? >> >>> Sorry. I did not mention that this problem only exists on arm64 platform. >> >> Ok, that makes at least a little bit more sense. >> >> >> >>> Because on arm64 platform the PTE_RDONLY is automatically attached >> >>> to the userspace pte entries even through VM_WRITE + VM_SHARE. >> >>> The PTE_RDONLY needs to be cleared in vmf_insert_pfn_prot. However >> >>> vmf_insert_pfn_prot do not make the pte writable passing false >> >>> @mkwrite to insert_pfn. >> >> Question is why is arm64 doing this? As far as I can see they must >> >> have some hardware reason for that. >> >> >> >> The mkwrite parameter to insert_pfn() was added by commit >> >> b2770da642540 to make insert_pfn() look more like insert_pfn_pmd() so >> >> that the DAX code can insert PTEs which are writable and dirty at the same >> time. >> >> >> > This is one scenario to do so. In fact on arm64 there are many >> > scenarios could be to do so. So we can let vmf_insert_pfn_prot >> > supporting @mkwrite for drivers at core layer and let drivers to >> > decide whether or not to make writable and dirty at one time. The >> > patch did this. Otherwise double faults on arm64 when call >> vmf_insert_pfn_prot. >> >> Well, that doesn't answer my question why arm64 is double faulting in the >> first place,. >> > > > Eh. > > On arm64 When userspace mmap() with PROT_WRITE and MAP_SHARED the > vma->vm_page_prot has the PTE_RDONLY and PTE_WRITE within > PAGE_SHARED_EXEC. (seeing arm64 protection_map) > > When write the userspace virtual address the first fault happen and call > into driver's > .fault->ttm_bo_vm_fault_reserved->vmf_insert_pfn_prot->insert_pfn. > The insert_pfn will establish the pte entry. However the vmf_insert_pfn_prot > pass false @mkwrite to insert_pfn by default and so insert_pfn could not make > the pfn writable and it do not call maybe_mkwrite(pte_mkdirty(entry), vma) > to clear the PTE_RDONLY bit. So the pte entry is actually write protection > for mmu. > So when the first fault return and re-execute the store instruction the second > fault happen again. And in second fault it just only do pte_mkdirty(entry) > which > clear the PTE_RDONLY. It depends if the ARM64 CPU in question supports hardware dirty bit management (DBM). If that is the case and PTE_DBM (ie. PTE_WRITE) is set HW will automatically clear PTE_RDONLY bit to mark the entry dirty instead of raising a write fault. So you shouldn't see a double fault if PTE_DBM/WRITE is set. On ARM64 you can kind of think of PTE_RDONLY as the HW dirty bit and PTE_DBM as the read/write permission bit with SW being responsible for updating PTE_RDONLY via the fault handler if DBM is not supported by HW. At least that's my understanding from having hacked on this in the past. You can see all this weirdness happening in the definitions of pte_dirty() and pte_write() for ARM64. > I think so and hope no wrong. > >> So as long as this isn't sorted out I'm going to reject this patch. >> >> Regards, >> Christian. >> >> > >> >> This is a completely different use case to what you try to use it >> >> here for and that looks extremely fishy to me. >> >> >> >> Regards, >> >> Christian. >> >> >> > The first fault only sets up the pte entry which actually is dirty >> > catching. And the second immediate fault to the pfn due to first >> > dirty catching when the cpu re-execute the store instruction. >> It could be that this is done to work around some hw behavior, but >> not because of dirty catching. >> >> > Normally if the drivers call vmf_insert_pfn_prot and also supply >> > 'pfn_mkwrite' callback within vm_operations_struct which requires >> > the pte to be dirty catching then the vmf_insert_pfn_prot and the >> > double fault are reasonable. It is not a problem. >> Well, as far as I can see that behavior absolutely doesn't make sense. >> >> When pfn_mkwrite is requested then the driver should use PAGE_COPY, >> which is exactly what VMWGFX (the only driver using dirty tracking) >> is >> >> doing. >> Everybody else uses PAGE_SHARED which should make the pte writeable >> immediately. >> >> Regards, >> Christian. >> >> > However the most of drivers calling vmf_insert_pfn_prot do not >> > supply the 'pfn_mkwrite' callback so that the second fault is >> unnecessary. >> > >> > So just like vmf_insert_mixed and vmf_insert_mixed_mkwrite pair, >> > we should also supply vmf_insert_pfn_mkwrite for drivers as well. >> > >> > Signed-off-by: Xianrong Zhou >> > --- >> > arch/x86/entry/vdso/vma.c | 3 +
Re: BUG [RESEND][NEW BUG]: kernel NULL pointer dereference, address: 0000000000000008
Hi, Ma Jun, Normally, I would reply under the quoted text, but I will adjust to your convention. I have just discovered that your patch causes Ubuntu 22.04 LTS GNOME XWayland session to block at typing password and ENTER in the graphical logon screen (tested several times). After that, I was not able to even log from another box with ssh, or the session would block (tested one time, second time too, thrid time it passed after I connected before attempt to login on XWayland console). You might find useful syslog and dmesg of the freeze on this link (they were +100K): https://magrf.grf.hr/~mtodorov/linux/bugreports/6.7.0/amdgpu/6.7.0-xway-09721-g61da593f4458/ The exact applied patch was this: marvin@defiant:~/linux/kernel/linux_torvalds$ git diff diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index 73f6d7e72c73..6ef333df9adf 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -3996,16 +3996,13 @@ static int gfx_v10_0_init_microcode(struct amdgpu_device *adev) if (!amdgpu_sriov_vf(adev)) { snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_rlc.bin", ucode_prefix); - err = amdgpu_ucode_request(adev, &adev->gfx.rlc_fw, fw_name); - /* don't check this. There are apparently firmwares in the wild with -* incorrect size in the header -*/ - if (err == -ENODEV) - goto out; + err = request_firmware(&adev->gfx.rlc_fw, fw_name, adev->dev); if (err) - dev_dbg(adev->dev, - "gfx10: amdgpu_ucode_request() failed \"%s\"\n", - fw_name); + goto out; + + /* don't validate this firmware. There are apparently firmwares +* in the wild with incorrect size in the header +*/ rlc_hdr = (const struct rlc_firmware_header_v2_0 *)adev->gfx.rlc_fw->data; version_major = le16_to_cpu(rlc_hdr->header.header_version_major); version_minor = le16_to_cpu(rlc_hdr->header.header_version_minor); marvin@defiant:~/linux/kernel/linux_torvalds$ uname -rms Linux 6.7.0-xway-09721-g61da593f4458 x86_64 marvin@defiant:~/linux/kernel/linux_torvalds$ So, there seems to be a problem with the way the patch affects XWayland. Checked multiple times the exact commit with and without the diff. Hope this helps, because I am not familiar with the amdgpu driver. Best regards, Mirsad Todorovac On 1/22/24 09:34, Ma, Jun wrote: Perhaps similar to the problem I encountered earlier, you can try the following patch https://lists.freedesktop.org/archives/amd-gfx/2024-January/103259.html Regards, Ma Jun On 1/21/2024 3:54 AM, Mirsad Todorovac wrote: Hi, The last email did not pass to the most of the recipients due to banned .xz attachment. As the .config is too big to send inline or uncompressed either, I will omit it in this attempt. In the meantime, I had some success in decoding the stack trace, but sadly not complete. I don't think this Oops is deterministic, but I am working on a reproducer. The platform is Ubuntu 22.04 LTS. Complete list of hardware and .config is available here: https://domac.alu.unizg.hr/~mtodorov/linux/bugreports/amdgpu/6.7.0-rtl-v02-nokcsan-09928-g052d534373b7/ Best regards, Mirsad --- kernel: [5.576702] BUG: kernel NULL pointer dereference, address: 0008 kernel: [5.576707] #PF: supervisor read access in kernel mode kernel: [5.576710] #PF: error_code(0x) - not-present page kernel: [5.576712] PGD 0 P4D 0 kernel: [5.576715] Oops: [#1] PREEMPT SMP NOPTI kernel: [5.576718] CPU: 9 PID: 650 Comm: systemd-udevd Not tainted 6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7 #2 kernel: [5.576723] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023 kernel: [5.576726] RIP: 0010:gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu kernel: [ 5.576872] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0 All code 0: 8d 55 a8lea-0x58(%rbp),%edx 3: 4c 89 ffmov%r15,%rdi 6: e8 e4 83 ec ff call 0xffec83ef b: 41 89 c2mov%eax,%r10d e: 83 f8 edcmp$0xffed,%eax 11: 0f 84 b3 fd ff ff je 0xfdca 17: 85 c0 test %eax,%eax 19: 74 05 je 0x20 1b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 20: 49 8b 87 08
Re: [PATCH v3] drm/amdgpu/pm: Use macro definitions in the smu IH process function
On 1/25/2024 2:20 PM, Ma Jun wrote: > Replace the hard-coded numbers with macro definition > > Signed-off-by: Ma Jun Reviewed-by: Lijo Lazar Thanks, Lijo > --- > v3: > - Add new SMU_IH_INTERRUPT_* macros for smu, keeping the original > macro definitions in sync with pmfw (kevin) > --- > drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 10 +- > drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 14 +++--- > drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0.c | 2 +- > drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h | 10 ++ > 4 files changed, 23 insertions(+), 13 deletions(-) > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c > b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c > index fbeb31bf9e48..f6545093bfc1 100644 > --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c > @@ -1432,24 +1432,24 @@ static int smu_v11_0_irq_process(struct amdgpu_device > *adev, > dev_emerg(adev->dev, "ERROR: System is going to shutdown due to > GPU HW CTF!\n"); > orderly_poweroff(true); > } else if (client_id == SOC15_IH_CLIENTID_MP1) { > - if (src_id == 0xfe) { > + if (src_id == SMU_IH_INTERRUPT_ID_TO_DRIVER) { > /* ACK SMUToHost interrupt */ > data = RREG32_SOC15(MP1, 0, mmMP1_SMN_IH_SW_INT_CTRL); > data = REG_SET_FIELD(data, MP1_SMN_IH_SW_INT_CTRL, > INT_ACK, 1); > WREG32_SOC15(MP1, 0, mmMP1_SMN_IH_SW_INT_CTRL, data); > > switch (ctxid) { > - case 0x3: > + case SMU_IH_INTERRUPT_CONTEXT_ID_AC: > dev_dbg(adev->dev, "Switched to AC mode!\n"); > schedule_work(&smu->interrupt_work); > adev->pm.ac_power = true; > break; > - case 0x4: > + case SMU_IH_INTERRUPT_CONTEXT_ID_DC: > dev_dbg(adev->dev, "Switched to DC mode!\n"); > schedule_work(&smu->interrupt_work); > adev->pm.ac_power = false; > break; > - case 0x7: > + case SMU_IH_INTERRUPT_CONTEXT_ID_THERMAL_THROTTLING: > /* >* Increment the throttle interrupt counter >*/ > @@ -1508,7 +1508,7 @@ int smu_v11_0_register_irq_handler(struct smu_context > *smu) > return ret; > > ret = amdgpu_irq_add_id(adev, SOC15_IH_CLIENTID_MP1, > - 0xfe, > + SMU_IH_INTERRUPT_ID_TO_DRIVER, > irq_src); > if (ret) > return ret; > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c > b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c > index 1db74c0b5257..c13364830500 100644 > --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c > @@ -1368,24 +1368,24 @@ static int smu_v13_0_irq_process(struct amdgpu_device > *adev, > dev_emerg(adev->dev, "ERROR: System is going to shutdown due to > GPU HW CTF!\n"); > orderly_poweroff(true); > } else if (client_id == SOC15_IH_CLIENTID_MP1) { > - if (src_id == 0xfe) { > + if (src_id == SMU_IH_INTERRUPT_ID_TO_DRIVER) { > /* ACK SMUToHost interrupt */ > data = RREG32_SOC15(MP1, 0, regMP1_SMN_IH_SW_INT_CTRL); > data = REG_SET_FIELD(data, MP1_SMN_IH_SW_INT_CTRL, > INT_ACK, 1); > WREG32_SOC15(MP1, 0, regMP1_SMN_IH_SW_INT_CTRL, data); > > switch (ctxid) { > - case 0x3: > + case SMU_IH_INTERRUPT_CONTEXT_ID_AC: > dev_dbg(adev->dev, "Switched to AC mode!\n"); > smu_v13_0_ack_ac_dc_interrupt(smu); > adev->pm.ac_power = true; > break; > - case 0x4: > + case SMU_IH_INTERRUPT_CONTEXT_ID_DC: > dev_dbg(adev->dev, "Switched to DC mode!\n"); > smu_v13_0_ack_ac_dc_interrupt(smu); > adev->pm.ac_power = false; > break; > - case 0x7: > + case SMU_IH_INTERRUPT_CONTEXT_ID_THERMAL_THROTTLING: > /* >* Increment the throttle interrupt counter >*/ > @@ -1398,7 +1398,7 @@ static int smu_v13_0_irq_process(struct amdgpu_device > *adev, > > schedule_wor
RE: [PATCH v3] drm/amdgpu/pm: Use macro definitions in the smu IH process function
[AMD Official Use Only - General] Reviewed-by: Yang Wang Best Regards, Kevin -Original Message- From: Ma, Jun Sent: Thursday, January 25, 2024 4:50 PM To: amd-gfx@lists.freedesktop.org Cc: Lazar, Lijo ; Feng, Kenneth ; Deucher, Alexander ; Wang, Yang(Kevin) ; Ma, Jun Subject: [PATCH v3] drm/amdgpu/pm: Use macro definitions in the smu IH process function Replace the hard-coded numbers with macro definition Signed-off-by: Ma Jun --- v3: - Add new SMU_IH_INTERRUPT_* macros for smu, keeping the original macro definitions in sync with pmfw (kevin) --- drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 10 +- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 14 +++--- drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0.c | 2 +- drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h | 10 ++ 4 files changed, 23 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c index fbeb31bf9e48..f6545093bfc1 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c @@ -1432,24 +1432,24 @@ static int smu_v11_0_irq_process(struct amdgpu_device *adev, dev_emerg(adev->dev, "ERROR: System is going to shutdown due to GPU HW CTF!\n"); orderly_poweroff(true); } else if (client_id == SOC15_IH_CLIENTID_MP1) { - if (src_id == 0xfe) { + if (src_id == SMU_IH_INTERRUPT_ID_TO_DRIVER) { /* ACK SMUToHost interrupt */ data = RREG32_SOC15(MP1, 0, mmMP1_SMN_IH_SW_INT_CTRL); data = REG_SET_FIELD(data, MP1_SMN_IH_SW_INT_CTRL, INT_ACK, 1); WREG32_SOC15(MP1, 0, mmMP1_SMN_IH_SW_INT_CTRL, data); switch (ctxid) { - case 0x3: + case SMU_IH_INTERRUPT_CONTEXT_ID_AC: dev_dbg(adev->dev, "Switched to AC mode!\n"); schedule_work(&smu->interrupt_work); adev->pm.ac_power = true; break; - case 0x4: + case SMU_IH_INTERRUPT_CONTEXT_ID_DC: dev_dbg(adev->dev, "Switched to DC mode!\n"); schedule_work(&smu->interrupt_work); adev->pm.ac_power = false; break; - case 0x7: + case SMU_IH_INTERRUPT_CONTEXT_ID_THERMAL_THROTTLING: /* * Increment the throttle interrupt counter */ @@ -1508,7 +1508,7 @@ int smu_v11_0_register_irq_handler(struct smu_context *smu) return ret; ret = amdgpu_irq_add_id(adev, SOC15_IH_CLIENTID_MP1, - 0xfe, + SMU_IH_INTERRUPT_ID_TO_DRIVER, irq_src); if (ret) return ret; diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c index 1db74c0b5257..c13364830500 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c @@ -1368,24 +1368,24 @@ static int smu_v13_0_irq_process(struct amdgpu_device *adev, dev_emerg(adev->dev, "ERROR: System is going to shutdown due to GPU HW CTF!\n"); orderly_poweroff(true); } else if (client_id == SOC15_IH_CLIENTID_MP1) { - if (src_id == 0xfe) { + if (src_id == SMU_IH_INTERRUPT_ID_TO_DRIVER) { /* ACK SMUToHost interrupt */ data = RREG32_SOC15(MP1, 0, regMP1_SMN_IH_SW_INT_CTRL); data = REG_SET_FIELD(data, MP1_SMN_IH_SW_INT_CTRL, INT_ACK, 1); WREG32_SOC15(MP1, 0, regMP1_SMN_IH_SW_INT_CTRL, data); switch (ctxid) { - case 0x3: + case SMU_IH_INTERRUPT_CONTEXT_ID_AC: dev_dbg(adev->dev, "Switched to AC mode!\n"); smu_v13_0_ack_ac_dc_interrupt(smu); adev->pm.ac_power = true; break; - case 0x4: + case SMU_IH_INTERRUPT_CONTEXT_ID_DC: dev_dbg(adev->dev, "Switched to DC mode!\n"); smu_v13_0_ack_ac_dc_interrupt(smu); adev->pm.ac_power = false; break; - case 0x7: + case SMU_IH_INTERRUPT_CONTEXT_ID_THERMAL_THROTTLING: /* *
[PATCH v3] drm/amdgpu/pm: Use macro definitions in the smu IH process function
Replace the hard-coded numbers with macro definition Signed-off-by: Ma Jun --- v3: - Add new SMU_IH_INTERRUPT_* macros for smu, keeping the original macro definitions in sync with pmfw (kevin) --- drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 10 +- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 14 +++--- drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0.c | 2 +- drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h | 10 ++ 4 files changed, 23 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c index fbeb31bf9e48..f6545093bfc1 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c @@ -1432,24 +1432,24 @@ static int smu_v11_0_irq_process(struct amdgpu_device *adev, dev_emerg(adev->dev, "ERROR: System is going to shutdown due to GPU HW CTF!\n"); orderly_poweroff(true); } else if (client_id == SOC15_IH_CLIENTID_MP1) { - if (src_id == 0xfe) { + if (src_id == SMU_IH_INTERRUPT_ID_TO_DRIVER) { /* ACK SMUToHost interrupt */ data = RREG32_SOC15(MP1, 0, mmMP1_SMN_IH_SW_INT_CTRL); data = REG_SET_FIELD(data, MP1_SMN_IH_SW_INT_CTRL, INT_ACK, 1); WREG32_SOC15(MP1, 0, mmMP1_SMN_IH_SW_INT_CTRL, data); switch (ctxid) { - case 0x3: + case SMU_IH_INTERRUPT_CONTEXT_ID_AC: dev_dbg(adev->dev, "Switched to AC mode!\n"); schedule_work(&smu->interrupt_work); adev->pm.ac_power = true; break; - case 0x4: + case SMU_IH_INTERRUPT_CONTEXT_ID_DC: dev_dbg(adev->dev, "Switched to DC mode!\n"); schedule_work(&smu->interrupt_work); adev->pm.ac_power = false; break; - case 0x7: + case SMU_IH_INTERRUPT_CONTEXT_ID_THERMAL_THROTTLING: /* * Increment the throttle interrupt counter */ @@ -1508,7 +1508,7 @@ int smu_v11_0_register_irq_handler(struct smu_context *smu) return ret; ret = amdgpu_irq_add_id(adev, SOC15_IH_CLIENTID_MP1, - 0xfe, + SMU_IH_INTERRUPT_ID_TO_DRIVER, irq_src); if (ret) return ret; diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c index 1db74c0b5257..c13364830500 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c @@ -1368,24 +1368,24 @@ static int smu_v13_0_irq_process(struct amdgpu_device *adev, dev_emerg(adev->dev, "ERROR: System is going to shutdown due to GPU HW CTF!\n"); orderly_poweroff(true); } else if (client_id == SOC15_IH_CLIENTID_MP1) { - if (src_id == 0xfe) { + if (src_id == SMU_IH_INTERRUPT_ID_TO_DRIVER) { /* ACK SMUToHost interrupt */ data = RREG32_SOC15(MP1, 0, regMP1_SMN_IH_SW_INT_CTRL); data = REG_SET_FIELD(data, MP1_SMN_IH_SW_INT_CTRL, INT_ACK, 1); WREG32_SOC15(MP1, 0, regMP1_SMN_IH_SW_INT_CTRL, data); switch (ctxid) { - case 0x3: + case SMU_IH_INTERRUPT_CONTEXT_ID_AC: dev_dbg(adev->dev, "Switched to AC mode!\n"); smu_v13_0_ack_ac_dc_interrupt(smu); adev->pm.ac_power = true; break; - case 0x4: + case SMU_IH_INTERRUPT_CONTEXT_ID_DC: dev_dbg(adev->dev, "Switched to DC mode!\n"); smu_v13_0_ack_ac_dc_interrupt(smu); adev->pm.ac_power = false; break; - case 0x7: + case SMU_IH_INTERRUPT_CONTEXT_ID_THERMAL_THROTTLING: /* * Increment the throttle interrupt counter */ @@ -1398,7 +1398,7 @@ static int smu_v13_0_irq_process(struct amdgpu_device *adev, schedule_work(&smu->throttling_logging_work); break; - case 0x8: + case SMU_IH_INTERRUPT_CONTE
[PATCH v2] drm/amdgpu: move the drm client creation behind drm device registration
This patch is to eliminate interrupt warning below: "[drm] Fence fallback timer expired on ring sdma0.0". An early vm pt clearing job is sent to SDMA ahead of interrupt enabled, introduced by patch below: - drm/amdkfd: Export DMABufs from KFD using GEM handles And re-locating the drm client creation following after drm_dev_register looks like a more proper flow. v2: wrap the drm client creation Signed-off-by: Le Ma --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 32 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 4 +++ 3 files changed, 27 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index 77e263660288..41db030ddc4e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c @@ -141,11 +141,31 @@ static void amdgpu_amdkfd_reset_work(struct work_struct *work) static const struct drm_client_funcs kfd_client_funcs = { .unregister = drm_client_release, }; + +int amdgpu_amdkfd_drm_client_create(struct amdgpu_device *adev) +{ + int ret; + + if (!adev->kfd.init_complete) + return 0; + + ret = drm_client_init(&adev->ddev, &adev->kfd.client, "kfd", + &kfd_client_funcs); + if (ret) { + dev_err(adev->dev, "Failed to init DRM client: %d\n", + ret); + return ret; + } + + drm_client_register(&adev->kfd.client); + + return 0; +} + void amdgpu_amdkfd_device_init(struct amdgpu_device *adev) { int i; int last_valid_bit; - int ret; amdgpu_amdkfd_gpuvm_init_mem_limits(); @@ -164,12 +184,6 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev) .enable_mes = adev->enable_mes, }; - ret = drm_client_init(&adev->ddev, &adev->kfd.client, "kfd", &kfd_client_funcs); - if (ret) { - dev_err(adev->dev, "Failed to init DRM client: %d\n", ret); - return; - } - /* this is going to have a few of the MSBs set that we need to * clear */ @@ -208,10 +222,6 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev) adev->kfd.init_complete = kgd2kfd_device_init(adev->kfd.dev, &gpu_resources); - if (adev->kfd.init_complete) - drm_client_register(&adev->kfd.client); - else - drm_client_release(&adev->kfd.client); amdgpu_amdkfd_total_mem_size += adev->gmc.real_vram_size; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h index 584a0cea5572..da175c384adf 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h @@ -182,6 +182,8 @@ int amdgpu_queue_mask_bit_to_set_resource_bit(struct amdgpu_device *adev, struct amdgpu_amdkfd_fence *amdgpu_amdkfd_fence_create(u64 context, struct mm_struct *mm, struct svm_range_bo *svm_bo); + +int amdgpu_amdkfd_drm_client_create(struct amdgpu_device *adev); #if defined(CONFIG_DEBUG_FS) int kfd_debugfs_kfd_mem_limits(struct seq_file *m, void *data); #endif diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 475bd59c9ac2..91d5d9435067 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -2255,6 +2255,10 @@ static int amdgpu_pci_probe(struct pci_dev *pdev, if (ret) goto err_pci; + ret = amdgpu_amdkfd_drm_client_create(adev); + if (ret) + goto err_pci; + /* * 1. don't init fbdev on hw without DCE * 2. don't init fbdev if there are no connectors -- 2.38.1