[PATCH] drm/amdkfd: Fix isa version for the GC 10.3.7
Correct the isa version for handling KFD test. Fixes: 7c4f4f197e0c ("drm/amdkfd: Add GC 10.3.6 and 10.3.7 KFD definitions") Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index fdad1415f8bd..5ebbeac61379 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -388,7 +388,7 @@ struct kfd_dev *kgd2kfd_probe(struct amdgpu_device *adev, bool vf) f2g = &gfx_v10_3_kfd2kgd; break; case IP_VERSION(10, 3, 7): - gfx_target_version = 100307; + gfx_target_version = 100306; if (!vf) f2g = &gfx_v10_3_kfd2kgd; break; -- 2.25.1
Re: [Bug 216373] New: Uncorrected errors reported for AMD GPU
On 8/23/2022 10:34 PM, Tom Seewald wrote: On Sat, Aug 20, 2022 at 2:53 AM Lazar, Lijo wrote: Missed the remap part, the offset is here - https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv6.0-rc1%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Fnv.c%23L680&data=05%7C01%7Clijo.lazar%40amd.com%7Cac6bd5bb5d4143ff9e5808da852982d8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637968710652869475%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=U5AkO7jfPGP2veXp%2FQkoLY92%2BHNOdMkwTwQCb0tRJPk%3D&reserved=0 The trace is coming from *_flush_hdp. You may also check if *_remap_hdp_registers() is getting called. It is done in nbio_vx_y files, most likely this one for your device - https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv6.0-rc1%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Fnbio_v2_3.c%23L68&data=05%7C01%7Clijo.lazar%40amd.com%7Cac6bd5bb5d4143ff9e5808da852982d8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637968710652869475%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=N4ZbLJuRgddTqMdMo2vD5iJGMMmUJ1MPUVJwVIKThSU%3D&reserved=0 Thanks, Lijo Hi Lijo, I would be happy to test any patches that you think would shed some light on this. Unfortunately, I don't have any NV platforms to test. Attached is an 'untested-patch' based on your trace logs. Thanks, Lijodiff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index d7eb23b8d692..743a3ac909ad 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2376,6 +2376,15 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev) DRM_ERROR("amdgpu_vram_scratch_init failed %d\n", r); goto init_failed; } + + /* remap HDP registers to a hole in mmio space, +* for the purpose of expose those registers +* to process space. This is needed for any early HDP +* flush operation +*/ + if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev)) + adev->nbio.funcs->remap_hdp_registers(adev); + r = adev->ip_blocks[i].version->funcs->hw_init((void *)adev); if (r) { DRM_ERROR("hw_init %d failed %d\n", i, r); diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c index b3fba8dea63c..3ac7fef74277 100644 --- a/drivers/gpu/drm/amd/amdgpu/nv.c +++ b/drivers/gpu/drm/amd/amdgpu/nv.c @@ -1032,12 +1032,6 @@ static int nv_common_hw_init(void *handle) nv_program_aspm(adev); /* setup nbio registers */ adev->nbio.funcs->init_registers(adev); - /* remap HDP registers to a hole in mmio space, -* for the purpose of expose those registers -* to process space -*/ - if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev)) - adev->nbio.funcs->remap_hdp_registers(adev); /* enable the doorbell aperture */ nv_enable_doorbell_aperture(adev, true); diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c index fde6154f2009..a0481e37d7cf 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc15.c +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c @@ -1240,12 +1240,6 @@ static int soc15_common_hw_init(void *handle) soc15_program_aspm(adev); /* setup nbio registers */ adev->nbio.funcs->init_registers(adev); - /* remap HDP registers to a hole in mmio space, -* for the purpose of expose those registers -* to process space -*/ - if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev)) - adev->nbio.funcs->remap_hdp_registers(adev); /* enable the doorbell aperture */ soc15_enable_doorbell_aperture(adev, true); diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c index 1ff7fc7bb340..60a1cf03fddc 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc21.c +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c @@ -643,12 +643,6 @@ static int soc21_common_hw_init(void *handle) soc21_program_aspm(adev); /* setup nbio registers */ adev->nbio.funcs->init_registers(adev); - /* remap HDP registers to a hole in mmio space, -* for the purpose of expose those registers -* to process space -*/ - if (adev->nbio.funcs->remap_hdp_registers) - adev->nbio.funcs->remap_hdp_registers(adev); /* enable the doorbell aperture */ soc21_enable_doorbell_aperture(adev, true);
[PATCH v4] drm/amd/display: Fix vblank refcount in vrr transition
manage_dm_interrupts disable/enable vblank using drm_crtc_vblank_off/on which causes drm_crtc_vblank_get in vrr_transition to fail, and later when drm_crtc_vblank_put is called the refcount on vblank will be messed up. Therefore move the call to after manage_dm_interrupts. Signed-off-by: Yunxiang Li --- v2: check the return code for calls that might fail and warn on them v3/v4: make the sequence closer to the original and remove redundant local variables .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 55 +-- 1 file changed, 26 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index bc2493a2a90e..de80b61b8d8e 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -7488,15 +7488,15 @@ static void amdgpu_dm_handle_vrr_transition(struct dm_crtc_state *old_state, * We also need vupdate irq for the actual core vblank handling * at end of vblank. */ - dm_set_vupdate_irq(new_state->base.crtc, true); - drm_crtc_vblank_get(new_state->base.crtc); + WARN_ON(dm_set_vupdate_irq(new_state->base.crtc, true) != 0); + WARN_ON(drm_crtc_vblank_get(new_state->base.crtc) != 0); DRM_DEBUG_DRIVER("%s: crtc=%u VRR off->on: Get vblank ref\n", __func__, new_state->base.crtc->base.id); } else if (old_vrr_active && !new_vrr_active) { /* Transition VRR active -> inactive: * Allow vblank irq disable again for fixed refresh rate. */ - dm_set_vupdate_irq(new_state->base.crtc, false); + WARN_ON(dm_set_vupdate_irq(new_state->base.crtc, false) != 0); drm_crtc_vblank_put(new_state->base.crtc); DRM_DEBUG_DRIVER("%s: crtc=%u VRR on->off: Drop vblank ref\n", __func__, new_state->base.crtc->base.id); @@ -8261,23 +8261,6 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) mutex_unlock(&dm->dc_lock); } - /* Count number of newly disabled CRTCs for dropping PM refs later. */ - for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, - new_crtc_state, i) { - if (old_crtc_state->active && !new_crtc_state->active) - crtc_disable_count++; - - dm_new_crtc_state = to_dm_crtc_state(new_crtc_state); - dm_old_crtc_state = to_dm_crtc_state(old_crtc_state); - - /* For freesync config update on crtc state and params for irq */ - update_stream_irq_parameters(dm, dm_new_crtc_state); - - /* Handle vrr on->off / off->on transitions */ - amdgpu_dm_handle_vrr_transition(dm_old_crtc_state, - dm_new_crtc_state); - } - /** * Enable interrupts for CRTCs that are newly enabled or went through * a modeset. It was intentionally deferred until after the front end @@ -8287,16 +8270,29 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) { struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc); #ifdef CONFIG_DEBUG_FS - bool configure_crc = false; enum amdgpu_dm_pipe_crc_source cur_crc_src; #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY) - struct crc_rd_work *crc_rd_wrk = dm->crc_rd_wrk; + struct crc_rd_work *crc_rd_wrk; +#endif +#endif + /* Count number of newly disabled CRTCs for dropping PM refs later. */ + if (old_crtc_state->active && !new_crtc_state->active) + crtc_disable_count++; + + dm_new_crtc_state = to_dm_crtc_state(new_crtc_state); + dm_old_crtc_state = to_dm_crtc_state(old_crtc_state); + + /* For freesync config update on crtc state and params for irq */ + update_stream_irq_parameters(dm, dm_new_crtc_state); + +#ifdef CONFIG_DEBUG_FS +#if defined(CONFIG_DRM_AMD_SECURE_DISPLAY) + crc_rd_wrk = dm->crc_rd_wrk; #endif spin_lock_irqsave(&adev_to_drm(adev)->event_lock, flags); cur_crc_src = acrtc->dm_irq_params.crc_src; spin_unlock_irqrestore(&adev_to_drm(adev)->event_lock, flags); #endif - dm_new_crtc_state = to_dm_crtc_state(new_crtc_state); if (new_crtc_state->active && (!old_crtc_state->active || @@ -8304,16 +8300,19 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) dc_stream_retain(dm_new_crtc_state->stream); acrtc->dm_
Re: [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions.
On Tue, Aug 23, 2022 at 12:16 PM Christian König wrote: > > Hi Bas, > > I've just pushed an updated drm-exec branch to fdo which should now > include the bo_list bug fix. Still getting the same thing: [ 103.598784] [ cut here ] [ 103.598787] WARNING: CPU: 2 PID: 2505 at drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:667 amdgpu_ttm_tt_get_user_pages+0x15c/0x190 [amdgpu] [ 103.599016] Modules linked in: uinput snd_seq_dummy snd_hrtimer snd_seq snd_seq_device ccm algif_aead cbc des_generic libdes ecb md4 cmac algif_hash algif_skcipher af_alg bnep intel_ra pl_msr snd_soc_acp5x_mach snd_acp5x_i2s snd_acp5x_pcm_dma intel_rapl_common rtw88_8822ce rtw88_8822c rtw88_pci edac_mce_amd rtw88_core kvm_amd mac80211 kvm btusb btrtl libarc4 btbcm irqby pass rapl btintel pcspkr btmtk joydev cfg80211 snd_hda_codec_hdmi bluetooth i2c_piix4 snd_soc_nau8821 snd_hda_intel snd_intel_dspcfg snd_soc_core snd_intel_sdw_acpi snd_hda_codec snd_pci_ acp5x snd_hda_core snd_rn_pci_acp3x snd_compress snd_acp_config ac97_bus snd_soc_acpi snd_pcm_dmaengine snd_hwdep ecdh_generic snd_pci_acp3x cdc_acm mousedev ecc rfkill snd_pcm ina2xx_adc kfifo_buf snd_timer snd opt3001 ina2xx industrialio soundcore acpi_cpufreq spi_amd mac_hid fuse ip_tables x_tables overlay ext4 crc16 mbcache jbd2 mmc_block vfat fat usbhid amdgpu drm_tt m_helper ttm agpgart drm_exec gpu_sched i2c_algo_bit [ 103.599064] drm_display_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm serio_raw sdhci_pci atkbd libps2 cqhci vivaldi_fmap ccp crct10dif_pclmul sdhci i8042 cr c32_pclmul xhci_pci hid_multitouch ghash_clmulni_intel aesni_intel crypto_simd cryptd wdat_wdt mmc_core cec sp5100_tco rng_core xhci_pci_renesas serio video i2c_hid_acpi 8250_dw i2c_hid b trfs blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq dm_mirror dm_region_hash dm_log dm_mod pkcs8_key_parser crypto_user [ 103.599091] CPU: 2 PID: 2505 Comm: ForzaHorizon5.e Not tainted 5.18.0-1-neptune-00172-g067e00b76d9c #25 [ 103.599093] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0105 03/21/2022 [ 103.599095] RIP: 0010:amdgpu_ttm_tt_get_user_pages+0x15c/0x190 [amdgpu] [ 103.599232] Code: 66 ed c0 48 c7 c7 60 cb 09 c1 e8 5f 87 c1 cb eb 9c 48 c7 c6 c5 8e f3 c0 bf 02 00 00 00 e8 8c a4 ee ff 41 be f2 ff ff ff eb 8b <0f> 0b eb f4 41 be fd ff ff ff e9 7c ff ff ff 48 83 b8 a0 00 00 00 [ 103.599234] RSP: 0018:c195020afb98 EFLAGS: 00010282 [ 103.599235] RAX: 9d840b878000 RBX: 9d8300b1e1c0 RCX: 0001 [ 103.599236] RDX: 0dc0 RSI: 9d840b878000 RDI: 9d835bdb3000 [ 103.599237] RBP: 9d84d2617520 R08: 1000 R09: 9d840b878000 [ 103.599238] R10: 0005 R11: R12: 9d831030e240 [ 103.599239] R13: 3208 R14: 0001 R15: 9d8313c30060 [ 103.599240] FS: 1c86f640() GS:9d862fe8() knlGS:1abf [ 103.599242] CS: 0010 DS: ES: CR0: 80050033 [ 103.599242] CR2: 7f0be80210e8 CR3: 00017fa08000 CR4: 00350ee0 [ 103.599244] Call Trace: [ 103.599247] [ 103.599250] amdgpu_cs_ioctl+0x9cc/0x2080 [amdgpu] [ 103.599390] ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu] [ 103.599525] drm_ioctl_kernel+0xc5/0x170 [drm] [ 103.599547] drm_ioctl+0x229/0x400 [drm] [ 103.599563] ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu] [ 103.599700] amdgpu_drm_ioctl+0x4a/0x80 [amdgpu] [ 103.599832] __x64_sys_ioctl+0x8c/0xc0 [ 103.599837] do_syscall_64+0x3a/0x80 [ 103.599841] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 103.599844] RIP: 0033:0x7f0c9f59b59b [ 103.599846] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a5 a8 0c 00 f7 d8 64 89 01 48 [ 103.599847] RSP: 002b:1c86d498 EFLAGS: 0246 ORIG_RAX: 0010 [ 103.599849] RAX: ffda RBX: 1c86d510 RCX: 7f0c9f59b59b [ 103.599850] RDX: 1c86d510 RSI: c0186444 RDI: 0021 [ 103.599850] RBP: c0186444 R08: 7f0c3ad55840 R09: 1c86d4e0 [ 103.599851] R10: R11: 0246 R12: 7f0c3a19b200 [ 103.599852] R13: 0021 R14: 7f0c3aff8cf0 R15: 1c86d6e0 [ 103.599854] [ 103.599854] ---[ end trace ]--- [ 103.599856] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -14! [ 103.55] [ cut here ] [ 103.55] refcount_t: underflow; use-after-free. [ 103.62] WARNING: CPU: 2 PID: 2505 at lib/refcount.c:28 refcount_warn_saturate+0xa6/0xf0 [ 103.66] Modules linked in: uinput snd_seq_dummy snd_hrtimer snd_seq snd_seq_device ccm algif_aead cbc des_generic libdes ecb md4 cmac algif_hash algif_skcipher af_alg bnep intel_ra pl_msr snd_soc_acp5x_mach snd_acp5x_i2s snd_acp5x_pcm_dma intel_rapl_common rtw88_8822ce rtw88_8822c rtw88_pci edac_m
Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq
On 2022-08-23 14:57, Andrey Grodzovsky wrote: > On 2022-08-23 14:30, Luben Tuikov wrote: > >> >> On 2022-08-23 14:13, Andrey Grodzovsky wrote: >>> On 2022-08-23 12:58, Luben Tuikov wrote: Inlined: On 2022-08-22 16:09, Andrey Grodzovsky wrote: > Poblem: Given many entities competing for same rq on ^Problem > same scheduler an uncceptabliy long wait time for some ^unacceptably > jobs waiting stuck in rq before being picked up are > observed (seen using GPUVis). > The issue is due to Round Robin policy used by scheduler > to pick up the next entity for execution. Under stress > of many entities and long job queus within entity some ^queues > jobs could be stack for very long time in it's entity's > queue before being popped from the queue and executed > while for other entites with samller job queues a job ^entities; smaller > might execute ealier even though that job arrived later ^earlier > then the job in the long queue. > > Fix: > Add FIFO selection policy to entites in RQ, chose next enitity > on rq in such order that if job on one entity arrived > ealrier then job on another entity the first job will start > executing ealier regardless of the length of the entity's job > queue. > > Signed-off-by: Andrey Grodzovsky > Tested-by: Li Yunxiang (Teddy) > --- >drivers/gpu/drm/scheduler/sched_entity.c | 2 + >drivers/gpu/drm/scheduler/sched_main.c | 65 ++-- >include/drm/gpu_scheduler.h | 8 +++ >3 files changed, 71 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c > b/drivers/gpu/drm/scheduler/sched_entity.c > index 6b25b2f4f5a3..3bb7f69306ef 100644 > --- a/drivers/gpu/drm/scheduler/sched_entity.c > +++ b/drivers/gpu/drm/scheduler/sched_entity.c > @@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job > *sched_job) > atomic_inc(entity->rq->sched->score); > WRITE_ONCE(entity->last_user, current->group_leader); > first = spsc_queue_push(&entity->job_queue, > &sched_job->queue_node); > + sched_job->submit_ts = ktime_get(); > + > > /* first job wakes up scheduler */ > if (first) { > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > b/drivers/gpu/drm/scheduler/sched_main.c > index 68317d3a7a27..c123aa120d06 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -59,6 +59,19 @@ >#define CREATE_TRACE_POINTS >#include "gpu_scheduler_trace.h" > > + > + > +int drm_sched_policy = -1; > + > +/** > + * DOC: sched_policy (int) > + * Used to override default entites scheduling policy in a run queue. > + */ > +MODULE_PARM_DESC(sched_policy, > + "specify schedule policy for entites on a runqueue (-1 = > auto(default) value, 0 = Round Robin,1 = use FIFO"); > +module_param_named(sched_policy, drm_sched_policy, int, 0444); As per Christian's comments, you can drop the "auto" and perhaps leave one as the default, say the RR. I do think it is beneficial to have a module parameter control the scheduling policy, as shown above. >>> >>> Christian is not against it, just against adding 'auto' here - like the >>> default. >> Exactly what I said. >> >> Also, I still think an O(1) scheduling (picking next to run) should be >> what we strive for in such a FIFO patch implementation. >> A FIFO mechanism is by it's nature an O(1) mechanism for picking the next >> element. >> >> Regards, >> Luben > > > The only solution i see for this now is keeping a global per rq jobs > list parallel to SPCP queue per entity - we use this list when we switch > to FIFO scheduling, we can even start building it ONLY when we switch > to FIFO building it gradually as more jobs come. Do you have other solution > in mind ? The idea is to "sort" on insertion, not on picking the next one to run. cont'd below: > > Andrey > >> >>> > + > + >#define to_drm_sched_job(sched_job)\ > container_of((sched_job), struct drm_sched_job, > queue_node) > > @@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq > *rq, >} > >/** > - * drm_sched_rq_select_entity - Select an entity which could provide a > job to run > + * drm_sched_rq_select_entity_rr - Select an entity which could provide > a job to run > * > * @rq: scheduler run queue to check. > * > - * Try to find a ready entity, returns NULL if none found. > + * Try to find a ready entity, in round robin manner. > + * > + * Returns NULL if none found. >
Re: [RFC v4 00/17] drm/display/dp_mst: Drop Radeon MST support, make MST atomic-only
Actually, talked with airlied and they suggested at this point I should just go ahead and push. So, pushed! Have fun getting nice DSC support everyone :) On Tue, 2022-08-23 at 13:26 -0400, Lyude Paul wrote: > Would anyone have any issues if I merged this today? The whole series is > acked, but I'm not sure if we would like to wait for R-b's? > > > On Wed, 2022-08-17 at 15:38 -0400, Lyude Paul wrote: > > For quite a while we've been carrying around a lot of legacy modesetting > > code in the MST helpers that has been rather annoying to keep around, > > and very often gets in the way of trying to implement additional > > functionality in MST such as fallback link rate retraining, dynamic BPC > > management and DSC support, etc. because of the fact that we can't rely > > on atomic for everything. > > > > Luckily, we only actually have one user of the legacy MST code in the > > kernel - radeon. Originally I was thinking of trying to maintain this > > code and keep it around in some form, but I'm pretty unconvinced anyone > > is actually using this. My reasoning for that is because I've seen > > nearly no issues regarding MST on radeon for quite a while now - despite > > the fact my local testing seems to indicate it's quite broken. This > > isn't too surprising either, as MST support in radeon.ko is gated behind > > a module parameter that isn't enabled by default. This isn't to say I > > wouldn't be open to alternative suggestions, but I'd rather not be the > > one to have to spend time on that if at all possible! Plus, I already > > floated the idea of dropping this code by AMD folks a few times and > > didn't get much resistance. > > > > As well, this series has some basic refactoring that I did along the way > > and some bugs I had to fix in order to get my atomic-only MST code > > working. Most of this is pretty straight forward and simply renaming > > things to more closely match the DisplayPort specification, as I think > > this will also make maintaining this code a lot easier in the long run > > (I've gotten myself confused way too many times because of this). > > > > So far I've tested this on all three MST drivers: amdgpu, i915 and > > nouveau, along with making sure that removing the radeon MST code > > doesn't break anything else. The one thing I very much could use help > > with regarding testing though is making sure that this works with > > amdgpu's DSC support on MST. > > > > So, with this we should be using the atomic state as much as possible > > with MST modesetting, hooray! > > > > V4: > > * Get rid of fix that Wayne pointed out isn't needed > > > > Cc: Wayne Lin > > Cc: Ville Syrjälä > > Cc: Fangzhi Zuo > > Cc: Jani Nikula > > Cc: Imre Deak > > Cc: Daniel Vetter > > Cc: Sean Paul > > > > Lyude Paul (17): > > drm/amdgpu/dc/mst: Rename dp_mst_stream_allocation(_table) > > drm/amdgpu/dm/mst: Rename get_payload_table() > > drm/display/dp_mst: Rename drm_dp_mst_vcpi_allocation > > drm/display/dp_mst: Call them time slots, not VCPI slots > > drm/display/dp_mst: Fix confusing docs for > > drm_dp_atomic_release_time_slots() > > drm/display/dp_mst: Add some missing kdocs for atomic MST structs > > drm/display/dp_mst: Add helper for finding payloads in atomic MST > > state > > drm/display/dp_mst: Add nonblocking helpers for DP MST > > drm/display/dp_mst: Don't open code modeset checks for releasing time > > slots > > drm/display/dp_mst: Fix modeset tracking in > > drm_dp_atomic_release_vcpi_slots() > > drm/nouveau/kms: Cache DP encoders in nouveau_connector > > drm/nouveau/kms: Pull mst state in for all modesets > > drm/display/dp_mst: Add helpers for serializing SST <-> MST > > transitions > > drm/display/dp_mst: Drop all ports from topology on CSNs before > > queueing link address work > > drm/display/dp_mst: Maintain time slot allocations when deleting > > payloads > > drm/radeon: Drop legacy MST support > > drm/display/dp_mst: Move all payload info into the atomic state > > > > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 68 +- > > .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 108 +- > > .../display/amdgpu_dm/amdgpu_dm_mst_types.c | 125 +- > > drivers/gpu/drm/amd/display/dc/core/dc_link.c | 10 +- > > drivers/gpu/drm/amd/display/dc/dm_helpers.h |4 +- > > .../amd/display/include/link_service_types.h | 14 +- > > drivers/gpu/drm/display/drm_dp_mst_topology.c | 1137 - > > drivers/gpu/drm/i915/display/intel_display.c |6 + > > drivers/gpu/drm/i915/display/intel_dp.c |9 + > > drivers/gpu/drm/i915/display/intel_dp_mst.c | 91 +- > > drivers/gpu/drm/i915/display/intel_hdcp.c | 24 +- > > drivers/gpu/drm/nouveau/dispnv50/disp.c | 197 ++- > > drivers/gpu/drm/nouveau/dispnv50/disp.h |2 + > > drivers/gpu/drm/nouveau/nouveau_connector.c | 18 +- > > drivers/gpu/drm/nouveau/nouveau_connector.h |3 + > > drivers/gpu/drm/radeon/Makef
Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq
On 2022-08-23 14:30, Luben Tuikov wrote: On 2022-08-23 14:13, Andrey Grodzovsky wrote: On 2022-08-23 12:58, Luben Tuikov wrote: Inlined: On 2022-08-22 16:09, Andrey Grodzovsky wrote: Poblem: Given many entities competing for same rq on ^Problem same scheduler an uncceptabliy long wait time for some ^unacceptably jobs waiting stuck in rq before being picked up are observed (seen using GPUVis). The issue is due to Round Robin policy used by scheduler to pick up the next entity for execution. Under stress of many entities and long job queus within entity some ^queues jobs could be stack for very long time in it's entity's queue before being popped from the queue and executed while for other entites with samller job queues a job ^entities; smaller might execute ealier even though that job arrived later ^earlier then the job in the long queue. Fix: Add FIFO selection policy to entites in RQ, chose next enitity on rq in such order that if job on one entity arrived ealrier then job on another entity the first job will start executing ealier regardless of the length of the entity's job queue. Signed-off-by: Andrey Grodzovsky Tested-by: Li Yunxiang (Teddy) --- drivers/gpu/drm/scheduler/sched_entity.c | 2 + drivers/gpu/drm/scheduler/sched_main.c | 65 ++-- include/drm/gpu_scheduler.h | 8 +++ 3 files changed, 71 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 6b25b2f4f5a3..3bb7f69306ef 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) atomic_inc(entity->rq->sched->score); WRITE_ONCE(entity->last_user, current->group_leader); first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node); + sched_job->submit_ts = ktime_get(); + /* first job wakes up scheduler */ if (first) { diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 68317d3a7a27..c123aa120d06 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -59,6 +59,19 @@ #define CREATE_TRACE_POINTS #include "gpu_scheduler_trace.h" + + +int drm_sched_policy = -1; + +/** + * DOC: sched_policy (int) + * Used to override default entites scheduling policy in a run queue. + */ +MODULE_PARM_DESC(sched_policy, + "specify schedule policy for entites on a runqueue (-1 = auto(default) value, 0 = Round Robin,1 = use FIFO"); +module_param_named(sched_policy, drm_sched_policy, int, 0444); As per Christian's comments, you can drop the "auto" and perhaps leave one as the default, say the RR. I do think it is beneficial to have a module parameter control the scheduling policy, as shown above. Christian is not against it, just against adding 'auto' here - like the default. Exactly what I said. Also, I still think an O(1) scheduling (picking next to run) should be what we strive for in such a FIFO patch implementation. A FIFO mechanism is by it's nature an O(1) mechanism for picking the next element. Regards, Luben The only solution i see for this now is keeping a global per rq jobs list parallel to SPCP queue per entity - we use this list when we switch to FIFO scheduling, we can even start building it ONLY when we switch to FIFO building it gradually as more jobs come. Do you have other solution in mind ? Andrey + + #define to_drm_sched_job(sched_job) \ container_of((sched_job), struct drm_sched_job, queue_node) @@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, } /** - * drm_sched_rq_select_entity - Select an entity which could provide a job to run + * drm_sched_rq_select_entity_rr - Select an entity which could provide a job to run * * @rq: scheduler run queue to check. * - * Try to find a ready entity, returns NULL if none found. + * Try to find a ready entity, in round robin manner. + * + * Returns NULL if none found. */ static struct drm_sched_entity * -drm_sched_rq_select_entity(struct drm_sched_rq *rq) +drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq) { struct drm_sched_entity *entity; @@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq) return NULL; } +/** + * drm_sched_rq_select_entity_fifo - Select an entity which could provide a job to run + * + * @rq: scheduler run queue to check. + * + * Try to find a ready entity, based on FIFO order of jobs arrivals. + * + * Returns NULL if none found. + */ +static struct drm_sched_entity * +drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq) +{ + struct drm_sched_entity *tmp, *entity = NULL; + ktime_t oldest_ts = KTIME_MAX; + struct drm_sched_job *sched_job; + + spin_lock(
Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq
On 2022-08-23 14:13, Andrey Grodzovsky wrote: > > On 2022-08-23 12:58, Luben Tuikov wrote: >> Inlined: >> >> On 2022-08-22 16:09, Andrey Grodzovsky wrote: >>> Poblem: Given many entities competing for same rq on >> ^Problem >> >>> same scheduler an uncceptabliy long wait time for some >> ^unacceptably >> >>> jobs waiting stuck in rq before being picked up are >>> observed (seen using GPUVis). >>> The issue is due to Round Robin policy used by scheduler >>> to pick up the next entity for execution. Under stress >>> of many entities and long job queus within entity some >> ^queues >> >>> jobs could be stack for very long time in it's entity's >>> queue before being popped from the queue and executed >>> while for other entites with samller job queues a job >> ^entities; smaller >> >>> might execute ealier even though that job arrived later >> ^earlier >> >>> then the job in the long queue. >>> >>> Fix: >>> Add FIFO selection policy to entites in RQ, chose next enitity >>> on rq in such order that if job on one entity arrived >>> ealrier then job on another entity the first job will start >>> executing ealier regardless of the length of the entity's job >>> queue. >>> >>> Signed-off-by: Andrey Grodzovsky >>> Tested-by: Li Yunxiang (Teddy) >>> --- >>> drivers/gpu/drm/scheduler/sched_entity.c | 2 + >>> drivers/gpu/drm/scheduler/sched_main.c | 65 ++-- >>> include/drm/gpu_scheduler.h | 8 +++ >>> 3 files changed, 71 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c >>> b/drivers/gpu/drm/scheduler/sched_entity.c >>> index 6b25b2f4f5a3..3bb7f69306ef 100644 >>> --- a/drivers/gpu/drm/scheduler/sched_entity.c >>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c >>> @@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job >>> *sched_job) >>> atomic_inc(entity->rq->sched->score); >>> WRITE_ONCE(entity->last_user, current->group_leader); >>> first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node); >>> + sched_job->submit_ts = ktime_get(); >>> + >>> >>> /* first job wakes up scheduler */ >>> if (first) { >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >>> b/drivers/gpu/drm/scheduler/sched_main.c >>> index 68317d3a7a27..c123aa120d06 100644 >>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>> @@ -59,6 +59,19 @@ >>> #define CREATE_TRACE_POINTS >>> #include "gpu_scheduler_trace.h" >>> >>> + >>> + >>> +int drm_sched_policy = -1; >>> + >>> +/** >>> + * DOC: sched_policy (int) >>> + * Used to override default entites scheduling policy in a run queue. >>> + */ >>> +MODULE_PARM_DESC(sched_policy, >>> + "specify schedule policy for entites on a runqueue (-1 = >>> auto(default) value, 0 = Round Robin,1 = use FIFO"); >>> +module_param_named(sched_policy, drm_sched_policy, int, 0444); >> As per Christian's comments, you can drop the "auto" and perhaps leave one >> as the default, >> say the RR. >> >> I do think it is beneficial to have a module parameter control the >> scheduling policy, as shown above. > > > Christian is not against it, just against adding 'auto' here - like the > default. Exactly what I said. Also, I still think an O(1) scheduling (picking next to run) should be what we strive for in such a FIFO patch implementation. A FIFO mechanism is by it's nature an O(1) mechanism for picking the next element. Regards, Luben > > >> >>> + >>> + >>> #define to_drm_sched_job(sched_job) \ >>> container_of((sched_job), struct drm_sched_job, queue_node) >>> >>> @@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq >>> *rq, >>> } >>> >>> /** >>> - * drm_sched_rq_select_entity - Select an entity which could provide a job >>> to run >>> + * drm_sched_rq_select_entity_rr - Select an entity which could provide a >>> job to run >>>* >>>* @rq: scheduler run queue to check. >>>* >>> - * Try to find a ready entity, returns NULL if none found. >>> + * Try to find a ready entity, in round robin manner. >>> + * >>> + * Returns NULL if none found. >>>*/ >>> static struct drm_sched_entity * >>> -drm_sched_rq_select_entity(struct drm_sched_rq *rq) >>> +drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq) >>> { >>> struct drm_sched_entity *entity; >>> >>> @@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq) >>> return NULL; >>> } >>> >>> +/** >>> + * drm_sched_rq_select_entity_fifo - Select an entity which could provide >>> a job to run >>> + * >>> + * @rq: scheduler run queue to check. >>> + * >>> + * Try to find a ready entity, based on FIFO order of jobs arrivals. >>> + * >>> + * Returns NULL if none found. >>> + */ >>> +static struct drm_sched_entity * >>> +drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq) >>> +{ >>> + struct drm_sched_entity *tmp, *entity = NULL; >>> + ktime_t oldest_ts
Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq
On 2022-08-23 12:58, Luben Tuikov wrote: Inlined: On 2022-08-22 16:09, Andrey Grodzovsky wrote: Poblem: Given many entities competing for same rq on ^Problem same scheduler an uncceptabliy long wait time for some ^unacceptably jobs waiting stuck in rq before being picked up are observed (seen using GPUVis). The issue is due to Round Robin policy used by scheduler to pick up the next entity for execution. Under stress of many entities and long job queus within entity some ^queues jobs could be stack for very long time in it's entity's queue before being popped from the queue and executed while for other entites with samller job queues a job ^entities; smaller might execute ealier even though that job arrived later ^earlier then the job in the long queue. Fix: Add FIFO selection policy to entites in RQ, chose next enitity on rq in such order that if job on one entity arrived ealrier then job on another entity the first job will start executing ealier regardless of the length of the entity's job queue. Signed-off-by: Andrey Grodzovsky Tested-by: Li Yunxiang (Teddy) --- drivers/gpu/drm/scheduler/sched_entity.c | 2 + drivers/gpu/drm/scheduler/sched_main.c | 65 ++-- include/drm/gpu_scheduler.h | 8 +++ 3 files changed, 71 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 6b25b2f4f5a3..3bb7f69306ef 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) atomic_inc(entity->rq->sched->score); WRITE_ONCE(entity->last_user, current->group_leader); first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node); + sched_job->submit_ts = ktime_get(); + /* first job wakes up scheduler */ if (first) { diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 68317d3a7a27..c123aa120d06 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -59,6 +59,19 @@ #define CREATE_TRACE_POINTS #include "gpu_scheduler_trace.h" + + +int drm_sched_policy = -1; + +/** + * DOC: sched_policy (int) + * Used to override default entites scheduling policy in a run queue. + */ +MODULE_PARM_DESC(sched_policy, + "specify schedule policy for entites on a runqueue (-1 = auto(default) value, 0 = Round Robin,1 = use FIFO"); +module_param_named(sched_policy, drm_sched_policy, int, 0444); As per Christian's comments, you can drop the "auto" and perhaps leave one as the default, say the RR. I do think it is beneficial to have a module parameter control the scheduling policy, as shown above. Christian is not against it, just against adding 'auto' here - like the default. + + #define to_drm_sched_job(sched_job) \ container_of((sched_job), struct drm_sched_job, queue_node) @@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, } /** - * drm_sched_rq_select_entity - Select an entity which could provide a job to run + * drm_sched_rq_select_entity_rr - Select an entity which could provide a job to run * * @rq: scheduler run queue to check. * - * Try to find a ready entity, returns NULL if none found. + * Try to find a ready entity, in round robin manner. + * + * Returns NULL if none found. */ static struct drm_sched_entity * -drm_sched_rq_select_entity(struct drm_sched_rq *rq) +drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq) { struct drm_sched_entity *entity; @@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq) return NULL; } +/** + * drm_sched_rq_select_entity_fifo - Select an entity which could provide a job to run + * + * @rq: scheduler run queue to check. + * + * Try to find a ready entity, based on FIFO order of jobs arrivals. + * + * Returns NULL if none found. + */ +static struct drm_sched_entity * +drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq) +{ + struct drm_sched_entity *tmp, *entity = NULL; + ktime_t oldest_ts = KTIME_MAX; + struct drm_sched_job *sched_job; + + spin_lock(&rq->lock); + + list_for_each_entry(tmp, &rq->entities, list) { + + if (drm_sched_entity_is_ready(tmp)) { + sched_job = to_drm_sched_job(spsc_queue_peek(&tmp->job_queue)); + + if (ktime_before(sched_job->submit_ts, oldest_ts)) { + oldest_ts = sched_job->submit_ts; + entity = tmp; + } + } + } Here I think we need an O(1) lookup of the next job to pick out to run. I see a number of optimizations, for instance keeping the current/oldest timestamp in the rq struct itself, This was my original design
Re: [RFC v4 00/17] drm/display/dp_mst: Drop Radeon MST support, make MST atomic-only
Would anyone have any issues if I merged this today? The whole series is acked, but I'm not sure if we would like to wait for R-b's? On Wed, 2022-08-17 at 15:38 -0400, Lyude Paul wrote: > For quite a while we've been carrying around a lot of legacy modesetting > code in the MST helpers that has been rather annoying to keep around, > and very often gets in the way of trying to implement additional > functionality in MST such as fallback link rate retraining, dynamic BPC > management and DSC support, etc. because of the fact that we can't rely > on atomic for everything. > > Luckily, we only actually have one user of the legacy MST code in the > kernel - radeon. Originally I was thinking of trying to maintain this > code and keep it around in some form, but I'm pretty unconvinced anyone > is actually using this. My reasoning for that is because I've seen > nearly no issues regarding MST on radeon for quite a while now - despite > the fact my local testing seems to indicate it's quite broken. This > isn't too surprising either, as MST support in radeon.ko is gated behind > a module parameter that isn't enabled by default. This isn't to say I > wouldn't be open to alternative suggestions, but I'd rather not be the > one to have to spend time on that if at all possible! Plus, I already > floated the idea of dropping this code by AMD folks a few times and > didn't get much resistance. > > As well, this series has some basic refactoring that I did along the way > and some bugs I had to fix in order to get my atomic-only MST code > working. Most of this is pretty straight forward and simply renaming > things to more closely match the DisplayPort specification, as I think > this will also make maintaining this code a lot easier in the long run > (I've gotten myself confused way too many times because of this). > > So far I've tested this on all three MST drivers: amdgpu, i915 and > nouveau, along with making sure that removing the radeon MST code > doesn't break anything else. The one thing I very much could use help > with regarding testing though is making sure that this works with > amdgpu's DSC support on MST. > > So, with this we should be using the atomic state as much as possible > with MST modesetting, hooray! > > V4: > * Get rid of fix that Wayne pointed out isn't needed > > Cc: Wayne Lin > Cc: Ville Syrjälä > Cc: Fangzhi Zuo > Cc: Jani Nikula > Cc: Imre Deak > Cc: Daniel Vetter > Cc: Sean Paul > > Lyude Paul (17): > drm/amdgpu/dc/mst: Rename dp_mst_stream_allocation(_table) > drm/amdgpu/dm/mst: Rename get_payload_table() > drm/display/dp_mst: Rename drm_dp_mst_vcpi_allocation > drm/display/dp_mst: Call them time slots, not VCPI slots > drm/display/dp_mst: Fix confusing docs for > drm_dp_atomic_release_time_slots() > drm/display/dp_mst: Add some missing kdocs for atomic MST structs > drm/display/dp_mst: Add helper for finding payloads in atomic MST > state > drm/display/dp_mst: Add nonblocking helpers for DP MST > drm/display/dp_mst: Don't open code modeset checks for releasing time > slots > drm/display/dp_mst: Fix modeset tracking in > drm_dp_atomic_release_vcpi_slots() > drm/nouveau/kms: Cache DP encoders in nouveau_connector > drm/nouveau/kms: Pull mst state in for all modesets > drm/display/dp_mst: Add helpers for serializing SST <-> MST > transitions > drm/display/dp_mst: Drop all ports from topology on CSNs before > queueing link address work > drm/display/dp_mst: Maintain time slot allocations when deleting > payloads > drm/radeon: Drop legacy MST support > drm/display/dp_mst: Move all payload info into the atomic state > > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 68 +- > .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 108 +- > .../display/amdgpu_dm/amdgpu_dm_mst_types.c | 125 +- > drivers/gpu/drm/amd/display/dc/core/dc_link.c | 10 +- > drivers/gpu/drm/amd/display/dc/dm_helpers.h |4 +- > .../amd/display/include/link_service_types.h | 14 +- > drivers/gpu/drm/display/drm_dp_mst_topology.c | 1137 - > drivers/gpu/drm/i915/display/intel_display.c |6 + > drivers/gpu/drm/i915/display/intel_dp.c |9 + > drivers/gpu/drm/i915/display/intel_dp_mst.c | 91 +- > drivers/gpu/drm/i915/display/intel_hdcp.c | 24 +- > drivers/gpu/drm/nouveau/dispnv50/disp.c | 197 ++- > drivers/gpu/drm/nouveau/dispnv50/disp.h |2 + > drivers/gpu/drm/nouveau/nouveau_connector.c | 18 +- > drivers/gpu/drm/nouveau/nouveau_connector.h |3 + > drivers/gpu/drm/radeon/Makefile |2 +- > drivers/gpu/drm/radeon/atombios_crtc.c| 11 +- > drivers/gpu/drm/radeon/atombios_encoders.c| 59 - > drivers/gpu/drm/radeon/radeon_atombios.c |2 - > drivers/gpu/drm/radeon/radeon_connectors.c| 61 +- > drivers/gpu/drm/radeon/radeon_device.c|1 - > drivers/gpu/drm/radeon/radeon_dp_mst.c| 778 --- > drive
Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq
Inlined: On 2022-08-22 16:09, Andrey Grodzovsky wrote: > Poblem: Given many entities competing for same rq on ^Problem > same scheduler an uncceptabliy long wait time for some ^unacceptably > jobs waiting stuck in rq before being picked up are > observed (seen using GPUVis). > The issue is due to Round Robin policy used by scheduler > to pick up the next entity for execution. Under stress > of many entities and long job queus within entity some ^queues > jobs could be stack for very long time in it's entity's > queue before being popped from the queue and executed > while for other entites with samller job queues a job ^entities; smaller > might execute ealier even though that job arrived later ^earlier > then the job in the long queue. > > Fix: > Add FIFO selection policy to entites in RQ, chose next enitity > on rq in such order that if job on one entity arrived > ealrier then job on another entity the first job will start > executing ealier regardless of the length of the entity's job > queue. > > Signed-off-by: Andrey Grodzovsky > Tested-by: Li Yunxiang (Teddy) > --- > drivers/gpu/drm/scheduler/sched_entity.c | 2 + > drivers/gpu/drm/scheduler/sched_main.c | 65 ++-- > include/drm/gpu_scheduler.h | 8 +++ > 3 files changed, 71 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c > b/drivers/gpu/drm/scheduler/sched_entity.c > index 6b25b2f4f5a3..3bb7f69306ef 100644 > --- a/drivers/gpu/drm/scheduler/sched_entity.c > +++ b/drivers/gpu/drm/scheduler/sched_entity.c > @@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job > *sched_job) > atomic_inc(entity->rq->sched->score); > WRITE_ONCE(entity->last_user, current->group_leader); > first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node); > + sched_job->submit_ts = ktime_get(); > + > > /* first job wakes up scheduler */ > if (first) { > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > b/drivers/gpu/drm/scheduler/sched_main.c > index 68317d3a7a27..c123aa120d06 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -59,6 +59,19 @@ > #define CREATE_TRACE_POINTS > #include "gpu_scheduler_trace.h" > > + > + > +int drm_sched_policy = -1; > + > +/** > + * DOC: sched_policy (int) > + * Used to override default entites scheduling policy in a run queue. > + */ > +MODULE_PARM_DESC(sched_policy, > + "specify schedule policy for entites on a runqueue (-1 = > auto(default) value, 0 = Round Robin,1 = use FIFO"); > +module_param_named(sched_policy, drm_sched_policy, int, 0444); As per Christian's comments, you can drop the "auto" and perhaps leave one as the default, say the RR. I do think it is beneficial to have a module parameter control the scheduling policy, as shown above. > + > + > #define to_drm_sched_job(sched_job) \ > container_of((sched_job), struct drm_sched_job, queue_node) > > @@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, > } > > /** > - * drm_sched_rq_select_entity - Select an entity which could provide a job > to run > + * drm_sched_rq_select_entity_rr - Select an entity which could provide a > job to run > * > * @rq: scheduler run queue to check. > * > - * Try to find a ready entity, returns NULL if none found. > + * Try to find a ready entity, in round robin manner. > + * > + * Returns NULL if none found. > */ > static struct drm_sched_entity * > -drm_sched_rq_select_entity(struct drm_sched_rq *rq) > +drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq) > { > struct drm_sched_entity *entity; > > @@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq) > return NULL; > } > > +/** > + * drm_sched_rq_select_entity_fifo - Select an entity which could provide a > job to run > + * > + * @rq: scheduler run queue to check. > + * > + * Try to find a ready entity, based on FIFO order of jobs arrivals. > + * > + * Returns NULL if none found. > + */ > +static struct drm_sched_entity * > +drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq) > +{ > + struct drm_sched_entity *tmp, *entity = NULL; > + ktime_t oldest_ts = KTIME_MAX; > + struct drm_sched_job *sched_job; > + > + spin_lock(&rq->lock); > + > + list_for_each_entry(tmp, &rq->entities, list) { > + > + if (drm_sched_entity_is_ready(tmp)) { > + sched_job = > to_drm_sched_job(spsc_queue_peek(&tmp->job_queue)); > + > + if (ktime_before(sched_job->submit_ts, oldest_ts)) { > + oldest_ts = sched_job->submit_ts; > + entity = tmp; > + } > + } > + } Here I think we need an O(1) lookup of the next job to pick out to run. I see a number of optimizations, for instance keeping the current/oldest timestamp in the rq struct it
Re: [PATCH] drm/amdgpu: Fix page table setup on Arcturus
Am 2022-08-22 um 11:52 schrieb Mukul Joshi: When translate_further is enabled, page table depth needs to be updated. This was missing on Arcturus MMHUB init. This was causing address translations to fail for SDMA user-mode queues. Fixes: 2abf2573b1c69 ("drm/amdgpu: Enable translate_further to extend UTCL2 reach" Signed-off-by: Mukul Joshi Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c index 6e0145b2b408..445cb06b9d26 100644 --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c @@ -295,9 +295,17 @@ static void mmhub_v9_4_disable_identity_aperture(struct amdgpu_device *adev, static void mmhub_v9_4_setup_vmid_config(struct amdgpu_device *adev, int hubid) { struct amdgpu_vmhub *hub = &adev->vmhub[AMDGPU_MMHUB_0]; + unsigned int num_level, block_size; uint32_t tmp; int i; + num_level = adev->vm_manager.num_level; + block_size = adev->vm_manager.block_size; + if (adev->gmc.translate_further) + num_level -= 1; + else + block_size -= 9; + for (i = 0; i <= 14; i++) { tmp = RREG32_SOC15_OFFSET(MMHUB, 0, mmVML2VC0_VM_CONTEXT1_CNTL, hubid * MMHUB_INSTANCE_REGISTER_OFFSET + i); @@ -305,7 +313,7 @@ static void mmhub_v9_4_setup_vmid_config(struct amdgpu_device *adev, int hubid) ENABLE_CONTEXT, 1); tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL, PAGE_TABLE_DEPTH, - adev->vm_manager.num_level); + num_level); tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL, RANGE_PROTECTION_FAULT_ENABLE_DEFAULT, 1); tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL, @@ -323,7 +331,7 @@ static void mmhub_v9_4_setup_vmid_config(struct amdgpu_device *adev, int hubid) EXECUTE_PROTECTION_FAULT_ENABLE_DEFAULT, 1); tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL, PAGE_TABLE_BLOCK_SIZE, - adev->vm_manager.block_size - 9); + block_size); /* Send no-retry XNACK on fault to suppress VM fault storm. */ tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL, RETRY_PERMISSION_OR_INVALID_PAGE_FAULT,
Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq
On 2022-08-23 08:15, Christian König wrote: Am 22.08.22 um 22:09 schrieb Andrey Grodzovsky: Poblem: Given many entities competing for same rq on same scheduler an uncceptabliy long wait time for some jobs waiting stuck in rq before being picked up are observed (seen using GPUVis). The issue is due to Round Robin policy used by scheduler to pick up the next entity for execution. Under stress of many entities and long job queus within entity some jobs could be stack for very long time in it's entity's queue before being popped from the queue and executed while for other entites with samller job queues a job might execute ealier even though that job arrived later then the job in the long queue. Fix: Add FIFO selection policy to entites in RQ, chose next enitity on rq in such order that if job on one entity arrived ealrier then job on another entity the first job will start executing ealier regardless of the length of the entity's job queue. Signed-off-by: Andrey Grodzovsky Tested-by: Li Yunxiang (Teddy) --- drivers/gpu/drm/scheduler/sched_entity.c | 2 + drivers/gpu/drm/scheduler/sched_main.c | 65 ++-- include/drm/gpu_scheduler.h | 8 +++ 3 files changed, 71 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 6b25b2f4f5a3..3bb7f69306ef 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) atomic_inc(entity->rq->sched->score); WRITE_ONCE(entity->last_user, current->group_leader); first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node); + sched_job->submit_ts = ktime_get(); + /* first job wakes up scheduler */ if (first) { diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 68317d3a7a27..c123aa120d06 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -59,6 +59,19 @@ #define CREATE_TRACE_POINTS #include "gpu_scheduler_trace.h" + + +int drm_sched_policy = -1; + +/** + * DOC: sched_policy (int) + * Used to override default entites scheduling policy in a run queue. + */ +MODULE_PARM_DESC(sched_policy, + "specify schedule policy for entites on a runqueue (-1 = auto(default) value, 0 = Round Robin,1 = use FIFO"); Well we don't really have an autodetect at the moment, so I would drop that. +module_param_named(sched_policy, drm_sched_policy, int, 0444); + + #define to_drm_sched_job(sched_job) \ container_of((sched_job), struct drm_sched_job, queue_node) @@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, } /** - * drm_sched_rq_select_entity - Select an entity which could provide a job to run + * drm_sched_rq_select_entity_rr - Select an entity which could provide a job to run * * @rq: scheduler run queue to check. * - * Try to find a ready entity, returns NULL if none found. + * Try to find a ready entity, in round robin manner. + * + * Returns NULL if none found. */ static struct drm_sched_entity * -drm_sched_rq_select_entity(struct drm_sched_rq *rq) +drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq) { struct drm_sched_entity *entity; @@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq) return NULL; } +/** + * drm_sched_rq_select_entity_fifo - Select an entity which could provide a job to run + * + * @rq: scheduler run queue to check. + * + * Try to find a ready entity, based on FIFO order of jobs arrivals. + * + * Returns NULL if none found. + */ +static struct drm_sched_entity * +drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq) +{ + struct drm_sched_entity *tmp, *entity = NULL; + ktime_t oldest_ts = KTIME_MAX; + struct drm_sched_job *sched_job; + + spin_lock(&rq->lock); + + list_for_each_entry(tmp, &rq->entities, list) { + + if (drm_sched_entity_is_ready(tmp)) { + sched_job = to_drm_sched_job(spsc_queue_peek(&tmp->job_queue)); + + if (ktime_before(sched_job->submit_ts, oldest_ts)) { + oldest_ts = sched_job->submit_ts; + entity = tmp; + } + } + } + + if (entity) { + rq->current_entity = entity; + reinit_completion(&entity->entity_idle); + } That should probably be a separate function or at least outside of this here. Apart from that totally straight forward implementation. Any idea how much extra overhead that is? Regards, Christian. Well, memory wise you have the extra long for each job struct for the time stamp, and then for each next job extraction you have to iterate the entire rq to find the next entity with oldest job so always linear in number of entitles. Today the worst case is also O(# entities) in case none of them are
RE: [PATCH 00/14] DC Patches August 22, 2022
[Public] Hi all, This week this patchset was tested on the following systems: Sapphire Pulse RX5700XT Reference AMD RX6800 Engineering board with Ryzen 9 5900H These systems were tested on the following display types: eDP, (1080p 60hz [4500U, 5650U, 5900H]) VGA and DVI (1680x1050 60HZ [DP to VGA/DVI, USB-C to DVI/VGA]) DP/HDMI/USB-C (1440p 170hz, 4k 60hz, 4k 144hz [Includes USB-C to DP/HDMI adapters]) MST tested with Startech MST14DP123DP and 2x 4k 60Hz displays DSC tested with Cable Matters 101075 (DP to 3x DP), and 201375 (USB-C to 3x DP) with 3x 4k60 displays The testing is a mix of automated and manual tests. Manual testing includes (but is not limited to): Changing display configurations and settings Benchmark testing Feature testing (Freesync, etc.) Automated testing includes (but is not limited to): Script testing (scripts to automate some of the manual checks) IGT testing The patchset consists of the amd-staging-drm-next branch (Head commit - 6339782c00f4e5dda276b1ecc5cb167c5789aa3a) with new patches added on top of it. This branch is used for both Ubuntu and Chrome OS testing (ChromeOS on a bi-weekly basis). Tested on Ubuntu 22.04 Tested-by: Daniel Wheeler Thank you, Dan Wheeler Sr. Technologist | AMD SW Display -- 1 Commerce Valley Dr E, Thornhill, ON L3T 7X6 amd.com -Original Message- From: amd-gfx On Behalf Of brichang Sent: August 22, 2022 5:58 AM To: amd-gfx@lists.freedesktop.org Cc: Wang, Chao-kai (Stylon) ; Chang, Brian ; Li, Sun peng (Leo) ; Wentland, Harry ; Zhuo, Qingqing (Lillian) ; Siqueira, Rodrigo ; Li, Roman ; Chiu, Solomon ; Pillai, Aurabindo ; Lin, Wayne ; Lakha, Bhawanpreet ; Gutierrez, Agustin ; Kotarac, Pavle Subject: [PATCH 00/14] DC Patches August 22, 2022 This DC patchset brings improvements in multiple areas. In summary, we have: * Remove redundant check in atomic_check. * Add log clock table for SMU. * Add SubVP scaling. * Add interface to track PHY state. * Free phantom plane after removing from the context. * Add k1/k2 driver for virtual signal for DCN32. * Fix cursor flicker when entering PSRSU. * Change reg offset for DCN321 and DCN32 during runtime initialization. * Change AUX NACK behavior. * Correct HDMI ODM combine policy. * Fix odm 2:1 policy in 4k144 mode. * Fix pipe split policy for RV2 Alvin Lee (2): drm/amd/display: Free phantom plane and stream properly drm/amd/display: Uncomment SubVP scaling case Aric Cyr (1): drm/amd/display: 3.2.199 Aurabindo Pillai (3): drm/amd/display: change to runtime initialization for reg offsets for DCN32 drm/amd/display: change to runtime initialization for reg offsets for DCN321 drm/amd/display: program k1/k2 divider for virtual signal for DCN32 Derek Lai (1): drm/amd/display: do not change pipe split policy for RV2 Ilya Bakoulin (1): drm/amd/display: Change AUX NACK behavior Leo Chen (1): drm/amd/display: Adding log clock table from SMU Robin Chen (1): drm/amd/display: Cursor flicker when entering PSRSU Roman Li (1): drm/amd/display: Remove redundant check in atomic_check Saaem Rizvi (1): drm/amd/display: HDMI ODM Combine Policy Correction Samson Tam (1): drm/amd/display: fix odm 2:1 policy not being applied consistently in 4k144 modes Taimur Hassan (1): drm/amd/display: Add interface to track PHY state .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 - .../display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c | 46 + .../dc/clk_mgr/dcn314/dcn314_clk_mgr.c| 48 + .../dc/clk_mgr/dcn315/dcn315_clk_mgr.c| 46 + drivers/gpu/drm/amd/display/dc/core/dc.c | 27 +- .../gpu/drm/amd/display/dc/core/dc_link_dp.c | 12 +- .../gpu/drm/amd/display/dc/core/dc_resource.c | 3 - drivers/gpu/drm/amd/display/dc/dc.h | 3 +- drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c | 5 +- drivers/gpu/drm/amd/display/dc/dc_link.h | 1 + drivers/gpu/drm/amd/display/dc/dc_stream.h| 2 - drivers/gpu/drm/amd/display/dc/dce/dce_aux.c | 12 +- .../display/dc/dce110/dce110_hw_sequencer.c | 8 +- .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 3 + .../drm/amd/display/dc/dcn10/dcn10_resource.c | 18 + .../drm/amd/display/dc/dcn20/dcn20_hwseq.c| 9 +- .../drm/amd/display/dc/dcn31/dcn31_hwseq.c| 9 +- .../drm/amd/display/dc/dcn32/dcn32_hwseq.c| 34 +- .../drm/amd/display/dc/dcn32/dcn32_hwseq.h| 3 + .../gpu/drm/amd/display/dc/dcn32/dcn32_init.c | 1 + .../drm/amd/display/dc/dcn32/dcn32_resource.c | 658 ++-- .../drm/amd/display/dc/dcn32/dcn32_resource.h | 997 ++ .../amd/display/dc/dcn321/dcn321_resource.c | 638 ++- .../drm/amd/display/dc/dml/calcs/dcn_calcs.c | 32 +- .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c | 5 + .../gpu/drm/amd/display/dc/inc/hw/hw_shared.h | 6 + .../gpu/drm/amd/display/dc/inc/hw_sequencer.h | 2 + 27 fi
Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq
Am 22.08.22 um 22:09 schrieb Andrey Grodzovsky: Poblem: Given many entities competing for same rq on same scheduler an uncceptabliy long wait time for some jobs waiting stuck in rq before being picked up are observed (seen using GPUVis). The issue is due to Round Robin policy used by scheduler to pick up the next entity for execution. Under stress of many entities and long job queus within entity some jobs could be stack for very long time in it's entity's queue before being popped from the queue and executed while for other entites with samller job queues a job might execute ealier even though that job arrived later then the job in the long queue. Fix: Add FIFO selection policy to entites in RQ, chose next enitity on rq in such order that if job on one entity arrived ealrier then job on another entity the first job will start executing ealier regardless of the length of the entity's job queue. Signed-off-by: Andrey Grodzovsky Tested-by: Li Yunxiang (Teddy) --- drivers/gpu/drm/scheduler/sched_entity.c | 2 + drivers/gpu/drm/scheduler/sched_main.c | 65 ++-- include/drm/gpu_scheduler.h | 8 +++ 3 files changed, 71 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 6b25b2f4f5a3..3bb7f69306ef 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) atomic_inc(entity->rq->sched->score); WRITE_ONCE(entity->last_user, current->group_leader); first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node); + sched_job->submit_ts = ktime_get(); + /* first job wakes up scheduler */ if (first) { diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 68317d3a7a27..c123aa120d06 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -59,6 +59,19 @@ #define CREATE_TRACE_POINTS #include "gpu_scheduler_trace.h" + + +int drm_sched_policy = -1; + +/** + * DOC: sched_policy (int) + * Used to override default entites scheduling policy in a run queue. + */ +MODULE_PARM_DESC(sched_policy, + "specify schedule policy for entites on a runqueue (-1 = auto(default) value, 0 = Round Robin,1 = use FIFO"); Well we don't really have an autodetect at the moment, so I would drop that. +module_param_named(sched_policy, drm_sched_policy, int, 0444); + + #define to_drm_sched_job(sched_job) \ container_of((sched_job), struct drm_sched_job, queue_node) @@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, } /** - * drm_sched_rq_select_entity - Select an entity which could provide a job to run + * drm_sched_rq_select_entity_rr - Select an entity which could provide a job to run * * @rq: scheduler run queue to check. * - * Try to find a ready entity, returns NULL if none found. + * Try to find a ready entity, in round robin manner. + * + * Returns NULL if none found. */ static struct drm_sched_entity * -drm_sched_rq_select_entity(struct drm_sched_rq *rq) +drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq) { struct drm_sched_entity *entity; @@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq) return NULL; } +/** + * drm_sched_rq_select_entity_fifo - Select an entity which could provide a job to run + * + * @rq: scheduler run queue to check. + * + * Try to find a ready entity, based on FIFO order of jobs arrivals. + * + * Returns NULL if none found. + */ +static struct drm_sched_entity * +drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq) +{ + struct drm_sched_entity *tmp, *entity = NULL; + ktime_t oldest_ts = KTIME_MAX; + struct drm_sched_job *sched_job; + + spin_lock(&rq->lock); + + list_for_each_entry(tmp, &rq->entities, list) { + + if (drm_sched_entity_is_ready(tmp)) { + sched_job = to_drm_sched_job(spsc_queue_peek(&tmp->job_queue)); + + if (ktime_before(sched_job->submit_ts, oldest_ts)) { + oldest_ts = sched_job->submit_ts; + entity = tmp; + } + } + } + + if (entity) { + rq->current_entity = entity; + reinit_completion(&entity->entity_idle); + } That should probably be a separate function or at least outside of this here. Apart from that totally straight forward implementation. Any idea how much extra overhead that is? Regards, Christian. + + spin_unlock(&rq->lock); + return entity; +} + /** * drm_sched_job_done - complete a job * @s_job: pointer to the job which is done @@ -804,7 +858,10 @@ drm_sched_select_entity(struct drm
Re: [PATCH] drm/amdkfd: Allocate doorbells only when needed
No one else seems to have any thoughts on it, so Reviewed-by: Kent Russell Kent From: amd-gfx on behalf of Russell, Kent Sent: Monday, August 22, 2022 10:10 AM To: Kuehling, Felix ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH] drm/amdkfd: Allocate doorbells only when needed [AMD Official Use Only - General] I can throw an Acked-by: Kent Russell since we don't have an RB yet. Kent > -Original Message- > From: amd-gfx On Behalf Of Felix > Kuehling > Sent: Wednesday, August 3, 2022 2:56 PM > To: amd-gfx@lists.freedesktop.org > Subject: [PATCH] drm/amdkfd: Allocate doorbells only when needed > > Only allocate doorbells when the first queue is created on a GPU or the > doorbells need to be mapped into CPU or GPU virtual address space. This > avoids allocating doorbells unnecessarily and can allow more processes > to use KFD on multi-GPU systems. > > Signed-off-by: Felix Kuehling > --- > drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 13 + > drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c | 9 + > drivers/gpu/drm/amd/amdkfd/kfd_process.c | 5 - > 3 files changed, 22 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > index 2b3d8bc8f0aa..907f4711abce 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > @@ -327,6 +327,12 @@ static int kfd_ioctl_create_queue(struct file *filep, > struct kfd_process *p, >goto err_bind_process; >} > > + if (!pdd->doorbell_index && > + kfd_alloc_process_doorbells(dev, &pdd->doorbell_index) < 0) { > + err = -ENOMEM; > + goto err_alloc_doorbells; > + } > + >/* Starting with GFX11, wptr BOs must be mapped to GART for MES to > determine work > * on unmapped queues for usermode queue oversubscription (no > aggregated doorbell) > */ > @@ -404,6 +410,7 @@ static int kfd_ioctl_create_queue(struct file *filep, > struct > kfd_process *p, >if (wptr_bo) >amdgpu_amdkfd_free_gtt_mem(dev->adev, wptr_bo); > err_wptr_map_gart: > +err_alloc_doorbells: > err_bind_process: > err_pdd: >mutex_unlock(&p->mutex); > @@ -1092,6 +1099,10 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file > *filep, >goto err_unlock; >} >offset = kfd_get_process_doorbells(pdd); > + if (!offset) { > + err = -ENOMEM; > + goto err_unlock; > + } >} else if (flags & KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP) { >if (args->size != PAGE_SIZE) { >err = -EINVAL; > @@ -2173,6 +2184,8 @@ static int criu_restore_memory_of_gpu(struct > kfd_process_device *pdd, >return -EINVAL; > >offset = kfd_get_process_doorbells(pdd); > + if (!offset) > + return -ENOMEM; >} else if (bo_bucket->alloc_flags & > KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP) { >/* MMIO BOs need remapped bus address */ >if (bo_bucket->size != PAGE_SIZE) { > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c > b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c > index cb3d2ccc5100..b33798f89ef0 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c > @@ -157,6 +157,8 @@ int kfd_doorbell_mmap(struct kfd_dev *dev, struct > kfd_process *process, > >/* Calculate physical address of doorbell */ >address = kfd_get_process_doorbells(pdd); > + if (!address) > + return -ENOMEM; >vma->vm_flags |= VM_IO | VM_DONTCOPY | VM_DONTEXPAND | > VM_NORESERVE | >VM_DONTDUMP | VM_PFNMAP; > > @@ -275,6 +277,13 @@ uint64_t kfd_get_number_elems(struct kfd_dev *kfd) > > phys_addr_t kfd_get_process_doorbells(struct kfd_process_device *pdd) > { > + if (!pdd->doorbell_index) { > + int r = kfd_alloc_process_doorbells(pdd->dev, > + &pdd->doorbell_index); > + if (r) > + return 0; > + } > + >return pdd->dev->doorbell_base + >pdd->doorbell_index * kfd_doorbell_process_slice(pdd->dev); > } > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c > b/drivers/gpu/drm/amd/amdkfd/kfd_process.c > index 6c83a519b3a1..951b63677248 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c > @@ -1499,11 +1499,6 @@ struct kfd_process_device > *kfd_create_process_device_data(struct kfd_dev *dev, >if (!pdd) >return NULL; > > - if (kfd_alloc_process_doorbells(dev, &pdd->doorbell_index) < 0) { > - pr_err("Failed to alloc doorbell for pdd\n"); > - goto err_free_pdd; > - } > - >
amd-gfx@lists.freedesktop.org
Hi Bernard On 8/23/22 04:14, Bernard Zhao wrote: This patch trf to fis cocci warning: I believe that there are a couple of typos on this description. Maybe you could fixed to s/trf/try and s/fis/fix. drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c: 2349:8-34: duplicated argument to && or || drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c: 3680:8-55: duplicated argument to && or || Signed-off-by: Bernard Zhao Also, it would be nice to have a changelog between the versions. Other than those small nits, Reviewed-by: Maíra Canal Best Regards, - Maíra Canal --- .../gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c| 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c index cb2025771646..f99c1696a1f6 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c @@ -2346,8 +2346,7 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l if (mode_lib->vba.DSCEnable[k] && mode_lib->vba.ForcedOutputLinkBPP[k] != 0) mode_lib->vba.DSCOnlyIfNecessaryWithBPP = true; - if ((mode_lib->vba.DSCEnable[k] || mode_lib->vba.DSCEnable[k]) - && mode_lib->vba.OutputFormat[k] == dm_n422 + if (mode_lib->vba.DSCEnable[k] && mode_lib->vba.OutputFormat[k] == dm_n422 && !mode_lib->vba.DSC422NativeSupport) mode_lib->vba.DSC422NativeNotSupported = true; @@ -3678,7 +3677,6 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l if (mode_lib->vba.SourcePixelFormat[k] != dm_444_64 && mode_lib->vba.SourcePixelFormat[k] != dm_444_32 && mode_lib->vba.SourcePixelFormat[k] != dm_444_16 - && mode_lib->vba.SourcePixelFormat[k] != dm_444_16 && mode_lib->vba.SourcePixelFormat[k] != dm_444_8 && mode_lib->vba.SourcePixelFormat[k] != dm_rgbe) { if (mode_lib->vba.ViewportWidthChroma[k] > mode_lib->vba.SurfaceWidthC[k]
Re: [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions.
Hi Bas, I've just pushed an updated drm-exec branch to fdo which should now include the bo_list bug fix. Can you please test that with Forza? I'm still fighting getting a new kernel on my Steamdeck. Thanks, Christian. Am 22.08.22 um 01:08 schrieb Bas Nieuwenhuizen: On Thu, Aug 18, 2022 at 3:20 PM Christian König wrote: Hi Bas, I've just pushed the branch drm-exec to my fdo repository: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fckoenig%2Flinux-drm.git&data=05%7C01%7Cchristian.koenig%40amd.com%7Ccc04d790d774485a5cbd08da83ca03f4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637967200920376906%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=8ZaXIdEQZe3oNCQtxoNjuBezB4YmPeDR2cLfXfxraZk%3D&reserved=0 This branch contains all the gang submit patches as well as the latest drm-exec stuff. VCN3/4 video decoding has some issues on it, but that probably shouldn't bother your work. Hi Christian, The drm-exec branch doesn't seem to be capable of running Forza Horizon 5. First bad commit seems to be commit 8bb3e919ce0109512f6631422f3fe52169836261 Author: Christian König Date: Thu Jul 14 10:23:38 2022 +0200 drm/amdgpu: revert "partial revert "remove ctx->lock" v2" This reverts commit 94f4c4965e5513ba624488f4b601d6b385635aec. We found that the bo_list is missing a protection for its list entries. Since that is fixed now this workaround can be removed again. Signed-off-by: Christian König and https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2F497679%2F&data=05%7C01%7Cchristian.koenig%40amd.com%7Ccc04d790d774485a5cbd08da83ca03f4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637967200920376906%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=0F7jd61YEApKySKpqIgODHM1x0JB83coaHgjzFeVPoU%3D&reserved=0 ("drm/amdgpu: Fix use-after-free on amdgpu_bo_list mutex") seems to fix things at that patch, but I'm not seeing the obvious rebase over "drm/amdgpu: cleanup and reorder amdgpu_cs.c" yet (and/or whether further issues were introduced). Error logs: [ 124.821691] [ cut here ] [ 124.821696] WARNING: CPU: 3 PID: 2485 at drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:667 amdgpu_ttm_tt_get_user_pages+0x15c/0x190 [amdgpu] [ 124.821955] Modules linked in: uinput snd_seq_dummy snd_hrtimer snd_seq snd_seq_device ccm algif_aead cbc des_generic libdes ecb md4 cmac algif_hash algif_skcipher af_alg bnep intel_rapl_msr intel_rapl_common snd_soc_acp5x_mach snd_acp5x_i2s snd_acp5x_pcm_dma edac_mce_amd kvm_amd kvm rtw88_8822ce rtw88_8822c rtw88_pci irqbypass rapl rtw88_core pcspkr joydev mac80211 btusb s nd_hda_codec_hdmi btrtl libarc4 snd_hda_intel btbcm btintel snd_intel_dspcfg btmtk snd_pci_acp5x i2c_piix4 snd_soc_nau8821 snd_intel_sdw_acpi snd_rn_pci_acp3x cfg80211 bluetooth snd_soc_core snd_hda_codec snd_acp_config snd_soc_acpi snd_pci_acp3x ecdh_generic snd_hda_core cdc_acm mousedev snd_compress ecc rfkill snd_hwdep ac97_bus snd_pcm_dmaengine ina2xx_adc snd_pcm kfifo_buf spi_amd snd_timer opt3001 ina2xx snd industrialio soundcore mac_hid acpi_cpufreq fuse ip_tables x_tables overlay ext4 crc16 mbcache jbd2 mmc_block vfat fat usbhid amdgpu drm_ttm_helper ttm agpgart drm_exec gpu_sched i2c_algo_bit [ 124.822016] drm_display_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm serio_raw atkbd crct10dif_pclmul libps2 crc32_pclmul vivaldi_fmap sdhci_pci ghash_clmulni_intel i8042 ccp cqhci sdhci aesni_intel hid_multitouch xhci_pci crypto_simd cryptd wdat_wdt mmc_core cec sp5100_tco rng_core xhci_pci_renesas serio video i2c_hid_acpi 8250_dw i2c_hid btrfs blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq dm_mirror dm_region_hash dm_log dm_mod pkcs8_key_parser crypto_user [ 124.822051] CPU: 3 PID: 2485 Comm: ForzaHorizon5.e Not tainted 5.18.0-1-neptune-00172-g067e00b76d9c #23 [ 124.822054] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0105 03/21/2022 [ 124.822055] RIP: 0010:amdgpu_ttm_tt_get_user_pages+0x15c/0x190 [amdgpu] [ 124.822262] Code: e1 ef c0 48 c7 c7 10 4a 0c c1 e8 5f f7 3e dd eb 9c 48 c7 c6 85 0a f6 c0 bf 02 00 00 00 e8 8c 74 e2 ff 41 be f2 ff ff ff eb 8b <0f> 0b eb f4 41 be fd ff ff ff e9 7c ff ff ff 48 83 b8 a0 00 00 00 [ 124.822264] RSP: 0018:a257827afb98 EFLAGS: 00010282 [ 124.822267] RAX: 8b82240e6000 RBX: 8b8200a31100 RCX: 0001 [ 124.822268] RDX: 0dc0 RSI: 8b82240e6000 RDI: 8b82a4c7e800 [ 124.822269] RBP: 8b82ee809320 R08: 1000 R09: 8b82240e6000 [ 124.822270] R10: 0006 R11: R12: 8b82ee6dc9c0 [ 124.822272] R13: 3188 R14: 0001 R15: 8b823face440 [ 124.822273] FS: 2773f640() GS:8b852fec() knlGS:1aba [ 124.822275] CS: 0010 DS: ES:
Re: [PATCH] drm/amdgpu: Fix page table setup on Arcturus
Am 22.08.22 um 17:52 schrieb Mukul Joshi: When translate_further is enabled, page table depth needs to be updated. This was missing on Arcturus MMHUB init. This was causing address translations to fail for SDMA user-mode queues. Fixes: 2abf2573b1c69 ("drm/amdgpu: Enable translate_further to extend UTCL2 reach" Signed-off-by: Mukul Joshi Acked-by: Christian König --- drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c index 6e0145b2b408..445cb06b9d26 100644 --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c @@ -295,9 +295,17 @@ static void mmhub_v9_4_disable_identity_aperture(struct amdgpu_device *adev, static void mmhub_v9_4_setup_vmid_config(struct amdgpu_device *adev, int hubid) { struct amdgpu_vmhub *hub = &adev->vmhub[AMDGPU_MMHUB_0]; + unsigned int num_level, block_size; uint32_t tmp; int i; + num_level = adev->vm_manager.num_level; + block_size = adev->vm_manager.block_size; + if (adev->gmc.translate_further) + num_level -= 1; + else + block_size -= 9; + for (i = 0; i <= 14; i++) { tmp = RREG32_SOC15_OFFSET(MMHUB, 0, mmVML2VC0_VM_CONTEXT1_CNTL, hubid * MMHUB_INSTANCE_REGISTER_OFFSET + i); @@ -305,7 +313,7 @@ static void mmhub_v9_4_setup_vmid_config(struct amdgpu_device *adev, int hubid) ENABLE_CONTEXT, 1); tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL, PAGE_TABLE_DEPTH, - adev->vm_manager.num_level); + num_level); tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL, RANGE_PROTECTION_FAULT_ENABLE_DEFAULT, 1); tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL, @@ -323,7 +331,7 @@ static void mmhub_v9_4_setup_vmid_config(struct amdgpu_device *adev, int hubid) EXECUTE_PROTECTION_FAULT_ENABLE_DEFAULT, 1); tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL, PAGE_TABLE_BLOCK_SIZE, - adev->vm_manager.block_size - 9); + block_size); /* Send no-retry XNACK on fault to suppress VM fault storm. */ tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL, RETRY_PERMISSION_OR_INVALID_PAGE_FAULT,
[PATCH] drm/amd: remove possible condition with no effect (if == else)
This patch fix cocci warning: drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c:1816:6-8: WARNING: possible condition with no effect (if == else). Signed-off-by: Bernard Zhao --- drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c index 85f32206a766..dccc9794e6a2 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c @@ -1813,8 +1813,6 @@ static bool dcn314_resource_construct( if (dc->ctx->dce_environment == DCE_ENV_PRODUCTION_DRV) dc->debug = debug_defaults_drv; - else if (dc->ctx->dce_environment == DCE_ENV_FPGA_MAXIMUS) - dc->debug = debug_defaults_diags; else dc->debug = debug_defaults_diags; // Init the vm_helper -- 2.33.1
amd-gfx@lists.freedesktop.org
This patch trf to fis cocci warning: drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c: 2349:8-34: duplicated argument to && or || drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c: 3680:8-55: duplicated argument to && or || Signed-off-by: Bernard Zhao --- drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c index cb2025771646..fa26834daf56 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c @@ -2346,8 +2346,7 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l if (mode_lib->vba.DSCEnable[k] && mode_lib->vba.ForcedOutputLinkBPP[k] != 0) mode_lib->vba.DSCOnlyIfNecessaryWithBPP = true; - if ((mode_lib->vba.DSCEnable[k] || mode_lib->vba.DSCEnable[k]) - && mode_lib->vba.OutputFormat[k] == dm_n422 + if (mode_lib->vba.DSCEnable[k] && mode_lib->vba.OutputFormat[k] == dm_n422 && !mode_lib->vba.DSC422NativeSupport) mode_lib->vba.DSC422NativeNotSupported = true; -- 2.33.1
[PATCH] drm/amd: fix potential memory leak
This patch fix potential memory leak (clk_src) when function run into last return NULL. Signed-off-by: Bernard Zhao --- drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c index 85f32206a766..c7bb76a2a8c2 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c @@ -1643,6 +1643,7 @@ static struct clock_source *dcn31_clock_source_create( } BREAK_TO_DEBUGGER(); + kfree(clk_src); return NULL; } -- 2.33.1
[PATCH] drm/amdgpu: mmVM_L2_CNTL3 register not initialized correctly
From: Qu Huang The mmVM_L2_CNTL3 register is not assigned an initial value Signed-off-by: Qu Huang --- drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c index 1da2ec692057e..b8a987a032a8e 100644 --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c @@ -176,6 +176,7 @@ static void mmhub_v1_0_init_cache_regs(struct amdgpu_device *adev) tmp = REG_SET_FIELD(tmp, VM_L2_CNTL2, INVALIDATE_L2_CACHE, 1); WREG32_SOC15(MMHUB, 0, mmVM_L2_CNTL2, tmp); + tmp = mmVM_L2_CNTL3_DEFAULT; if (adev->gmc.translate_further) { tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, BANK_SELECT, 12); tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, -- 2.31.1
amd-gfx@lists.freedesktop.org
This patch trf to fis cocci warning: drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c: 2349:8-34: duplicated argument to && or || drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c: 3680:8-55: duplicated argument to && or || Signed-off-by: Bernard Zhao --- .../gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c| 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c index cb2025771646..f99c1696a1f6 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c @@ -2346,8 +2346,7 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l if (mode_lib->vba.DSCEnable[k] && mode_lib->vba.ForcedOutputLinkBPP[k] != 0) mode_lib->vba.DSCOnlyIfNecessaryWithBPP = true; - if ((mode_lib->vba.DSCEnable[k] || mode_lib->vba.DSCEnable[k]) - && mode_lib->vba.OutputFormat[k] == dm_n422 + if (mode_lib->vba.DSCEnable[k] && mode_lib->vba.OutputFormat[k] == dm_n422 && !mode_lib->vba.DSC422NativeSupport) mode_lib->vba.DSC422NativeNotSupported = true; @@ -3678,7 +3677,6 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l if (mode_lib->vba.SourcePixelFormat[k] != dm_444_64 && mode_lib->vba.SourcePixelFormat[k] != dm_444_32 && mode_lib->vba.SourcePixelFormat[k] != dm_444_16 - && mode_lib->vba.SourcePixelFormat[k] != dm_444_16 && mode_lib->vba.SourcePixelFormat[k] != dm_444_8 && mode_lib->vba.SourcePixelFormat[k] != dm_rgbe) { if (mode_lib->vba.ViewportWidthChroma[k] > mode_lib->vba.SurfaceWidthC[k] -- 2.33.1
[PATCH] drm/amd: fix potential memory leak
This patch fix potential memory leak (clk_src) when function run into last return NULL. Signed-off-by: Bernard Zhao --- drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c index 85f32206a766..76f263846c6b 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c @@ -1715,6 +1715,7 @@ static struct clock_source *dcn30_clock_source_create( } BREAK_TO_DEBUGGER(); + free(clk_src); return NULL; } -- 2.33.1
[PATCH] drm/amd: remove possible condition with no effect (if == else)
This patch fix cocci warning: drivers/gpu/drm/amd/display/dc/core/dc.c:3335:2-4: WARNING: possible condition with no effect (if == else). Signed-off-by: Bernard Zhao --- drivers/gpu/drm/amd/display/dc/core/dc.c | 9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index aeecca68dea7..2d4c44083d79 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -3332,13 +3332,8 @@ static void commit_planes_for_stream(struct dc *dc, /* Since phantom pipe programming is moved to post_unlock_program_front_end, * move the SubVP lock to after the phantom pipes have been setup */ - if (should_lock_all_pipes && dc->hwss.interdependent_update_lock) { - if (dc->hwss.subvp_pipe_control_lock) - dc->hwss.subvp_pipe_control_lock(dc, context, false, should_lock_all_pipes, NULL, subvp_prev_use); - } else { - if (dc->hwss.subvp_pipe_control_lock) - dc->hwss.subvp_pipe_control_lock(dc, context, false, should_lock_all_pipes, NULL, subvp_prev_use); - } + if (dc->hwss.subvp_pipe_control_lock) + dc->hwss.subvp_pipe_control_lock(dc, context, false, should_lock_all_pipes, NULL, subvp_prev_use); return; } -- 2.33.1