[PATCH] drm/amdkfd: Fix isa version for the GC 10.3.7

2022-08-23 Thread Prike Liang
Correct the isa version for handling KFD test.

Fixes: 7c4f4f197e0c ("drm/amdkfd: Add GC 10.3.6 and 10.3.7 KFD definitions")
Signed-off-by: Prike Liang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index fdad1415f8bd..5ebbeac61379 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -388,7 +388,7 @@ struct kfd_dev *kgd2kfd_probe(struct amdgpu_device *adev, 
bool vf)
f2g = &gfx_v10_3_kfd2kgd;
break;
case IP_VERSION(10, 3, 7):
-   gfx_target_version = 100307;
+   gfx_target_version = 100306;
if (!vf)
f2g = &gfx_v10_3_kfd2kgd;
break;
-- 
2.25.1



Re: [Bug 216373] New: Uncorrected errors reported for AMD GPU

2022-08-23 Thread Lazar, Lijo



On 8/23/2022 10:34 PM, Tom Seewald wrote:

On Sat, Aug 20, 2022 at 2:53 AM Lazar, Lijo  wrote:


Missed the remap part, the offset is here -

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv6.0-rc1%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Fnv.c%23L680&data=05%7C01%7Clijo.lazar%40amd.com%7Cac6bd5bb5d4143ff9e5808da852982d8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637968710652869475%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=U5AkO7jfPGP2veXp%2FQkoLY92%2BHNOdMkwTwQCb0tRJPk%3D&reserved=0


The trace is coming from *_flush_hdp.

You may also check if *_remap_hdp_registers() is getting called. It is
done in nbio_vx_y files, most likely this one for your device -
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv6.0-rc1%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Fnbio_v2_3.c%23L68&data=05%7C01%7Clijo.lazar%40amd.com%7Cac6bd5bb5d4143ff9e5808da852982d8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637968710652869475%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=N4ZbLJuRgddTqMdMo2vD5iJGMMmUJ1MPUVJwVIKThSU%3D&reserved=0

Thanks,
Lijo


Hi Lijo,

I would be happy to test any patches that you think would shed some
light on this.

Unfortunately, I don't have any NV platforms to test. Attached is an 
'untested-patch' based on your trace logs.


Thanks,
Lijodiff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index d7eb23b8d692..743a3ac909ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2376,6 +2376,15 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
*adev)
DRM_ERROR("amdgpu_vram_scratch_init failed 
%d\n", r);
goto init_failed;
}
+
+   /* remap HDP registers to a hole in mmio space,
+* for the purpose of expose those registers
+* to process space. This is needed for any early HDP
+* flush operation
+*/
+   if (adev->nbio.funcs->remap_hdp_registers && 
!amdgpu_sriov_vf(adev))
+   adev->nbio.funcs->remap_hdp_registers(adev);
+
r = adev->ip_blocks[i].version->funcs->hw_init((void 
*)adev);
if (r) {
DRM_ERROR("hw_init %d failed %d\n", i, r);
diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index b3fba8dea63c..3ac7fef74277 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -1032,12 +1032,6 @@ static int nv_common_hw_init(void *handle)
nv_program_aspm(adev);
/* setup nbio registers */
adev->nbio.funcs->init_registers(adev);
-   /* remap HDP registers to a hole in mmio space,
-* for the purpose of expose those registers
-* to process space
-*/
-   if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
-   adev->nbio.funcs->remap_hdp_registers(adev);
/* enable the doorbell aperture */
nv_enable_doorbell_aperture(adev, true);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index fde6154f2009..a0481e37d7cf 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -1240,12 +1240,6 @@ static int soc15_common_hw_init(void *handle)
soc15_program_aspm(adev);
/* setup nbio registers */
adev->nbio.funcs->init_registers(adev);
-   /* remap HDP registers to a hole in mmio space,
-* for the purpose of expose those registers
-* to process space
-*/
-   if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
-   adev->nbio.funcs->remap_hdp_registers(adev);
 
/* enable the doorbell aperture */
soc15_enable_doorbell_aperture(adev, true);
diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c 
b/drivers/gpu/drm/amd/amdgpu/soc21.c
index 1ff7fc7bb340..60a1cf03fddc 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc21.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
@@ -643,12 +643,6 @@ static int soc21_common_hw_init(void *handle)
soc21_program_aspm(adev);
/* setup nbio registers */
adev->nbio.funcs->init_registers(adev);
-   /* remap HDP registers to a hole in mmio space,
-* for the purpose of expose those registers
-* to process space
-*/
-   if (adev->nbio.funcs->remap_hdp_registers)
-   adev->nbio.funcs->remap_hdp_registers(adev);
/* enable the doorbell aperture */
soc21_enable_doorbell_aperture(adev, true);
 


[PATCH v4] drm/amd/display: Fix vblank refcount in vrr transition

2022-08-23 Thread Yunxiang Li
manage_dm_interrupts disable/enable vblank using drm_crtc_vblank_off/on
which causes drm_crtc_vblank_get in vrr_transition to fail, and later
when drm_crtc_vblank_put is called the refcount on vblank will be messed
up. Therefore move the call to after manage_dm_interrupts.

Signed-off-by: Yunxiang Li 
---
v2: check the return code for calls that might fail and warn on them
v3/v4: make the sequence closer to the original and remove redundant local 
variables

 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 55 +--
 1 file changed, 26 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index bc2493a2a90e..de80b61b8d8e 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -7488,15 +7488,15 @@ static void amdgpu_dm_handle_vrr_transition(struct 
dm_crtc_state *old_state,
 * We also need vupdate irq for the actual core vblank handling
 * at end of vblank.
 */
-   dm_set_vupdate_irq(new_state->base.crtc, true);
-   drm_crtc_vblank_get(new_state->base.crtc);
+   WARN_ON(dm_set_vupdate_irq(new_state->base.crtc, true) != 0);
+   WARN_ON(drm_crtc_vblank_get(new_state->base.crtc) != 0);
DRM_DEBUG_DRIVER("%s: crtc=%u VRR off->on: Get vblank ref\n",
 __func__, new_state->base.crtc->base.id);
} else if (old_vrr_active && !new_vrr_active) {
/* Transition VRR active -> inactive:
 * Allow vblank irq disable again for fixed refresh rate.
 */
-   dm_set_vupdate_irq(new_state->base.crtc, false);
+   WARN_ON(dm_set_vupdate_irq(new_state->base.crtc, false) != 0);
drm_crtc_vblank_put(new_state->base.crtc);
DRM_DEBUG_DRIVER("%s: crtc=%u VRR on->off: Drop vblank ref\n",
 __func__, new_state->base.crtc->base.id);
@@ -8261,23 +8261,6 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state)
mutex_unlock(&dm->dc_lock);
}
 
-   /* Count number of newly disabled CRTCs for dropping PM refs later. */
-   for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state,
- new_crtc_state, i) {
-   if (old_crtc_state->active && !new_crtc_state->active)
-   crtc_disable_count++;
-
-   dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
-   dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
-
-   /* For freesync config update on crtc state and params for irq 
*/
-   update_stream_irq_parameters(dm, dm_new_crtc_state);
-
-   /* Handle vrr on->off / off->on transitions */
-   amdgpu_dm_handle_vrr_transition(dm_old_crtc_state,
-   dm_new_crtc_state);
-   }
-
/**
 * Enable interrupts for CRTCs that are newly enabled or went through
 * a modeset. It was intentionally deferred until after the front end
@@ -8287,16 +8270,29 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state)
for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, 
new_crtc_state, i) {
struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc);
 #ifdef CONFIG_DEBUG_FS
-   bool configure_crc = false;
enum amdgpu_dm_pipe_crc_source cur_crc_src;
 #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
-   struct crc_rd_work *crc_rd_wrk = dm->crc_rd_wrk;
+   struct crc_rd_work *crc_rd_wrk;
+#endif
+#endif
+   /* Count number of newly disabled CRTCs for dropping PM refs 
later. */
+   if (old_crtc_state->active && !new_crtc_state->active)
+   crtc_disable_count++;
+
+   dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
+   dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
+
+   /* For freesync config update on crtc state and params for irq 
*/
+   update_stream_irq_parameters(dm, dm_new_crtc_state);
+
+#ifdef CONFIG_DEBUG_FS
+#if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
+   crc_rd_wrk = dm->crc_rd_wrk;
 #endif
spin_lock_irqsave(&adev_to_drm(adev)->event_lock, flags);
cur_crc_src = acrtc->dm_irq_params.crc_src;
spin_unlock_irqrestore(&adev_to_drm(adev)->event_lock, flags);
 #endif
-   dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
 
if (new_crtc_state->active &&
(!old_crtc_state->active ||
@@ -8304,16 +8300,19 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state)
dc_stream_retain(dm_new_crtc_state->stream);
acrtc->dm_

Re: [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions.

2022-08-23 Thread Bas Nieuwenhuizen
On Tue, Aug 23, 2022 at 12:16 PM Christian König
 wrote:
>
> Hi Bas,
>
> I've just pushed an updated drm-exec branch to fdo which should now
> include the bo_list bug fix.

Still getting the same thing:

[  103.598784] [ cut here ]
[  103.598787] WARNING: CPU: 2 PID: 2505 at
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:667
amdgpu_ttm_tt_get_user_pages+0x15c/0x190 [amdgpu]
[  103.599016] Modules linked in: uinput snd_seq_dummy snd_hrtimer
snd_seq snd_seq_device ccm algif_aead cbc des_generic libdes ecb md4
cmac algif_hash algif_skcipher af_alg bnep intel_ra
pl_msr snd_soc_acp5x_mach snd_acp5x_i2s snd_acp5x_pcm_dma
intel_rapl_common rtw88_8822ce rtw88_8822c rtw88_pci edac_mce_amd
rtw88_core kvm_amd mac80211 kvm btusb btrtl libarc4 btbcm irqby
pass rapl btintel pcspkr btmtk joydev cfg80211 snd_hda_codec_hdmi
bluetooth i2c_piix4 snd_soc_nau8821 snd_hda_intel snd_intel_dspcfg
snd_soc_core snd_intel_sdw_acpi snd_hda_codec snd_pci_
acp5x snd_hda_core snd_rn_pci_acp3x snd_compress snd_acp_config
ac97_bus snd_soc_acpi snd_pcm_dmaengine snd_hwdep ecdh_generic
snd_pci_acp3x cdc_acm mousedev ecc rfkill snd_pcm ina2xx_adc
kfifo_buf snd_timer snd opt3001 ina2xx industrialio soundcore
acpi_cpufreq spi_amd mac_hid fuse ip_tables x_tables overlay ext4
crc16 mbcache jbd2 mmc_block vfat fat usbhid amdgpu drm_tt
m_helper ttm agpgart drm_exec gpu_sched i2c_algo_bit
[  103.599064]  drm_display_helper drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops drm serio_raw sdhci_pci atkbd libps2
cqhci vivaldi_fmap ccp crct10dif_pclmul sdhci i8042 cr
c32_pclmul xhci_pci hid_multitouch ghash_clmulni_intel aesni_intel
crypto_simd cryptd wdat_wdt mmc_core cec sp5100_tco rng_core
xhci_pci_renesas serio video i2c_hid_acpi 8250_dw i2c_hid b
trfs blake2b_generic libcrc32c crc32c_generic crc32c_intel xor
raid6_pq dm_mirror dm_region_hash dm_log dm_mod pkcs8_key_parser
crypto_user
[  103.599091] CPU: 2 PID: 2505 Comm: ForzaHorizon5.e Not tainted
5.18.0-1-neptune-00172-g067e00b76d9c #25
[  103.599093] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0105 03/21/2022
[  103.599095] RIP: 0010:amdgpu_ttm_tt_get_user_pages+0x15c/0x190 [amdgpu]
[  103.599232] Code: 66 ed c0 48 c7 c7 60 cb 09 c1 e8 5f 87 c1 cb eb
9c 48 c7 c6 c5 8e f3 c0 bf 02 00 00 00 e8 8c a4 ee ff 41 be f2 ff ff
ff eb 8b <0f> 0b eb f4 41 be fd ff ff ff e9 7c ff
ff ff 48 83 b8 a0 00 00 00
[  103.599234] RSP: 0018:c195020afb98 EFLAGS: 00010282
[  103.599235] RAX: 9d840b878000 RBX: 9d8300b1e1c0 RCX: 0001
[  103.599236] RDX: 0dc0 RSI: 9d840b878000 RDI: 9d835bdb3000
[  103.599237] RBP: 9d84d2617520 R08: 1000 R09: 9d840b878000
[  103.599238] R10: 0005 R11:  R12: 9d831030e240
[  103.599239] R13: 3208 R14: 0001 R15: 9d8313c30060
[  103.599240] FS:  1c86f640() GS:9d862fe8()
knlGS:1abf
[  103.599242] CS:  0010 DS:  ES:  CR0: 80050033
[  103.599242] CR2: 7f0be80210e8 CR3: 00017fa08000 CR4: 00350ee0
[  103.599244] Call Trace:
[  103.599247]  
[  103.599250]  amdgpu_cs_ioctl+0x9cc/0x2080 [amdgpu]
[  103.599390]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[  103.599525]  drm_ioctl_kernel+0xc5/0x170 [drm]
[  103.599547]  drm_ioctl+0x229/0x400 [drm]
[  103.599563]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[  103.599700]  amdgpu_drm_ioctl+0x4a/0x80 [amdgpu]
[  103.599832]  __x64_sys_ioctl+0x8c/0xc0
[  103.599837]  do_syscall_64+0x3a/0x80
[  103.599841]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  103.599844] RIP: 0033:0x7f0c9f59b59b
[  103.599846] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d
4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00
00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a5
a8 0c 00 f7 d8 64 89 01 48
[  103.599847] RSP: 002b:1c86d498 EFLAGS: 0246 ORIG_RAX:
0010
[  103.599849] RAX: ffda RBX: 1c86d510 RCX: 7f0c9f59b59b
[  103.599850] RDX: 1c86d510 RSI: c0186444 RDI: 0021
[  103.599850] RBP: c0186444 R08: 7f0c3ad55840 R09: 1c86d4e0
[  103.599851] R10:  R11: 0246 R12: 7f0c3a19b200
[  103.599852] R13: 0021 R14: 7f0c3aff8cf0 R15: 1c86d6e0
[  103.599854]  
[  103.599854] ---[ end trace  ]---
[  103.599856] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
process the buffer list -14!
[  103.55] [ cut here ]
[  103.55] refcount_t: underflow; use-after-free.
[  103.62] WARNING: CPU: 2 PID: 2505 at lib/refcount.c:28
refcount_warn_saturate+0xa6/0xf0
[  103.66] Modules linked in: uinput snd_seq_dummy snd_hrtimer
snd_seq snd_seq_device ccm algif_aead cbc des_generic libdes ecb md4
cmac algif_hash algif_skcipher af_alg bnep intel_ra
pl_msr snd_soc_acp5x_mach snd_acp5x_i2s snd_acp5x_pcm_dma
intel_rapl_common rtw88_8822ce rtw88_8822c rtw88_pci edac_m

Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-23 Thread Luben Tuikov



On 2022-08-23 14:57, Andrey Grodzovsky wrote:
> On 2022-08-23 14:30, Luben Tuikov wrote:
> 
>>
>> On 2022-08-23 14:13, Andrey Grodzovsky wrote:
>>> On 2022-08-23 12:58, Luben Tuikov wrote:
 Inlined:

 On 2022-08-22 16:09, Andrey Grodzovsky wrote:
> Poblem: Given many entities competing for same rq on
 ^Problem

> same scheduler an uncceptabliy long wait time for some
 ^unacceptably

> jobs waiting stuck in rq before being picked up are
> observed (seen using  GPUVis).
> The issue is due to Round Robin policy used by scheduler
> to pick up the next entity for execution. Under stress
> of many entities and long job queus within entity some
 ^queues

> jobs could be stack for very long time in it's entity's
> queue before being popped from the queue and executed
> while for other entites with samller job queues a job
 ^entities; smaller

> might execute ealier even though that job arrived later
 ^earlier

> then the job in the long queue.
>
> Fix:
> Add FIFO selection policy to entites in RQ, chose next enitity
> on rq in such order that if job on one entity arrived
> ealrier then job on another entity the first job will start
> executing ealier regardless of the length of the entity's job
> queue.
>
> Signed-off-by: Andrey Grodzovsky 
> Tested-by: Li Yunxiang (Teddy) 
> ---
>drivers/gpu/drm/scheduler/sched_entity.c |  2 +
>drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
>include/drm/gpu_scheduler.h  |  8 +++
>3 files changed, 71 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
> b/drivers/gpu/drm/scheduler/sched_entity.c
> index 6b25b2f4f5a3..3bb7f69306ef 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job 
> *sched_job)
>   atomic_inc(entity->rq->sched->score);
>   WRITE_ONCE(entity->last_user, current->group_leader);
>   first = spsc_queue_push(&entity->job_queue, 
> &sched_job->queue_node);
> + sched_job->submit_ts = ktime_get();
> +
>
>   /* first job wakes up scheduler */
>   if (first) {
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index 68317d3a7a27..c123aa120d06 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -59,6 +59,19 @@
>#define CREATE_TRACE_POINTS
>#include "gpu_scheduler_trace.h"
>
> +
> +
> +int drm_sched_policy = -1;
> +
> +/**
> + * DOC: sched_policy (int)
> + * Used to override default entites scheduling policy in a run queue.
> + */
> +MODULE_PARM_DESC(sched_policy,
> + "specify schedule policy for entites on a runqueue (-1 = 
> auto(default) value, 0 = Round Robin,1  = use FIFO");
> +module_param_named(sched_policy, drm_sched_policy, int, 0444);
 As per Christian's comments, you can drop the "auto" and perhaps leave one 
 as the default,
 say the RR.

 I do think it is beneficial to have a module parameter control the 
 scheduling policy, as shown above.
>>>
>>> Christian is not against it, just against adding 'auto' here - like the
>>> default.
>> Exactly what I said.
>>
>> Also, I still think an O(1) scheduling (picking next to run) should be
>> what we strive for in such a FIFO patch implementation.
>> A FIFO mechanism is by it's nature an O(1) mechanism for picking the next
>> element.
>>
>> Regards,
>> Luben
> 
> 
> The only solution i see for this now is keeping a global per rq jobs 
> list parallel to SPCP queue per entity - we use this list when we switch
> to FIFO scheduling, we can even start building  it ONLY when we switch 
> to FIFO building it gradually as more jobs come. Do you have other solution
> in mind ?

The idea is to "sort" on insertion, not on picking the next one to run.

cont'd below:

> 
> Andrey
> 
>>
>>>
> +
> +
>#define to_drm_sched_job(sched_job)\
>   container_of((sched_job), struct drm_sched_job, 
> queue_node)
>
> @@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq 
> *rq,
>}
>
>/**
> - * drm_sched_rq_select_entity - Select an entity which could provide a 
> job to run
> + * drm_sched_rq_select_entity_rr - Select an entity which could provide 
> a job to run
> *
> * @rq: scheduler run queue to check.
> *
> - * Try to find a ready entity, returns NULL if none found.
> + * Try to find a ready entity, in round robin manner.
> + *
> + * Returns NULL if none found.
>  

Re: [RFC v4 00/17] drm/display/dp_mst: Drop Radeon MST support, make MST atomic-only

2022-08-23 Thread Lyude Paul
Actually, talked with airlied and they suggested at this point I should just
go ahead and push. So, pushed! Have fun getting nice DSC support everyone :)

On Tue, 2022-08-23 at 13:26 -0400, Lyude Paul wrote:
> Would anyone have any issues if I merged this today? The whole series is
> acked, but I'm not sure if we would like to wait for R-b's?
> 
> 
> On Wed, 2022-08-17 at 15:38 -0400, Lyude Paul wrote:
> > For quite a while we've been carrying around a lot of legacy modesetting
> > code in the MST helpers that has been rather annoying to keep around,
> > and very often gets in the way of trying to implement additional
> > functionality in MST such as fallback link rate retraining, dynamic BPC
> > management and DSC support, etc. because of the fact that we can't rely
> > on atomic for everything.
> > 
> > Luckily, we only actually have one user of the legacy MST code in the
> > kernel - radeon. Originally I was thinking of trying to maintain this
> > code and keep it around in some form, but I'm pretty unconvinced anyone
> > is actually using this. My reasoning for that is because I've seen
> > nearly no issues regarding MST on radeon for quite a while now - despite
> > the fact my local testing seems to indicate it's quite broken. This
> > isn't too surprising either, as MST support in radeon.ko is gated behind
> > a module parameter that isn't enabled by default. This isn't to say I
> > wouldn't be open to alternative suggestions, but I'd rather not be the
> > one to have to spend time on that if at all possible! Plus, I already
> > floated the idea of dropping this code by AMD folks a few times and
> > didn't get much resistance.
> > 
> > As well, this series has some basic refactoring that I did along the way
> > and some bugs I had to fix in order to get my atomic-only MST code
> > working. Most of this is pretty straight forward and simply renaming
> > things to more closely match the DisplayPort specification, as I think
> > this will also make maintaining this code a lot easier in the long run
> > (I've gotten myself confused way too many times because of this).
> > 
> > So far I've tested this on all three MST drivers: amdgpu, i915 and
> > nouveau, along with making sure that removing the radeon MST code
> > doesn't break anything else. The one thing I very much could use help
> > with regarding testing though is making sure that this works with
> > amdgpu's DSC support on MST.
> > 
> > So, with this we should be using the atomic state as much as possible
> > with MST modesetting, hooray!
> > 
> > V4:
> > * Get rid of fix that Wayne pointed out isn't needed
> > 
> > Cc: Wayne Lin 
> > Cc: Ville Syrjälä 
> > Cc: Fangzhi Zuo 
> > Cc: Jani Nikula 
> > Cc: Imre Deak 
> > Cc: Daniel Vetter 
> > Cc: Sean Paul 
> > 
> > Lyude Paul (17):
> >   drm/amdgpu/dc/mst: Rename dp_mst_stream_allocation(_table)
> >   drm/amdgpu/dm/mst: Rename get_payload_table()
> >   drm/display/dp_mst: Rename drm_dp_mst_vcpi_allocation
> >   drm/display/dp_mst: Call them time slots, not VCPI slots
> >   drm/display/dp_mst: Fix confusing docs for
> > drm_dp_atomic_release_time_slots()
> >   drm/display/dp_mst: Add some missing kdocs for atomic MST structs
> >   drm/display/dp_mst: Add helper for finding payloads in atomic MST
> > state
> >   drm/display/dp_mst: Add nonblocking helpers for DP MST
> >   drm/display/dp_mst: Don't open code modeset checks for releasing time
> > slots
> >   drm/display/dp_mst: Fix modeset tracking in
> > drm_dp_atomic_release_vcpi_slots()
> >   drm/nouveau/kms: Cache DP encoders in nouveau_connector
> >   drm/nouveau/kms: Pull mst state in for all modesets
> >   drm/display/dp_mst: Add helpers for serializing SST <-> MST
> > transitions
> >   drm/display/dp_mst: Drop all ports from topology on CSNs before
> > queueing link address work
> >   drm/display/dp_mst: Maintain time slot allocations when deleting
> > payloads
> >   drm/radeon: Drop legacy MST support
> >   drm/display/dp_mst: Move all payload info into the atomic state
> > 
> >  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |   68 +-
> >  .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c |  108 +-
> >  .../display/amdgpu_dm/amdgpu_dm_mst_types.c   |  125 +-
> >  drivers/gpu/drm/amd/display/dc/core/dc_link.c |   10 +-
> >  drivers/gpu/drm/amd/display/dc/dm_helpers.h   |4 +-
> >  .../amd/display/include/link_service_types.h  |   14 +-
> >  drivers/gpu/drm/display/drm_dp_mst_topology.c | 1137 -
> >  drivers/gpu/drm/i915/display/intel_display.c  |6 +
> >  drivers/gpu/drm/i915/display/intel_dp.c   |9 +
> >  drivers/gpu/drm/i915/display/intel_dp_mst.c   |   91 +-
> >  drivers/gpu/drm/i915/display/intel_hdcp.c |   24 +-
> >  drivers/gpu/drm/nouveau/dispnv50/disp.c   |  197 ++-
> >  drivers/gpu/drm/nouveau/dispnv50/disp.h   |2 +
> >  drivers/gpu/drm/nouveau/nouveau_connector.c   |   18 +-
> >  drivers/gpu/drm/nouveau/nouveau_connector.h   |3 +
> >  drivers/gpu/drm/radeon/Makef

Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-23 Thread Andrey Grodzovsky

On 2022-08-23 14:30, Luben Tuikov wrote:



On 2022-08-23 14:13, Andrey Grodzovsky wrote:

On 2022-08-23 12:58, Luben Tuikov wrote:

Inlined:

On 2022-08-22 16:09, Andrey Grodzovsky wrote:

Poblem: Given many entities competing for same rq on

^Problem


same scheduler an uncceptabliy long wait time for some

^unacceptably


jobs waiting stuck in rq before being picked up are
observed (seen using  GPUVis).
The issue is due to Round Robin policy used by scheduler
to pick up the next entity for execution. Under stress
of many entities and long job queus within entity some

^queues


jobs could be stack for very long time in it's entity's
queue before being popped from the queue and executed
while for other entites with samller job queues a job

^entities; smaller


might execute ealier even though that job arrived later

^earlier


then the job in the long queue.

Fix:
Add FIFO selection policy to entites in RQ, chose next enitity
on rq in such order that if job on one entity arrived
ealrier then job on another entity the first job will start
executing ealier regardless of the length of the entity's job
queue.

Signed-off-by: Andrey Grodzovsky 
Tested-by: Li Yunxiang (Teddy) 
---
   drivers/gpu/drm/scheduler/sched_entity.c |  2 +
   drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
   include/drm/gpu_scheduler.h  |  8 +++
   3 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 6b25b2f4f5a3..3bb7f69306ef 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job 
*sched_job)
atomic_inc(entity->rq->sched->score);
WRITE_ONCE(entity->last_user, current->group_leader);
first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node);
+   sched_job->submit_ts = ktime_get();
+
   
   	/* first job wakes up scheduler */

if (first) {
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 68317d3a7a27..c123aa120d06 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -59,6 +59,19 @@
   #define CREATE_TRACE_POINTS
   #include "gpu_scheduler_trace.h"
   
+

+
+int drm_sched_policy = -1;
+
+/**
+ * DOC: sched_policy (int)
+ * Used to override default entites scheduling policy in a run queue.
+ */
+MODULE_PARM_DESC(sched_policy,
+   "specify schedule policy for entites on a runqueue (-1 = 
auto(default) value, 0 = Round Robin,1  = use FIFO");
+module_param_named(sched_policy, drm_sched_policy, int, 0444);

As per Christian's comments, you can drop the "auto" and perhaps leave one as 
the default,
say the RR.

I do think it is beneficial to have a module parameter control the scheduling 
policy, as shown above.


Christian is not against it, just against adding 'auto' here - like the
default.

Exactly what I said.

Also, I still think an O(1) scheduling (picking next to run) should be
what we strive for in such a FIFO patch implementation.
A FIFO mechanism is by it's nature an O(1) mechanism for picking the next
element.

Regards,
Luben



The only solution i see for this now is keeping a global per rq jobs 
list parallel to SPCP queue per entity - we use this list when we switch
to FIFO scheduling, we can even start building  it ONLY when we switch 
to FIFO building it gradually as more jobs come. Do you have other solution

in mind ?

Andrey






+
+
   #define to_drm_sched_job(sched_job)  \
container_of((sched_job), struct drm_sched_job, queue_node)
   
@@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,

   }
   
   /**

- * drm_sched_rq_select_entity - Select an entity which could provide a job to 
run
+ * drm_sched_rq_select_entity_rr - Select an entity which could provide a job 
to run
*
* @rq: scheduler run queue to check.
*
- * Try to find a ready entity, returns NULL if none found.
+ * Try to find a ready entity, in round robin manner.
+ *
+ * Returns NULL if none found.
*/
   static struct drm_sched_entity *
-drm_sched_rq_select_entity(struct drm_sched_rq *rq)
+drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
   {
struct drm_sched_entity *entity;
   
@@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)

return NULL;
   }
   
+/**

+ * drm_sched_rq_select_entity_fifo - Select an entity which could provide a 
job to run
+ *
+ * @rq: scheduler run queue to check.
+ *
+ * Try to find a ready entity, based on FIFO order of jobs arrivals.
+ *
+ * Returns NULL if none found.
+ */
+static struct drm_sched_entity *
+drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
+{
+   struct drm_sched_entity *tmp, *entity = NULL;
+   ktime_t oldest_ts = KTIME_MAX;
+   struct drm_sched_job *sched_job;
+
+   spin_lock(

Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-23 Thread Luben Tuikov



On 2022-08-23 14:13, Andrey Grodzovsky wrote:
> 
> On 2022-08-23 12:58, Luben Tuikov wrote:
>> Inlined:
>>
>> On 2022-08-22 16:09, Andrey Grodzovsky wrote:
>>> Poblem: Given many entities competing for same rq on
>> ^Problem
>>
>>> same scheduler an uncceptabliy long wait time for some
>> ^unacceptably
>>
>>> jobs waiting stuck in rq before being picked up are
>>> observed (seen using  GPUVis).
>>> The issue is due to Round Robin policy used by scheduler
>>> to pick up the next entity for execution. Under stress
>>> of many entities and long job queus within entity some
>> ^queues
>>
>>> jobs could be stack for very long time in it's entity's
>>> queue before being popped from the queue and executed
>>> while for other entites with samller job queues a job
>> ^entities; smaller
>>
>>> might execute ealier even though that job arrived later
>> ^earlier
>>
>>> then the job in the long queue.
>>>
>>> Fix:
>>> Add FIFO selection policy to entites in RQ, chose next enitity
>>> on rq in such order that if job on one entity arrived
>>> ealrier then job on another entity the first job will start
>>> executing ealier regardless of the length of the entity's job
>>> queue.
>>>
>>> Signed-off-by: Andrey Grodzovsky 
>>> Tested-by: Li Yunxiang (Teddy) 
>>> ---
>>>   drivers/gpu/drm/scheduler/sched_entity.c |  2 +
>>>   drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
>>>   include/drm/gpu_scheduler.h  |  8 +++
>>>   3 files changed, 71 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
>>> b/drivers/gpu/drm/scheduler/sched_entity.c
>>> index 6b25b2f4f5a3..3bb7f69306ef 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>> @@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job 
>>> *sched_job)
>>> atomic_inc(entity->rq->sched->score);
>>> WRITE_ONCE(entity->last_user, current->group_leader);
>>> first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node);
>>> +   sched_job->submit_ts = ktime_get();
>>> +
>>>   
>>> /* first job wakes up scheduler */
>>> if (first) {
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 68317d3a7a27..c123aa120d06 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -59,6 +59,19 @@
>>>   #define CREATE_TRACE_POINTS
>>>   #include "gpu_scheduler_trace.h"
>>>   
>>> +
>>> +
>>> +int drm_sched_policy = -1;
>>> +
>>> +/**
>>> + * DOC: sched_policy (int)
>>> + * Used to override default entites scheduling policy in a run queue.
>>> + */
>>> +MODULE_PARM_DESC(sched_policy,
>>> +   "specify schedule policy for entites on a runqueue (-1 = 
>>> auto(default) value, 0 = Round Robin,1  = use FIFO");
>>> +module_param_named(sched_policy, drm_sched_policy, int, 0444);
>> As per Christian's comments, you can drop the "auto" and perhaps leave one 
>> as the default,
>> say the RR.
>>
>> I do think it is beneficial to have a module parameter control the 
>> scheduling policy, as shown above.
> 
> 
> Christian is not against it, just against adding 'auto' here - like the 
> default.

Exactly what I said.

Also, I still think an O(1) scheduling (picking next to run) should be
what we strive for in such a FIFO patch implementation.
A FIFO mechanism is by it's nature an O(1) mechanism for picking the next
element.

Regards,
Luben

> 
> 
>>
>>> +
>>> +
>>>   #define to_drm_sched_job(sched_job)   \
>>> container_of((sched_job), struct drm_sched_job, queue_node)
>>>   
>>> @@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq 
>>> *rq,
>>>   }
>>>   
>>>   /**
>>> - * drm_sched_rq_select_entity - Select an entity which could provide a job 
>>> to run
>>> + * drm_sched_rq_select_entity_rr - Select an entity which could provide a 
>>> job to run
>>>*
>>>* @rq: scheduler run queue to check.
>>>*
>>> - * Try to find a ready entity, returns NULL if none found.
>>> + * Try to find a ready entity, in round robin manner.
>>> + *
>>> + * Returns NULL if none found.
>>>*/
>>>   static struct drm_sched_entity *
>>> -drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>>> +drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
>>>   {
>>> struct drm_sched_entity *entity;
>>>   
>>> @@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>>> return NULL;
>>>   }
>>>   
>>> +/**
>>> + * drm_sched_rq_select_entity_fifo - Select an entity which could provide 
>>> a job to run
>>> + *
>>> + * @rq: scheduler run queue to check.
>>> + *
>>> + * Try to find a ready entity, based on FIFO order of jobs arrivals.
>>> + *
>>> + * Returns NULL if none found.
>>> + */
>>> +static struct drm_sched_entity *
>>> +drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
>>> +{
>>> +   struct drm_sched_entity *tmp, *entity = NULL;
>>> +   ktime_t oldest_ts 

Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-23 Thread Andrey Grodzovsky



On 2022-08-23 12:58, Luben Tuikov wrote:

Inlined:

On 2022-08-22 16:09, Andrey Grodzovsky wrote:

Poblem: Given many entities competing for same rq on

^Problem


same scheduler an uncceptabliy long wait time for some

^unacceptably


jobs waiting stuck in rq before being picked up are
observed (seen using  GPUVis).
The issue is due to Round Robin policy used by scheduler
to pick up the next entity for execution. Under stress
of many entities and long job queus within entity some

^queues


jobs could be stack for very long time in it's entity's
queue before being popped from the queue and executed
while for other entites with samller job queues a job

^entities; smaller


might execute ealier even though that job arrived later

^earlier


then the job in the long queue.

Fix:
Add FIFO selection policy to entites in RQ, chose next enitity
on rq in such order that if job on one entity arrived
ealrier then job on another entity the first job will start
executing ealier regardless of the length of the entity's job
queue.

Signed-off-by: Andrey Grodzovsky 
Tested-by: Li Yunxiang (Teddy) 
---
  drivers/gpu/drm/scheduler/sched_entity.c |  2 +
  drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
  include/drm/gpu_scheduler.h  |  8 +++
  3 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 6b25b2f4f5a3..3bb7f69306ef 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job 
*sched_job)
atomic_inc(entity->rq->sched->score);
WRITE_ONCE(entity->last_user, current->group_leader);
first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node);
+   sched_job->submit_ts = ktime_get();
+
  
  	/* first job wakes up scheduler */

if (first) {
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 68317d3a7a27..c123aa120d06 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -59,6 +59,19 @@
  #define CREATE_TRACE_POINTS
  #include "gpu_scheduler_trace.h"
  
+

+
+int drm_sched_policy = -1;
+
+/**
+ * DOC: sched_policy (int)
+ * Used to override default entites scheduling policy in a run queue.
+ */
+MODULE_PARM_DESC(sched_policy,
+   "specify schedule policy for entites on a runqueue (-1 = 
auto(default) value, 0 = Round Robin,1  = use FIFO");
+module_param_named(sched_policy, drm_sched_policy, int, 0444);

As per Christian's comments, you can drop the "auto" and perhaps leave one as 
the default,
say the RR.

I do think it is beneficial to have a module parameter control the scheduling 
policy, as shown above.



Christian is not against it, just against adding 'auto' here - like the 
default.






+
+
  #define to_drm_sched_job(sched_job)   \
container_of((sched_job), struct drm_sched_job, queue_node)
  
@@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,

  }
  
  /**

- * drm_sched_rq_select_entity - Select an entity which could provide a job to 
run
+ * drm_sched_rq_select_entity_rr - Select an entity which could provide a job 
to run
   *
   * @rq: scheduler run queue to check.
   *
- * Try to find a ready entity, returns NULL if none found.
+ * Try to find a ready entity, in round robin manner.
+ *
+ * Returns NULL if none found.
   */
  static struct drm_sched_entity *
-drm_sched_rq_select_entity(struct drm_sched_rq *rq)
+drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
  {
struct drm_sched_entity *entity;
  
@@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)

return NULL;
  }
  
+/**

+ * drm_sched_rq_select_entity_fifo - Select an entity which could provide a 
job to run
+ *
+ * @rq: scheduler run queue to check.
+ *
+ * Try to find a ready entity, based on FIFO order of jobs arrivals.
+ *
+ * Returns NULL if none found.
+ */
+static struct drm_sched_entity *
+drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
+{
+   struct drm_sched_entity *tmp, *entity = NULL;
+   ktime_t oldest_ts = KTIME_MAX;
+   struct drm_sched_job *sched_job;
+
+   spin_lock(&rq->lock);
+
+   list_for_each_entry(tmp, &rq->entities, list) {
+
+   if (drm_sched_entity_is_ready(tmp)) {
+   sched_job = 
to_drm_sched_job(spsc_queue_peek(&tmp->job_queue));
+
+   if (ktime_before(sched_job->submit_ts, oldest_ts)) {
+   oldest_ts = sched_job->submit_ts;
+   entity = tmp;
+   }
+   }
+   }

Here I think we need an O(1) lookup of the next job to pick out to run.
I see a number of optimizations, for instance keeping the current/oldest
timestamp in the rq struct itself,



This was my original design

Re: [RFC v4 00/17] drm/display/dp_mst: Drop Radeon MST support, make MST atomic-only

2022-08-23 Thread Lyude Paul
Would anyone have any issues if I merged this today? The whole series is
acked, but I'm not sure if we would like to wait for R-b's?


On Wed, 2022-08-17 at 15:38 -0400, Lyude Paul wrote:
> For quite a while we've been carrying around a lot of legacy modesetting
> code in the MST helpers that has been rather annoying to keep around,
> and very often gets in the way of trying to implement additional
> functionality in MST such as fallback link rate retraining, dynamic BPC
> management and DSC support, etc. because of the fact that we can't rely
> on atomic for everything.
> 
> Luckily, we only actually have one user of the legacy MST code in the
> kernel - radeon. Originally I was thinking of trying to maintain this
> code and keep it around in some form, but I'm pretty unconvinced anyone
> is actually using this. My reasoning for that is because I've seen
> nearly no issues regarding MST on radeon for quite a while now - despite
> the fact my local testing seems to indicate it's quite broken. This
> isn't too surprising either, as MST support in radeon.ko is gated behind
> a module parameter that isn't enabled by default. This isn't to say I
> wouldn't be open to alternative suggestions, but I'd rather not be the
> one to have to spend time on that if at all possible! Plus, I already
> floated the idea of dropping this code by AMD folks a few times and
> didn't get much resistance.
> 
> As well, this series has some basic refactoring that I did along the way
> and some bugs I had to fix in order to get my atomic-only MST code
> working. Most of this is pretty straight forward and simply renaming
> things to more closely match the DisplayPort specification, as I think
> this will also make maintaining this code a lot easier in the long run
> (I've gotten myself confused way too many times because of this).
> 
> So far I've tested this on all three MST drivers: amdgpu, i915 and
> nouveau, along with making sure that removing the radeon MST code
> doesn't break anything else. The one thing I very much could use help
> with regarding testing though is making sure that this works with
> amdgpu's DSC support on MST.
> 
> So, with this we should be using the atomic state as much as possible
> with MST modesetting, hooray!
> 
> V4:
> * Get rid of fix that Wayne pointed out isn't needed
> 
> Cc: Wayne Lin 
> Cc: Ville Syrjälä 
> Cc: Fangzhi Zuo 
> Cc: Jani Nikula 
> Cc: Imre Deak 
> Cc: Daniel Vetter 
> Cc: Sean Paul 
> 
> Lyude Paul (17):
>   drm/amdgpu/dc/mst: Rename dp_mst_stream_allocation(_table)
>   drm/amdgpu/dm/mst: Rename get_payload_table()
>   drm/display/dp_mst: Rename drm_dp_mst_vcpi_allocation
>   drm/display/dp_mst: Call them time slots, not VCPI slots
>   drm/display/dp_mst: Fix confusing docs for
> drm_dp_atomic_release_time_slots()
>   drm/display/dp_mst: Add some missing kdocs for atomic MST structs
>   drm/display/dp_mst: Add helper for finding payloads in atomic MST
> state
>   drm/display/dp_mst: Add nonblocking helpers for DP MST
>   drm/display/dp_mst: Don't open code modeset checks for releasing time
> slots
>   drm/display/dp_mst: Fix modeset tracking in
> drm_dp_atomic_release_vcpi_slots()
>   drm/nouveau/kms: Cache DP encoders in nouveau_connector
>   drm/nouveau/kms: Pull mst state in for all modesets
>   drm/display/dp_mst: Add helpers for serializing SST <-> MST
> transitions
>   drm/display/dp_mst: Drop all ports from topology on CSNs before
> queueing link address work
>   drm/display/dp_mst: Maintain time slot allocations when deleting
> payloads
>   drm/radeon: Drop legacy MST support
>   drm/display/dp_mst: Move all payload info into the atomic state
> 
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |   68 +-
>  .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c |  108 +-
>  .../display/amdgpu_dm/amdgpu_dm_mst_types.c   |  125 +-
>  drivers/gpu/drm/amd/display/dc/core/dc_link.c |   10 +-
>  drivers/gpu/drm/amd/display/dc/dm_helpers.h   |4 +-
>  .../amd/display/include/link_service_types.h  |   14 +-
>  drivers/gpu/drm/display/drm_dp_mst_topology.c | 1137 -
>  drivers/gpu/drm/i915/display/intel_display.c  |6 +
>  drivers/gpu/drm/i915/display/intel_dp.c   |9 +
>  drivers/gpu/drm/i915/display/intel_dp_mst.c   |   91 +-
>  drivers/gpu/drm/i915/display/intel_hdcp.c |   24 +-
>  drivers/gpu/drm/nouveau/dispnv50/disp.c   |  197 ++-
>  drivers/gpu/drm/nouveau/dispnv50/disp.h   |2 +
>  drivers/gpu/drm/nouveau/nouveau_connector.c   |   18 +-
>  drivers/gpu/drm/nouveau/nouveau_connector.h   |3 +
>  drivers/gpu/drm/radeon/Makefile   |2 +-
>  drivers/gpu/drm/radeon/atombios_crtc.c|   11 +-
>  drivers/gpu/drm/radeon/atombios_encoders.c|   59 -
>  drivers/gpu/drm/radeon/radeon_atombios.c  |2 -
>  drivers/gpu/drm/radeon/radeon_connectors.c|   61 +-
>  drivers/gpu/drm/radeon/radeon_device.c|1 -
>  drivers/gpu/drm/radeon/radeon_dp_mst.c|  778 ---
>  drive

Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-23 Thread Luben Tuikov
Inlined:

On 2022-08-22 16:09, Andrey Grodzovsky wrote:
> Poblem: Given many entities competing for same rq on
^Problem

> same scheduler an uncceptabliy long wait time for some
^unacceptably

> jobs waiting stuck in rq before being picked up are
> observed (seen using  GPUVis).
> The issue is due to Round Robin policy used by scheduler
> to pick up the next entity for execution. Under stress
> of many entities and long job queus within entity some
^queues

> jobs could be stack for very long time in it's entity's
> queue before being popped from the queue and executed
> while for other entites with samller job queues a job
^entities; smaller

> might execute ealier even though that job arrived later
^earlier

> then the job in the long queue.
> 
> Fix:
> Add FIFO selection policy to entites in RQ, chose next enitity
> on rq in such order that if job on one entity arrived
> ealrier then job on another entity the first job will start
> executing ealier regardless of the length of the entity's job
> queue.
> 
> Signed-off-by: Andrey Grodzovsky 
> Tested-by: Li Yunxiang (Teddy) 
> ---
>  drivers/gpu/drm/scheduler/sched_entity.c |  2 +
>  drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
>  include/drm/gpu_scheduler.h  |  8 +++
>  3 files changed, 71 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
> b/drivers/gpu/drm/scheduler/sched_entity.c
> index 6b25b2f4f5a3..3bb7f69306ef 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job 
> *sched_job)
>   atomic_inc(entity->rq->sched->score);
>   WRITE_ONCE(entity->last_user, current->group_leader);
>   first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node);
> + sched_job->submit_ts = ktime_get();
> +
>  
>   /* first job wakes up scheduler */
>   if (first) {
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index 68317d3a7a27..c123aa120d06 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -59,6 +59,19 @@
>  #define CREATE_TRACE_POINTS
>  #include "gpu_scheduler_trace.h"
>  
> +
> +
> +int drm_sched_policy = -1;
> +
> +/**
> + * DOC: sched_policy (int)
> + * Used to override default entites scheduling policy in a run queue.
> + */
> +MODULE_PARM_DESC(sched_policy,
> + "specify schedule policy for entites on a runqueue (-1 = 
> auto(default) value, 0 = Round Robin,1  = use FIFO");
> +module_param_named(sched_policy, drm_sched_policy, int, 0444);

As per Christian's comments, you can drop the "auto" and perhaps leave one as 
the default,
say the RR.

I do think it is beneficial to have a module parameter control the scheduling 
policy, as shown above.

> +
> +
>  #define to_drm_sched_job(sched_job)  \
>   container_of((sched_job), struct drm_sched_job, queue_node)
>  
> @@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>  }
>  
>  /**
> - * drm_sched_rq_select_entity - Select an entity which could provide a job 
> to run
> + * drm_sched_rq_select_entity_rr - Select an entity which could provide a 
> job to run
>   *
>   * @rq: scheduler run queue to check.
>   *
> - * Try to find a ready entity, returns NULL if none found.
> + * Try to find a ready entity, in round robin manner.
> + *
> + * Returns NULL if none found.
>   */
>  static struct drm_sched_entity *
> -drm_sched_rq_select_entity(struct drm_sched_rq *rq)
> +drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
>  {
>   struct drm_sched_entity *entity;
>  
> @@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>   return NULL;
>  }
>  
> +/**
> + * drm_sched_rq_select_entity_fifo - Select an entity which could provide a 
> job to run
> + *
> + * @rq: scheduler run queue to check.
> + *
> + * Try to find a ready entity, based on FIFO order of jobs arrivals.
> + *
> + * Returns NULL if none found.
> + */
> +static struct drm_sched_entity *
> +drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
> +{
> + struct drm_sched_entity *tmp, *entity = NULL;
> + ktime_t oldest_ts = KTIME_MAX;
> + struct drm_sched_job *sched_job;
> +
> + spin_lock(&rq->lock);
> +
> + list_for_each_entry(tmp, &rq->entities, list) {
> +
> + if (drm_sched_entity_is_ready(tmp)) {
> + sched_job = 
> to_drm_sched_job(spsc_queue_peek(&tmp->job_queue));
> +
> + if (ktime_before(sched_job->submit_ts, oldest_ts)) {
> + oldest_ts = sched_job->submit_ts;
> + entity = tmp;
> + }
> + }
> + }

Here I think we need an O(1) lookup of the next job to pick out to run.
I see a number of optimizations, for instance keeping the current/oldest
timestamp in the rq struct it

Re: [PATCH] drm/amdgpu: Fix page table setup on Arcturus

2022-08-23 Thread Felix Kuehling

Am 2022-08-22 um 11:52 schrieb Mukul Joshi:

When translate_further is enabled, page table depth needs to
be updated. This was missing on Arcturus MMHUB init. This was
causing address translations to fail for SDMA user-mode queues.

Fixes: 2abf2573b1c69 ("drm/amdgpu: Enable translate_further to extend UTCL2 
reach"
Signed-off-by: Mukul Joshi 


Reviewed-by: Felix Kuehling 



---
  drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c | 12 ++--
  1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
index 6e0145b2b408..445cb06b9d26 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
@@ -295,9 +295,17 @@ static void mmhub_v9_4_disable_identity_aperture(struct 
amdgpu_device *adev,
  static void mmhub_v9_4_setup_vmid_config(struct amdgpu_device *adev, int 
hubid)
  {
struct amdgpu_vmhub *hub = &adev->vmhub[AMDGPU_MMHUB_0];
+   unsigned int num_level, block_size;
uint32_t tmp;
int i;
  
+	num_level = adev->vm_manager.num_level;

+   block_size = adev->vm_manager.block_size;
+   if (adev->gmc.translate_further)
+   num_level -= 1;
+   else
+   block_size -= 9;
+
for (i = 0; i <= 14; i++) {
tmp = RREG32_SOC15_OFFSET(MMHUB, 0, mmVML2VC0_VM_CONTEXT1_CNTL,
hubid * MMHUB_INSTANCE_REGISTER_OFFSET + i);
@@ -305,7 +313,7 @@ static void mmhub_v9_4_setup_vmid_config(struct 
amdgpu_device *adev, int hubid)
ENABLE_CONTEXT, 1);
tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
PAGE_TABLE_DEPTH,
-   adev->vm_manager.num_level);
+   num_level);
tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
RANGE_PROTECTION_FAULT_ENABLE_DEFAULT, 1);
tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
@@ -323,7 +331,7 @@ static void mmhub_v9_4_setup_vmid_config(struct 
amdgpu_device *adev, int hubid)
EXECUTE_PROTECTION_FAULT_ENABLE_DEFAULT, 1);
tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
PAGE_TABLE_BLOCK_SIZE,
-   adev->vm_manager.block_size - 9);
+   block_size);
/* Send no-retry XNACK on fault to suppress VM fault storm. */
tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
RETRY_PERMISSION_OR_INVALID_PAGE_FAULT,


Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-23 Thread Andrey Grodzovsky

On 2022-08-23 08:15, Christian König wrote:




Am 22.08.22 um 22:09 schrieb Andrey Grodzovsky:

Poblem: Given many entities competing for same rq on
same scheduler an uncceptabliy long wait time for some
jobs waiting stuck in rq before being picked up are
observed (seen using  GPUVis).
The issue is due to Round Robin policy used by scheduler
to pick up the next entity for execution. Under stress
of many entities and long job queus within entity some
jobs could be stack for very long time in it's entity's
queue before being popped from the queue and executed
while for other entites with samller job queues a job
might execute ealier even though that job arrived later
then the job in the long queue.

Fix:
Add FIFO selection policy to entites in RQ, chose next enitity
on rq in such order that if job on one entity arrived
ealrier then job on another entity the first job will start
executing ealier regardless of the length of the entity's job
queue.

Signed-off-by: Andrey Grodzovsky 
Tested-by: Li Yunxiang (Teddy) 
---
  drivers/gpu/drm/scheduler/sched_entity.c |  2 +
  drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
  include/drm/gpu_scheduler.h  |  8 +++
  3 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c

index 6b25b2f4f5a3..3bb7f69306ef 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct 
drm_sched_job *sched_job)

  atomic_inc(entity->rq->sched->score);
  WRITE_ONCE(entity->last_user, current->group_leader);
  first = spsc_queue_push(&entity->job_queue, 
&sched_job->queue_node);

+    sched_job->submit_ts = ktime_get();
+
    /* first job wakes up scheduler */
  if (first) {
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c

index 68317d3a7a27..c123aa120d06 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -59,6 +59,19 @@
  #define CREATE_TRACE_POINTS
  #include "gpu_scheduler_trace.h"
  +
+
+int drm_sched_policy = -1;
+
+/**
+ * DOC: sched_policy (int)
+ * Used to override default entites scheduling policy in a run queue.
+ */
+MODULE_PARM_DESC(sched_policy,
+    "specify schedule policy for entites on a runqueue (-1 = 
auto(default) value, 0 = Round Robin,1  = use FIFO");


Well we don't really have an autodetect at the moment, so I would drop 
that.



+module_param_named(sched_policy, drm_sched_policy, int, 0444);
+
+
  #define to_drm_sched_job(sched_job)    \
  container_of((sched_job), struct drm_sched_job, queue_node)
  @@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct 
drm_sched_rq *rq,

  }
    /**
- * drm_sched_rq_select_entity - Select an entity which could provide 
a job to run
+ * drm_sched_rq_select_entity_rr - Select an entity which could 
provide a job to run

   *
   * @rq: scheduler run queue to check.
   *
- * Try to find a ready entity, returns NULL if none found.
+ * Try to find a ready entity, in round robin manner.
+ *
+ * Returns NULL if none found.
   */
  static struct drm_sched_entity *
-drm_sched_rq_select_entity(struct drm_sched_rq *rq)
+drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
  {
  struct drm_sched_entity *entity;
  @@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq 
*rq)

  return NULL;
  }
  +/**
+ * drm_sched_rq_select_entity_fifo - Select an entity which could 
provide a job to run

+ *
+ * @rq: scheduler run queue to check.
+ *
+ * Try to find a ready entity, based on FIFO order of jobs arrivals.
+ *
+ * Returns NULL if none found.
+ */
+static struct drm_sched_entity *
+drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
+{
+    struct drm_sched_entity *tmp, *entity = NULL;
+    ktime_t oldest_ts = KTIME_MAX;
+    struct drm_sched_job *sched_job;
+
+    spin_lock(&rq->lock);
+
+    list_for_each_entry(tmp, &rq->entities, list) {
+
+    if (drm_sched_entity_is_ready(tmp)) {
+    sched_job = 
to_drm_sched_job(spsc_queue_peek(&tmp->job_queue));

+
+    if (ktime_before(sched_job->submit_ts, oldest_ts)) {
+    oldest_ts = sched_job->submit_ts;
+    entity = tmp;
+    }
+    }
+    }
+
+    if (entity) {
+    rq->current_entity = entity;
+    reinit_completion(&entity->entity_idle);
+    }


That should probably be a separate function or at least outside of 
this here.


Apart from that totally straight forward implementation. Any idea how 
much extra overhead that is?


Regards,
Christian.



Well, memory wise you have the extra long for each job struct for the 
time stamp, and then for each next job extraction you have to iterate 
the entire rq to find the next entity with oldest job so always linear 
in number of entitles. Today the worst case is also O(# entities) in 
case none of them are

RE: [PATCH 00/14] DC Patches August 22, 2022

2022-08-23 Thread Wheeler, Daniel
[Public]

Hi all,
 
This week this patchset was tested on the following systems:
 
Sapphire Pulse RX5700XT 
Reference AMD RX6800
Engineering board with Ryzen 9 5900H
 
These systems were tested on the following display types: 
eDP, (1080p 60hz [4500U, 5650U, 5900H])
VGA and DVI (1680x1050 60HZ [DP to VGA/DVI, USB-C to DVI/VGA])
DP/HDMI/USB-C (1440p 170hz, 4k 60hz, 4k 144hz [Includes USB-C to DP/HDMI 
adapters])
 
MST tested with Startech MST14DP123DP and 2x 4k 60Hz displays
DSC tested with Cable Matters 101075 (DP to 3x DP), and 201375 (USB-C to 3x DP) 
with 3x 4k60 displays
 
The testing is a mix of automated and manual tests. Manual testing includes 
(but is not limited to):
Changing display configurations and settings
Benchmark testing
Feature testing (Freesync, etc.)
 
Automated testing includes (but is not limited to):
Script testing (scripts to automate some of the manual checks)
IGT testing
 
The patchset consists of the amd-staging-drm-next branch (Head commit - 
6339782c00f4e5dda276b1ecc5cb167c5789aa3a) with new patches added on top of it. 
This branch is used for both Ubuntu and Chrome OS testing (ChromeOS on a 
bi-weekly basis).

 
Tested on Ubuntu 22.04
 
Tested-by: Daniel Wheeler 
 
 
Thank you,
 
Dan Wheeler
Sr. Technologist | AMD
SW Display
--
1 Commerce Valley Dr E, Thornhill, ON L3T 7X6
amd.com

-Original Message-
From: amd-gfx  On Behalf Of brichang
Sent: August 22, 2022 5:58 AM
To: amd-gfx@lists.freedesktop.org
Cc: Wang, Chao-kai (Stylon) ; Chang, Brian 
; Li, Sun peng (Leo) ; Wentland, Harry 
; Zhuo, Qingqing (Lillian) ; 
Siqueira, Rodrigo ; Li, Roman ; 
Chiu, Solomon ; Pillai, Aurabindo 
; Lin, Wayne ; Lakha, Bhawanpreet 
; Gutierrez, Agustin ; 
Kotarac, Pavle 
Subject: [PATCH 00/14] DC Patches August 22, 2022

This DC patchset brings improvements in multiple areas. In summary, we have:

* Remove redundant check in atomic_check.
* Add log clock table for SMU.
* Add SubVP scaling.
* Add interface to track PHY state.
* Free phantom plane after removing from the context.
* Add k1/k2 driver for virtual signal for DCN32.
* Fix cursor flicker when entering PSRSU.
* Change reg offset for DCN321 and DCN32 during runtime initialization.
* Change AUX NACK behavior.
* Correct HDMI ODM combine policy.
* Fix odm 2:1 policy in 4k144 mode.
* Fix pipe split policy for RV2

Alvin Lee (2):
  drm/amd/display: Free phantom plane and stream properly
  drm/amd/display: Uncomment SubVP scaling case

Aric Cyr (1):
  drm/amd/display: 3.2.199

Aurabindo Pillai (3):
  drm/amd/display: change to runtime initialization for reg offsets for
DCN32
  drm/amd/display: change to runtime initialization for reg offsets for
DCN321
  drm/amd/display: program k1/k2 divider for virtual signal for DCN32

Derek Lai (1):
  drm/amd/display: do not change pipe split policy for RV2

Ilya Bakoulin (1):
  drm/amd/display: Change AUX NACK behavior

Leo Chen (1):
  drm/amd/display: Adding log clock table from SMU

Robin Chen (1):
  drm/amd/display: Cursor flicker when entering PSRSU

Roman Li (1):
  drm/amd/display: Remove redundant check in atomic_check

Saaem Rizvi (1):
  drm/amd/display: HDMI ODM Combine Policy Correction

Samson Tam (1):
  drm/amd/display: fix odm 2:1 policy not being applied consistently in
4k144 modes

Taimur Hassan (1):
  drm/amd/display: Add interface to track PHY state

 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |   3 -
 .../display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c  |  46 +
 .../dc/clk_mgr/dcn314/dcn314_clk_mgr.c|  48 +
 .../dc/clk_mgr/dcn315/dcn315_clk_mgr.c|  46 +
 drivers/gpu/drm/amd/display/dc/core/dc.c  |  27 +-
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  |  12 +-
 .../gpu/drm/amd/display/dc/core/dc_resource.c |   3 -
 drivers/gpu/drm/amd/display/dc/dc.h   |   3 +-
 drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c  |   5 +-
 drivers/gpu/drm/amd/display/dc/dc_link.h  |   1 +
 drivers/gpu/drm/amd/display/dc/dc_stream.h|   2 -
 drivers/gpu/drm/amd/display/dc/dce/dce_aux.c  |  12 +-
 .../display/dc/dce110/dce110_hw_sequencer.c   |   8 +-
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c |   3 +
 .../drm/amd/display/dc/dcn10/dcn10_resource.c |  18 +
 .../drm/amd/display/dc/dcn20/dcn20_hwseq.c|   9 +-
 .../drm/amd/display/dc/dcn31/dcn31_hwseq.c|   9 +-
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.c|  34 +-
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.h|   3 +
 .../gpu/drm/amd/display/dc/dcn32/dcn32_init.c |   1 +
 .../drm/amd/display/dc/dcn32/dcn32_resource.c | 658 ++--  
.../drm/amd/display/dc/dcn32/dcn32_resource.h | 997 ++
 .../amd/display/dc/dcn321/dcn321_resource.c   | 638 ++-
 .../drm/amd/display/dc/dml/calcs/dcn_calcs.c  |  32 +-
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c  |   5 +
 .../gpu/drm/amd/display/dc/inc/hw/hw_shared.h |   6 +
 .../gpu/drm/amd/display/dc/inc/hw_sequencer.h |   2 +
 27 fi

Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-23 Thread Christian König




Am 22.08.22 um 22:09 schrieb Andrey Grodzovsky:

Poblem: Given many entities competing for same rq on
same scheduler an uncceptabliy long wait time for some
jobs waiting stuck in rq before being picked up are
observed (seen using  GPUVis).
The issue is due to Round Robin policy used by scheduler
to pick up the next entity for execution. Under stress
of many entities and long job queus within entity some
jobs could be stack for very long time in it's entity's
queue before being popped from the queue and executed
while for other entites with samller job queues a job
might execute ealier even though that job arrived later
then the job in the long queue.

Fix:
Add FIFO selection policy to entites in RQ, chose next enitity
on rq in such order that if job on one entity arrived
ealrier then job on another entity the first job will start
executing ealier regardless of the length of the entity's job
queue.

Signed-off-by: Andrey Grodzovsky 
Tested-by: Li Yunxiang (Teddy) 
---
  drivers/gpu/drm/scheduler/sched_entity.c |  2 +
  drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
  include/drm/gpu_scheduler.h  |  8 +++
  3 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 6b25b2f4f5a3..3bb7f69306ef 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job 
*sched_job)
atomic_inc(entity->rq->sched->score);
WRITE_ONCE(entity->last_user, current->group_leader);
first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node);
+   sched_job->submit_ts = ktime_get();
+
  
  	/* first job wakes up scheduler */

if (first) {
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 68317d3a7a27..c123aa120d06 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -59,6 +59,19 @@
  #define CREATE_TRACE_POINTS
  #include "gpu_scheduler_trace.h"
  
+

+
+int drm_sched_policy = -1;
+
+/**
+ * DOC: sched_policy (int)
+ * Used to override default entites scheduling policy in a run queue.
+ */
+MODULE_PARM_DESC(sched_policy,
+   "specify schedule policy for entites on a runqueue (-1 = 
auto(default) value, 0 = Round Robin,1  = use FIFO");


Well we don't really have an autodetect at the moment, so I would drop that.


+module_param_named(sched_policy, drm_sched_policy, int, 0444);
+
+
  #define to_drm_sched_job(sched_job)   \
container_of((sched_job), struct drm_sched_job, queue_node)
  
@@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,

  }
  
  /**

- * drm_sched_rq_select_entity - Select an entity which could provide a job to 
run
+ * drm_sched_rq_select_entity_rr - Select an entity which could provide a job 
to run
   *
   * @rq: scheduler run queue to check.
   *
- * Try to find a ready entity, returns NULL if none found.
+ * Try to find a ready entity, in round robin manner.
+ *
+ * Returns NULL if none found.
   */
  static struct drm_sched_entity *
-drm_sched_rq_select_entity(struct drm_sched_rq *rq)
+drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
  {
struct drm_sched_entity *entity;
  
@@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)

return NULL;
  }
  
+/**

+ * drm_sched_rq_select_entity_fifo - Select an entity which could provide a 
job to run
+ *
+ * @rq: scheduler run queue to check.
+ *
+ * Try to find a ready entity, based on FIFO order of jobs arrivals.
+ *
+ * Returns NULL if none found.
+ */
+static struct drm_sched_entity *
+drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
+{
+   struct drm_sched_entity *tmp, *entity = NULL;
+   ktime_t oldest_ts = KTIME_MAX;
+   struct drm_sched_job *sched_job;
+
+   spin_lock(&rq->lock);
+
+   list_for_each_entry(tmp, &rq->entities, list) {
+
+   if (drm_sched_entity_is_ready(tmp)) {
+   sched_job = 
to_drm_sched_job(spsc_queue_peek(&tmp->job_queue));
+
+   if (ktime_before(sched_job->submit_ts, oldest_ts)) {
+   oldest_ts = sched_job->submit_ts;
+   entity = tmp;
+   }
+   }
+   }
+
+   if (entity) {
+   rq->current_entity = entity;
+   reinit_completion(&entity->entity_idle);
+   }


That should probably be a separate function or at least outside of this 
here.


Apart from that totally straight forward implementation. Any idea how 
much extra overhead that is?


Regards,
Christian.


+
+   spin_unlock(&rq->lock);
+   return entity;
+}
+
  /**
   * drm_sched_job_done - complete a job
   * @s_job: pointer to the job which is done
@@ -804,7 +858,10 @@ drm_sched_select_entity(struct drm

Re: [PATCH] drm/amdkfd: Allocate doorbells only when needed

2022-08-23 Thread Russell, Kent
No one else seems to have any thoughts on it, so Reviewed-by: Kent Russell 


 Kent


From: amd-gfx  on behalf of Russell, 
Kent 
Sent: Monday, August 22, 2022 10:10 AM
To: Kuehling, Felix ; amd-gfx@lists.freedesktop.org 

Subject: RE: [PATCH] drm/amdkfd: Allocate doorbells only when needed

[AMD Official Use Only - General]

I can throw an Acked-by: Kent Russell  since we don't 
have an RB yet.

 Kent

> -Original Message-
> From: amd-gfx  On Behalf Of Felix
> Kuehling
> Sent: Wednesday, August 3, 2022 2:56 PM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH] drm/amdkfd: Allocate doorbells only when needed
>
> Only allocate doorbells when the first queue is created on a GPU or the
> doorbells need to be mapped into CPU or GPU virtual address space. This
> avoids allocating doorbells unnecessarily and can allow more processes
> to use KFD on multi-GPU systems.
>
> Signed-off-by: Felix Kuehling 
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 13 +
>  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c |  9 +
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  5 -
>  3 files changed, 22 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 2b3d8bc8f0aa..907f4711abce 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -327,6 +327,12 @@ static int kfd_ioctl_create_queue(struct file *filep,
> struct kfd_process *p,
>goto err_bind_process;
>}
>
> + if (!pdd->doorbell_index &&
> + kfd_alloc_process_doorbells(dev, &pdd->doorbell_index) < 0) {
> + err = -ENOMEM;
> + goto err_alloc_doorbells;
> + }
> +
>/* Starting with GFX11, wptr BOs must be mapped to GART for MES to
> determine work
> * on unmapped queues for usermode queue oversubscription (no
> aggregated doorbell)
> */
> @@ -404,6 +410,7 @@ static int kfd_ioctl_create_queue(struct file *filep, 
> struct
> kfd_process *p,
>if (wptr_bo)
>amdgpu_amdkfd_free_gtt_mem(dev->adev, wptr_bo);
>  err_wptr_map_gart:
> +err_alloc_doorbells:
>  err_bind_process:
>  err_pdd:
>mutex_unlock(&p->mutex);
> @@ -1092,6 +1099,10 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file
> *filep,
>goto err_unlock;
>}
>offset = kfd_get_process_doorbells(pdd);
> + if (!offset) {
> + err = -ENOMEM;
> + goto err_unlock;
> + }
>} else if (flags & KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP) {
>if (args->size != PAGE_SIZE) {
>err = -EINVAL;
> @@ -2173,6 +2184,8 @@ static int criu_restore_memory_of_gpu(struct
> kfd_process_device *pdd,
>return -EINVAL;
>
>offset = kfd_get_process_doorbells(pdd);
> + if (!offset)
> + return -ENOMEM;
>} else if (bo_bucket->alloc_flags &
> KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP) {
>/* MMIO BOs need remapped bus address */
>if (bo_bucket->size != PAGE_SIZE) {
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> index cb3d2ccc5100..b33798f89ef0 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> @@ -157,6 +157,8 @@ int kfd_doorbell_mmap(struct kfd_dev *dev, struct
> kfd_process *process,
>
>/* Calculate physical address of doorbell */
>address = kfd_get_process_doorbells(pdd);
> + if (!address)
> + return -ENOMEM;
>vma->vm_flags |= VM_IO | VM_DONTCOPY | VM_DONTEXPAND |
> VM_NORESERVE |
>VM_DONTDUMP | VM_PFNMAP;
>
> @@ -275,6 +277,13 @@ uint64_t kfd_get_number_elems(struct kfd_dev *kfd)
>
>  phys_addr_t kfd_get_process_doorbells(struct kfd_process_device *pdd)
>  {
> + if (!pdd->doorbell_index) {
> + int r = kfd_alloc_process_doorbells(pdd->dev,
> + &pdd->doorbell_index);
> + if (r)
> + return 0;
> + }
> +
>return pdd->dev->doorbell_base +
>pdd->doorbell_index * kfd_doorbell_process_slice(pdd->dev);
>  }
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index 6c83a519b3a1..951b63677248 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -1499,11 +1499,6 @@ struct kfd_process_device
> *kfd_create_process_device_data(struct kfd_dev *dev,
>if (!pdd)
>return NULL;
>
> - if (kfd_alloc_process_doorbells(dev, &pdd->doorbell_index) < 0) {
> - pr_err("Failed to alloc doorbell for pdd\n");
> - goto err_free_pdd;
> - }
> -
>   

amd-gfx@lists.freedesktop.org

2022-08-23 Thread Maíra Canal

Hi Bernard

On 8/23/22 04:14, Bernard Zhao wrote:

This patch trf to fis cocci warning:


I believe that there are a couple of typos on this description. Maybe 
you could fixed to s/trf/try and s/fis/fix.



drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c:
2349:8-34: duplicated argument to && or ||
drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c:
3680:8-55: duplicated argument to && or ||

Signed-off-by: Bernard Zhao 


Also, it would be nice to have a changelog between the versions.

Other than those small nits,

Reviewed-by: Maíra Canal 

Best Regards,
- Maíra Canal


---
  .../gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c| 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
index cb2025771646..f99c1696a1f6 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
@@ -2346,8 +2346,7 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_l
  
  			if (mode_lib->vba.DSCEnable[k] && mode_lib->vba.ForcedOutputLinkBPP[k] != 0)

mode_lib->vba.DSCOnlyIfNecessaryWithBPP = true;
-   if ((mode_lib->vba.DSCEnable[k] || 
mode_lib->vba.DSCEnable[k])
-   && mode_lib->vba.OutputFormat[k] == 
dm_n422
+   if (mode_lib->vba.DSCEnable[k] && 
mode_lib->vba.OutputFormat[k] == dm_n422
&& !mode_lib->vba.DSC422NativeSupport)
mode_lib->vba.DSC422NativeNotSupported = true;
  
@@ -3678,7 +3677,6 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l

if (mode_lib->vba.SourcePixelFormat[k] != dm_444_64
&& mode_lib->vba.SourcePixelFormat[k] 
!= dm_444_32
&& mode_lib->vba.SourcePixelFormat[k] 
!= dm_444_16
-   && mode_lib->vba.SourcePixelFormat[k] 
!= dm_444_16
&& mode_lib->vba.SourcePixelFormat[k] 
!= dm_444_8
&& mode_lib->vba.SourcePixelFormat[k] 
!= dm_rgbe) {
if (mode_lib->vba.ViewportWidthChroma[k] > 
mode_lib->vba.SurfaceWidthC[k]


Re: [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions.

2022-08-23 Thread Christian König

Hi Bas,

I've just pushed an updated drm-exec branch to fdo which should now 
include the bo_list bug fix.


Can you please test that with Forza? I'm still fighting getting a new 
kernel on my Steamdeck.


Thanks,
Christian.

Am 22.08.22 um 01:08 schrieb Bas Nieuwenhuizen:

On Thu, Aug 18, 2022 at 3:20 PM Christian König
 wrote:

Hi Bas,

I've just pushed the branch drm-exec to my fdo repository:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fckoenig%2Flinux-drm.git&data=05%7C01%7Cchristian.koenig%40amd.com%7Ccc04d790d774485a5cbd08da83ca03f4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637967200920376906%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=8ZaXIdEQZe3oNCQtxoNjuBezB4YmPeDR2cLfXfxraZk%3D&reserved=0

This branch contains all the gang submit patches as well as the latest
drm-exec stuff. VCN3/4 video decoding has some issues on it, but that
probably shouldn't bother your work.

Hi Christian,

The drm-exec branch doesn't seem to be capable of running Forza
Horizon 5. First bad commit seems to be

commit 8bb3e919ce0109512f6631422f3fe52169836261
Author: Christian König 
Date:   Thu Jul 14 10:23:38 2022 +0200

drm/amdgpu: revert "partial revert "remove ctx->lock" v2"

This reverts commit 94f4c4965e5513ba624488f4b601d6b385635aec.

We found that the bo_list is missing a protection for its list entries.
Since that is fixed now this workaround can be removed again.

Signed-off-by: Christian König 


and

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2F497679%2F&data=05%7C01%7Cchristian.koenig%40amd.com%7Ccc04d790d774485a5cbd08da83ca03f4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637967200920376906%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=0F7jd61YEApKySKpqIgODHM1x0JB83coaHgjzFeVPoU%3D&reserved=0
 ("drm/amdgpu: Fix
use-after-free on amdgpu_bo_list mutex")

seems to fix things at that patch, but I'm not seeing the obvious
rebase over "drm/amdgpu: cleanup and reorder amdgpu_cs.c" yet (and/or
whether further issues were introduced).


Error logs:

[  124.821691] [ cut here ]
[  124.821696] WARNING: CPU: 3 PID: 2485 at
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:667
amdgpu_ttm_tt_get_user_pages+0x15c/0x190 [amdgpu]
[  124.821955] Modules linked in: uinput snd_seq_dummy snd_hrtimer
snd_seq snd_seq_device ccm algif_aead cbc des_generic libdes ecb md4
cmac algif_hash algif_skcipher af_alg bnep intel_rapl_msr
intel_rapl_common snd_soc_acp5x_mach snd_acp5x_i2s snd_acp5x_pcm_dma
edac_mce_amd kvm_amd kvm rtw88_8822ce rtw88_8822c rtw88_pci irqbypass
rapl rtw88_core pcspkr joydev mac80211 btusb s
nd_hda_codec_hdmi btrtl libarc4 snd_hda_intel btbcm btintel
snd_intel_dspcfg btmtk snd_pci_acp5x i2c_piix4 snd_soc_nau8821
snd_intel_sdw_acpi snd_rn_pci_acp3x cfg80211 bluetooth snd_soc_core
snd_hda_codec snd_acp_config snd_soc_acpi snd_pci_acp3x ecdh_generic
snd_hda_core cdc_acm mousedev snd_compress ecc rfkill snd_hwdep
ac97_bus snd_pcm_dmaengine ina2xx_adc snd_pcm kfifo_buf
spi_amd snd_timer opt3001 ina2xx snd industrialio soundcore mac_hid
acpi_cpufreq fuse ip_tables x_tables overlay ext4 crc16 mbcache jbd2
mmc_block vfat fat usbhid amdgpu drm_ttm_helper ttm agpgart drm_exec
gpu_sched i2c_algo_bit
[  124.822016]  drm_display_helper drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops drm serio_raw atkbd crct10dif_pclmul
libps2 crc32_pclmul vivaldi_fmap sdhci_pci ghash_clmulni_intel i8042
ccp cqhci sdhci aesni_intel hid_multitouch xhci_pci crypto_simd cryptd
wdat_wdt mmc_core cec sp5100_tco rng_core xhci_pci_renesas serio video
i2c_hid_acpi 8250_dw i2c_hid btrfs
blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq
dm_mirror dm_region_hash dm_log dm_mod pkcs8_key_parser crypto_user
[  124.822051] CPU: 3 PID: 2485 Comm: ForzaHorizon5.e Not tainted
5.18.0-1-neptune-00172-g067e00b76d9c #23
[  124.822054] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0105 03/21/2022
[  124.822055] RIP: 0010:amdgpu_ttm_tt_get_user_pages+0x15c/0x190 [amdgpu]
[  124.822262] Code: e1 ef c0 48 c7 c7 10 4a 0c c1 e8 5f f7 3e dd eb
9c 48 c7 c6 85 0a f6 c0 bf 02 00 00 00 e8 8c 74 e2 ff 41 be f2 ff ff
ff eb 8b <0f> 0b eb f4 41 be fd ff ff ff e9 7c ff ff ff 48 83 b8 a0 00
00 00
[  124.822264] RSP: 0018:a257827afb98 EFLAGS: 00010282
[  124.822267] RAX: 8b82240e6000 RBX: 8b8200a31100 RCX: 0001
[  124.822268] RDX: 0dc0 RSI: 8b82240e6000 RDI: 8b82a4c7e800
[  124.822269] RBP: 8b82ee809320 R08: 1000 R09: 8b82240e6000
[  124.822270] R10: 0006 R11:  R12: 8b82ee6dc9c0
[  124.822272] R13: 3188 R14: 0001 R15: 8b823face440
[  124.822273] FS:  2773f640() GS:8b852fec()
knlGS:1aba
[  124.822275] CS:  0010 DS:  ES: 

Re: [PATCH] drm/amdgpu: Fix page table setup on Arcturus

2022-08-23 Thread Christian König

Am 22.08.22 um 17:52 schrieb Mukul Joshi:

When translate_further is enabled, page table depth needs to
be updated. This was missing on Arcturus MMHUB init. This was
causing address translations to fail for SDMA user-mode queues.

Fixes: 2abf2573b1c69 ("drm/amdgpu: Enable translate_further to extend UTCL2 
reach"
Signed-off-by: Mukul Joshi 


Acked-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c | 12 ++--
  1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
index 6e0145b2b408..445cb06b9d26 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
@@ -295,9 +295,17 @@ static void mmhub_v9_4_disable_identity_aperture(struct 
amdgpu_device *adev,
  static void mmhub_v9_4_setup_vmid_config(struct amdgpu_device *adev, int 
hubid)
  {
struct amdgpu_vmhub *hub = &adev->vmhub[AMDGPU_MMHUB_0];
+   unsigned int num_level, block_size;
uint32_t tmp;
int i;
  
+	num_level = adev->vm_manager.num_level;

+   block_size = adev->vm_manager.block_size;
+   if (adev->gmc.translate_further)
+   num_level -= 1;
+   else
+   block_size -= 9;
+
for (i = 0; i <= 14; i++) {
tmp = RREG32_SOC15_OFFSET(MMHUB, 0, mmVML2VC0_VM_CONTEXT1_CNTL,
hubid * MMHUB_INSTANCE_REGISTER_OFFSET + i);
@@ -305,7 +313,7 @@ static void mmhub_v9_4_setup_vmid_config(struct 
amdgpu_device *adev, int hubid)
ENABLE_CONTEXT, 1);
tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
PAGE_TABLE_DEPTH,
-   adev->vm_manager.num_level);
+   num_level);
tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
RANGE_PROTECTION_FAULT_ENABLE_DEFAULT, 1);
tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
@@ -323,7 +331,7 @@ static void mmhub_v9_4_setup_vmid_config(struct 
amdgpu_device *adev, int hubid)
EXECUTE_PROTECTION_FAULT_ENABLE_DEFAULT, 1);
tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
PAGE_TABLE_BLOCK_SIZE,
-   adev->vm_manager.block_size - 9);
+   block_size);
/* Send no-retry XNACK on fault to suppress VM fault storm. */
tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
RETRY_PERMISSION_OR_INVALID_PAGE_FAULT,




[PATCH] drm/amd: remove possible condition with no effect (if == else)

2022-08-23 Thread Bernard Zhao
This patch fix cocci warning:
drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c:1816:6-8:
WARNING: possible condition with no effect (if == else).

Signed-off-by: Bernard Zhao 
---
 drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
index 85f32206a766..dccc9794e6a2 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
@@ -1813,8 +1813,6 @@ static bool dcn314_resource_construct(
 
if (dc->ctx->dce_environment == DCE_ENV_PRODUCTION_DRV)
dc->debug = debug_defaults_drv;
-   else if (dc->ctx->dce_environment == DCE_ENV_FPGA_MAXIMUS)
-   dc->debug = debug_defaults_diags;
else
dc->debug = debug_defaults_diags;
// Init the vm_helper
-- 
2.33.1



amd-gfx@lists.freedesktop.org

2022-08-23 Thread Bernard Zhao
This patch trf to fis cocci warning:
drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c:
2349:8-34: duplicated argument to && or ||
drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c:
3680:8-55: duplicated argument to && or ||

Signed-off-by: Bernard Zhao 
---
 drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
index cb2025771646..fa26834daf56 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
@@ -2346,8 +2346,7 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_l
 
if (mode_lib->vba.DSCEnable[k] && 
mode_lib->vba.ForcedOutputLinkBPP[k] != 0)
mode_lib->vba.DSCOnlyIfNecessaryWithBPP = true;
-   if ((mode_lib->vba.DSCEnable[k] || 
mode_lib->vba.DSCEnable[k])
-   && mode_lib->vba.OutputFormat[k] == 
dm_n422
+   if (mode_lib->vba.DSCEnable[k] && 
mode_lib->vba.OutputFormat[k] == dm_n422
&& !mode_lib->vba.DSC422NativeSupport)
mode_lib->vba.DSC422NativeNotSupported = true;
 
-- 
2.33.1



[PATCH] drm/amd: fix potential memory leak

2022-08-23 Thread Bernard Zhao
This patch fix potential memory leak (clk_src) when function run
into last return NULL.
Signed-off-by: Bernard Zhao 
---
 drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
index 85f32206a766..c7bb76a2a8c2 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
@@ -1643,6 +1643,7 @@ static struct clock_source *dcn31_clock_source_create(
}
 
BREAK_TO_DEBUGGER();
+   kfree(clk_src);
return NULL;
 }
 
-- 
2.33.1



[PATCH] drm/amdgpu: mmVM_L2_CNTL3 register not initialized correctly

2022-08-23 Thread jinsdb
From: Qu Huang 

The mmVM_L2_CNTL3 register is not assigned an initial value

Signed-off-by: Qu Huang 
---
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
index 1da2ec692057e..b8a987a032a8e 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
@@ -176,6 +176,7 @@ static void mmhub_v1_0_init_cache_regs(struct amdgpu_device 
*adev)
tmp = REG_SET_FIELD(tmp, VM_L2_CNTL2, INVALIDATE_L2_CACHE, 1);
WREG32_SOC15(MMHUB, 0, mmVM_L2_CNTL2, tmp);

+   tmp = mmVM_L2_CNTL3_DEFAULT;
if (adev->gmc.translate_further) {
tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, BANK_SELECT, 12);
tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3,
--
2.31.1



amd-gfx@lists.freedesktop.org

2022-08-23 Thread Bernard Zhao
This patch trf to fis cocci warning:
drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c:
2349:8-34: duplicated argument to && or ||
drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c:
3680:8-55: duplicated argument to && or ||

Signed-off-by: Bernard Zhao 
---
 .../gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c| 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
index cb2025771646..f99c1696a1f6 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
@@ -2346,8 +2346,7 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_l
 
if (mode_lib->vba.DSCEnable[k] && 
mode_lib->vba.ForcedOutputLinkBPP[k] != 0)
mode_lib->vba.DSCOnlyIfNecessaryWithBPP = true;
-   if ((mode_lib->vba.DSCEnable[k] || 
mode_lib->vba.DSCEnable[k])
-   && mode_lib->vba.OutputFormat[k] == 
dm_n422
+   if (mode_lib->vba.DSCEnable[k] && 
mode_lib->vba.OutputFormat[k] == dm_n422
&& !mode_lib->vba.DSC422NativeSupport)
mode_lib->vba.DSC422NativeNotSupported = true;
 
@@ -3678,7 +3677,6 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_l
if (mode_lib->vba.SourcePixelFormat[k] != dm_444_64
&& mode_lib->vba.SourcePixelFormat[k] 
!= dm_444_32
&& mode_lib->vba.SourcePixelFormat[k] 
!= dm_444_16
-   && mode_lib->vba.SourcePixelFormat[k] 
!= dm_444_16
&& mode_lib->vba.SourcePixelFormat[k] 
!= dm_444_8
&& mode_lib->vba.SourcePixelFormat[k] 
!= dm_rgbe) {
if (mode_lib->vba.ViewportWidthChroma[k] > 
mode_lib->vba.SurfaceWidthC[k]
-- 
2.33.1



[PATCH] drm/amd: fix potential memory leak

2022-08-23 Thread Bernard Zhao
This patch fix potential memory leak (clk_src) when function run
into last return NULL.

Signed-off-by: Bernard Zhao 
---
 drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
index 85f32206a766..76f263846c6b 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
@@ -1715,6 +1715,7 @@ static struct clock_source *dcn30_clock_source_create(
}
 
BREAK_TO_DEBUGGER();
+   free(clk_src);
return NULL;
 }
 
-- 
2.33.1



[PATCH] drm/amd: remove possible condition with no effect (if == else)

2022-08-23 Thread Bernard Zhao
This patch fix cocci warning:
drivers/gpu/drm/amd/display/dc/core/dc.c:3335:2-4: WARNING:
possible condition with no effect (if == else).

Signed-off-by: Bernard Zhao 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index aeecca68dea7..2d4c44083d79 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -3332,13 +3332,8 @@ static void commit_planes_for_stream(struct dc *dc,
/* Since phantom pipe programming is moved to 
post_unlock_program_front_end,
 * move the SubVP lock to after the phantom pipes have been 
setup
 */
-   if (should_lock_all_pipes && 
dc->hwss.interdependent_update_lock) {
-   if (dc->hwss.subvp_pipe_control_lock)
-   dc->hwss.subvp_pipe_control_lock(dc, context, 
false, should_lock_all_pipes, NULL, subvp_prev_use);
-   } else {
-   if (dc->hwss.subvp_pipe_control_lock)
-   dc->hwss.subvp_pipe_control_lock(dc, context, 
false, should_lock_all_pipes, NULL, subvp_prev_use);
-   }
+   if (dc->hwss.subvp_pipe_control_lock)
+   dc->hwss.subvp_pipe_control_lock(dc, context, false, 
should_lock_all_pipes, NULL, subvp_prev_use);
return;
}
 
-- 
2.33.1