Am 01.06.23 um 13:14 schrieb Chong Li:
enforce process isolation between graphics and compute via using the same 
reserved vmid.

Signed-off-by: Chong Li <chong...@amd.com>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  9 +++++++++
  drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 13 ++++++++++++-
  3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index ce196badf42d..48c5c547d85a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -215,6 +215,7 @@ extern int amdgpu_force_asic_type;
  extern int amdgpu_smartshift_bias;
  extern int amdgpu_use_xgmi_p2p;
  extern int amdgpu_mtype_local;
+extern int enforce_isolation;
  #ifdef CONFIG_HSA_AMD
  extern int sched_policy;
  extern bool debug_evictions;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 3d91e123f9bd..2e0ebd92b4cf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -973,6 +973,15 @@ MODULE_PARM_DESC(
                                                4 = 
AMDGPU_CPX_PARTITION_MODE)");
  module_param_named(user_partt_mode, amdgpu_user_partt_mode, uint, 0444);
+
+/**
+ * DOC: enforce_isolation (int)
+ * enforce process isolation between graphics and compute via using the same 
reserved vmid.
+ */
+int enforce_isolation = 0;

Please move that to the other declarations above.

+module_param(enforce_isolation, int, 0444);

IIRC you can also use bool here.

+MODULE_PARM_DESC(enforce_isolation, "enforce process isolation between graphics and 
compute . 1 = On, 0 = Off");

This way you can drop the "1 = On, 0 = Off" part from the description because "enforce_isolation=on" should then be accepted on the kernel commandline as well.

+
  /* These devices are not supported by amdgpu.
   * They are supported by the mach64, r128, radeon drivers
   */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
index c991ca0b7a1c..33efa17d08ff 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
@@ -409,7 +409,7 @@ int amdgpu_vmid_grab(struct amdgpu_vm *vm, struct 
amdgpu_ring *ring,
        if (r || !idle)
                goto error;
- if (vm->reserved_vmid[vmhub]) {
+       if (vm->reserved_vmid[vmhub] || (enforce_isolation && (vmhub == 
AMDGPU_GFXHUB(0)))) {
                r = amdgpu_vmid_grab_reserved(vm, ring, job, &id, fence);
                if (r || !id)
                        goto error;
@@ -578,6 +578,17 @@ void amdgpu_vmid_mgr_init(struct amdgpu_device *adev)
                        list_add_tail(&id_mgr->ids[j].list, &id_mgr->ids_lru);
                }
        }
+
+       if (enforce_isolation) {
+               struct amdgpu_vmid_mgr *id_mgr = 
&adev->vm_manager.id_mgr[AMDGPU_GFXHUB(0)];
+               struct amdgpu_vmid *id = NULL;

Empty line between declaration and code please.

+               ++id_mgr->reserved_use_count;
+               id = list_first_entry(&id_mgr->ids_lru, struct amdgpu_vmid,
+                                       list);
+               /* Remove from normal round robin handling */
+               list_del_init(&id->list);
+               id_mgr->reserved = id;

It would be good if we don't duplicate this hunk here and in amdgpu_vmid_alloc_reserved().

We should probably cleanup amdgpu_vmid_alloc_reserved() a bit and move the check for vm->reserved_vmid into amdgpu_vm_ioctl().

This way we could call amdgpu_vmid_alloc_reserved() here as well.

Apart from that looks good from the technical side.

Regards,
Christian.

+       }
  }
/**

Reply via email to