-----Original Message-----
From: Jack Zhang <jack.zha...@amd.com> 
Sent: Monday, May 18, 2020 12:45 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Jack (Jian) <jack.zha...@amd.com>
Subject: [PATCH] drm/amdgpu fix incorrect sysfs remove behavior for xgmi

Under xgmi setup,some sysfs fail to create for the second time of kmd driver 
loading. It's due to sysfs nodes are not removed appropriately in the last 
unlod time.

Changes of this patch:
1. remove sysfs for dev_attr_xgmi_error
2. remove sysfs_link adev->dev->kobj with target name.
   And it only needs to be removed once for a xgmi setup 3. remove sysfs_link 
hive->kobj with target name

In amdgpu_xgmi_remove_device:
1. amdgpu_xgmi_sysfs_rem_dev_info needs to be run per device 2. 
amdgpu_xgmi_sysfs_destroy needs to be run on the last node of device.

Signed-off-by: Jack Zhang <jack.zha...@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
index e9e59bc..bfe2468 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
@@ -325,9 +325,17 @@ static int amdgpu_xgmi_sysfs_add_dev_info(struct 
amdgpu_device *adev,  static void amdgpu_xgmi_sysfs_rem_dev_info(struct 
amdgpu_device *adev,
                                          struct amdgpu_hive_info *hive)
 {
+       char node[10] = { 0 };
        device_remove_file(adev->dev, &dev_attr_xgmi_device_id);
-       sysfs_remove_link(&adev->dev->kobj, adev->ddev->unique);
-       sysfs_remove_link(hive->kobj, adev->ddev->unique);
+       device_remove_file(adev->dev, &dev_attr_xgmi_error);
+
+       if (adev != hive->adev) {
+               sysfs_remove_link(&adev->dev->kobj,"xgmi_hive_info");
+       }
+
+       sprintf(node, "node%d", hive->number_devices);
+       sysfs_remove_link(hive->kobj, node);
+
 }
 
 
@@ -583,14 +591,14 @@ int amdgpu_xgmi_remove_device(struct amdgpu_device *adev)
        if (!hive)
                return -EINVAL;
 
-       if (!(hive->number_devices--)) {
+       task_barrier_rem_task(&hive->tb);
+       amdgpu_xgmi_sysfs_rem_dev_info(adev, hive);
+       mutex_unlock(&hive->hive_lock);
+
+       if(!(--hive->number_devices)){
                amdgpu_xgmi_sysfs_destroy(adev, hive);
                mutex_destroy(&hive->hive_lock);
                mutex_destroy(&hive->reset_lock);
-       } else {
-               task_barrier_rem_task(&hive->tb);
-               amdgpu_xgmi_sysfs_rem_dev_info(adev, hive);
-               mutex_unlock(&hive->hive_lock);
        }
 
        return psp_xgmi_terminate(&adev->psp);
--
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to