On 10/16/20 3:56 PM, Alex Deucher wrote:
On Wed, Oct 14, 2020 at 9:53 AM Nirmoy Das <nirmoy....@amd.com> wrote:
Because of firmware bug, Raven asics can't handle jobs
scheduled to multiple compute queues. So enable only one
compute queue till we have a firmware fix.

Signed-off-by: Nirmoy Das <nirmoy....@amd.com>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |  4 ++++
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   | 11 ++++++++++-
  2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 8c9bacfdbc30..ca2ac985b300 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -195,6 +195,10 @@ static bool amdgpu_gfx_is_multipipe_capable(struct 
amdgpu_device *adev)
  bool amdgpu_gfx_is_high_priority_compute_queue(struct amdgpu_device *adev,
                                                int queue)
  {
+       /* We only enable one compute queue for Raven */
+       if (adev->asic_type == CHIP_RAVEN)
+               return false;
+
         /* Policy: make queue 0 of each pipe as high priority compute queue */
         return (queue == 0);

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 0d8e203b10ef..f3fc9ad8bc20 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -4633,7 +4633,16 @@ static int gfx_v9_0_early_init(void *handle)
                 adev->gfx.num_gfx_rings = 0;
         else
                 adev->gfx.num_gfx_rings = GFX9_NUM_GFX_RINGS;
-       adev->gfx.num_compute_rings = amdgpu_num_kcq;
+
+       /* raven firmware currently can not load balance jobs
+        * among multiple compute queues. Enable only one
+        * compute queue till we have a firmware fix.
+        */
+       if (adev->asic_type == CHIP_RAVEN)
+               adev->gfx.num_compute_rings = 1;
+       else
+               adev->gfx.num_compute_rings = amdgpu_num_kcq;
+
I would suggest something like this instead so we can override easily
for testing:

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index abddcd9dab3d..a2954b41e59d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1376,6 +1376,12 @@ static int amdgpu_device_check_arguments(struct
amdgpu_device *adev)

         if (amdgpu_num_kcq == -1) {
                 amdgpu_num_kcq = 8;
+               /* raven firmware currently can not load balance jobs
+                * among multiple compute queues. Enable only one
+                * compute queue till we have a firmware fix.
+                */
+               if (adev->asic_type == CHIP_RAVEN)
+                       amdgpu_num_kcq = 1;
         } else if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) {
                 amdgpu_num_kcq = 8;
                 dev_warn(adev->dev, "set kernel compute queue number
to 8 due to invalid parameter provided by user\n");


Thanks, this looks much better,


I will update.


Nirmoy


Alex

         gfx_v9_0_set_kiq_pm4_funcs(adev);
         gfx_v9_0_set_ring_funcs(adev);
         gfx_v9_0_set_irq_funcs(adev);
--
2.28.0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Cnirmoy.das%40amd.com%7Cc3012ca19bf149cb880608d871db5494%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637384534119165172%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Rcd6aMUMxxvDcwi695IYNvvhHfpKAq74KAOT9Vpzvmo%3D&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to