Re: [PATCH] drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix

2021-09-27 Thread Alex Deucher
On Sun, Sep 26, 2021 at 9:14 AM Prike Liang  wrote:
>
> In the s2idle stress test sdma resume fail occasionally,in the
> failed case GPU is in the gfxoff state.This issue may introduce
> by FSDL miss handle doorbell S/R and now temporary fix the issue
> by forcing exit gfxoff for sdma resume.
>
> Signed-off-by: Prike Liang 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c 
> b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> index e4a96e7e386d..81906955ef52 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> @@ -868,6 +868,12 @@ static int sdma_v5_2_start(struct amdgpu_device *adev)
> msleep(1000);
> }
>
> +   /* TODO: check whether can submit a doorbell request to raise
> +* a doorbell fence to exit gfxoff.
> +*/
> +   if (adev->in_s0ix)
> +   amdgpu_gfx_off_ctrl(adev, false);
> +
> sdma_v5_2_soft_reset(adev);
> /* unhalt the MEs */
> sdma_v5_2_enable(adev, true);
> @@ -876,6 +882,8 @@ static int sdma_v5_2_start(struct amdgpu_device *adev)
>
> /* start the gfx rings and rlc compute queues */
> r = sdma_v5_2_gfx_resume(adev);
> +   if (adev->in_s0ix)
> +   amdgpu_gfx_off_ctrl(adev, true);
> if (r)
> return r;
> r = sdma_v5_2_rlc_resume(adev);
> --
> 2.17.1
>


[PATCH] drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix

2021-09-26 Thread Prike Liang
In the s2idle stress test sdma resume fail occasionally,in the
failed case GPU is in the gfxoff state.This issue may introduce
by FSDL miss handle doorbell S/R and now temporary fix the issue
by forcing exit gfxoff for sdma resume.

Signed-off-by: Prike Liang 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index e4a96e7e386d..81906955ef52 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -868,6 +868,12 @@ static int sdma_v5_2_start(struct amdgpu_device *adev)
msleep(1000);
}
 
+   /* TODO: check whether can submit a doorbell request to raise
+* a doorbell fence to exit gfxoff.
+*/
+   if (adev->in_s0ix)
+   amdgpu_gfx_off_ctrl(adev, false);
+
sdma_v5_2_soft_reset(adev);
/* unhalt the MEs */
sdma_v5_2_enable(adev, true);
@@ -876,6 +882,8 @@ static int sdma_v5_2_start(struct amdgpu_device *adev)
 
/* start the gfx rings and rlc compute queues */
r = sdma_v5_2_gfx_resume(adev);
+   if (adev->in_s0ix)
+   amdgpu_gfx_off_ctrl(adev, true);
if (r)
return r;
r = sdma_v5_2_rlc_resume(adev);
-- 
2.17.1



RE: [PATCH] drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix

2021-09-23 Thread Liang, Prike
[Public]

Hold on there's still need further check the gfxoff control residence and will 
update the patch.

Thanks,
Prike
> -Original Message-
> From: Liang, Prike 
> Sent: Friday, September 24, 2021 1:18 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Huang, Ray
> ; Liang, Prike 
> Subject: [PATCH] drm/amdgpu: force exit gfxoff on sdma resume for rmb
> s0ix
>
> In the s2idle stress test sdma resume fail occasionally,in the failed case GPU
> is in the gfxoff state.This issue may introduce by FSDL miss handle doorbell
> S/R and now temporary fix the issue by forcing exit gfxoff for sdma resume.
>
> Signed-off-by: Prike Liang 
> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 11 +++
>  1 file changed, 11 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index 24b0195..af759ab 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -7608,6 +7608,14 @@ static int gfx_v10_0_suspend(void *handle)
>
>  static int gfx_v10_0_resume(void *handle)  {
> + struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> +
> + /* TODO: check whether can submit a doorbell request to raise
> +  * a doorbell fence to exit gfxoff.
> +  */
> + if (adev->in_s0ix)
> + amdgpu_gfx_off_ctrl(adev, false);
> +
>   return gfx_v10_0_hw_init(handle);
>  }
>
> @@ -7819,6 +7827,9 @@ static int gfx_v10_0_late_init(void *handle)
>   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   int r;
>
> + if (adev->in_s0ix)
> +  amdgpu_gfx_off_ctrl(adev, true);
> +
>   r = amdgpu_irq_get(adev, &adev->gfx.priv_reg_irq, 0);
>   if (r)
>   return r;
> --
> 2.7.4



[PATCH] drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix

2021-09-23 Thread Prike Liang
In the s2idle stress test sdma resume fail occasionally,in the
failed case GPU is in the gfxoff state.This issue may introduce
by FSDL miss handle doorbell S/R and now temporary fix the issue
by forcing exit gfxoff for sdma resume.

Signed-off-by: Prike Liang 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 24b0195..af759ab 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7608,6 +7608,14 @@ static int gfx_v10_0_suspend(void *handle)
 
 static int gfx_v10_0_resume(void *handle)
 {
+   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+
+   /* TODO: check whether can submit a doorbell request to raise
+* a doorbell fence to exit gfxoff.
+*/
+   if (adev->in_s0ix)
+   amdgpu_gfx_off_ctrl(adev, false);
+
return gfx_v10_0_hw_init(handle);
 }
 
@@ -7819,6 +7827,9 @@ static int gfx_v10_0_late_init(void *handle)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
int r;
 
+   if (adev->in_s0ix)
+amdgpu_gfx_off_ctrl(adev, true);
+
r = amdgpu_irq_get(adev, &adev->gfx.priv_reg_irq, 0);
if (r)
return r;
-- 
2.7.4