Re: [PATCH] drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix
On Sun, Sep 26, 2021 at 9:14 AM Prike Liang wrote: > > In the s2idle stress test sdma resume fail occasionally,in the > failed case GPU is in the gfxoff state.This issue may introduce > by FSDL miss handle doorbell S/R and now temporary fix the issue > by forcing exit gfxoff for sdma resume. > > Signed-off-by: Prike Liang Reviewed-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 > 1 file changed, 8 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c > b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c > index e4a96e7e386d..81906955ef52 100644 > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c > @@ -868,6 +868,12 @@ static int sdma_v5_2_start(struct amdgpu_device *adev) > msleep(1000); > } > > + /* TODO: check whether can submit a doorbell request to raise > +* a doorbell fence to exit gfxoff. > +*/ > + if (adev->in_s0ix) > + amdgpu_gfx_off_ctrl(adev, false); > + > sdma_v5_2_soft_reset(adev); > /* unhalt the MEs */ > sdma_v5_2_enable(adev, true); > @@ -876,6 +882,8 @@ static int sdma_v5_2_start(struct amdgpu_device *adev) > > /* start the gfx rings and rlc compute queues */ > r = sdma_v5_2_gfx_resume(adev); > + if (adev->in_s0ix) > + amdgpu_gfx_off_ctrl(adev, true); > if (r) > return r; > r = sdma_v5_2_rlc_resume(adev); > -- > 2.17.1 >
[PATCH] drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix
In the s2idle stress test sdma resume fail occasionally,in the failed case GPU is in the gfxoff state.This issue may introduce by FSDL miss handle doorbell S/R and now temporary fix the issue by forcing exit gfxoff for sdma resume. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c index e4a96e7e386d..81906955ef52 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c @@ -868,6 +868,12 @@ static int sdma_v5_2_start(struct amdgpu_device *adev) msleep(1000); } + /* TODO: check whether can submit a doorbell request to raise +* a doorbell fence to exit gfxoff. +*/ + if (adev->in_s0ix) + amdgpu_gfx_off_ctrl(adev, false); + sdma_v5_2_soft_reset(adev); /* unhalt the MEs */ sdma_v5_2_enable(adev, true); @@ -876,6 +882,8 @@ static int sdma_v5_2_start(struct amdgpu_device *adev) /* start the gfx rings and rlc compute queues */ r = sdma_v5_2_gfx_resume(adev); + if (adev->in_s0ix) + amdgpu_gfx_off_ctrl(adev, true); if (r) return r; r = sdma_v5_2_rlc_resume(adev); -- 2.17.1
RE: [PATCH] drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix
[Public] Hold on there's still need further check the gfxoff control residence and will update the patch. Thanks, Prike > -Original Message- > From: Liang, Prike > Sent: Friday, September 24, 2021 1:18 PM > To: amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander ; Huang, Ray > ; Liang, Prike > Subject: [PATCH] drm/amdgpu: force exit gfxoff on sdma resume for rmb > s0ix > > In the s2idle stress test sdma resume fail occasionally,in the failed case GPU > is in the gfxoff state.This issue may introduce by FSDL miss handle doorbell > S/R and now temporary fix the issue by forcing exit gfxoff for sdma resume. > > Signed-off-by: Prike Liang > --- > drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > index 24b0195..af759ab 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > @@ -7608,6 +7608,14 @@ static int gfx_v10_0_suspend(void *handle) > > static int gfx_v10_0_resume(void *handle) { > + struct amdgpu_device *adev = (struct amdgpu_device *)handle; > + > + /* TODO: check whether can submit a doorbell request to raise > + * a doorbell fence to exit gfxoff. > + */ > + if (adev->in_s0ix) > + amdgpu_gfx_off_ctrl(adev, false); > + > return gfx_v10_0_hw_init(handle); > } > > @@ -7819,6 +7827,9 @@ static int gfx_v10_0_late_init(void *handle) > struct amdgpu_device *adev = (struct amdgpu_device *)handle; > int r; > > + if (adev->in_s0ix) > + amdgpu_gfx_off_ctrl(adev, true); > + > r = amdgpu_irq_get(adev, &adev->gfx.priv_reg_irq, 0); > if (r) > return r; > -- > 2.7.4
[PATCH] drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix
In the s2idle stress test sdma resume fail occasionally,in the failed case GPU is in the gfxoff state.This issue may introduce by FSDL miss handle doorbell S/R and now temporary fix the issue by forcing exit gfxoff for sdma resume. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index 24b0195..af759ab 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -7608,6 +7608,14 @@ static int gfx_v10_0_suspend(void *handle) static int gfx_v10_0_resume(void *handle) { + struct amdgpu_device *adev = (struct amdgpu_device *)handle; + + /* TODO: check whether can submit a doorbell request to raise +* a doorbell fence to exit gfxoff. +*/ + if (adev->in_s0ix) + amdgpu_gfx_off_ctrl(adev, false); + return gfx_v10_0_hw_init(handle); } @@ -7819,6 +7827,9 @@ static int gfx_v10_0_late_init(void *handle) struct amdgpu_device *adev = (struct amdgpu_device *)handle; int r; + if (adev->in_s0ix) +amdgpu_gfx_off_ctrl(adev, true); + r = amdgpu_irq_get(adev, &adev->gfx.priv_reg_irq, 0); if (r) return r; -- 2.7.4