Re: [PATCH] drm/amdgpu: wait for all rings to drain before runtime suspending

2019-12-11 Thread Alex Deucher
On Wed, Dec 11, 2019 at 8:07 AM Christian König
 wrote:
>
> Am 11.12.19 um 03:26 schrieb zhoucm1:
> >
> > On 2019/12/11 上午6:08, Alex Deucher wrote:
> >> Add a safety check to runtime suspend to make sure all outstanding
> >> fences have signaled before we suspend.  Doesn't fix any known issue.
> >>
> >> We already do this via the fence driver suspend function, but we
> >> just force completion rather than bailing.  This bails on runtime
> >> suspend so we can try again later once the fences are signaled to
> >> avoid missing any outstanding work.
> >
> > The idea sounds OK to me, but if you want to drain the rings, you
> > should make sure no more submission, right?
> >
> > So you should park all schedulers before waiting for all outstanding
> > fences completed.
>
> At that point userspace should already be put to hold, so no new
> submissions. But it probably won't hurt stopping the scheduler anyway.
>

Any ioctl calls will wake the hw again or increase the usage count.

> But another issue I see is what happens if we locked up the hardware?
>

Regular GPU reset would kick in eventually.

Alex

> Christian.
>
> >
> > -David
> >
> >>
> >> Signed-off-by: Alex Deucher 
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 12 +++-
> >>   1 file changed, 11 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >> index 2f367146c72c..81322b0a8acf 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >> @@ -1214,13 +1214,23 @@ static int
> >> amdgpu_pmops_runtime_suspend(struct device *dev)
> >>   struct pci_dev *pdev = to_pci_dev(dev);
> >>   struct drm_device *drm_dev = pci_get_drvdata(pdev);
> >>   struct amdgpu_device *adev = drm_dev->dev_private;
> >> -int ret;
> >> +int ret, i;
> >> if (!adev->runpm) {
> >>   pm_runtime_forbid(dev);
> >>   return -EBUSY;
> >>   }
> >>   +/* wait for all rings to drain before suspending */
> >> +for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
> >> +struct amdgpu_ring *ring = adev->rings[i];
> >> +if (ring && ring->sched.ready) {
> >> +ret = amdgpu_fence_wait_empty(ring);
> >> +if (ret)
> >> +return -EBUSY;
> >> +}
> >> +}
> >> +
> >>   if (amdgpu_device_supports_boco(drm_dev))
> >>   drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING;
> >>   drm_kms_helper_poll_disable(drm_dev);
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: wait for all rings to drain before runtime suspending

2019-12-11 Thread Christian König

Am 11.12.19 um 03:26 schrieb zhoucm1:


On 2019/12/11 上午6:08, Alex Deucher wrote:

Add a safety check to runtime suspend to make sure all outstanding
fences have signaled before we suspend.  Doesn't fix any known issue.

We already do this via the fence driver suspend function, but we
just force completion rather than bailing.  This bails on runtime
suspend so we can try again later once the fences are signaled to
avoid missing any outstanding work.


The idea sounds OK to me, but if you want to drain the rings, you 
should make sure no more submission, right?


So you should park all schedulers before waiting for all outstanding 
fences completed.


At that point userspace should already be put to hold, so no new 
submissions. But it probably won't hurt stopping the scheduler anyway.


But another issue I see is what happens if we locked up the hardware?

Christian.



-David



Signed-off-by: Alex Deucher 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 12 +++-
  1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

index 2f367146c72c..81322b0a8acf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1214,13 +1214,23 @@ static int 
amdgpu_pmops_runtime_suspend(struct device *dev)

  struct pci_dev *pdev = to_pci_dev(dev);
  struct drm_device *drm_dev = pci_get_drvdata(pdev);
  struct amdgpu_device *adev = drm_dev->dev_private;
-    int ret;
+    int ret, i;
    if (!adev->runpm) {
  pm_runtime_forbid(dev);
  return -EBUSY;
  }
  +    /* wait for all rings to drain before suspending */
+    for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
+    struct amdgpu_ring *ring = adev->rings[i];
+    if (ring && ring->sched.ready) {
+    ret = amdgpu_fence_wait_empty(ring);
+    if (ret)
+    return -EBUSY;
+    }
+    }
+
  if (amdgpu_device_supports_boco(drm_dev))
  drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING;
  drm_kms_helper_poll_disable(drm_dev);

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: wait for all rings to drain before runtime suspending

2019-12-10 Thread zhoucm1


On 2019/12/11 上午6:08, Alex Deucher wrote:

Add a safety check to runtime suspend to make sure all outstanding
fences have signaled before we suspend.  Doesn't fix any known issue.

We already do this via the fence driver suspend function, but we
just force completion rather than bailing.  This bails on runtime
suspend so we can try again later once the fences are signaled to
avoid missing any outstanding work.


The idea sounds OK to me, but if you want to drain the rings, you should 
make sure no more submission, right?


So you should park all schedulers before waiting for all outstanding 
fences completed.


-David



Signed-off-by: Alex Deucher 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 12 +++-
  1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 2f367146c72c..81322b0a8acf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1214,13 +1214,23 @@ static int amdgpu_pmops_runtime_suspend(struct device 
*dev)
struct pci_dev *pdev = to_pci_dev(dev);
struct drm_device *drm_dev = pci_get_drvdata(pdev);
struct amdgpu_device *adev = drm_dev->dev_private;
-   int ret;
+   int ret, i;
  
  	if (!adev->runpm) {

pm_runtime_forbid(dev);
return -EBUSY;
}
  
+	/* wait for all rings to drain before suspending */

+   for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
+   struct amdgpu_ring *ring = adev->rings[i];
+   if (ring && ring->sched.ready) {
+   ret = amdgpu_fence_wait_empty(ring);
+   if (ret)
+   return -EBUSY;
+   }
+   }
+
if (amdgpu_device_supports_boco(drm_dev))
drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING;
drm_kms_helper_poll_disable(drm_dev);

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: wait for all rings to drain before runtime suspending

2019-12-10 Thread Alex Deucher
Add a safety check to runtime suspend to make sure all outstanding
fences have signaled before we suspend.  Doesn't fix any known issue.

We already do this via the fence driver suspend function, but we
just force completion rather than bailing.  This bails on runtime
suspend so we can try again later once the fences are signaled to
avoid missing any outstanding work.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 2f367146c72c..81322b0a8acf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1214,13 +1214,23 @@ static int amdgpu_pmops_runtime_suspend(struct device 
*dev)
struct pci_dev *pdev = to_pci_dev(dev);
struct drm_device *drm_dev = pci_get_drvdata(pdev);
struct amdgpu_device *adev = drm_dev->dev_private;
-   int ret;
+   int ret, i;
 
if (!adev->runpm) {
pm_runtime_forbid(dev);
return -EBUSY;
}
 
+   /* wait for all rings to drain before suspending */
+   for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
+   struct amdgpu_ring *ring = adev->rings[i];
+   if (ring && ring->sched.ready) {
+   ret = amdgpu_fence_wait_empty(ring);
+   if (ret)
+   return -EBUSY;
+   }
+   }
+
if (amdgpu_device_supports_boco(drm_dev))
drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING;
drm_kms_helper_poll_disable(drm_dev);
-- 
2.23.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx