Dear Emily,

Thank you for your reply.

Am 09.08.2018 um 04:11 schrieb Deng, Emily:
Hi Paul,
      Answers as below:

It’d be great if you had an mail user agent (email program) that
supports quoting. Maybe you could configure Outlook(?) that way? That
would tremendously useful for the mailing list archives too.

-----Original Message-----
From: Paul Menzel <pmenzel+amd-...@molgen.mpg.de>
Sent: Wednesday, August 8, 2018 10:29 PM
To: Deng, Emily <emily.d...@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Liu, Monk <monk....@amd.com>
Subject: Re: [PATCH] drm/amdgpu/sriov: give 8s for recover vram under RUNTIME

On 08/08/18 04:13, Emily Deng wrote:

[…]

Extend the timeout for recovering vram bos from shadows on sr-iov
to cover the worst case scenario for timeslices and VFs

Under runtime, the wait fence time could be quite long when
other VFs are in exclusive mode. For example, for 4 VF, every
VF's exclusive timeout time is set to 3s, then the worst case is
9s. If the VF number is more than 4,then the worst case time will
be longer.

Nit: Missing space after the comma.

How did you get to nine seconds? Isn’t it four times three, which is
twelve?
[Emily] It is 3 times three, not 4, as the VF only need to wait the
other three VF's exclusive timeout.

Thank you, I didn’t know that.

(It looks like you missed my inline comments for the change itself
below.)

The 8s is the test data, with setting to 8s, it will pass the TDR
test for 1000 times.

SWDEV-161490

Change-Id: Ifc32d56ca7fde01b1f4fe2b0db6959b51909008a
Signed-off-by: Monk Liu <monk....@amd.com>
Signed-off-by: Emily Deng <emily.d...@amd.com>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 1d933db..ef82ad1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3124,7 +3124,7 @@ static int amdgpu_device_handle_vram_lost(struct 
amdgpu_device *adev)
        long tmo;
if (amdgpu_sriov_runtime(adev))
-               tmo = msecs_to_jiffies(amdgpu_lockup_timeout);
+               tmo = msecs_to_jiffies(8000);

Actually, I do not understand the change at all. Isn’t that a module
parameter?

```
$ git grep amdgpu_lockup_timeout
drivers/gpu/drm/amd/amdgpu/amdgpu.h:extern int amdgpu_lockup_timeout;
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:     if (amdgpu_lockup_timeout == 0) 
{
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:             amdgpu_lockup_timeout = 
10000;
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:             tmo = 
msecs_to_jiffies(amdgpu_lockup_timeout);
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:int amdgpu_lockup_timeout = 10000;
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:module_param_named(lockup_timeout, 
amdgpu_lockup_timeout, int, 0444);
drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:                      timeout = 
msecs_to_jiffies(amdgpu_lockup_timeout);
drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c:  if (amdgpu_lockup_timeout == 0)
```

So, if it’s not set, the time-out is set to 10(!) seconds anyway,
isn’t it? What am I missing?

        else
                tmo = msecs_to_jiffies(100);

In my opinion, such time-outs need to have a big FIXME added to
it, and VF’s exclusive time-outs should be drastically decreased.


Kind regards,

Paul
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to