Hi Andrey,

I need to understand more about pci saved state. So excluding patch 5 the series is Acked-by: Nirmoy Das <nirmoy....@amd.com>.



Regards,

Nirmoy


On 8/31/20 5:50 PM, Andrey Grodzovsky wrote:
Many PCI bus controllers are able to detect a variety of hardware PCI errors on 
the bus,
such as parity errors on the data and address buses,  A typical action taken is 
to disconnect
the affected device, halting all I/O to it. Typically, a reconnection mechanism 
is also offered,
so that the affected PCI device(s) are reset and put back into working 
condition.
In our case the reconnection mechanism is facilitated by kernel Downstream Port 
Containment (DPC)
driver which will intercept the PCIe error, remove (isolate) the faulting 
device after which it
will call into PCIe recovery code of the PCI core.
This code will call hooks which are implemented in this patchset where the 
error is
first reported at which point we block the GPU scheduler, next DPC resets the
PCI link which generates HW interrupt which is intercepted by SMU/PSP who
start executing mode1 reset of the ASIC, next step is slot reset hook is called
at which point we wait for ASIC reset to complete, restore PCI config space and 
run
HW suspend/resume sequence to resinit the ASIC.
Last hook called is resume normal operation at which point we will restart the 
GPU scheduler.

Andrey Grodzovsky (8):
   drm/amdgpu: Implement DPC recovery
   drm/amdgpu: Avoid accessing HW when suspending SW state
   drm/amdgpu: Block all job scheduling activity during DPC recovery
   drm/amdgpu: Fix SMU error failure
   drm/amdgpu: Fix consecutive DPC recovery failures.
   drm/amdgpu: Trim amdgpu_pci_slot_reset by reusing code.
   drm/amdgpu: Disable DPC for XGMI for now.
   drm/amdgpu: Minor checkpatch fix

  drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  16 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 298 ++++++++++++++++++++++++++++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |  13 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    |   6 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c    |   6 +
  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c     |  18 +-
  drivers/gpu/drm/amd/amdgpu/nv.c            |   4 +-
  drivers/gpu/drm/amd/amdgpu/soc15.c         |   4 +-
  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c     |   3 +
  9 files changed, 346 insertions(+), 22 deletions(-)

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to