Re: [PATCH kernel v2] powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2

2018-10-18 Thread Alistair Popple
> >>> wouldn't you also need to do that somewhere? Unless the driver > >>> does it at startup? > >> > >> VFIO performs GPU reset so I'd expect the GPUs to flush its caches > >> without any software interactions. Am I hoping for too much here? > > > > Sadly you are. It's not the GPU caches that ne

Re: [PATCH kernel v2] powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2

2018-10-18 Thread Alexey Kardashevskiy
On 18/10/2018 12:05, Alistair Popple wrote: > Hi Alexey, > >>> wouldn't you also need to do that somewhere? Unless the driver >>> does it at startup? >> >> VFIO performs GPU reset so I'd expect the GPUs to flush its caches >> without any software interactions. Am I hoping for too much here? >

Re: [PATCH kernel v2] powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2

2018-10-17 Thread Alistair Popple
Hi Alexey, > > wouldn't you also need to do that somewhere? Unless the driver > > does it at startup? > > VFIO performs GPU reset so I'd expect the GPUs to flush its caches > without any software interactions. Am I hoping for too much here? Sadly you are. It's not the GPU caches that need flushi

Re: [PATCH kernel v2] powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2

2018-10-16 Thread Alexey Kardashevskiy
On 16/10/2018 18:32, Alistair Popple wrote: > On Tuesday, 16 October 2018 1:22:53 PM AEDT Alexey Kardashevskiy wrote: >> >> On 16/10/2018 13:19, Alistair Popple wrote: reset_ntl() does what npu2_dev_procedure_reset() does plus more stuff, there nothing really in npu2_dev_procedure_rese

Re: [PATCH kernel v2] powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2

2018-10-16 Thread Alistair Popple
On Tuesday, 16 October 2018 1:22:53 PM AEDT Alexey Kardashevskiy wrote: > > On 16/10/2018 13:19, Alistair Popple wrote: > >> reset_ntl() does what npu2_dev_procedure_reset() does plus more stuff, > >> there nothing really in npu2_dev_procedure_reset() which reset_ntl() > >> does not do already fro

Re: [PATCH kernel v2] powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2

2018-10-15 Thread Alexey Kardashevskiy
On 16/10/2018 13:19, Alistair Popple wrote: >> reset_ntl() does what npu2_dev_procedure_reset() does plus more stuff, >> there nothing really in npu2_dev_procedure_reset() which reset_ntl() >> does not do already from the hardware standpoint. And it did stop HMIs >> for me though. >> >> but ok,

Re: [PATCH kernel v2] powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2

2018-10-15 Thread Alistair Popple
> reset_ntl() does what npu2_dev_procedure_reset() does plus more stuff, > there nothing really in npu2_dev_procedure_reset() which reset_ntl() > does not do already from the hardware standpoint. And it did stop HMIs > for me though. > > but ok, what will be sufficient then if not reset_ntl()? Ar

Re: [PATCH kernel v2] powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2

2018-10-15 Thread Alexey Kardashevskiy
On 16/10/2018 12:44, Alistair Popple wrote: > Hi Alexey, > > On Tuesday, 16 October 2018 12:37:49 PM AEDT Alexey Kardashevskiy wrote: >> >> On 16/10/2018 11:38, Alistair Popple wrote: >>> Hi Alexey, >>> >>> Looking at the skiboot side I think we only fence the NVLink bricks as part >>> of a >>

Re: [PATCH kernel v2] powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2

2018-10-15 Thread Alistair Popple
Hi Alexey, On Tuesday, 16 October 2018 12:37:49 PM AEDT Alexey Kardashevskiy wrote: > > On 16/10/2018 11:38, Alistair Popple wrote: > > Hi Alexey, > > > > Looking at the skiboot side I think we only fence the NVLink bricks as part > > of a > > PCIe function level reset (FLR) rather than a PCI H

Re: [PATCH kernel v2] powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2

2018-10-15 Thread Alexey Kardashevskiy
On 16/10/2018 11:38, Alistair Popple wrote: > Hi Alexey, > > Looking at the skiboot side I think we only fence the NVLink bricks as part > of a > PCIe function level reset (FLR) rather than a PCI Hot or Fundamental reset > which > I believe is what the code here does. So to fence the bricks y

Re: [PATCH kernel v2] powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2

2018-10-15 Thread Alistair Popple
Hi Alexey, Looking at the skiboot side I think we only fence the NVLink bricks as part of a PCIe function level reset (FLR) rather than a PCI Hot or Fundamental reset which I believe is what the code here does. So to fence the bricks you would need to do either a FLR on the given link or alter Ski

Re: [PATCH kernel v2] powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2

2018-10-15 Thread Alexey Kardashevskiy
Ping? On 02/10/2018 13:20, Alexey Kardashevskiy wrote: > The skiboot firmware has a hot reset handler which fences the NVIDIA V100 > GPU RAM on Witherspoons and makes accesses no-op instead of throwing HMIs: > https://github.com/open-power/skiboot/commit/fca2b2b839a67 > > Now we are going to pas

[PATCH kernel v2] powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2

2018-10-01 Thread Alexey Kardashevskiy
The skiboot firmware has a hot reset handler which fences the NVIDIA V100 GPU RAM on Witherspoons and makes accesses no-op instead of throwing HMIs: https://github.com/open-power/skiboot/commit/fca2b2b839a67 Now we are going to pass V100 via VFIO which most certainly involves KVM guests which are