Re: Poor windows VFIO performance, GPU stalls (bisected)

2020-07-26 Thread Chris Wilson
Quoting Alex Williamson (2020-07-26 14:30:52)
> On Sun, 26 Jul 2020 17:49:07 +1000
> Geoffrey McRae  wrote:
> 
> > Hi All,
> > 
> > The commit 22540ca3d00d2990a4148a13b92209c3dc5422db causes a Windows KVM 
> > guest running under QEMU with a VFIO passthrough GPU to randomly stall 
> > when using the GPU leading to the guest assuming that the driver has 
> > hung. Reverting this commit resolves the problem.
> 
> Please double check this commit ID, I can't find it in mainline or
> linux-next.  Thanks,

See commit aa202f1f5696 ("workqueue: don't use wq_select_unbound_cpu()
for bound works"). 22540ca3 is the cherry-pick into v5.4.26
-Chris


Re: Poor windows VFIO performance, GPU stalls (bisected)

2020-07-26 Thread Geoffrey McRae




On 2020-07-26 23:32, Geoffrey McRae wrote:

On 2020-07-26 23:30, Alex Williamson wrote:

On Sun, 26 Jul 2020 17:49:07 +1000
Geoffrey McRae  wrote:


Hi All,

The commit 22540ca3d00d2990a4148a13b92209c3dc5422db causes a Windows 
KVM
guest running under QEMU with a VFIO passthrough GPU to randomly 
stall

when using the GPU leading to the guest assuming that the driver has
hung. Reverting this commit resolves the problem.


Please double check this commit ID, I can't find it in mainline or
linux-next.  Thanks,

Alex


Confirmed:

https://github.com/torvalds/linux/commit/22540ca3d00d2990a4148a13b92209c3dc5422db


Sorry, I just noticed my error, it should be:
aa202f1f56960c60e7befaa0f49c72b8fa11b0a8





The host system is configured with the following kernel arguments 
which

may be related:
   isolcpus=0-5,24-29,6-11,30-35 rcu_nocbs=0-5,24-29,6-11,30-35

The system is an AMD Threadripper 2970WX on a Gigabyte x399 AORUS 
Gaming

7 board.
It has two GPUs each being passed through to two separate KVM guests,
one is an AMD Radeon 7 in a Linux guest, the other is a GeForce 
1080Ti

in a Windows guest.
The cores used for these two guests are isolated from the host for
performance reasons.

Any insight as to why this is occurring would be appreciated. If you
need any more information or would like to test patches please let me
know.

Kind Regards,
Geoffrey McRae
HostFission

https://hostfission.com



Re: Poor windows VFIO performance, GPU stalls (bisected)

2020-07-26 Thread Geoffrey McRae




On 2020-07-26 23:30, Alex Williamson wrote:

On Sun, 26 Jul 2020 17:49:07 +1000
Geoffrey McRae  wrote:


Hi All,

The commit 22540ca3d00d2990a4148a13b92209c3dc5422db causes a Windows 
KVM

guest running under QEMU with a VFIO passthrough GPU to randomly stall
when using the GPU leading to the guest assuming that the driver has
hung. Reverting this commit resolves the problem.


Please double check this commit ID, I can't find it in mainline or
linux-next.  Thanks,

Alex


Confirmed:

https://github.com/torvalds/linux/commit/22540ca3d00d2990a4148a13b92209c3dc5422db



The host system is configured with the following kernel arguments 
which

may be related:
   isolcpus=0-5,24-29,6-11,30-35 rcu_nocbs=0-5,24-29,6-11,30-35

The system is an AMD Threadripper 2970WX on a Gigabyte x399 AORUS 
Gaming

7 board.
It has two GPUs each being passed through to two separate KVM guests,
one is an AMD Radeon 7 in a Linux guest, the other is a GeForce 1080Ti
in a Windows guest.
The cores used for these two guests are isolated from the host for
performance reasons.

Any insight as to why this is occurring would be appreciated. If you
need any more information or would like to test patches please let me
know.

Kind Regards,
Geoffrey McRae
HostFission

https://hostfission.com



Re: Poor windows VFIO performance, GPU stalls (bisected)

2020-07-26 Thread Alex Williamson
On Sun, 26 Jul 2020 17:49:07 +1000
Geoffrey McRae  wrote:

> Hi All,
> 
> The commit 22540ca3d00d2990a4148a13b92209c3dc5422db causes a Windows KVM 
> guest running under QEMU with a VFIO passthrough GPU to randomly stall 
> when using the GPU leading to the guest assuming that the driver has 
> hung. Reverting this commit resolves the problem.

Please double check this commit ID, I can't find it in mainline or
linux-next.  Thanks,

Alex
 
> The host system is configured with the following kernel arguments which 
> may be related:
>isolcpus=0-5,24-29,6-11,30-35 rcu_nocbs=0-5,24-29,6-11,30-35
> 
> The system is an AMD Threadripper 2970WX on a Gigabyte x399 AORUS Gaming 
> 7 board.
> It has two GPUs each being passed through to two separate KVM guests, 
> one is an AMD Radeon 7 in a Linux guest, the other is a GeForce 1080Ti 
> in a Windows guest.
> The cores used for these two guests are isolated from the host for 
> performance reasons.
> 
> Any insight as to why this is occurring would be appreciated. If you 
> need any more information or would like to test patches please let me 
> know.
> 
> Kind Regards,
> Geoffrey McRae
> HostFission
> 
> https://hostfission.com
> 



Poor windows VFIO performance, GPU stalls (bisected)

2020-07-26 Thread Geoffrey McRae

Hi All,

The commit 22540ca3d00d2990a4148a13b92209c3dc5422db causes a Windows KVM 
guest running under QEMU with a VFIO passthrough GPU to randomly stall 
when using the GPU leading to the guest assuming that the driver has 
hung. Reverting this commit resolves the problem.


The host system is configured with the following kernel arguments which 
may be related:

  isolcpus=0-5,24-29,6-11,30-35 rcu_nocbs=0-5,24-29,6-11,30-35

The system is an AMD Threadripper 2970WX on a Gigabyte x399 AORUS Gaming 
7 board.
It has two GPUs each being passed through to two separate KVM guests, 
one is an AMD Radeon 7 in a Linux guest, the other is a GeForce 1080Ti 
in a Windows guest.
The cores used for these two guests are isolated from the host for 
performance reasons.


Any insight as to why this is occurring would be appreciated. If you 
need any more information or would like to test patches please let me 
know.


Kind Regards,
Geoffrey McRae
HostFission

https://hostfission.com