On Tue, Nov 13, 2018 at 11:20 AM, Radoslaw Szkodzinski <astralst...@gmail.com> wrote: > On Tue, Nov 13, 2018 at 12:02 PM John Mitchell <sonwo...@gmail.com> wrote: >> On Tuesday, November 13, 2018 at 5:01:06 AM UTC+1, Radosław Szkodziński >> wrote: >> > On Thu, Nov 8, 2018 at 11:48 AM Marek Marczykowski-Górecki >> > <marmarek@<snip>> wrote: >> > > Yes, the main problem with GPU passthrough is complexity - those devices >> > > abuse PCI specification in so many ways you wouldn't believe... There >> > > are some workarounds, but most require qemu running in dom0, which is a >> > > security issue. >> > >> > The main actual problem, considering you would never keep the passed >> > through GPU ever in dom0 (always in vfio-pci), is how to reset the >> > device. >> > There are known bugs in Vega and Polaris implementation of PCIe reset >> > so they do not reset right - and additionally some mainboards do not >> > reprogram PCIe configuration space for secondary bridges. >> > Typically the GPU is passed in as a secondary card - skipping the >> > whole VGA passthrough thing. >> > With older driver versions, you had to also specify a ROM file >> > (containing video rom dump), but latest ones do not require it as they >> > have a fallback interface to access AtomBIOS. >> > >> > nVidia on the other hand strictly requires as much hiding of the VM as >> > possible unless you own a Quadro and always have to add the video ROM, >> > but if it's successful it just works. >> > Passing the GPU between VM instances requires belief that device reset >> > for such a complex device has no data leaks... >> > >> > On top of that, you have the problem that some IOMMU implementations >> > do not reset devices fully either. >> > Passing the GPU between VM instances securely requires belief that >> > device reset for such a complex device has no data leaks... >> > If you will never remove the GPU from VM nor stop the VM using it, >> > this shouldn't be a problem - but ensuring that in Windows world is >> > pretty hard with those silly automatic updates. >> > GPU VM, always on, never reset, with some sort of a proxy could avoid >> > this issue. I think Looking Glass project has some reasonable software >> > that can be investigated for this. >> > >> > R. >> >> Thank you for the informative reply. >> >> It appears the Vega reset bug is related to the VM presenting the card as a >> legacy PCI device rather than a PCIe device. > > AMD has said that Vega needs PSP reset issued before disabling IOMMU > otherwise its internal bus will be all sorts of messed up and > unavailable. > Alternatively you can quirk this device to not be reset and put in D3 > at all, then it will be reset by the driver on next guest boot, but > that has some security implications. > > I'm not entirely sure what is currently wrong with the IOMMU on my TR4 > X399, but 4.17 works, while 4.18+ kernel is a total failure on reset > even with new AGESA which does attempt to rewrite PCIe option ROM. > I suspect this can fix it: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3c120143f584360a13614787e23ae2cdcb5e5ccd > If not I'll go bisecting. > > Card is also messed up if passed through even after S3, also the above > is supsected. S3 actually fully resets the GPU otherwise as tested > when the GPU is booted by EFI, in which case you can use S3 to reset > it; but not if you IOMMU pass it somewhere. Likely by the same > TLB/IOVA handling bug - TLB/IOVA cannot be touched in S3 resume. > At least 4.20-rc got some real debugfs for AMD IOMMU which will help > finding out what's wrong. > >> Looking Glass appears to be Alpha stage. > > Yes, and it is not quite right way around - allows to show Windows > headless in Linux or Windows target. Mostly useful to not have a > monitor cable run to the GPU. > It seems to be mostly doing what Qubes driver is doing? Works well > enough if you want the VM visible in a window or fullscreen on another > GPU. > Perhaps even if the passed through GPU is a laptop secondary as it > needs no output mapping and thus neither framebuffer mapping nor > hardware output mux and vga switcheroo. > > Another way would be to use Virgil 3D and write a virtual Windows > driver for it or extend the Qubes one. https://virgil3d.github.io/ > Make sure to still put the GPU and vfio-virgl in a VM for any > security, scrape the GPU framebuffer from there and redisplay it > safely in host after improving Qubes framebuffer proxy to handle vsync > better. > That option would run the GPU code in Linux VM after proxying. > Relatively high potential for bugs and issues though as shader > implementation will likely have compatibility issues... Might be > harder to exploit as it's a sort of a proxy.
I would be very happy to help fund development in this area if anyone qualified & serious is interested. Regards, Jean-Philippe -- You received this message because you are subscribed to the Google Groups "qubes-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to qubes-devel+unsubscr...@googlegroups.com. To post to this group, send email to qubes-devel@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/qubes-devel/CABQWM_AZ2kp4%2B8tJ%2Bc2oZGY6ZUe1Drv4O4mE6PtYoTWwLVkZSQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.