On Wed, 31 May 2023, Jan Beulich wrote: > On 31.05.2023 00:38, Stefano Stabellini wrote: > > On Fri, 26 May 2023, Jan Beulich wrote: > >> On 25.05.2023 21:24, Stefano Stabellini wrote: > >>> On Thu, 25 May 2023, Jan Beulich wrote: > >>>> On 25.05.2023 01:37, Stefano Stabellini wrote: > >>>>> On Wed, 24 May 2023, Jan Beulich wrote: > >>>>>>>> RFC: _setup_hwdom_pci_devices()' loop may want splitting: For > >>>>>>>> modify_bars() to consistently respect BARs of hidden devices > >>>>>>>> while > >>>>>>>> setting up "normal" ones (i.e. to avoid as much as possible the > >>>>>>>> "continue" path introduced here), setting up of the former may > >>>>>>>> want > >>>>>>>> doing first. > >>>>>>> > >>>>>>> But BARs of hidden devices should be mapped into dom0 physmap? > >>>>>> > >>>>>> Yes. > >>>>> > >>>>> The BARs would be mapped read-only (not read-write), right? Otherwise we > >>>>> let dom0 access devices that belong to Xen, which doesn't seem like a > >>>>> good idea. > >>>>> > >>>>> But even if we map the BARs read-only, what is the benefit of mapping > >>>>> them to Dom0? If Dom0 loads a driver for it and the driver wants to > >>>>> initialize the device, the driver will crash because the MMIO region is > >>>>> read-only instead of read-write, right? > >>>>> > >>>>> How does this device hiding work for dom0? How does dom0 know not to > >>>>> access a device that is present on the PCI bus but is used by Xen? > >>>> > >>>> None of these are new questions - this has all been this way for PV Dom0, > >>>> and so far we've limped along quite okay. That's not to say that we > >>>> shouldn't improve things if we can, but that first requires ideas as to > >>>> how. > >>> > >>> For PV, that was OK because PV requires extensive guest modifications > >>> anyway. We only run Linux and few BSDs as Dom0. So, making the interface > >>> cleaner and reducing guest changes is nice-to-have but not critical. > >>> > >>> For PVH, this is different. One of the top reasons for AMD to work on > >>> PVH is to enable arbitrary non-Linux OSes as Dom0 (when paired with > >>> dom0less/hyperlaunch). It could be anything from Zephyr to a > >>> proprietary RTOS like VxWorks. Minimal guest changes for advanced > >>> features (e.g. Dom0 S3) might be OK but in general I think we should aim > >>> at (almost) zero guest changes. On ARM, it is already the case (with some > >>> non-upstream patches for dom0less PCI.) > >>> > >>> For this specific patch, which is necessary to enable PVH on AMD x86 in > >>> gitlab-ci, we can do anything we want to make it move faster. But > >>> medium/long term I think we should try to make non-Xen-aware PVH Dom0 > >>> possible. > >> > >> I don't think Linux could boot as PVH Dom0 without any awareness. Hence > >> I guess it's not easy to see how other OSes might. What you're after > >> looks rather like a HVM Dom0 to me, with it being unclear where the > >> external emulator then would run (in a stubdom maybe, which might be > >> possible to arrange for via the dom0less way of creating boot time > >> DomU-s) and how it would get any necessary xenstore based information. > > > > I know that Linux has lots of Xen awareness scattered everywhere so it > > is difficult to tell what's what. Leaving the PVH entry point aside for > > this discussion, what else is really needed for a Linux without > > CONFIG_XEN to boot as PVH Dom0? > > > > Same question from a different angle: let's say that we boot Zephyr or > > another RTOS as HVM Dom0, what is really required for the emulator to > > emulate? I am hoping that the answer is "nothing" except for maybe a > > UART. > > > > It comes down to how much legacy stuff the guest OS expects to find. > > Legacy stuff that would normally be emulated by QEMU. I am counting on > > the fact that a modern OS doesn't expect any of the legacy stuff (e.g. > > PIIX3/Q35/E1000) if it is not advertised in the firmware tables. > > And that's where I expect the problems start: We don't really alter > things like the DSDT and SSDTs, and we also don't parse them. So we > won't know what firmware describes there. Hence we have to expect that > any legacy device might be present in the underlying platform, and > hence would also need offering either by passing through or by > emulation. Yet we can't sensibly emulate everything in Xen itself.
I see your point, thanks for the explanation. I can see it might require some work in that area, either by removing those devices from the firmware tables (that we currently don't even parse) or by passing through those devices when possible. FYI there is also an ACPI SOT table [1] that could maybe used for this but nobody has ever used it so far. [1] https://wiki.xenproject.org/images/0/02/Status-override-table.pdf