Re: qemu and Xen ABI-unstable libs

2020-09-21 Thread Roger Pau Monné
On Mon, Sep 21, 2020 at 08:36:55AM +0100, Paul Durrant wrote:
> > -Original Message-
> > From: Xen-devel  On Behalf Of Ian 
> > Jackson
> > Sent: 18 September 2020 17:39
> > To: Debian folks: Michael Tokarev ; Hans van Kranenburg 
> > ; Xen
> > upstream folks with an interest: Andrew Cooper ; 
> > Roger Pau Monné
> > 
> > Cc: pkg-xen-de...@lists.alioth.debian.org; xen-devel@lists.xenproject.org; 
> > My Xen upstream tools co-
> > maintainer: Wei Liu 
> > Subject: RFC: qemu and Xen ABI-unstable libs
> > 
> > Hi all.  Michael Tokarev has been looking into the problem that qemu
> > is using Xen libraries with usntable ABIs.  We did an experiment to
> > see which abi-unstable symbols qemu links to, by suppressing libxc
> > from the link line.  The results are below.[1]
> > 
> > Things are not looking too bad.  After some discussion on #xendevel I
> > have tried to summarise the situation for each of the troublesome
> > symbols.
> > 
> > Also, we discovered that upstream qemu does not link against any
> > abi-unstable Xen libraries if PCI passthrough is disabled.
> > 
> > Please would my Xen colleages correct me if I have made any mistakes.
> > Michael, I hope this is helpful and clear.
> > 
> > 
> > In order from easy to hard:
> > 
> > 
> > xc_domain_shutdown
> > 
> > This call in qemu needs to be replaced with a call to the existing
> > function xendevicemodel_shutdown in libxendevicemodel.  I think it is
> > likely that this call is fixed in qemu upstream.
> > 
> 
> I just pulled QEMU master and it appears that destroy_hvm_domain() is still 
> calling xc_domain_shutdown().
> 
> > 
> > xc_get_hvm_param
> > 
> > There are three references in qemu's
> > xen_get_default_ioreq_server_info, relating to ioreq servers.  These
> > uses (and perhaps surrounding code at this function's call site)
> > should be replaced by use of xendevicemodel_create_ioreq_server
> > etc. from libxendevicemodel.  I think it is likely that this call is
> > fixed in qemu upstream.
> > 
> 
> These references are in compat code for Xen < 4.6. Use of (non-default) ioreq 
> server has been present in the code for a long time.
> We can remove them by retiring the compat code.
> 
> > 
> > xc_physdev_map_pirq
> > xc_physdev_map_pirq_msi
> > xc_physdev_unmap_pirq
> > 
> > These are all small wrappers for the PHYSDEVOP_map_pirq hypercall.
> > PHYSDEVOP is already reasonably abi-stable at the hypervisor level (in
> > theory it's versioned, but changing it would break all dom0's).
> 
> The hypercalls are non-tools and directly called from the Linux kernel code 
> so they are ABI.
> 
> > These calls could just be provided as-is by a new stable abi
> > entrypoint.  We think this should probably go in libxendevicemodel.
> > 
> 
> Rather than simply moving this calls into libxendevicemodel, we should think 
> about their interactions with calls such as
> xc_domain_bind_pt_pci_irq() below and maybe have a stable library that 
> actually provides a better API/ABI for interrupt
> mapping/triggering although...

I've thought the same when speaking with Ian about this, as (for HVM
passthrough) we use the physdev op to obtain a pirq from a physical
device interrupt source (a MSI entry in the QEMU case, because the
legacy interrupt is bound by the toolstack IIRC) and then use that
pirq to bind it to a guest lapic vector.

I think in a sense such physical interrupt abstraction (the pirq) is
helpful in order to simplify the binding, as you don't end up with a
hypercall with a massive number of parameters to identify both the
source and destination interrupt data. It's also helpful when the
guest changes the interrupt binding, as you then only update the guest
side and keep using the same pirq.

We might want however to have an interface more specific to
passthrough, such that the pirqs (or maybe we could just call them
handles) returned by such interface can only be used with guest
specific bind hypercalls?

> I've long felt PCI pass-through should not be done by QEMU anyway (not least 
> because we currently
> have no mechanism for PCI pass-through to PVH domains).

Having xenpt in tree would be fine IMO. Now we have all the proper
infrastructure in place to allow different pci devices to be handled
by different emulators IIRC, which is all that's required for this to
work correctly.

Thanks, Roger.



RE: qemu and Xen ABI-unstable libs

2020-09-21 Thread Paul Durrant
> -Original Message-
> From: Roger Pau Monné 
> Sent: 21 September 2020 11:16
> To: p...@xen.org
> Cc: 'Ian Jackson' ; 'Debian folks: Michael Tokarev' 
> ; 'Hans van
> Kranenburg' ; 'Xen upstream folks with an interest: Andrew 
> Cooper'
> ; pkg-xen-de...@lists.alioth.debian.org; 
> xen-devel@lists.xenproject.org;
> 'My Xen upstream tools co-maintainer: Wei Liu' 
> Subject: Re: qemu and Xen ABI-unstable libs
> 
> On Mon, Sep 21, 2020 at 08:36:55AM +0100, Paul Durrant wrote:
> > > -Original Message-
> > > From: Xen-devel  On Behalf Of Ian 
> > > Jackson
> > > Sent: 18 September 2020 17:39
> > > To: Debian folks: Michael Tokarev ; Hans van Kranenburg 
> > > ; Xen
> > > upstream folks with an interest: Andrew Cooper 
> > > ; Roger Pau Monné
> > > 
> > > Cc: pkg-xen-de...@lists.alioth.debian.org; 
> > > xen-devel@lists.xenproject.org; My Xen upstream tools
> co-
> > > maintainer: Wei Liu 
> > > Subject: RFC: qemu and Xen ABI-unstable libs
> > >
> > > Hi all.  Michael Tokarev has been looking into the problem that qemu
> > > is using Xen libraries with usntable ABIs.  We did an experiment to
> > > see which abi-unstable symbols qemu links to, by suppressing libxc
> > > from the link line.  The results are below.[1]
> > >
> > > Things are not looking too bad.  After some discussion on #xendevel I
> > > have tried to summarise the situation for each of the troublesome
> > > symbols.
> > >
> > > Also, we discovered that upstream qemu does not link against any
> > > abi-unstable Xen libraries if PCI passthrough is disabled.
> > >
> > > Please would my Xen colleages correct me if I have made any mistakes.
> > > Michael, I hope this is helpful and clear.
> > >
> > >
> > > In order from easy to hard:
> > >
> > >
> > > xc_domain_shutdown
> > >
> > > This call in qemu needs to be replaced with a call to the existing
> > > function xendevicemodel_shutdown in libxendevicemodel.  I think it is
> > > likely that this call is fixed in qemu upstream.
> > >
> >
> > I just pulled QEMU master and it appears that destroy_hvm_domain() is still 
> > calling
> xc_domain_shutdown().
> >
> > >
> > > xc_get_hvm_param
> > >
> > > There are three references in qemu's
> > > xen_get_default_ioreq_server_info, relating to ioreq servers.  These
> > > uses (and perhaps surrounding code at this function's call site)
> > > should be replaced by use of xendevicemodel_create_ioreq_server
> > > etc. from libxendevicemodel.  I think it is likely that this call is
> > > fixed in qemu upstream.
> > >
> >
> > These references are in compat code for Xen < 4.6. Use of (non-default) 
> > ioreq server has been
> present in the code for a long time.
> > We can remove them by retiring the compat code.
> >
> > >
> > > xc_physdev_map_pirq
> > > xc_physdev_map_pirq_msi
> > > xc_physdev_unmap_pirq
> > >
> > > These are all small wrappers for the PHYSDEVOP_map_pirq hypercall.
> > > PHYSDEVOP is already reasonably abi-stable at the hypervisor level (in
> > > theory it's versioned, but changing it would break all dom0's).
> >
> > The hypercalls are non-tools and directly called from the Linux kernel code 
> > so they are ABI.
> >
> > > These calls could just be provided as-is by a new stable abi
> > > entrypoint.  We think this should probably go in libxendevicemodel.
> > >
> >
> > Rather than simply moving this calls into libxendevicemodel, we should 
> > think about their
> interactions with calls such as
> > xc_domain_bind_pt_pci_irq() below and maybe have a stable library that 
> > actually provides a better
> API/ABI for interrupt
> > mapping/triggering although...
> 
> I've thought the same when speaking with Ian about this, as (for HVM
> passthrough) we use the physdev op to obtain a pirq from a physical
> device interrupt source (a MSI entry in the QEMU case, because the
> legacy interrupt is bound by the toolstack IIRC) and then use that
> pirq to bind it to a guest lapic vector.
> 
> I think in a sense such physical interrupt abstraction (the pirq) is
> helpful in order to simplify the binding, as you don't end up with a
> hypercall with a massive number of parameters to identify both the
> source and destination interrupt data. It's also helpful when the
> guest changes the interrupt binding, as yo

RE: qemu and Xen ABI-unstable libs

2020-09-21 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich 
> Sent: 21 September 2020 10:41
> To: p...@xen.org
> Cc: 'Ian Jackson' ; 'Debian folks: Michael Tokarev' 
> ; 'Hans van
> Kranenburg' ; 'Xen upstream folks with an interest: Andrew 
> Cooper'
> ; 'Roger Pau Monné' ; 
> pkg-xen-
> de...@lists.alioth.debian.org; xen-devel@lists.xenproject.org; 'My Xen 
> upstream tools co-maintainer:
> Wei Liu' 
> Subject: Re: qemu and Xen ABI-unstable libs
> 
> On 21.09.2020 09:36, Paul Durrant wrote:
> >> From: Xen-devel  On Behalf Of Ian 
> >> Jackson
> >> Sent: 18 September 2020 17:39
> >>
> >> xc_domain_iomem_permission
> >> xc_domain_populate_physmap_exact
> >> xc_domain_ioport_mapping
> >> xc_domain_memory_mapping
> >>
> >> The things done by these calls in qemu should be done by the Xen
> >> toolstack (libxl), during domain creation etc., instead.
> >
> > I don't think that is practical. E.g. if a guest re-programs a PCI I/O BAR 
> > then it may necessitate
> re-calling
> > xc_domain_ioport_mapping(); the tool-stack cannot know a priori where PCI 
> > BARs will end up in guest
> port/memory space.
> 
> In your reply I assume you meant just the latter two of the four?
> For these I agree, and as a result they shouldn't be domctl in
> the new model.
> 

Sorry if I wasn't clear. Yes, the latter two are what I was referring to.

  Paul

> Jan




Re: qemu and Xen ABI-unstable libs

2020-09-21 Thread Jan Beulich
On 21.09.2020 09:36, Paul Durrant wrote:
>> From: Xen-devel  On Behalf Of Ian 
>> Jackson
>> Sent: 18 September 2020 17:39
>>
>> xc_domain_iomem_permission
>> xc_domain_populate_physmap_exact
>> xc_domain_ioport_mapping
>> xc_domain_memory_mapping
>>
>> The things done by these calls in qemu should be done by the Xen
>> toolstack (libxl), during domain creation etc., instead.
> 
> I don't think that is practical. E.g. if a guest re-programs a PCI I/O BAR 
> then it may necessitate re-calling
> xc_domain_ioport_mapping(); the tool-stack cannot know a priori where PCI 
> BARs will end up in guest port/memory space.

In your reply I assume you meant just the latter two of the four?
For these I agree, and as a result they shouldn't be domctl in
the new model.

Jan



RE: qemu and Xen ABI-unstable libs

2020-09-21 Thread Paul Durrant
> -Original Message-
> From: Xen-devel  On Behalf Of Ian 
> Jackson
> Sent: 18 September 2020 17:39
> To: Debian folks: Michael Tokarev ; Hans van Kranenburg 
> ; Xen
> upstream folks with an interest: Andrew Cooper ; 
> Roger Pau Monné
> 
> Cc: pkg-xen-de...@lists.alioth.debian.org; xen-devel@lists.xenproject.org; My 
> Xen upstream tools co-
> maintainer: Wei Liu 
> Subject: RFC: qemu and Xen ABI-unstable libs
> 
> Hi all.  Michael Tokarev has been looking into the problem that qemu
> is using Xen libraries with usntable ABIs.  We did an experiment to
> see which abi-unstable symbols qemu links to, by suppressing libxc
> from the link line.  The results are below.[1]
> 
> Things are not looking too bad.  After some discussion on #xendevel I
> have tried to summarise the situation for each of the troublesome
> symbols.
> 
> Also, we discovered that upstream qemu does not link against any
> abi-unstable Xen libraries if PCI passthrough is disabled.
> 
> Please would my Xen colleages correct me if I have made any mistakes.
> Michael, I hope this is helpful and clear.
> 
> 
> In order from easy to hard:
> 
> 
> xc_domain_shutdown
> 
> This call in qemu needs to be replaced with a call to the existing
> function xendevicemodel_shutdown in libxendevicemodel.  I think it is
> likely that this call is fixed in qemu upstream.
> 

I just pulled QEMU master and it appears that destroy_hvm_domain() is still 
calling xc_domain_shutdown().

> 
> xc_get_hvm_param
> 
> There are three references in qemu's
> xen_get_default_ioreq_server_info, relating to ioreq servers.  These
> uses (and perhaps surrounding code at this function's call site)
> should be replaced by use of xendevicemodel_create_ioreq_server
> etc. from libxendevicemodel.  I think it is likely that this call is
> fixed in qemu upstream.
> 

These references are in compat code for Xen < 4.6. Use of (non-default) ioreq 
server has been present in the code for a long time.
We can remove them by retiring the compat code.

> 
> xc_physdev_map_pirq
> xc_physdev_map_pirq_msi
> xc_physdev_unmap_pirq
> 
> These are all small wrappers for the PHYSDEVOP_map_pirq hypercall.
> PHYSDEVOP is already reasonably abi-stable at the hypervisor level (in
> theory it's versioned, but changing it would break all dom0's).

The hypercalls are non-tools and directly called from the Linux kernel code so 
they are ABI.

> These calls could just be provided as-is by a new stable abi
> entrypoint.  We think this should probably go in libxendevicemodel.
> 

Rather than simply moving this calls into libxendevicemodel, we should think 
about their interactions with calls such as
xc_domain_bind_pt_pci_irq() below and maybe have a stable library that actually 
provides a better API/ABI for interrupt
mapping/triggering although... I've long felt PCI pass-through should not be 
done by QEMU anyway (not least because we currently
have no mechanism for PCI pass-through to PVH domains).

> So, what's needed is to make Xen upstream change to add versions of
> these three functions to tools/libs/devicemodel.  Change qemu to use
> them.
> 
> 
> xc_domain_iomem_permission
> xc_domain_populate_physmap_exact
> xc_domain_ioport_mapping
> xc_domain_memory_mapping
> 
> The things done by these calls in qemu should be done by the Xen
> toolstack (libxl), during domain creation etc., instead.

I don't think that is practical. E.g. if a guest re-programs a PCI I/O BAR then 
it may necessitate re-calling
xc_domain_ioport_mapping(); the tool-stack cannot know a priori where PCI BARs 
will end up in guest port/memory space.

> 
> For at least some of them, there are patches on xen-devel, see
>   From: Grzegorz Uriasz 
>   Subject: [PATCH 1/3] tools/libxl: Grant VGA IO port permission for
>stubdom/target domain
>   Date: Sun, 14 Jun 2020 23:12:01 +0100
> et seq.  These patches have been reviewed and as far as I can tell
> from the thread we are awaiting a resend.
> 

For legacy ranges, such as VGA, it is practical.

> 
> xc_set_hvm_param
> 
> Two calls both relating to HVM_PARAM_ACPI_S_STATE.
> 
> These need to be turned into DMOP hypercalls (ie, new hypercalls added
> to the hypervisor) and entrypoints provided in libxendevicemodel.
> 

Yes, this is certainly a candidate for a DM op.

> 
> xc_domain_bind_pt_pci_irq
> xc_domain_unbind_msi_irq
> xc_domain_unbind_pt_irq
> xc_domain_update_msi_irq
> 
> These are currently XEN_DOMCTL_* hypercalls.  These do not have a
> stable ABI at the hypervisor interface.  AIUI Xen hypervisor folks
> think they should be changed to use the DMOP or PHYSDEVOP hypercalls.
> 
> Additionally, we need calls for these in a userspace library with a
> stable ABI.  We think that should be libxendevicemodel.
> 

What I said above: This needs more consideration.

A while ago I hacked together xenpt 
(https://xenbits.xen.org/gitweb/?p=people/pauldu/xenpt.git), a stand-alone PCI 
pass-through
emulator. One option would be to get this into shape and pull it into the Xen