[qemu-mainline test] 167509: tolerable FAIL - PUSHED

2021-12-21 Thread osstest service owner
flight 167509 qemu-mainline real [real]
flight 167514 qemu-mainline real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/167509/
http://logs.test-lab.xenproject.org/osstest/logs/167514/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-shadow 20 guest-localmigrate/x10 fail pass in 167514-retest

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167501
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167501
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167501
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 167501
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 167501
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167501
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 167501
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167501
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass

version targeted for testing:
 qemuu5316e12bb2b4408a1597b283ef4bb4794dd7b4f7
baseline version:
 qemuu2bf40d0841b942e7ba12953d515e62a436f0af84

Last test of basis   167501  2021-12-21 06:49:30 Z  

[linux-linus test] 167508: tolerable FAIL - PUSHED

2021-12-21 Thread osstest service owner
flight 167508 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167508/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 167500
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167500
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167500
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 167500
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167500
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 167500
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167500
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167500
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass

version targeted for testing:
 linux1c3e979bf3e225e5b4b810b24712b16254d608b6
baseline version:
 linux6e0567b7305209c2d689ce57180a63d8dc657ad8

Last test of basis   167500  2021-12-21 05:30:08 Z0 days
Testing same since   167508  2021-12-21 18:09:47 Z0 days1 attempts


People who touched revisions under test:
  Benjamin Tissoires 
  Jiasheng Jiang 
  Linus Torvalds 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64

[xen-unstable test] 167504: tolerable FAIL - PUSHED

2021-12-21 Thread osstest service owner
flight 167504 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167504/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 14 guest-start  fail REGR. vs. 167497

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 167497
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167497
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167497
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167497
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 167497
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 167497
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 167497
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167497
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 167497
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 167497
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 167497
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167497
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  f1e268b9fd13647e1f69c8ce0ae7be401d319fc8
baseline version:
 xen  8e3edefb880caeeaaf80123d5599139e8c2

Re: [RFC v1 5/5] xen/arm: add SCI mediator support for DomUs

2021-12-21 Thread Stefano Stabellini
On Tue, 21 Dec 2021, Anthony PERARD wrote:
> On Fri, Dec 17, 2021 at 12:15:25PM +, Oleksii Moisieiev wrote:
> > > On 14.12.21 11:34, Oleksii Moisieiev wrote:
> > > > @@ -1817,17 +1858,24 @@ static void libxl__add_dtdevs(libxl__egc *egc, 
> > > > libxl__ao *ao, uint32_t domid,
> > > >   {
> > > >   AO_GC;
> > > >   libxl__ao_device *aodev = libxl__multidev_prepare(multidev);
> > > > -int i, rc = 0;
> > > > +int i, rc = 0, rc_sci = 0;
> > > >   for (i = 0; i < d_config->num_dtdevs; i++) {
> > > >   const libxl_device_dtdev *dtdev = &d_config->dtdevs[i];
> > > >   LOGD(DEBUG, domid, "Assign device \"%s\" to domain", 
> > > > dtdev->path);
> > > >   rc = xc_assign_dt_device(CTX->xch, domid, dtdev->path);
> > > > -if (rc < 0) {
> > > > -LOGD(ERROR, domid, "xc_assign_dtdevice failed: %d", rc);
> > > > +rc_sci = xc_domain_add_sci_device(CTX->xch, domid, 
> > > > dtdev->path);
> > > > +
> > > > +if ((rc < 0) && (rc_sci < 0)) {
> > > > +LOGD(ERROR, domid, "xc_assign_dt_device failed: %d; "
> > > > + "xc_domain_add_sci_device failed: %d",
> > > > + rc, rc_sci);
> > > >   goto out;
> > > >   }
> > > > +
> > > > +if (rc)
> > > > +rc = rc_sci;
> > > 
> > > 
> > > If I get this code right, it sounds like the dom.cfg's dtdev property is
> > > reused to describe sci devices as well, but the libxl__add_dtdevs() does 
> > > not
> > > (and can not) differentiate them. So it has no option but to send two
> > > domctls for each device in dtdevs[] hoping to at least one domctl to
> > > succeeded. Or I really missed something?
> > > 
> > > It feels to me that:
> > >  - either new dom.cfg's property could be introduced (scidev?) to describe
> > > sci devices together with new parsing logic/management code, so you will 
> > > end
> > > up having new libxl__add_scidevs() to invoke XEN_DOMCTL_add_sci_device,
> > > so no mixing things.
> > >  - or indeed dtdev logic could be *completely* reused including extending
> > > XEN_DOMCTL_assign_device to cover your use-case, although sounds generic, 
> > > it
> > > is used to describe devices for the passthrough (to configure an IOMMU for
> > > the device), in that case you wouldn't need an extra
> > > XEN_DOMCTL_add_sci_device introduced by current patch.

I realize I did my review before reading Oleksandr's comments. I fully
agree with his feedback. Having seen how difficult is for users to setup
a domU configuration correctly today, I would advise to try to reuse the
existing dtdev property instead of adding yet one new property to make
the life of our users easier.



> > > Personally I would use the first option as I am not sure that second 
> > > option
> > > is conceptually correct && possible. I would leave this for the 
> > > maintainers
> > > to clarify.
> > > 
> > 
> > Thank you for the point. I agree that reusing XEN_DOMCTL_assign_device
> > seems not to be conceptually correct. Introducing new dom.cfg property
> > seems to be the only way to avoid extra Domctl calls.
> > I will handle this in v2 if there will be no complains from Maintainers.
> 
> I don't know if there will be a complains in v2 or not :-), but using
> something different from dtdev sound good.
> 
> If I understand correctly, dtdev seems to be about passing through an
> existing device to a guest, and this new sci device seems to be about having 
> Xen
> "emulating" an sci device which the guest will use. So introducing
> something new (scidev or other name) sounds good.

Users already have to provide 4 properties (dtdev, iomem, irqs,
device_tree) to setup device assignment. I think that making it 5
properties would not be a step forward :-)

To me dtdev and XEN_DOMCTL_assign_device are appropriate because they
signify device assignment of one or more devices. We are not adding any
additional "meaning" to them. It is just that we'll automatically detect
and generate any SCMI configurations based on the list of assigned
devices. Because if SCMI is enabled and a device is assigned to the
guest, then I think we want to add the device to the SCMI list of
devices automatically.

If we really want to introduce a new list of devices, please make it
optional so that most times the user can skip it unless really required.

Re: [RFC v1 3/5] xen/arm: introduce SCMI-SMC mediator driver

2021-12-21 Thread Stefano Stabellini
On Tue, 21 Dec 2021, Oleksii Moisieiev wrote:
> Hi Stefano,
> 
> On Mon, Dec 20, 2021 at 04:52:01PM -0800, Stefano Stabellini wrote:
> > On Mon, 20 Dec 2021, Oleksii Moisieiev wrote:
> > > Hi Stefano,
> > > 
> > > On Fri, Dec 17, 2021 at 06:14:55PM -0800, Stefano Stabellini wrote:
> > > > On Tue, 14 Dec 2021, Oleksii Moisieiev wrote:
> > > > > This is the implementation of SCI interface, called SCMI-SMC driver,
> > > > > which works as the mediator between XEN Domains and Firmware (SCP, 
> > > > > ATF etc).
> > > > > This allows devices from the Domains to work with clocks, resets and
> > > > > power-domains without access to CPG.
> > > > > 
> > > > > The following features are implemented:
> > > > > - request SCMI channels from ATF and pass channels to Domains;
> > > > > - set device permissions for Domains based on the Domain partial
> > > > > device-tree. Devices with permissions are able to work with clocks,
> > > > > resets and power-domains via SCMI;
> > > > > - redirect scmi messages from Domains to ATF.
> > > > > 
> > > > > Signed-off-by: Oleksii Moisieiev 
> > > > > ---
> > > > >  xen/arch/arm/Kconfig  |   2 +
> > > > >  xen/arch/arm/sci/Kconfig  |  10 +
> > > > >  xen/arch/arm/sci/Makefile |   1 +
> > > > >  xen/arch/arm/sci/scmi_smc.c   | 795 
> > > > > ++
> > > > >  xen/include/public/arch-arm.h |   1 +
> > > > >  5 files changed, 809 insertions(+)
> > > > >  create mode 100644 xen/arch/arm/sci/Kconfig
> > > > >  create mode 100644 xen/arch/arm/sci/scmi_smc.c
> > > > > 
> > > > > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> > > > > index 186e1db389..02d96c6cfc 100644
> > > > > --- a/xen/arch/arm/Kconfig
> > > > > +++ b/xen/arch/arm/Kconfig
> > > > > @@ -114,6 +114,8 @@ config SCI
> > > > > support. It allows guests to control system resourcess via 
> > > > > one of
> > > > > SCI mediators implemented in XEN.
> > > > >  
> > > > > +source "arch/arm/sci/Kconfig"
> > > > > +
> > > > >  endmenu
> > > > >  
> > > > >  menu "ARM errata workaround via the alternative framework"
> > > > > diff --git a/xen/arch/arm/sci/Kconfig b/xen/arch/arm/sci/Kconfig
> > > > > new file mode 100644
> > > > > index 00..9563067ddc
> > > > > --- /dev/null
> > > > > +++ b/xen/arch/arm/sci/Kconfig
> > > > > @@ -0,0 +1,10 @@
> > > > > +config SCMI_SMC
> > > > > + bool "Enable SCMI-SMC mediator driver"
> > > > > + default n
> > > > > + depends on SCI
> > > > > + ---help---
> > > > > +
> > > > > + Enables mediator in XEN to pass SCMI requests from Domains to 
> > > > > ATF.
> > > > > + This feature allows drivers from Domains to work with System
> > > > > + Controllers (such as power,resets,clock etc.). SCP is used as 
> > > > > transport
> > > > > + for communication.
> > > > > diff --git a/xen/arch/arm/sci/Makefile b/xen/arch/arm/sci/Makefile
> > > > > index 837dc7492b..67f2611872 100644
> > > > > --- a/xen/arch/arm/sci/Makefile
> > > > > +++ b/xen/arch/arm/sci/Makefile
> > > > > @@ -1 +1,2 @@
> > > > >  obj-y += sci.o
> > > > > +obj-$(CONFIG_SCMI_SMC) += scmi_smc.o
> > > > > diff --git a/xen/arch/arm/sci/scmi_smc.c b/xen/arch/arm/sci/scmi_smc.c
> > > > > new file mode 100644
> > > > > index 00..2eb01ea82d
> > > > > --- /dev/null
> > > > > +++ b/xen/arch/arm/sci/scmi_smc.c
> > > > > @@ -0,0 +1,795 @@
> > > > > +/*
> > > > > + * xen/arch/arm/sci/scmi_smc.c
> > > > > + *
> > > > > + * SCMI mediator driver, using SCP as transport.
> > > > > + *
> > > > > + * Oleksii Moisieiev 
> > > > > + * Copyright (C) 2021, EPAM Systems.
> > > > > + *
> > > > > + * This program is free software; you can redistribute it and/or 
> > > > > modify
> > > > > + * it under the terms of the GNU General Public License as published 
> > > > > by
> > > > > + * the Free Software Foundation; either version 2 of the License, or
> > > > > + * (at your option) any later version.
> > > > > + *
> > > > > + * This program is distributed in the hope that it will be useful,
> > > > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > > > + * GNU General Public License for more details.
> > > > > + */
> > > > > +
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +
> > > > > +#define SCMI_BASE_PROTOCOL  0x10
> > > > > +#define SCMI_BASE_PROTOCOL_ATTIBUTES0x1
> > > > > +#define SCMI_BASE_SET_DEVICE_PERMISSIONS0x9
> > > > > +#define SCMI_BASE_RESET_AGENT_CONFIGURATION 0xB
> > > > > +#define SCMI_BASE_DISCOVER_AGENT0x7
> > > > > +
> > > > > +/* SCMI return codes. See section 4.1.4 of SCMI spec (DEN0056C) */
> > > > >

Re: [RFC v1 3/5] xen/arm: introduce SCMI-SMC mediator driver

2021-12-21 Thread Oleksii Moisieiev
Hi Stefano,

On Mon, Dec 20, 2021 at 04:52:01PM -0800, Stefano Stabellini wrote:
> On Mon, 20 Dec 2021, Oleksii Moisieiev wrote:
> > Hi Stefano,
> > 
> > On Fri, Dec 17, 2021 at 06:14:55PM -0800, Stefano Stabellini wrote:
> > > On Tue, 14 Dec 2021, Oleksii Moisieiev wrote:
> > > > This is the implementation of SCI interface, called SCMI-SMC driver,
> > > > which works as the mediator between XEN Domains and Firmware (SCP, ATF 
> > > > etc).
> > > > This allows devices from the Domains to work with clocks, resets and
> > > > power-domains without access to CPG.
> > > > 
> > > > The following features are implemented:
> > > > - request SCMI channels from ATF and pass channels to Domains;
> > > > - set device permissions for Domains based on the Domain partial
> > > > device-tree. Devices with permissions are able to work with clocks,
> > > > resets and power-domains via SCMI;
> > > > - redirect scmi messages from Domains to ATF.
> > > > 
> > > > Signed-off-by: Oleksii Moisieiev 
> > > > ---
> > > >  xen/arch/arm/Kconfig  |   2 +
> > > >  xen/arch/arm/sci/Kconfig  |  10 +
> > > >  xen/arch/arm/sci/Makefile |   1 +
> > > >  xen/arch/arm/sci/scmi_smc.c   | 795 ++
> > > >  xen/include/public/arch-arm.h |   1 +
> > > >  5 files changed, 809 insertions(+)
> > > >  create mode 100644 xen/arch/arm/sci/Kconfig
> > > >  create mode 100644 xen/arch/arm/sci/scmi_smc.c
> > > > 
> > > > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> > > > index 186e1db389..02d96c6cfc 100644
> > > > --- a/xen/arch/arm/Kconfig
> > > > +++ b/xen/arch/arm/Kconfig
> > > > @@ -114,6 +114,8 @@ config SCI
> > > >   support. It allows guests to control system resourcess via 
> > > > one of
> > > >   SCI mediators implemented in XEN.
> > > >  
> > > > +source "arch/arm/sci/Kconfig"
> > > > +
> > > >  endmenu
> > > >  
> > > >  menu "ARM errata workaround via the alternative framework"
> > > > diff --git a/xen/arch/arm/sci/Kconfig b/xen/arch/arm/sci/Kconfig
> > > > new file mode 100644
> > > > index 00..9563067ddc
> > > > --- /dev/null
> > > > +++ b/xen/arch/arm/sci/Kconfig
> > > > @@ -0,0 +1,10 @@
> > > > +config SCMI_SMC
> > > > +   bool "Enable SCMI-SMC mediator driver"
> > > > +   default n
> > > > +   depends on SCI
> > > > +   ---help---
> > > > +
> > > > +   Enables mediator in XEN to pass SCMI requests from Domains to 
> > > > ATF.
> > > > +   This feature allows drivers from Domains to work with System
> > > > +   Controllers (such as power,resets,clock etc.). SCP is used as 
> > > > transport
> > > > +   for communication.
> > > > diff --git a/xen/arch/arm/sci/Makefile b/xen/arch/arm/sci/Makefile
> > > > index 837dc7492b..67f2611872 100644
> > > > --- a/xen/arch/arm/sci/Makefile
> > > > +++ b/xen/arch/arm/sci/Makefile
> > > > @@ -1 +1,2 @@
> > > >  obj-y += sci.o
> > > > +obj-$(CONFIG_SCMI_SMC) += scmi_smc.o
> > > > diff --git a/xen/arch/arm/sci/scmi_smc.c b/xen/arch/arm/sci/scmi_smc.c
> > > > new file mode 100644
> > > > index 00..2eb01ea82d
> > > > --- /dev/null
> > > > +++ b/xen/arch/arm/sci/scmi_smc.c
> > > > @@ -0,0 +1,795 @@
> > > > +/*
> > > > + * xen/arch/arm/sci/scmi_smc.c
> > > > + *
> > > > + * SCMI mediator driver, using SCP as transport.
> > > > + *
> > > > + * Oleksii Moisieiev 
> > > > + * Copyright (C) 2021, EPAM Systems.
> > > > + *
> > > > + * This program is free software; you can redistribute it and/or modify
> > > > + * it under the terms of the GNU General Public License as published by
> > > > + * the Free Software Foundation; either version 2 of the License, or
> > > > + * (at your option) any later version.
> > > > + *
> > > > + * This program is distributed in the hope that it will be useful,
> > > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > > + * GNU General Public License for more details.
> > > > + */
> > > > +
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +
> > > > +#define SCMI_BASE_PROTOCOL  0x10
> > > > +#define SCMI_BASE_PROTOCOL_ATTIBUTES0x1
> > > > +#define SCMI_BASE_SET_DEVICE_PERMISSIONS0x9
> > > > +#define SCMI_BASE_RESET_AGENT_CONFIGURATION 0xB
> > > > +#define SCMI_BASE_DISCOVER_AGENT0x7
> > > > +
> > > > +/* SCMI return codes. See section 4.1.4 of SCMI spec (DEN0056C) */
> > > > +#define SCMI_SUCCESS  0
> > > > +#define SCMI_NOT_SUPPORTED  (-1)
> > > > +#define SCMI_INVALID_PARAMETERS (-2)
> > > > +#define SCMI_DENIED (-3)
> > > > +#define SCMI_NOT_FOUND  (-4)
> > > > +#define SCMI_OUT_OF_RANGE   (-5)
> > > > +#define SCMI_BUSY   

Re: Possible bug? DOM-U network stopped working after fatal error reported in DOM0

2021-12-21 Thread Roger Pau Monné
On Wed, Dec 22, 2021 at 02:19:03AM +0800, G.R. wrote:
> > > I omitted all operational details with the assumption that you are 
> > > familiar
> > > with TrueNAS and iSCSI setup.
> >
> > Not really. Ideally I would like a way to reproduce that can be done
> > using iperf, nc or similar simple command line tool, without requiring
> > to setup iSCSI.
> I think it would be tricky then. The problem hide itself well enough
> that I wasn't
> aware soon after upgrading since everything else works flawlessly --
> nfs, ssh, web etc.
> 
> > Can you also paste the output of `ifconfig xn0`?
> Here it is:
> xn0: flags=8843 metric 0 mtu 1500
> options=503
> ether 00:18:3c:51:6e:4c
> inet 192.168.1.9 netmask 0xff00 broadcast 192.168.1.255
> media: Ethernet manual
> status: active
> nd6 options=1
> 
> >
> > If I provided a patch for the FreeBSD kernel, would you be able to
> > apply and test it?
> Probably. I did this before when your XEN support for freeBSD was not
> available out-of-box.
> Just need to recreate all the required environments to apply the patch.

Could you build a debug kernel with the following patch applied and
give me the trace when it explodes?

Thanks, Roger.
---
diff --git a/sys/dev/xen/netfront/netfront.c b/sys/dev/xen/netfront/netfront.c
index fd2d97a7c70c..87bc3ecfc4dd 100644
--- a/sys/dev/xen/netfront/netfront.c
+++ b/sys/dev/xen/netfront/netfront.c
@@ -1519,8 +1519,12 @@ xn_count_frags(struct mbuf *m)
 {
int nfrags;
 
-   for (nfrags = 0; m != NULL; m = m->m_next)
+   for (nfrags = 0; m != NULL; m = m->m_next) {
+   KASSERT(
+  (mtod(m, vm_offset_t) & PAGE_MASK) + m->m_len <= PAGE_SIZE,
+   ("mbuf fragment crosses a page boundary"));
nfrags++;
+   }
 
return (nfrags);
 }




[qemu-mainline test] 167501: tolerable FAIL - PUSHED

2021-12-21 Thread osstest service owner
flight 167501 qemu-mainline real [real]
flight 167507 qemu-mainline real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/167501/
http://logs.test-lab.xenproject.org/osstest/logs/167507/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-arm64-arm64-libvirt-raw 12 debian-di-install   fail pass in 167507-retest

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-raw 14 migrate-support-check fail in 167507 never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-check fail in 167507 never 
pass
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167496
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167496
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167496
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 167496
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167496
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 167496
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 167496
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167496
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass

version targeted for testing:
 qemuu2bf40d0841b942e7ba12953d515e62a436f0af84
baseline version:
 qemuuc7d773ae49688463b59ade6989f8d612fecb973d

Last test of basis   167496  2021-12-20 21:38:

Re: Possible bug? DOM-U network stopped working after fatal error reported in DOM0

2021-12-21 Thread G.R.
> > I omitted all operational details with the assumption that you are familiar
> > with TrueNAS and iSCSI setup.
>
> Not really. Ideally I would like a way to reproduce that can be done
> using iperf, nc or similar simple command line tool, without requiring
> to setup iSCSI.
I think it would be tricky then. The problem hide itself well enough
that I wasn't
aware soon after upgrading since everything else works flawlessly --
nfs, ssh, web etc.

> Can you also paste the output of `ifconfig xn0`?
Here it is:
xn0: flags=8843 metric 0 mtu 1500
options=503
ether 00:18:3c:51:6e:4c
inet 192.168.1.9 netmask 0xff00 broadcast 192.168.1.255
media: Ethernet manual
status: active
nd6 options=1

>
> If I provided a patch for the FreeBSD kernel, would you be able to
> apply and test it?
Probably. I did this before when your XEN support for freeBSD was not
available out-of-box.
Just need to recreate all the required environments to apply the patch.

BTW, uname -a gives me the following;
>12.2-RELEASE-p11 FreeBSD 12.2-RELEASE-p11 75566f060d4(HEAD) TRUENAS  amd64

Thanks,
Timothy



Re: [XEN PATCH 47/57] libs/stat: Fix and rework python-bindings build

2021-12-21 Thread Anthony PERARD
On Thu, Dec 16, 2021 at 06:47:17PM +, Andrew Cooper wrote:
> On 06/12/2021 17:02, Anthony PERARD wrote:
> > Fix the dependency on the library, $(SHLIB) variable doesn't exist
> > anymore.
> >
> > Rework dependency on the include file, we can let `swig` generate the
> > dependency for us with the use of "-M*" flags.
> 
> Hmm.  At no point is swig mentioned in README/etc, and it's not in any
> of the CI logic.  I wasn't even aware that we had python bindings here.
> 
> Unless someone tries and confirms them not to be broken, I'd be tempted
> to drop it all.  I bet this has been dead since we dropped Xend.

How about the perl binding? Nothing to do with xend for them.

The only way to build the binding is to set a variable. It's not
possible to enable the binding via ./configure at the moment. So, yes,
maybe no one uses them.

Cheers,

-- 
Anthony PERARD



Re: [XEN PATCH 41/57] libs: Remove need for *installlocal targets

2021-12-21 Thread Anthony PERARD
On Tue, Dec 07, 2021 at 02:20:29PM +0100, Juergen Gross wrote:
> On 06.12.21 18:02, Anthony PERARD wrote:
> > There is no need for an extra "installlocal" target, we can use
> > double-colon rules instead.
> 
> Doesn't that mean that with the ultimate goal of including all
> Makefiles of the tree that all "install" and "uninstall" rules in the
> tree will have to be double-colon rules?

So, the install targets will be double-colon but for a different reason
(because of the framework been used). All the "install" target will
actually have different name, they will be prefixed by the name of the
directory where the Makefile is.

For example, "install" in "xl" will be named "xl/install", but there
will be something that will allow to run "make -C xl install" which will
call the rule "xl/install".

> With above remark I don't see how tools/libs/stat/Makefile can be left
> unmodified, as it includes libs.mk and it contains an "intall:" rule.

I've missed that one, just because ./configure doesn't allow to enable
the binding...

Thanks,

-- 
Anthony PERARD



[linux-linus test] 167500: tolerable FAIL - PUSHED

2021-12-21 Thread osstest service owner
flight 167500 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167500/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds18 guest-start/debian.repeat fail REGR. vs. 167495

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-rtds 20 guest-localmigrate/x10   fail  like 167495
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 167495
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167495
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167495
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 167495
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167495
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 167495
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167495
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167495
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass

version targeted for testing:
 linux6e0567b7305209c2d689ce57180a63d8dc657ad8
baseline version:
 linux86085fe79e3c1a66e32f2acae0ae64f4cceb8d28

Last test of basis   167495  2021-12-20 21:12:11 Z0 days
Testing same since   167500  2021-12-21 05:30:08 Z0 days1 attempts


People who touched revisions under test:
  Jason Gunthorpe 
  Jiacheng Shi 
  José Expósito 
  Linus Torvalds 
  Mike Marciniszyn 
  Wenpeng Liang 
  Yangyang Li 

jobs:
 build-amd64-xsm   

Re: [XEN PATCH 26/57] tools/firmware/hvmloader: rework Makefile

2021-12-21 Thread Anthony PERARD
On Thu, Dec 16, 2021 at 06:03:54PM +, Andrew Cooper wrote:
> On 06/12/2021 17:02, Anthony PERARD wrote:
> > Setup proper dependencies with libacpi so we don't need to run "make
> > hvmloader" in the "all" target. ("build.o" new prerequisite isn't
> > exactly proper but a side effect of building the $(DSDT_FILES) is to
> > generate the "ssdt_*.h" needed by "build.o".)
> >
> > Make use if "-iquote" instead of a plain "-I".
> 
> So I've read up on what this means, but is it really that important in
> the grand scheme of things?

It something that Jan proposed to do in some cases in the hypervisor
build system. I thought it was something good to do so I start using
-iquote in the toolstack as well.

> Can't we actually make our problems go away by turning libacpi into a
> real static library?  (Also, the "kill hvmloader plans" will turn
> libacpi back into only having one single user, so that too)

Well, libacpi also have some headers that needs to be generated, so I'm
not sure which problem are going away when turning it into a static lib.
Also, I don't think hvmloader and libxl would want the same library.

Thanks,

-- 
Anthony PERARD



Re: [XEN PATCH 25/57] tools/examples: cleanup Makefile

2021-12-21 Thread Anthony PERARD
On Thu, Dec 16, 2021 at 05:57:43PM +, Andrew Cooper wrote:
> On 06/12/2021 17:02, Anthony PERARD wrote:
> > diff --git a/tools/examples/Makefile b/tools/examples/Makefile
> > index 14e24f4cb3..48b520e133 100644
> > --- a/tools/examples/Makefile
> > +++ b/tools/examples/Makefile
> > @@ -26,10 +22,8 @@ uninstall: uninstall-readmes uninstall-configs
> >  
> >  .PHONY: install-readmes
> >  install-readmes:
> > -   [ -d $(DESTDIR)$(XEN_CONFIG_DIR) ] || \
> > -   $(INSTALL_DIR) $(DESTDIR)$(XEN_CONFIG_DIR)
> > -   set -e; for i in $(XEN_READMES); \
> > -   do [ -e $(DESTDIR)$(XEN_CONFIG_DIR)/$$i ] || \
> > +   $(INSTALL_DIR) $(DESTDIR)$(XEN_CONFIG_DIR)
> > +   set -e; for i in $(XEN_READMES); do \
> > $(INSTALL_DATA) $$i $(DESTDIR)$(XEN_CONFIG_DIR); \
> 
> Hmm.  Do we need a shell loop here at all?  Can't $(INSTALL_DATA) take
> multiple $i's at the same time?

Probably, even if the only time they are multiple filed install by
INSTALL_DATA in our build system is when shell globing is involve.
So, it's probably fine to remove the loop.

Thanks,

-- 
Anthony PERARD



Re: [XEN PATCH 24/57] tools/debugger/gdbsx: Fix and cleanup makefiles

2021-12-21 Thread Anthony PERARD
On Thu, Dec 16, 2021 at 05:55:29PM +, Andrew Cooper wrote:
> On 06/12/2021 17:02, Anthony PERARD wrote:
> > diff --git a/tools/debugger/gdbsx/Makefile b/tools/debugger/gdbsx/Makefile
> > index 8d7cd94a31..4aaf427c45 100644
> > --- a/tools/debugger/gdbsx/Makefile
> > +++ b/tools/debugger/gdbsx/Makefile
> > @@ -28,7 +28,7 @@ uninstall:
> >  gdbsx: gx/gx_all.a xg/xg_all.a 
> > $(CC) $(LDFLAGS) -o $@ $^
> >  
> > -xg/xg_all.a:
> > +xg/xg_all.a: FORCE
> > $(MAKE) -C xg
> > -gx/gx_all.a:
> > +gx/gx_all.a: FORCE
> > $(MAKE) -C gx
> 
> Shouldn't these be in the sub Make's ?

No, this is how we tell make how to build some of the prerequisite
needed to build "gdbsx", we tell make that they are build in
sub-directory.

> > diff --git a/tools/debugger/gdbsx/gx/Makefile 
> > b/tools/debugger/gdbsx/gx/Makefile
> > -#%.o: %.c $(GX_HDRS) Makefile
> > -#  $(CC) -c $(CFLAGS) -o $@ $<
> > -
> > -gx_all.a: $(GX_OBJS) Makefile $(GX_HDRS)
> > -   ar cr $@ $(GX_OBJS)# problem with ld using -m32 
> > +gx_all.a: $(GX_OBJS) Makefile
> > +   ar cr $@ $(GX_OBJS)
> 
> There's probably an $(AR) we ought to be using.

Yes, I'll look at that.

Thanks,

-- 
Anthony PERARD



Re: [XEN PATCH 22/57] tools/console: have one Makefile per program/directory

2021-12-21 Thread Anthony PERARD
On Thu, Dec 16, 2021 at 05:26:49PM +, Andrew Cooper wrote:
> On 06/12/2021 17:02, Anthony PERARD wrote:
> > Sources of both xenconsoled and xenconsole are already separated into
> > different directory and don't share anything in common. Having two
> > different Makefile means it's easier to deal with *FLAGS.
> >
> > Some common changes:
> > Rename $(BIN) to $(TARGETS), this will be useful later.
> > Stop removing *.so *.rpm *.a as they aren't created here.
> > Use $(OBJS-y) to list objects.
> > Update $(CFLAGS) for the directory rather than a single object.
> >
> > daemon:
> > Remove the need for $(LDLIBS_xenconsoled), use $(LDLIBS) instead.
> > Remove the need for $(CONSOLE_CFLAGS-y) and use $(CFLAGS-y)
> > instead.
> >
> > client:
> > Remove the unused $(LDLIBS_xenconsole)
> >
> > Signed-off-by: Anthony PERARD 
> > ---
> >  .gitignore|  4 +--
> >  tools/console/Makefile| 49 +++---
> >  tools/console/client/Makefile | 39 +++
> >  tools/console/daemon/Makefile | 50 +++
> >  4 files changed, 94 insertions(+), 48 deletions(-)
> >  create mode 100644 tools/console/client/Makefile
> >  create mode 100644 tools/console/daemon/Makefile
> >
> > diff --git a/.gitignore b/.gitignore
> > index b39b996718..c31fa9b841 100644
> > --- a/.gitignore
> > +++ b/.gitignore
> > @@ -159,8 +159,8 @@ tools/libs/util/libxenutil.map
> >  tools/libs/vchan/headers.chk
> >  tools/libs/vchan/libxenvchan.map
> >  tools/libs/vchan/xenvchan.pc
> > -tools/console/xenconsole
> > -tools/console/xenconsoled
> > +tools/console/client/xenconsole
> > +tools/console/daemon/xenconsoled
> 
> $ git ls-files -- :/ | grep gitignore
> ../.gitignore
> ../tools/fuzz/cpu-policy/.gitignore
> ../tools/libs/.gitignore
> ../tools/misc/.gitignore
> ../tools/tests/cpu-policy/.gitignore
> ../tools/tests/resource/.gitignore
> ../tools/tests/tsx/.gitignore
> ../tools/tests/vhpet/.gitignore
> ../tools/tests/xenstore/.gitignore
> tools/kconfig/.gitignore
> xsm/flask/.gitignore
> 
> 
> We're starting to use per-dir gitignores, because it has far less
> problematic behaviour for code movement.

You mean "we", I don't think everyone agree we that yet ;-). They aren't
any "xen/.gitignore" yet, despite me trying to add one at some point.

> I think we ought to take this opportunity to clean things up for the better.

Sounds good to me, I'll make the change.

> > diff --git a/tools/console/client/Makefile b/tools/console/client/Makefile
> > new file mode 100644
> > index 00..44176c6d93
> > --- /dev/null
> > +++ b/tools/console/client/Makefile
> > @@ -0,0 +1,39 @@
> > +XEN_ROOT=$(CURDIR)/../../..
> > +include $(XEN_ROOT)/tools/Rules.mk
> > +
> > +CFLAGS += -Werror
> 
> -Werror really ought to come from somewhere common, seeing as we expect
> it to be unilaterally set.

Yes. I think I'll also look at having "./configure --disable-werror" or
similar so -Werror could be easly disable, by for example someone
building an ancient release and not wanting to deal with warnings.

Thanks,

-- 
Anthony PERARD



Re: [XEN PATCH 19/57] tools/configure.ac: Create ZLIB_LIBS and ZLIB_CFLAGS

2021-12-21 Thread Anthony PERARD
On Thu, Dec 16, 2021 at 05:16:05PM +, Andrew Cooper wrote:
> On 06/12/2021 17:02, Anthony PERARD wrote:
> > diff --git a/tools/libs/guest/Makefile b/tools/libs/guest/Makefile
> > index 8f5f3acd21..1f4b7f7c58 100644
> > --- a/tools/libs/guest/Makefile
> > +++ b/tools/libs/guest/Makefile
> > @@ -103,8 +102,7 @@ NO_HEADERS_CHK := y
> >  
> >  include $(XEN_ROOT)/tools/libs/libs.mk
> >  
> > -libxenguest.so.$(MAJOR).$(MINOR): COMPRESSION_LIBS = $(filter 
> > -l%,$(zlib-options))
> > -libxenguest.so.$(MAJOR).$(MINOR): APPEND_LDFLAGS += $(COMPRESSION_LIBS) -lz
> > +libxenguest.so.$(MAJOR).$(MINOR): APPEND_LDFLAGS += $(ZLIB_LIBS) -lz
> 
> Looking ZLIB vs the other compression libs, shouldn't -lz be inside
> $(ZLIB_LIBS) ?

I don't think so because -lz is also used by libxenhypfs for example,
and maybe go binding for libxl. It seems that the name of the variable
is misleading, as it seems that -lz is zlib but $ZLIB doesn't contain it
or anything that would be called zlib.

-lz is check explicitly by configure and fail if missing, while $ZLIB
can be empty.

> Also, shouldn't this be LDLIBS rather than APPEND_LDFLAGS ?

Yes, I think I'm doing that in a different patch.
libs: rename LDUSELIBS to LDLIBS and use it instead of APPEND_LDFLAGS

Thanks,

-- 
Anthony PERARD



Re: [XEN PATCH 14/57] tools/flask/utils: remove unused variables/targets from Makefile

2021-12-21 Thread Anthony PERARD
On Thu, Dec 16, 2021 at 12:36:41PM +, Andrew Cooper wrote:
> On 06/12/2021 17:01, Anthony PERARD wrote:
> > @@ -35,21 +29,13 @@ flask-set-bool: set-bool.o
> >  
> >  .PHONY: clean
> >  clean: 
> > -   rm -f *.o *.opic *.so
> > +   rm -f *.o
> > rm -f $(CLIENTS)
> > $(RM) $(DEPS_RM)
> 
> Here and in plenty of subsequent patches, there's manipulation of raw
> `rm -f`'s which ought to be cleaned up to $(RM)

About using $(RM) or `rm -f`, I don't think we need to care if one or
the other is used in Makefiles, they needs to be equivalent or things
are going to fails. GNU make manual says that `rm` should exist,
especially when things like autoconf are used. For example our
./configure scrips is using `rm -f` and I don't think we can configure
that. So having RM configurable doesn't really serve any purpose.

Beyond that, using $(RM) is good in order to be consistent, and changing
"rm -f" to "$(RM)" is fine, I might not doing it myself or ask for it.

> I can fix this on commit if you're happy.

For the avoidance of doubt, yes, I'm happy with the patch that have been
committed.

Thanks,

-- 
Anthony PERARD



Re: [PATCH V6 1/2] libxl: Add support for Virtio disk configuration

2021-12-21 Thread Anthony PERARD
On Fri, Dec 17, 2021 at 06:50:02PM +0200, Oleksandr wrote:
> On 17.12.21 17:26, Juergen Gross wrote:
> > On 15.12.21 22:36, Oleksandr wrote:
> > > On 15.12.21 17:58, Juergen Gross wrote:
> > > > In practice we are having something like the "protocol" already today:
> > > > the disk device name is encoding that ("xvd*" is a Xen PV disk, while
> > > > "sd*" is an emulated SCSI disk, which happens to be presented to the
> > > > guest as "xvd*", too). And this is an additional information not
> > > > related to the backendtype.

You mean in theory? ;-) In practice, xvd* is the same as hd* (or sd*?).
I tried once to have xvd* mean PV disk only, but the patch was rejected.
So at the moment, we always get an emulated disk, we can't have PV disk
alone, at least on x86.

> > > > 
> > > > So we have basically the following configuration items, which are
> > > > orthogonal to each other (some combinations might not make sense,
> > > > but in theory most would be possible):
> > > > 
> > > > 1. protocol: emulated (not PV), Xen (like today), virtio
> > > > 
> > > > 2. backendtype: phy (blkback), qdisk (qemu), other (e.g. a daemon)
> > > > 
> > > > 3. format: raw, qcow, qcow2, vhd, qed
> > > > 
> > > > The combination virtio+phy would be equivalent to vhost, BTW. And
> > > > virtio+other might even use vhost-user, depending on the daemon.
> > > yes, BTW the combination virtio+other is close to our use-case.
> > > 
> > > 
> > > Thank you for the detailed explanation, now I see your point why
> > > using backendtype=virtio is not flexible option in the long term
> > > and why we would want/need to an extra configuration option such as
> > > protocol, etc. I think, it makes sense and would be correct.
> > > 
> > > If we take a disk as an example, then from the configuration PoV we
> > > will need to:
> > > - add an optional "protocol" option
> > 
> > I'm not sure regarding the name of the option. "protocol" was just a
> > suggestion by me.
> 
> Yes, personally I would be fine with either "protocol" or "specification",
> with a little preference for the former. What other people think of the
> name?

I don't have a better idea. "protocol" sound fine, as long as the description of
this new field is about how a guest kernel will communicate with the
backend.

We could start with "default" and "virtio-mmio" as options. "default" is
what we have now and usually mean emulated+xen-pv.

> > 
> > > - add new backendtype: external/other/daemon/etc.
> > > This seems to cover all possible combinations describe above
> > > (although I agree that some of them might not make sense). Is my
> > > understanding correct?
> > 
> > Yes.
> 
> ok, thank you for confirming.
> 
> > > Unfortunately, disk configuration/management code is spread over
> > > multiple sources (including auto-generated) in the toolstack which
> > > is not so easy to follow (at least to me
> > > who is not familiar enough with all this stuff), but anyway may I
> > > please clarify what is the minimum required amount of things that I
> > > need to do in order to get this basic virtio-mmio
> > > support series accepted?
> > 
> > I'd say we should first get consensus that others agree with my
> > suggestion.
> This is fair. Personally I share your opinion (what you propose sounds
> reasonable to me in general). Are there any other opinions? Any feedback
> would be highly appreciated.

The new proposed property sound good to me as well.

Thanks,

-- 
Anthony PERARD



Re: [XEN PATCH v8 00/47] xen: Build system improvements, now with out-of-tree build!

2021-12-21 Thread Jan Beulich
On 25.11.2021 14:39, Anthony PERARD wrote:
> Patch series available in this git branch:
> https://xenbits.xen.org/git-http/people/aperard/xen-unstable.git 
> br.build-system-xen-v8
> 
> v8:
> Mostly rework of v7. With many patch already applied.
> Some detail changes that are spread through many patches:
> - `make cloc` recipe should now work throughout the series, update of it 
> is
>   done in 3 patches.
> - new patch "build: fix enforce unique symbols for recent clang version"
>   to fix an issue with clang.
> - introducing $(srctree) and $(objtree) earlier
> - introducing $(srcdir) as shortcut for $(srctree)/$(src)
> - introduce usage of -iquote instead of -I in some cases
> More detail change log can be found in patches notes.
> 
> Also this v8 present a work-in-progress of the ability to do out-of-tree
> build without setting VPATH. This is presented as an alternative to force
> use of out-of-tree build. As the last patch show, it allows to build the
> xen-shim without the linkfarm and we don't need to make any other changes
> to any thing that build xen (osstest, distribution packages, xen.git, ...,
> and developers finger macros). The patches are only there as WIP / RFC as
> they were some concern about the usefulness and extra changes needed.
> We can decide whether those changes are good or if this is too much and we
> should force out-of-tree build for the hypervisor.

I'm afraid I'm of two minds here. I don't think we want to force people to
do out-of-tree builds, but I also dislike the idea of mixing in-tree and
out-of-tree builds. Yet reading the above I understand that the shim build
would conflict with an in-tree build because certain files would be picked
(the shim build being an out-of-tree one) from the (dirtied) source tree,
rather than the shim's build tree. Perhaps the extra path prefixes that I
commented upon in an individual patch are then indeed the least bad route
to take.

There's one compromise which comes to mind, but which may also not be
liked: We could simply fail out-of-tree builds when the source tree is
dirty. Then people wanting the shim built would need to use out-of-tree
builds also for the "main" hypervisor, but people suppressing the shim
build anyway (or doing it separately, e.g. using a non-default .config)
could continue to do in-tree builds. The one puzzle piece I'm lacking so
far (perhaps simply because of having overlooked where it is) is how, for
a full-build-of-everything, one would control where the xen/ part of the
build would go _outside_ of the source (sub-)tree.

Jan




Re: [XEN PATCH v8 36/47] RFC, no-VPATH: Kconfig: only ready auto.conf from objtree

2021-12-21 Thread Anthony PERARD
On Tue, Dec 21, 2021 at 03:23:48PM +0100, Jan Beulich wrote:
> On 25.11.2021 14:39, Anthony PERARD wrote:
> > From: Anthony PERARD 
> > 
> > zconf_fopen() will open file in $srctree if the file isn't in
> > $objtree. This is an issue for "auto.conf" as it could be in the
> > source tree if it is dirty. With "auto.conf" read from the source
> > tree, kconfig will not properly set the file associated with a
> > CONFIG_* properly in "include/config/" and instead only touch the
> > files that correspond to new CONFIG_* option that were not set in the
> > "auto.conf" found in the source tree.
> 
> Do we really mean to support mixing in-tree and out-of-tree builds
> from one source tree? Without such mixing, iiuc there would be no
> such problem.

Could you read and reply to the cover letter instead? The second
paragraph of the v8 changelog speaks about this. I would rather have a
discussion about this on the cover letter thread than in a patch.

Thanks,

-- 
Anthony PERARD



Re: [RFC v1 5/5] xen/arm: add SCI mediator support for DomUs

2021-12-21 Thread Anthony PERARD
On Fri, Dec 17, 2021 at 12:15:25PM +, Oleksii Moisieiev wrote:
> > On 14.12.21 11:34, Oleksii Moisieiev wrote:
> > > @@ -1817,17 +1858,24 @@ static void libxl__add_dtdevs(libxl__egc *egc, 
> > > libxl__ao *ao, uint32_t domid,
> > >   {
> > >   AO_GC;
> > >   libxl__ao_device *aodev = libxl__multidev_prepare(multidev);
> > > -int i, rc = 0;
> > > +int i, rc = 0, rc_sci = 0;
> > >   for (i = 0; i < d_config->num_dtdevs; i++) {
> > >   const libxl_device_dtdev *dtdev = &d_config->dtdevs[i];
> > >   LOGD(DEBUG, domid, "Assign device \"%s\" to domain", 
> > > dtdev->path);
> > >   rc = xc_assign_dt_device(CTX->xch, domid, dtdev->path);
> > > -if (rc < 0) {
> > > -LOGD(ERROR, domid, "xc_assign_dtdevice failed: %d", rc);
> > > +rc_sci = xc_domain_add_sci_device(CTX->xch, domid, dtdev->path);
> > > +
> > > +if ((rc < 0) && (rc_sci < 0)) {
> > > +LOGD(ERROR, domid, "xc_assign_dt_device failed: %d; "
> > > + "xc_domain_add_sci_device failed: %d",
> > > + rc, rc_sci);
> > >   goto out;
> > >   }
> > > +
> > > +if (rc)
> > > +rc = rc_sci;
> > 
> > 
> > If I get this code right, it sounds like the dom.cfg's dtdev property is
> > reused to describe sci devices as well, but the libxl__add_dtdevs() does not
> > (and can not) differentiate them. So it has no option but to send two
> > domctls for each device in dtdevs[] hoping to at least one domctl to
> > succeeded. Or I really missed something?
> > 
> > It feels to me that:
> >  - either new dom.cfg's property could be introduced (scidev?) to describe
> > sci devices together with new parsing logic/management code, so you will end
> > up having new libxl__add_scidevs() to invoke XEN_DOMCTL_add_sci_device,
> > so no mixing things.
> >  - or indeed dtdev logic could be *completely* reused including extending
> > XEN_DOMCTL_assign_device to cover your use-case, although sounds generic, it
> > is used to describe devices for the passthrough (to configure an IOMMU for
> > the device), in that case you wouldn't need an extra
> > XEN_DOMCTL_add_sci_device introduced by current patch.
> > 
> > Personally I would use the first option as I am not sure that second option
> > is conceptually correct && possible. I would leave this for the maintainers
> > to clarify.
> > 
> 
> Thank you for the point. I agree that reusing XEN_DOMCTL_assign_device
> seems not to be conceptually correct. Introducing new dom.cfg property
> seems to be the only way to avoid extra Domctl calls.
> I will handle this in v2 if there will be no complains from Maintainers.

I don't know if there will be a complains in v2 or not :-), but using
something different from dtdev sound good.

If I understand correctly, dtdev seems to be about passing through an
existing device to a guest, and this new sci device seems to be about having Xen
"emulating" an sci device which the guest will use. So introducing
something new (scidev or other name) sounds good.

Thanks,

-- 
Anthony PERARD



Re: [XEN PATCH v8 36/47] RFC, no-VPATH: Kconfig: only ready auto.conf from objtree

2021-12-21 Thread Jan Beulich
On 25.11.2021 14:39, Anthony PERARD wrote:
> From: Anthony PERARD 
> 
> zconf_fopen() will open file in $srctree if the file isn't in
> $objtree. This is an issue for "auto.conf" as it could be in the
> source tree if it is dirty. With "auto.conf" read from the source
> tree, kconfig will not properly set the file associated with a
> CONFIG_* properly in "include/config/" and instead only touch the
> files that correspond to new CONFIG_* option that were not set in the
> "auto.conf" found in the source tree.

Do we really mean to support mixing in-tree and out-of-tree builds
from one source tree? Without such mixing, iiuc there would be no
such problem.

Jan




Re: [XEN PATCH v8 34/47] build: add %.E targets

2021-12-21 Thread Jan Beulich
On 25.11.2021 14:39, Anthony PERARD wrote:
> I guess it's easier to remember that %.E does "$(CC) -E" or "$(CPP)".
> 
> Suggested-by: Andrew Cooper 
> Signed-off-by: Anthony PERARD 

I understand that this patch isn't going to be needed anymore.

Jan




Re: [XEN PATCH v8 32/47] build: shuffle main Makefile

2021-12-21 Thread Jan Beulich
On 25.11.2021 14:39, Anthony PERARD wrote:
> Reorganize a bit the Makefile ahead of patch
> "build: adding out-of-tree support to the xen build"

Without you saying so it's not clear why that's good of even
necessary. Moving things around always has the potential of
breaking subtle ordering requirements, so I'd prefer to see
what good the movement does.

Jan




Re: [XEN PATCH v8 31/47] build: specify source tree in include/ for prerequisite

2021-12-21 Thread Jan Beulich
On 25.11.2021 14:39, Anthony PERARD wrote:
> When doing an out-of-tree build, and thus setting VPATH,
> GNU Make 3.81 on Ubuntu Trusty complains about Circular dependency of
> include/Makefile and include/xlat.lst and drop them. The build fails
> later due to headers malformed.

A circular dependency would mean that besides the expected dependency
there is also one of include/Makefile on include/xlat.lst. Where is
that? I'm not aware of anything include/Makefile depends on. Is there
any dependency being introduced in this series, perhaps by way of new
(generated) dependency files? It would be good to have a clear
understanding of the issue - as you describe it, it could as well be
a make flaw.

The adjustments themselves look okay to me, but of course they don't
help readability.

Jan




Re: [XEN PATCH v8 30/47] build: rework "headers*.chk" prerequisite in include/

2021-12-21 Thread Jan Beulich
On 25.11.2021 14:39, Anthony PERARD wrote:
> @@ -81,10 +81,21 @@ ifeq ($(XEN_TARGET_ARCH),$(XEN_COMPILE_ARCH))
>  
>  all: $(obj)/headers.chk $(obj)/headers99.chk $(obj)/headers++.chk
>  
> -PUBLIC_HEADERS := $(filter-out $(src)/public/arch-% 
> $(src)/public/dom0_ops.h, $(wildcard $(src)/public/*.h $(src)/public/*/*.h) 
> $(public-y))
> +public-hdrs-path := $(srcdir)/public
>  
> -PUBLIC_C99_HEADERS := $(src)/public/io/9pfs.h $(src)/public/io/pvcalls.h
> -PUBLIC_ANSI_HEADERS := $(filter-out $(src)/public/%ctl.h $(src)/public/xsm/% 
> $(src)/public/%hvm/save.h $(PUBLIC_C99_HEADERS), $(PUBLIC_HEADERS))
> +public-list-headers = $(wildcard $1/*.h $1/*/*.h)
> +public-filter-headers = $(filter-out $(addprefix 
> $(public-hdrs-path)/,$($1-filter)), $($1))
> +
> +public-c99-headers := io/9pfs.h io/pvcalls.h
> +public-headers := $(call public-list-headers,$(public-hdrs-path)) $(public-y)
> +public-ansi-headers := $(public-headers)

Personally I find it odd for public-c99-headers to be first in this group.
Further down, in the upper-case counterparts, you have it at the end (still
in context below).

> +public-headers-filter := dom0_ops.h arch-%

What is the criteria to be listed here vs ...

> +public-ansi-headers-filter := %ctl.h xsm/% %hvm/save.h 
> $(public-headers-filter) $(public-c99-headers)

... outside of that macro's expansion here? There's no other use of the
macro that I can spot, so its presence is puzzling me.

> +
> +PUBLIC_HEADERS := $(call public-filter-headers,public-headers)
> +PUBLIC_ANSI_HEADERS := $(call public-filter-headers,public-ansi-headers)
> +PUBLIC_C99_HEADERS := $(addprefix $(public-hdrs-path)/, 
> $(public-c99-headers))

While benign right now, wouldn't it be more consistent if
public-filter-headers was also applied in this last case? Or is
there a reason not to?

Jan




Re: [XEN PATCH v8 29/47] build: replace $(BASEDIR) and use $(srctree)

2021-12-21 Thread Jan Beulich
On 25.11.2021 14:39, Anthony PERARD wrote:
> $(srctree) is a better description for the source directory than
> $(BASEDIR) that has been used for both source and build directory
> (which where the same).
> 
> This adds $(srctree) to a few path where make's VPATH=$(srctree) won't
> apply. And replace $(BASEDIR) by $(srctree).
> 
> Introduce "$(srcdir)" as a shortcut for "$(srctree)/$(src)" as the
> later is used often enough.
> 
> Signed-off-by: Anthony PERARD 

Acked-by: Jan Beulich 

One remark:

> --- a/xen/scripts/Makefile.clean
> +++ b/xen/scripts/Makefile.clean
> @@ -5,9 +5,12 @@
>  
>  src := $(obj)
>  
> +# shortcut for $(srctree)/$(src)
> +srcdir := $(srctree)/$(src)

Might it make sense to generalize the comment to "# shortcuts" right
away, in case further ones appear? There seems little reason to have
the comment duplicate what the assignment actually does.

Jan




Re: [XEN PATCH v8 28/47] build: replace $(BASEDIR) by $(objtree)

2021-12-21 Thread Jan Beulich
On 25.11.2021 14:39, Anthony PERARD wrote:
> We need to differentiate between source files and generated/built
> files. We will be replacing $(BASEDIR) by $(objtree) for files that
> are generated.
> 
> Signed-off-by: Anthony PERARD 

Acked-by: Jan Beulich 




Re: [XEN PATCH v8 27/47] build: grab common EFI source files in arch specific dir

2021-12-21 Thread Jan Beulich
On 25.11.2021 14:39, Anthony PERARD wrote:
> Rather than preparing the efi source file, we will make the symbolic
> link as needed from the build location.
> 
> The `ln` command is run every time to allow to update the link in case
> the source tree change location.

Btw, since symlinks aren't being liked, would there be a way to make
things work using vpath?

> This patch also introduce "efi_common.mk" which allow to reuse the
> common make instructions without having to duplicate them into each
> arch.
> 
> And now that we have a list of common source file, we can start to
> remove the links to the source files on clean.
> 
> Signed-off-by: Anthony PERARD 
> ---
> 
> Notes:
> v8:
> - use symbolic link instead of making a copy of the source
> - introduce efi_common.mk
> - remove links to source file on clean
> - use -iquote for "efi.h" headers in common/efi
> 
>  xen/Makefile |  5 -
>  xen/arch/arm/efi/Makefile|  4 ++--
>  xen/arch/x86/Makefile|  1 +
>  xen/arch/x86/efi/Makefile|  5 +
>  xen/common/efi/efi_common.mk | 12 

Could I talk you into avoiding _ when - is suitable, which is the case not
only for (non-exported) make variables, but also file names?

> --- a/xen/arch/arm/efi/Makefile
> +++ b/xen/arch/arm/efi/Makefile
> @@ -1,4 +1,4 @@
> -CFLAGS-y += -fshort-wchar
> +include $(srctree)/common/efi/efi_common.mk
>  
> -obj-y += boot.init.o pe.init.o ebmalloc.o runtime.o
> +obj-y += $(EFIOBJ-y)
>  obj-$(CONFIG_ACPI) +=  efi-dom0.init.o
> diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
> index e8151bf4b111..eabd8d3919a4 100644
> --- a/xen/arch/x86/Makefile
> +++ b/xen/arch/x86/Makefile
> @@ -79,6 +79,7 @@ endif
>  
>  # Allows "clean" to descend into boot/
>  subdir- += boot
> +subdir- += efi

This renders the comment stale - please generalize it.

Also, any reason a similar adjustment isn't needed for Arm? Perhaps
this could even move into xen/Makefile:

subdir- += $(wildcard efi/)

> --- /dev/null
> +++ b/xen/common/efi/efi_common.mk
> @@ -0,0 +1,12 @@
> +EFIOBJ-y := boot.init.o pe.init.o ebmalloc.o runtime.o
> +EFIOBJ-$(CONFIG_COMPAT) += compat.o
> +
> +CFLAGS-y += -fshort-wchar
> +CFLAGS-y += -iquote $(srctree)/common/efi
> +
> +$(obj)/%.c: $(abs_srctree)/common/efi/%.c FORCE
> + $(Q)ln -nfs $< $@

Like was the case before, I think it would be better if the links were
relative ones, at least when srctree == objtree (but ideally always).

> +clean-files += $(patsubst %.o,%.c, $(EFIOBJ-y:.init.o=.o) $(EFIOBJ-))

Nit: Please be consistent (at least within a single line) about blanks
following commas.

Jan




Re: Possible bug? DOM-U network stopped working after fatal error reported in DOM0

2021-12-21 Thread Roger Pau Monné
On Tue, Dec 21, 2021 at 01:13:43AM +0800, G.R. wrote:
> First of all, thank you for your quick response, Juergen and Roger.
> I just realized that I run into mail forwarding issue from sourceforge
> mail alias service, and only found the responses when I checked the
> list archive. As a result, I have to manually merge Roger's response
> to reply...
> 
> > > I have to admit that this trial process is blind as I have no idea
> > > which component in the combo is to be blamed. Is it a bug in the
> > > backend-driver, frontend-driver or the hypervisor itself? Or due to
> > > incompatible versions? Any suggestion on other diagnose ideas (e.g.
> > > debug logs) will be welcome, while I work on the planned experiments.
> >
> > This is a bug in FreeBSD netfront, so no matter which Linux or Xen
> > version you use.
> >
> > Does it make a difference if you disable TSO and LRO from netfront?
> >
> > $ ifconfig xn0 -tso -lro
> It does not, the fatal error still show up after this command.

Thanks for testing.

> >
> > Do you have instructions I can follow in order to try to reproduce the
> > issue?
> I don't know if there are any special details in my setup.
> Hopefully I don't miss anything useful:
> 1. Build a TrueNAS 12.0U7 DOM-U by flushing the OS image into a vdisk
> 2. Create / import a zfs pool to the DOM-U
> 3. Create and share some file based iSCSI extents on the pool
> 4. Mount the iSCSI extent through some initiator clients.
> The domU xn0 should be disabled immediately after step #4.
> 
> I omitted all operational details with the assumption that you are familiar
> with TrueNAS and iSCSI setup.

Not really. Ideally I would like a way to reproduce that can be done
using iperf, nc or similar simple command line tool, without requiring
to setup iSCSI.

> For step #4, I can reproduce it with both ipxe initiator and the win7
> built-in client.
> As a result, I assume the client version does not matter.
> For #2, I actually have a physical disk and controller assigned to DOM-U.
> But I suspect this is probably irrelevant.
> For #3, I'm not sure if the content in the extent matters.
> So far I have been testing the same extent, which is formatted as an NTFS 
> disk.

Can you also paste the output of `ifconfig xn0`?

If I provided a patch for the FreeBSD kernel, would you be able to
apply and test it?

Thanks, Roger.



Re: [PATCH] MAINTAINERS: update TXT section

2021-12-21 Thread Roger Pau Monné
On Tue, Dec 21, 2021 at 10:06:37AM +0100, Jan Beulich wrote:
> Since mail to Lukasz'es address has been bouncing, Intel have suggested
> replacement contacts.
> 
> Signed-off-by: Jan Beulich 
> ---
> To be frank, I'm not fully convinced of the M: - I'd instead see us use
> R: for both just like we had it for Lukasz.

I assume it was Intel then to explicitly request for Maintainer
status?

I'm fine with either M or R, albeit I agree R might be more natural
since there's no track of record in xen-devel of any reviews AFAICT.

Acked-by: Roger Pau Monné 

Thanks.



Re: [XEN PATCH v8 26/47] build,x86: remove the need for build32.mk

2021-12-21 Thread Jan Beulich
On 25.11.2021 14:39, Anthony PERARD wrote:
> --- a/xen/Makefile
> +++ b/xen/Makefile
> @@ -171,6 +171,10 @@ export LEX = $(if $(FLEX),$(FLEX),flex)
>  # Default file for 'make defconfig'.
>  export KBUILD_DEFCONFIG := $(ARCH)_defconfig
>  
> +# Copy CFLAGS generated by "Config.mk" so they can be reused later without
> +# reparsing Config.mk by e.g. arch/x86/boot/.
> +export XEN_COMMON_CFLAGS := $(CFLAGS)

For my own understanding (it's hard to check being half way through the
series): At this point there are no further adjustments expected to
CFLAGS?

> --- a/xen/arch/x86/boot/Makefile
> +++ b/xen/arch/x86/boot/Makefile
> @@ -1,25 +1,45 @@
>  obj-bin-y += head.o
> +head-objs := cmdline.S reloc.S

Is "-objs" really a suitable part of the name for a list of *.S?

> -DEFS_H_DEPS = $(abs_srctree)/$(src)/defs.h 
> $(abs_srctree)/include/xen/stdbool.h
> +nocov-y += $(head-objs:.S=.o)
> +noubsan-y += $(head-objs:.S=.o)
> +targets += $(head-objs:.S=.o)
>  
> -CMDLINE_DEPS = $(DEFS_H_DEPS) $(abs_srctree)/$(src)/video.h \
> -$(BASEDIR)/include/xen/kconfig.h \
> -$(BASEDIR)/include/generated/autoconf.h
> +head-objs := $(addprefix $(obj)/, $(head-objs))
>  
> -RELOC_DEPS = $(DEFS_H_DEPS) \
> -  $(BASEDIR)/include/generated/autoconf.h \
> -  $(BASEDIR)/include/xen/kconfig.h \
> -  $(BASEDIR)/include/xen/multiboot.h \
> -  $(BASEDIR)/include/xen/multiboot2.h \
> -  $(BASEDIR)/include/xen/const.h \
> -  $(BASEDIR)/include/public/arch-x86/hvm/start_info.h
> +$(obj)/head.o: $(head-objs)
>  
> -$(obj)/head.o: $(obj)/cmdline.S $(obj)/reloc.S
> +$(head-objs:.S=.lnk): LDFLAGS_DIRECT := $(subst 
> x86_64,i386,$(LDFLAGS_DIRECT))

Considering there's just a single use of LDFLAGS_DIRECT below, wouldn't
it make sense to avoid overriding the variable and doing the $(subst ...)
right at the use site instead?

> -$(obj)/cmdline.S: $(src)/cmdline.c $(CMDLINE_DEPS) $(src)/build32.lds
> - $(MAKE) -f $(abs_srctree)/$(src)/build32.mk -C $(obj) $(@F) 
> CMDLINE_DEPS="$(CMDLINE_DEPS)"
> +CFLAGS_x86_32 := $(subst -m64,-m32 -march=i686,$(XEN_COMMON_CFLAGS))

I can't seem to be able to spot -march=i686 in the old code. Or wait -
Is this duplicating what config/x86_32.mk has?

> +$(call cc-options-add,CFLAGS_x86_32,CC,$(EMBEDDED_EXTRA_CFLAGS))
> +CFLAGS_x86_32 += -Werror -fno-builtin -g0 -msoft-float

You did inherit -Werror and -fno-builtin from $(XEN_COMMON_CFLAGS)
already, so I don't think you need to specify these again?

> +CFLAGS_x86_32 += -I$(srctree)/include

Isn't this present in $(XEN_COMMON_CFLAGS) as well?

> -$(obj)/reloc.S: $(src)/reloc.c $(RELOC_DEPS) $(src)/build32.lds
> - $(MAKE) -f $(abs_srctree)/$(src)/build32.mk -C $(obj) $(@F) 
> RELOC_DEPS="$(RELOC_DEPS)"
> +# override for 32bit binaries
> +$(head-objs:.S=.o): CFLAGS-stack-boundary :=
> +$(head-objs:.S=.o): XEN_CFLAGS := $(CFLAGS_x86_32) -fpic

-fpic should again already be there.

> +$(head-objs): %.S: %.bin
> + (od -v -t x $< | tr -s ' ' | awk 'NR > 1 {print s} {s=$$0}' | \
> + sed 's/ /,0x/g' | sed 's/,0x$$//' | sed 's/^[0-9]*,/ .long /') >$@
> +
> +# Drop .got.plt during conversion to plain binary format.
> +# Please check build32.lds for more details.
> +%.bin: %.lnk
> + $(OBJDUMP) -h $< | sed -n '/[0-9]/{s,00*,0,g;p;}' | \
> + while read idx name sz rest; do \
> + case "$$name" in \
> + .got.plt) \
> + test $$sz != 0c || continue; \
> + echo "Error: non-empty $$name: 0x$$sz" >&2; \
> + exit $$(expr $$idx + 1);; \
> + esac; \
> + done
> + $(OBJCOPY) -O binary -R .got.plt $< $@
> +
> +

Nit: Please no double blank lines.

Jan




Re: [RFC v1 4/5] tools/arm: add "scmi_smc" option to xl.cfg

2021-12-21 Thread Anthony PERARD
On Tue, Dec 14, 2021 at 09:34:28AM +, Oleksii Moisieiev wrote:
> This enumeration sets SCI type for the domain. Currently there is
> two possible options: either 'none' or 'scmi_smc'.
> 
> 'none' is the default value and it disables SCI support at all.
> 
> 'scmi_smc' enables access to the Firmware from the domains using SCMI
> protocol and SMC as transport.
> 
> Signed-off-by: Oleksii Moisieiev 
> ---

Thanks for the patch, it looks good too me.

But it is kind of weird that the manual describes something that isn't
implemented yet. Could you maybe add in the patch description that the
feature isn't implemented yet or that the feature is implemented in
follow-up patches?

Also, about the golang binding thingy, could you add a note after a line
of a three dash "---" that let know the committer to regenerate
everything that needs re-generating due to change in the *.idl file,
just in case?

Thanks,

-- 
Anthony PERARD



[xen-unstable-smoke test] 167503: tolerable all pass - PUSHED

2021-12-21 Thread osstest service owner
flight 167503 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167503/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  f1e268b9fd13647e1f69c8ce0ae7be401d319fc8
baseline version:
 xen  8e3edefb880caeeaaf80123d5599139e8c2c9ecf

Last test of basis   167467  2021-12-17 19:01:56 Z3 days
Testing same since   167503  2021-12-21 10:01:43 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   8e3edefb88..f1e268b9fd  f1e268b9fd13647e1f69c8ce0ae7be401d319fc8 -> smoke



Re: [PATCH 6/6] x86/hvm: Support PKS

2021-12-21 Thread Jan Beulich
On 16.12.2021 10:54, Andrew Cooper wrote:
> With all infrastructure in place, advertise the PKS CPUID bit to guests, and
> let them set CR4.PKS.
> 
> Experiment with a tweak to the layout of hvm_cr4_guest_valid_bits() so future
> additions will be just a single added line.
> 
> The current context switching behaviour is tied to how VT-x works, so leave a
> safety check in the short term.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 

I would like to ask though that you ...

> --- a/xen/include/public/arch-x86/cpufeatureset.h
> +++ b/xen/include/public/arch-x86/cpufeatureset.h
> @@ -244,7 +244,7 @@ XEN_CPUFEATURE(CLDEMOTE,  6*32+25) /*A  CLDEMOTE 
> instruction */
>  XEN_CPUFEATURE(MOVDIRI,   6*32+27) /*a  MOVDIRI instruction */
>  XEN_CPUFEATURE(MOVDIR64B, 6*32+28) /*a  MOVDIR64B instruction */
>  XEN_CPUFEATURE(ENQCMD,6*32+29) /*   ENQCMD{,S} instructions */
> -XEN_CPUFEATURE(PKS,   6*32+31) /*   Protection Key for Supervisor */
> +XEN_CPUFEATURE(PKS,   6*32+31) /*H  Protection Key for Supervisor */

... clarify this restriction of not covering shadow mode guests by
an adjustment to title or description. Aiui the sole reason for
the restriction is that shadow code doesn't propagate the key bits
from guest to shadow PTEs?

Jan




Re: [PATCH 5/6] x86/pagewalk: Support PKS

2021-12-21 Thread Jan Beulich
On 16.12.2021 10:54, Andrew Cooper wrote:
> PKS is incredibly similar to the existing PKU behaviour, operating on
> pagewalks for any supervisor mapping.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 




[PATCH v4] is_system_domain: replace open-coded instances

2021-12-21 Thread Daniel P. Smith
From: Christopher Clark 

This is a split out of the hyperlaunch dom0 series.

There were several instances of open-coded domid range checking. This commit
replaces those with the is_system_domain or is_system_domid inline function.

Signed-off-by: Christopher Clark 
Signed-off-by: Daniel P. Smith 
Acked-by: Dario Faggioli 
---
 xen/arch/x86/cpu/mcheck/mce.c | 2 +-
 xen/arch/x86/cpu/vpmu.c   | 2 +-
 xen/common/domain.c   | 2 +-
 xen/common/domctl.c   | 4 ++--
 xen/common/sched/core.c   | 4 ++--
 xen/include/xen/sched.h   | 7 ++-
 6 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index 7f433343bc..5c1df39075 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -1518,7 +1518,7 @@ long do_mca(XEN_GUEST_HANDLE_PARAM(xen_mc_t) u_xen_mc)
 d = rcu_lock_domain_by_any_id(mc_msrinject->mcinj_domid);
 if ( d == NULL )
 {
-if ( mc_msrinject->mcinj_domid >= DOMID_FIRST_RESERVED )
+if ( is_system_domid(mc_msrinject->mcinj_domid) )
 return x86_mcerr("do_mca inject: incompatible flag "
  "MC_MSRINJ_F_GPADDR with domain %d",
  -EINVAL, domid);
diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c
index 8ec4547bed..c6bfa5a00e 100644
--- a/xen/arch/x86/cpu/vpmu.c
+++ b/xen/arch/x86/cpu/vpmu.c
@@ -188,7 +188,7 @@ void vpmu_do_interrupt(struct cpu_user_regs *regs)
  * in XENPMU_MODE_ALL, for everyone.
  */
 if ( (vpmu_mode & XENPMU_MODE_ALL) ||
- (sampled->domain->domain_id >= DOMID_FIRST_RESERVED) )
+ is_system_domain(sampled->domain) )
 {
 sampling = choose_hwdom_vcpu();
 if ( !sampling )
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 093bb4403f..347cc073aa 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -583,7 +583,7 @@ struct domain *domain_create(domid_t domid,
 /* Sort out our idea of is_hardware_domain(). */
 if ( domid == 0 || domid == hardware_domid )
 {
-if ( hardware_domid < 0 || hardware_domid >= DOMID_FIRST_RESERVED )
+if ( hardware_domid < 0 || is_system_domid(hardware_domid) )
 panic("The value of hardware_dom must be a valid domain ID\n");
 
 old_hwdom = hardware_domain;
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index 879a2adcbe..a3ad1f62b6 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -52,7 +52,7 @@ static inline int is_free_domid(domid_t dom)
 {
 struct domain *d;
 
-if ( dom >= DOMID_FIRST_RESERVED )
+if ( is_system_domid(dom) )
 return 0;
 
 if ( (d = rcu_lock_domain_by_id(dom)) == NULL )
@@ -536,7 +536,7 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) 
u_domctl)
 if ( !d )
 {
 ret = -EINVAL;
-if ( op->domain >= DOMID_FIRST_RESERVED )
+if ( is_system_domid(op->domain) )
 break;
 
 rcu_read_lock(&domlist_read_lock);
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index 8f4b1ca10d..6ea8bcf62f 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -821,7 +821,7 @@ int sched_init_domain(struct domain *d, unsigned int poolid)
 int ret;
 
 ASSERT(d->cpupool == NULL);
-ASSERT(d->domain_id < DOMID_FIRST_RESERVED);
+ASSERT(!is_system_domain(d));
 
 if ( (ret = cpupool_add_domain(d, poolid)) )
 return ret;
@@ -845,7 +845,7 @@ int sched_init_domain(struct domain *d, unsigned int poolid)
 
 void sched_destroy_domain(struct domain *d)
 {
-ASSERT(d->domain_id < DOMID_FIRST_RESERVED);
+ASSERT(!is_system_domain(d));
 
 if ( d->cpupool )
 {
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 28146ee404..cd575d01cf 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -613,9 +613,14 @@ extern struct vcpu *idle_vcpu[NR_CPUS];
 #define is_idle_domain(d) ((d)->domain_id == DOMID_IDLE)
 #define is_idle_vcpu(v)   (is_idle_domain((v)->domain))
 
+static inline bool is_system_domid(domid_t id)
+{
+return id >= DOMID_FIRST_RESERVED;
+}
+
 static inline bool is_system_domain(const struct domain *d)
 {
-return d->domain_id >= DOMID_FIRST_RESERVED;
+return is_system_domid(d->domain_id);
 }
 
 #define DOMAIN_DESTROYED (1u << 31) /* assumes atomic_t is >= 32 bits */
-- 
2.20.1




Re: [PATCH v3] is_system_domain: replace open-coded instances

2021-12-21 Thread Daniel P. Smith

On 12/21/21 4:20 AM, Jan Beulich wrote:

On 20.12.2021 17:28, Daniel P. Smith wrote:

From: Christopher Clark 

This is a split out of the hyperlaunch dom0 series.

There were several instances of open-coded domid range checking. This commit
replaces those with the is_system_domain or is_system_domid inline function.

Signed-off-by: Christopher Clark 
Signed-off-by: Daniel P. Smith 
Acked-by: Dario Faggioli 


While I'm not outright opposed, I'd still like to raise the question whether
we really want to intermix "is system domain" and "is in-range domain ID"
predicates. Personally I'd prefer the latter to remain open-coded range
checks.


--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -613,9 +613,14 @@ extern struct vcpu *idle_vcpu[NR_CPUS];
  #define is_idle_domain(d) ((d)->domain_id == DOMID_IDLE)
  #define is_idle_vcpu(v)   (is_idle_domain((v)->domain))
  
+static inline bool is_system_domid(domid_t id)

+{
+return (id >= DOMID_FIRST_RESERVED);


Nit: Generally we omit parentheses in cases like this, ...


ack


+}
+
  static inline bool is_system_domain(const struct domain *d)
  {
-return d->domain_id >= DOMID_FIRST_RESERVED;


... just like was the case here.

Jan





Re: [PATCH 4/6] x86/hvm: Enable guest access to MSR_PKRS

2021-12-21 Thread Jan Beulich
On 16.12.2021 10:54, Andrew Cooper wrote:
> Have guest_{rd,wr}msr() access either the live register, or stashed state,
> depending on context.  Include MSR_PKRS for migration, and let the guest have
> full access.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 




Re: [PATCH 3/6] x86/hvm: Context switch MSR_PKRS

2021-12-21 Thread Jan Beulich
On 16.12.2021 10:54, Andrew Cooper wrote:
> Under PKS, MSR_PKRS is available and based on the CPUID policy alone, and
> usable independently of CR4.PKS.  See the large comment in prot-key.h for
> details of the context switching arrangement.
> 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Jan Beulich 
> CC: Roger Pau Monné 
> CC: Wei Liu 
> CC: Kevin Tian 
> 
> At a guess, we're likely to see PKS on AMD eventually, hence not putting the
> DEFINE_PER_CPU() in vmx.c, but I'm at a total loss to find anywhere better to
> put it than hvm.c.  Suggestions welcome.

That's fine for now imo. hvm.c is another candidate for splitting up,
at which point a better place may surface. (I would even be willing
to make an attempt in that direction, if only I knew the results
wouldn't end up stuck again, just like appears to have happened for
the p2m / physmap etc splitting.)

> @@ -42,4 +45,49 @@ static inline void wrpkru(uint32_t pkru)
> :: "a" (pkru), "d" (0), "c" (0) );
>  }
>  
> +/*
> + * Xen does not use PKS.
> + *
> + * Guest kernel use is expected to be one default key, except for tiny 
> windows
> + * with a double write to switch to a non-default key in a permitted critical
> + * section.
> + *
> + * As such, we want MSR_PKRS un-intercepted.  Furthermore, as we only need it
> + * in Xen for emulation or migration purposes (i.e. possibly never in a
> + * domain's lifetime), we don't want to re-sync the hardware value on every
> + * vmexit.
> + *
> + * Therefore, we read and cache the guest value in ctxt_switch_from(), in the
> + * expectation that we can short-circuit the write in ctxt_switch_to().
> + * During regular operations in current context, the guest value is in
> + * hardware and the per-cpu cache is stale.
> + */
> +DECLARE_PER_CPU(uint32_t, pkrs);
> +
> +static inline uint32_t rdpkrs(void)
> +{
> +uint32_t pkrs, tmp;
> +
> +rdmsr(MSR_PKRS, pkrs, tmp);
> +
> +return pkrs;
> +}
> +
> +static inline uint32_t rdpkrs_and_cache(void)
> +{
> +return this_cpu(pkrs) = rdpkrs();
> +}
> +
> +static inline void wrpkrs(uint32_t pkrs)
> +{
> +uint32_t *this_pkrs = &this_cpu(pkrs);
> +
> +if ( *this_pkrs != pkrs )

For this to work, I think we need to clear PKRS during CPU init; I
admit I didn't peek ahead in the series to check whether you do so
later on in the series. At least the version of the SDM I'm looking
at doesn't even specify reset state of 0 for this MSR. But even if
it did, it would likely be as for PKRU - unchanged after INIT. Yet
INIT is all that CPUs go through when e.g. parking / unparking them.

Jan




Re: [PATCH 2/6] x86/prot-key: Split PKRU infrastructure out of asm/processor.h

2021-12-21 Thread Jan Beulich
On 16.12.2021 10:54, Andrew Cooper wrote:
> asm/processor.h is in desperate need of splitting up, and protection key
> functionality in only used in the emulator and pagewalk.  Introduce a new
> asm/prot-key.h and move the relevant content over.
> 
> Rename the PKRU_* constants to drop the user part and to use the architectural
> terminology.
> 
> Drop the read_pkru_{ad,wd}() helpers entirely.  The pkru infix is about to
> become wrong, and the sole user is shorter and easier to follow without the
> helpers.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper 

This looks functionally correct, so in principle
Reviewed-by: Jan Beulich 
But I have two remarks:

> --- /dev/null
> +++ b/xen/arch/x86/include/asm/prot-key.h
> @@ -0,0 +1,45 @@
> +/**
> + * arch/x86/include/asm/spec_ctrl.h
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see .
> + *
> + * Copyright (c) 2021 Citrix Systems Ltd.
> + */
> +#ifndef ASM_PROT_KEY_H
> +#define ASM_PROT_KEY_H
> +
> +#include 
> +
> +#define PKEY_AD 1 /* Access Disable */
> +#define PKEY_WD 2 /* Write Disable */
> +
> +#define PKEY_WIDTH 2 /* Two bits per protection key */
> +
> +static inline uint32_t rdpkru(void)
> +{
> +uint32_t pkru;

I agree this wants to be uint32_t (i.e. unlike the original function),
but I don't see why the function's return type needs to be, the more
that the sole caller also uses unsigned int for the variable to store
the result in.

> +asm volatile ( ".byte 0x0f,0x01,0xee"
> +   : "=a" (pkru) : "c" (0) : "dx" );
> +
> +return pkru;
> +}
> +
> +static inline void wrpkru(uint32_t pkru)

(To avoid an intermediate local variable, I can agree with using
uint32_t for the parameter type directly here.)

> --- a/xen/arch/x86/mm/guest_walk.c
> +++ b/xen/arch/x86/mm/guest_walk.c
> @@ -26,7 +26,9 @@
>  #include 
>  #include 
>  #include 
> +
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -413,10 +415,11 @@ guest_walk_tables(const struct vcpu *v, struct 
> p2m_domain *p2m,
>   guest_pku_enabled(v) )
>  {
>  unsigned int pkey = guest_l1e_get_pkey(gw->l1e);
> -unsigned int pkru = rdpkru();
> +unsigned int pkr = rdpkru();
> +unsigned int pk_ar = pkr >> (pkey * PKEY_WIDTH);

This is correct only because below you only inspect the low two bits.
Since I don't think masking off the upper bits is really useful here,
I'd like to suggest to not call the variable "pk_ar". Perhaps
something as generic as "temp"?

Jan




[xen-unstable test] 167497: tolerable FAIL

2021-12-21 Thread osstest service owner
flight 167497 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167497/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 12 debian-hvm-install fail in 
167485 pass in 167497
 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 12 debian-hvm-install fail in 
167485 pass in 167497
 test-armhf-armhf-xl-rtds 18 guest-start/debian.repeat  fail pass in 167485

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 167485
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167485
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167485
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167485
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 167485
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 167485
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 167485
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167485
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 167485
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 167485
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 167485
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167485
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-chec

Re: [PATCH 1/6] x86/prot-key: Enumeration for Protection Key Supervisor

2021-12-21 Thread Jan Beulich
On 16.12.2021 10:54, Andrew Cooper wrote:
> Protection Key Supervisor works in a very similar way to Protection Key User,
> except that instead of a PKRU register used by the {RD,WR}PKRU instructions,
> the supervisor protection settings live in MSR_PKRS and is accessed using
> normal {RD,WR}MSR instructions.
> 
> PKS has the same problematic interactions with PV guests as PKU (more infact,
> given the guest kernel's CPL), so we'll only support this for HVM guests for
> now.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 




Re: [PATCH] xenperf: omit meaningless trailing zeroes from output

2021-12-21 Thread Anthony PERARD
On Tue, Dec 21, 2021 at 10:59:27AM +0100, Jan Beulich wrote:
> There's no point producing a long chain of zeroes when the previously
> calculated total value was zero. To guard against mistakenly skipping
> non-zero individual fields, widen "sum" to "unsigned long long".
> 
> Signed-off-by: Jan Beulich 

Acked-by: Anthony PERARD 

Thanks,

-- 
Anthony PERARD



Re: [PATCH 3/5] VMX: sync VM-exit perf counters with known VM-exit reasons

2021-12-21 Thread Jan Beulich
On 17.12.2021 16:00, Andrew Cooper wrote:
> On 03/12/2021 12:05, Jan Beulich wrote:
>> --- a/xen/include/asm-x86/hvm/vmx/vmx.h
>> +++ b/xen/include/asm-x86/hvm/vmx/vmx.h
>> @@ -220,6 +220,8 @@ static inline void pi_clear_sn(struct pi
>>  #define EXIT_REASON_XSAVES  63
>>  #define EXIT_REASON_XRSTORS 64
>>  
>> +#define EXIT_REASON_LASTEXIT_REASON_XRSTORS
>> +
> 
> Given the problems with sentinals like this in the domctl/sysctl
> headers, I think this scheme would be less fragile if EXIT_REASON was an
> enum.

Enums have some "downsides", so I'm not sure I'd want to go that route,
at least not right here.

>  (Also, we seriously need to reduce the scope of these headers. 
> It's only just dawned on me why Intel uses EXIT_* and AMD uses VMEXIT_*)

Funny, isn't it? Otoh the PM specifically uses VMEXIT_*, while the SDM
specifically talks about "exit reason" everywhere. So there may be more
to this than just the need to avoid name space collisions.

And yes, I fully agree with the need of scope reduction. These
definitions living in a private header in xen/arch/x86/hvm/vmx/ should
be completely sufficient for the build to continue to work. Question
is if, while doing so, we wouldn't want to alter the name prefixes (but
see above).

> Alternatively, what about simply this:
> 
>  #define EXIT_REASON_XSAVES  63
>  #define EXIT_REASON_XRSTORS 64
> +/* Remember to update VMX_PERF_EXIT_REASON_SIZE too. */
> 
> ?
> 
> This avoids having yet another sentinel in the mix, and will be visible
> in *every* patch review.  It also gets rid of the ifdefary in the vmexit
> handler.

I can do it that way, sure, but then there'll again be no build time
check. As long as exit reasons all get added sequentially here, the
comment in context should raise enough attention, but what if Intel
start a second range like AMD did with NPF?

Jan




[libvirt test] 167499: regressions - FAIL

2021-12-21 Thread osstest service owner
flight 167499 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167499/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 151777

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-raw   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  adc0eaead0ebe11f38798e431d2748bfe9b54a30
baseline version:
 libvirt  2c846fa6bcc11929c9fb857a22430fb9945654ad

Last test of basis   151777  2020-07-10 04:19:19 Z  529 days
Failing since151818  2020-07-11 04:18:52 Z  528 days  510 attempts
Testing same since   167488  2021-12-20 04:19:01 Z1 days2 attempts


People who touched revisions under test:
Adolfo Jayme Barrientos 
  Aleksandr Alekseev 
  Aleksei Zakharov 
  Andika Triwidada 
  Andrea Bolognani 
  Ani Sinha 
  Balázs Meskó 
  Barrett Schonefeld 
  Bastian Germann 
  Bastien Orivel 
  BiaoXiang Ye 
  Bihong Yu 
  Binfeng Wu 
  Bjoern Walk 
  Boris Fiuczynski 
  Brian Turek 
  Bruno Haible 
  Chris Mayo 
  Christian Borntraeger 
  Christian Ehrhardt 
  Christian Kirbach 
  Christian Schoenebeck 
  Cole Robinson 
  Collin Walling 
  Cornelia Huck 
  Cédric Bosdonnat 
  Côme Borsoi 
  Daniel Henrique Barboza 
  Daniel Letai 
  Daniel P. Berrange 
  Daniel P. Berrangé 
  Didik Supriadi 
  dinglimin 
  Dmitrii Shcherbakov 
  Dmytro Linkin 
  Eiichi Tsukata 
  Eric Farman 
  Erik Skultety 
  Fabian Affolter 
  Fabian Freyer 
  Fabiano Fidêncio 
  Fangge Jin 
  Farhan Ali 
  Fedora Weblate Translation 
  Franck Ridel 
  Gavi Teitz 
  gongwei 
  Guoyi Tu
  Göran Uddeborg 
  Halil Pasic 
  Han Han 
  Hao Wang 
  Hela Basa 
  Helmut Grohne 
  Hiroki Narukawa 
  Hyman Huang(黄勇) 
  Ian Wienand 
  Ioanna Alifieraki 
  Jakob Meng 
  Jamie Strandboge 
  Jamie Strandboge 
  Jan Kuparinen 
  jason lee 
  Jean-Baptiste Holcroft 
  Jia Zhou 
  Jianan Gao 
  Jim Fehlig 
  Jin Yan 
  Jinsheng Zhang 
  Jiri Denemark 
  Joachim Falk 
  John Ferlan 
  Jonathan Watt 
  Jonathon Jongsma 
  Julio Faracco 
  Justin Gatzen 
  Ján Tomko 
  Kashyap Chamarthy 
  Kevin Locke 
  Koichi Murase 
  Kristina Hanicova 
  Laine Stump 
  Laszlo Ersek 
  Lee Yarwood 
  Lei Yang 
  Liao Pingfang 
  Lin Ma 
  Lin Ma 
  Lin Ma 
  Liu Yiding 
  Luke Yue 
  Luyao Zhong 
  Marc Hartmayer 
  Marc-André Lureau 
  Marek Marczykowski-Górecki 
  Markus Schade 
  Martin Kletzander 
  Masayoshi Mizuma 
  Matej Cepl 
  Matt Coleman 
  Matt Coleman 
  Mauro Matteo Cascella 
  Meina Li 
  Michal Privoznik 
  Michał Smyk 
  Milo Casagrande 
  Moshe Levi 
  Muha Aliss 
  Nathan 
  Neal Gompa 
  Nick Chevsky 
  Nick Shyrokovskiy 
  Nickys Music Group 
  Nico Pache 
  Nikolay Shirokovskiy 
  Olaf Hering 
  Olesya Gerasimenko 
  Or Ozeri 
  Orion Poplawski 
  Pany 
  Patrick Magauran 
  Paulo de Rezende Pinatti 
  Pavel Hrdina 
  Peng Liang 
  Peter Krempa 
  Pino Toscano 
  Pino Toscano 
  Piotr Drąg 
  Prathamesh Chavan 
  Praveen K Paladugu 
  Richard W.M. Jones 
  Ricky Tigg 
  Robin Lee 
  Roman Bogorodskiy 
  Roman Bolshakov 
  Ryan Gahagan 
  Ryan Schmidt 
  Sam Hartman 
  Scott Shambarger 
  Sebastian Mitterle 
  SeongHyun Jo 
  Shalini Chellathurai Saroja 
  Shaojun Yang 
  Shi Lei 
  simmon 
  Simon Chopin 
  Simon Gaiser 
  Simon Rowe 
  Stefan Bader 
  Stefan Berger 
  Stefan Berger 
  Stefan Hajnoczi 
  Stefan Hajnoczi 
  Szymon Scholz 

[PATCH] xenperf: omit meaningless trailing zeroes from output

2021-12-21 Thread Jan Beulich
There's no point producing a long chain of zeroes when the previously
calculated total value was zero. To guard against mistakenly skipping
non-zero individual fields, widen "sum" to "unsigned long long".

Signed-off-by: Jan Beulich 

--- a/tools/misc/xenperf.c
+++ b/tools/misc/xenperf.c
@@ -76,7 +76,7 @@ int main(int argc, char *argv[])
 DECLARE_HYPERCALL_BUFFER(xc_perfc_val_t, pcv);
 xc_perfc_val_t  *val;
 int num_desc, num_val;
-unsigned intsum, reset = 0, full = 0, pretty = 0;
+unsigned int reset = 0, full = 0, pretty = 0;
 char hypercall_name[36];
 
 if ( argc > 1 )
@@ -158,14 +158,15 @@ int main(int argc, char *argv[])
 val = pcv;
 for ( i = 0; i < num_desc; i++ )
 {
+unsigned long long sum = 0;
+
 printf ("%-35s ", pcd[i].name);
 
-sum = 0;
 for ( j = 0; j < pcd[i].nr_vals; j++ )
 sum += val[j];
-printf ("T=%10u ", (unsigned int)sum);
+printf("T=%10llu ", sum);
 
-if ( full || (pcd[i].nr_vals <= 4) )
+if ( sum && (full || (pcd[i].nr_vals <= 4)) )
 {
 if ( pretty && (strcmp(pcd[i].name, "hypercalls") == 0) )
 {




Re: [PATCH] xen/vpci: msix: move x86 specific code to x86 file

2021-12-21 Thread Rahul Singh
Hi Jan

> On 21 Dec 2021, at 7:41 am, Jan Beulich  wrote:
> 
> On 17.12.2021 15:32, Julien Grall wrote:
>> On 16/12/2021 13:37, Jan Beulich wrote:
>>> On 16.12.2021 12:01, Roger Pau Monné wrote:
 On Thu, Dec 16, 2021 at 10:18:32AM +, Rahul Singh wrote:
>> On 14 Dec 2021, at 12:37 pm, Roger Pau Monné  
>> wrote:
>> On Tue, Dec 14, 2021 at 10:45:17AM +, Rahul Singh wrote:
>>> +  unsigned long *data)
>>> {
>>> -const struct domain *d = v->domain;
>>> -struct vpci_msix *msix = msix_find(d, addr);
>>> const struct vpci_msix_entry *entry;
>>> unsigned int offset;
>>> 
>>> *data = ~0ul;
>>> 
>>> if ( !msix )
>>> -return X86EMUL_RETRY;
>>> +return VPCI_EMUL_RETRY;
>>> 
>>> if ( !access_allowed(msix->pdev, addr, len) )
>>> -return X86EMUL_OKAY;
>>> +return VPCI_EMUL_OKAY;
>>> 
>>> if ( VMSIX_ADDR_IN_RANGE(addr, msix->pdev->vpci, VPCI_MSIX_PBA) )
>>> {
>>> @@ -210,11 +194,11 @@ static int msix_read(struct vcpu *v, unsigned 
>>> long addr, unsigned int len,
>>> switch ( len )
>>> {
>>> case 4:
>>> -*data = readl(addr);
>>> +*data = vpci_arch_readl(addr);
>> 
>> Why do you need a vpci wrapper around the read/write handlers? AFAICT
>> arm64 also has {read,write}{l,q}. And you likely want to protect the
>> 64bit read with CONFIG_64BIT if this code is to be made available to
>> arm32.
> 
> I need the wrapper because {read,write}{l,q} function argument is 
> different for ARM and x86.
> ARM {read,wrie}(l,q}  function argument is pointer to the address whereas 
> X86  {read,wrie}(l,q}
> function argument is address itself.
 
 Oh, that's a shame. I don't think there's a need to tag those helpers
 with the vpci_ prefix though. Could we maybe introduce
 bus_{read,write}{b,w,l,q} helpers that take the same parameters on all
 arches?
 
 It would be even better to fix the current ones so they take the same
 parameters on x86 and Arm, but that would mean changing all the call
 places in one of the arches.
>>> 
>>> Yet still: +1 for removing the extra level of indirection. Imo these
>>> trivial helpers should never have diverged between arches; I have
>>> always been under the impression that on Linux they can be used by
>>> arch-independent code (or else drivers would be quite hard to write).
>> 
>> So technically both helpers are able to cope with pointer. The x86 one 
>> is also allowing to pass an address.
>> 
>> From a brief look at the x86, it looks like most of the users are using 
>> a pointer. However, the vPCI msix code is one example where addresses 
>> are passed.
> 
> Okay, first of all I need to clean up some confusion cause by Rahul
> saying "pointer to the address”:

Sorry for the confusion.
> That's where my "extra level of
> indirection" came from. I would really wish one wouldn't need to go
> to the code and verify such basic statements. There's no "pointer
> to the address" here. The question is whether the argument has to
> be a pointer (Arm) or is convertable to a pointer (x86). Therefore
> ...
> 
>> AFAICT, the read*/write* helpers on Linux only works with pointers. So I 
>> think the actions should be:
>>1) Modify the vPCI MSIx code to use pointer
>>2) Modify the x86 read*/write* helpers to forbid any access other 
>> than pointer.
> 
> ... I'd suggest to go with 1), to avoid impacting other x86 code.
> Longer term I wouldn't mind switching to 2) (unless vPCI really is
> the only place using non-pointer arguments, in which case doing
> the 2nd step right away [but still in a separate patch] would seem
> quite reasonable).

I will choose option 1 as of now to avoid any x86 specific change to 
 {read,write}{b,w,l,q}.

Regards,
Rahul
> Jan
> 



Re: [XEN PATCH v8 12/47] build: build everything from the root dir, use obj=$subdir

2021-12-21 Thread Anthony PERARD
On Tue, Dec 21, 2021 at 08:55:26AM +0100, Jan Beulich wrote:
> On 20.12.2021 12:29, Anthony PERARD wrote:
> > On Thu, Dec 16, 2021 at 11:08:47AM +, Anthony PERARD wrote:
> >> On Tue, Dec 07, 2021 at 12:10:34PM +0100, Jan Beulich wrote:
> >>> On 25.11.2021 14:39, Anthony PERARD wrote:
>  --- a/xen/Makefile
>  +++ b/xen/Makefile
>  @@ -22,6 +22,15 @@ export CHECKPOLICY?= checkpolicy
>   export BASEDIR := $(CURDIR)
>   export XEN_ROOT := $(BASEDIR)/..
>   
>  +abs_objtree := $(CURDIR)
>  +abs_srctree := $(CURDIR)
> >>>
> >>> Nit: In line with e.g. obj-y I think these would better be abs-srctree and
> >>> abs-objtree.
> >>
> >> I guess that would be fine, we don't need to keep the same spelling that
> >> Kbuild does.
> > 
> > Actually, those two variables are exported, as it appear in the next two
> > lines. Exporting a variable with a dash doesn't work very well as shells
> > may not be able to use them. When make find a variable with a dash in
> > it in the environment, it unexport them.
> > 
> > So, for those two variable, we need to avoid using a dash, so I think
> > keeping the current name is better. (And now, I've find out that there's
> > an issue in the build system, I'll prepare a patch.)
> 
> Oh, sure - if they have to be exported, the names should remain as they are.
> Question of course why they need exporting when by the end of the conversion
> you don't change directories anymore.

So far, the value of those two variables are simple, but they get a bit
more complicated when we introduce out-of-tree build. I would rather
avoid recalculating those values later one and be sure that the values
are the same on recursion.

Thanks,

-- 
Anthony PERARD



Re: [PATCH v2 03/18] IOMMU: have vendor code announce supported page sizes

2021-12-21 Thread Rahul Singh
Hi Jan,

> On 24 Sep 2021, at 10:43 am, Jan Beulich  wrote:
> 
> Generic code will use this information to determine what order values
> can legitimately be passed to the ->{,un}map_page() hooks. For now all
> ops structures simply get to announce 4k mappings (as base page size),
> and there is (and always has been) an assumption that this matches the
> CPU's MMU base page size (eventually we will want to permit IOMMUs with
> a base page size smaller than the CPU MMU's).
> 
> Signed-off-by: Jan Beulich 
> Reviewed-by: Kevin Tian 

Reviewed-by: Rahul Singh 

Regards,
Rahul
> 
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -629,6 +629,7 @@ static void amd_dump_page_tables(struct
> }
> 
> static const struct iommu_ops __initconstrel _iommu_ops = {
> +.page_sizes = PAGE_SIZE_4K,
> .init = amd_iommu_domain_init,
> .hwdom_init = amd_iommu_hwdom_init,
> .quarantine_init = amd_iommu_quarantine_init,
> --- a/xen/drivers/passthrough/arm/ipmmu-vmsa.c
> +++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
> @@ -1298,6 +1298,7 @@ static void ipmmu_iommu_domain_teardown(
> 
> static const struct iommu_ops ipmmu_iommu_ops =
> {
> +.page_sizes  = PAGE_SIZE_4K,
> .init= ipmmu_iommu_domain_init,
> .hwdom_init  = ipmmu_iommu_hwdom_init,
> .teardown= ipmmu_iommu_domain_teardown,
> --- a/xen/drivers/passthrough/arm/smmu.c
> +++ b/xen/drivers/passthrough/arm/smmu.c
> @@ -2873,6 +2873,7 @@ static void arm_smmu_iommu_domain_teardo
> }
> 
> static const struct iommu_ops arm_smmu_iommu_ops = {
> +.page_sizes = PAGE_SIZE_4K,
> .init = arm_smmu_iommu_domain_init,
> .hwdom_init = arm_smmu_iommu_hwdom_init,
> .add_device = arm_smmu_dt_add_device_generic,
> --- a/xen/drivers/passthrough/arm/smmu-v3.c
> +++ b/xen/drivers/passthrough/arm/smmu-v3.c
> @@ -3426,7 +3426,8 @@ static void arm_smmu_iommu_xen_domain_te
> }
> 
> static const struct iommu_ops arm_smmu_iommu_ops = {
> - .init   = arm_smmu_iommu_xen_domain_init,
> + .page_sizes = PAGE_SIZE_4K,
> + .init   = arm_smmu_iommu_xen_domain_init,
>   .hwdom_init = arm_smmu_iommu_hwdom_init,
>   .teardown   = arm_smmu_iommu_xen_domain_teardown,
>   .iotlb_flush= arm_smmu_iotlb_flush,
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -470,7 +470,17 @@ int __init iommu_setup(void)
> 
> if ( iommu_enable )
> {
> +const struct iommu_ops *ops = NULL;
> +
> rc = iommu_hardware_setup();
> +if ( !rc )
> +ops = iommu_get_ops();
> +if ( ops && (ops->page_sizes & -ops->page_sizes) != PAGE_SIZE )
> +{
> +printk(XENLOG_ERR "IOMMU: page size mask %lx unsupported\n",
> +   ops->page_sizes);
> +rc = ops->page_sizes ? -EPERM : -ENODATA;
> +}
> iommu_enabled = (rc == 0);
> }
> 
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2806,6 +2806,7 @@ static int __init intel_iommu_quarantine
> }
> 
> static struct iommu_ops __initdata vtd_ops = {
> +.page_sizes = PAGE_SIZE_4K,
> .init = intel_iommu_domain_init,
> .hwdom_init = intel_iommu_hwdom_init,
> .quarantine_init = intel_iommu_quarantine_init,
> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -231,6 +231,7 @@ struct page_info;
> typedef int iommu_grdm_t(xen_pfn_t start, xen_ulong_t nr, u32 id, void *ctxt);
> 
> struct iommu_ops {
> +unsigned long page_sizes;
> int (*init)(struct domain *d);
> void (*hwdom_init)(struct domain *d);
> int (*quarantine_init)(struct domain *d);
> 




Re: [PATCH v3] is_system_domain: replace open-coded instances

2021-12-21 Thread Jan Beulich
On 20.12.2021 17:28, Daniel P. Smith wrote:
> From: Christopher Clark 
> 
> This is a split out of the hyperlaunch dom0 series.
> 
> There were several instances of open-coded domid range checking. This commit
> replaces those with the is_system_domain or is_system_domid inline function.
> 
> Signed-off-by: Christopher Clark 
> Signed-off-by: Daniel P. Smith 
> Acked-by: Dario Faggioli 

While I'm not outright opposed, I'd still like to raise the question whether
we really want to intermix "is system domain" and "is in-range domain ID"
predicates. Personally I'd prefer the latter to remain open-coded range
checks.

> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -613,9 +613,14 @@ extern struct vcpu *idle_vcpu[NR_CPUS];
>  #define is_idle_domain(d) ((d)->domain_id == DOMID_IDLE)
>  #define is_idle_vcpu(v)   (is_idle_domain((v)->domain))
>  
> +static inline bool is_system_domid(domid_t id)
> +{
> +return (id >= DOMID_FIRST_RESERVED);

Nit: Generally we omit parentheses in cases like this, ...

> +}
> +
>  static inline bool is_system_domain(const struct domain *d)
>  {
> -return d->domain_id >= DOMID_FIRST_RESERVED;

... just like was the case here.

Jan




Re: [PATCH v3 02/13] xen: harmonize return types of hypercall handlers

2021-12-21 Thread Julien Grall

Hi Juergen,

On 21/12/2021 08:45, Juergen Gross wrote:

On 20.12.21 18:14, Julien Grall wrote:

Hi,

On 18/12/2021 00:00, Stefano Stabellini wrote:

On Fri, 17 Dec 2021, Juergen Gross wrote:

On 17.12.21 11:41, Julien Grall wrote:

Hi Juergen,

On 17/12/2021 08:50, Juergen Gross wrote:

On 17.12.21 08:45, Jan Beulich wrote:

On 17.12.2021 06:34, Juergen Gross wrote:

On 16.12.21 22:15, Stefano Stabellini wrote:

On Thu, 16 Dec 2021, Stefano Stabellini wrote:

On Thu, 16 Dec 2021, Juergen Gross wrote:

On 16.12.21 03:10, Stefano Stabellini wrote:

The case of XENMEM_maximum_ram_page is interesting but it is
not a
problem in reality because the max physical address size is
only 40-bit
for aarch32 guests, so 32-bit are always enough to return the
highest
page in memory for 32-bit guests.


You are aware that this isn't the guest's max page, but the
host's?


I can see now that you meant to say that, no matter what is the 
max
pseudo-physical address supported by the VM, 
XENMEM_maximum_ram_page

is
supposed to return the max memory page, which could go above the
addressibility limit of the VM.

So XENMEM_maximum_ram_page should potentially be able to return
(1<<44)
even when called by an aarch32 VM, with max IPA 40-bit.

I would imagine it could be useful if dom0 is 32-bit but domUs are
64-bit on a 64-bit hypervisor (which I think it would be a very 
rare

configuration on ARM.)

Then it looks like XENMEM_maximum_ram_page needs to be able to
return a
value > 32-bit when called by a 32-bit guest.

The hypercall ABI follows the ARM C calling convention, so a 
64-bit

value should be returned using r0 and r1. But looking at
xen/arch/arm/traps.c:do_trap_hypercall, it doesn't seem it ever 
sets

r1
today. Only r0 is set, so effectively we only support 32-bit 
return

values on aarch32 and for aarch32 guests.

In other words, today all hypercalls on ARM return 64-bit to 
64-bit

guests and 32-bit to 32-bit guests. Which in the case of memory_op
is
"technically" the correct thing to do because it matches the C
declaration in xen/include/xen/hypercall.h:

extern long
do_memory_op(
   unsigned long cmd,
   XEN_GUEST_HANDLE_PARAM(void) arg);

So...  I guess the conclusion is that on ARM do_memory_op should
return
"long" although it is not actually enough for a correct
implementation
of XENMEM_maximum_ram_page for aarch32 guests ?



Hence my suggestion to check the return value of _all_ 
hypercalls to

be
proper sign extended int values for 32-bit guests. This would 
fix all

potential issues without silently returning truncated values.


Are we absolutely certain we have no other paths left where a 
possibly

large unsigned values might be returned? In fact while
compat_memory_op() does the necessary saturation, I've never been 
fully
convinced of this being the best way of dealing with things. The 
range

of error indicators is much smaller than [-INT_MIN,-1], so almost
double the range of effectively unsigned values could be passed back
fine. (Obviously we can't change existing interfaces, so this mem-op
will need to remain as is.)


In fact libxenctrl tries do deal with this fact by wrapping a 
memory_op

for a 32-bit environment into a multicall. This will work fine for a
32-bit Arm guest, as xen_ulong_t is a uint64 there.

So do_memory_op should return long on Arm, yes. OTOH doing so will
continue to be a problem in case a 32-bit guest doesn't use the
multicall technique for handling possible 64-bit return values.

So I continue to argue that on Arm the return value of a hypercall
should be tested to fit into 32 bits.


It would make sense. But what would you return if the value doesn't 
fit?


I guess some errno value would be appropriate, like -EDOM, -ERANGE or
-E2BIG.


This seems to be better than the alternative below as it is a lot
simpler.


We would still need to special case XENMEM_maximum_reservation (or 
rework the implementation of the sub-op) because the value returned is 
an unsigned long. So technically, the unsigned value for -EDOM & co 
could be interpreted as the maximum host frame number.


I guess you meant XENMEM_maximum_ram_page.


H yes.



What about setting -EDOM only if the high 32 bits are not all the same?
This would mean to clamp the highest RAM page to -EDOM in case the
caller is interpreting it as an unsigned value. This would still be
better than silently dropping the high bits, which could lead to a
rather low page number instead.


This feels like quite a hack. So I would prefer the low page number. I 
am interested to hear about the others.




I also would like to see the hypercall returning 'int' when they are 
only meant to return 32-bit value. This will make easier to spot 
someone that decide to return a 64-bit value.


I guess this would need to include all hypercalls handled in common
code?


Ideally yes.




The only really clean alternative
would be to have separate hypercall function classes for Arm 32- and
64-bit guests (which still could share most of th

[PATCH] MAINTAINERS: update TXT section

2021-12-21 Thread Jan Beulich
Since mail to Lukasz'es address has been bouncing, Intel have suggested
replacement contacts.

Signed-off-by: Jan Beulich 
---
To be frank, I'm not fully convinced of the M: - I'd instead see us use
R: for both just like we had it for Lukasz.

--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -316,7 +316,8 @@ F:  xen/common/hypfs.c
 F: xen/include/xen/hypfs.h
 
 INTEL(R) TRUSTED EXECUTION TECHNOLOGY (TXT)
-R: Lukasz Hawrylko 
+M: Mateusz Mówka 
+R: Paweł Randzio 
 S: Odd Fixes
 F: xen/arch/x86/include/asm/tboot.h
 F: xen/arch/x86/tboot.c




[PATCH v3 07/10] mini-os: move x86 specific gnttab coding into arch/x86/gnttab.c

2021-12-21 Thread Juergen Gross
Having grant table code in arch/x86/mm.c seems wrong. Move it to the
new file arch/x86/gnttab.c, especially as the amount of code is
expected to grow further.

While doing that replace  type casts to pte_t with the more appropriate
__pte() macro.

No functional change.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
 arch/x86/gnttab.c | 78 +++
 arch/x86/mm.c | 47 
 2 files changed, 78 insertions(+), 47 deletions(-)
 create mode 100644 arch/x86/gnttab.c

diff --git a/arch/x86/gnttab.c b/arch/x86/gnttab.c
new file mode 100644
index 000..56e59d7
--- /dev/null
+++ b/arch/x86/gnttab.c
@@ -0,0 +1,78 @@
+/* -*-  Mode:C; c-basic-offset:4; tab-width:4 -*-
+ *
+ * (C) 2021 - Juergen Gross, SUSE Software Solutions Germany GmbH
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+grant_entry_v1_t *arch_init_gnttab(int nr_grant_frames)
+{
+struct gnttab_setup_table setup;
+unsigned long frames[nr_grant_frames];
+
+setup.dom = DOMID_SELF;
+setup.nr_frames = nr_grant_frames;
+set_xen_guest_handle(setup.frame_list, frames);
+
+HYPERVISOR_grant_table_op(GNTTABOP_setup_table, &setup, 1);
+return map_frames(frames, nr_grant_frames);
+}
+
+void arch_suspend_gnttab(grant_entry_v1_t *gnttab_table, int nr_grant_frames)
+{
+#ifdef CONFIG_PARAVIRT
+int i;
+
+for ( i = 0; i < nr_grant_frames; i++ )
+{
+HYPERVISOR_update_va_mapping((unsigned long)gnttab_table + PAGE_SIZE * 
i,
+__pte(0x0 << PAGE_SHIFT), UVMF_INVLPG);
+}
+#endif
+return;
+}
+
+void arch_resume_gnttab(grant_entry_v1_t *gnttab_table, int nr_grant_frames)
+{
+struct gnttab_setup_table setup;
+unsigned long frames[nr_grant_frames];
+#ifdef CONFIG_PARAVIRT
+int i;
+#endif
+
+setup.dom = DOMID_SELF;
+setup.nr_frames = nr_grant_frames;
+set_xen_guest_handle(setup.frame_list, frames);
+
+HYPERVISOR_grant_table_op(GNTTABOP_setup_table, &setup, 1);
+
+#ifdef CONFIG_PARAVIRT
+for ( i = 0; i < nr_grant_frames; i++ )
+{
+HYPERVISOR_update_va_mapping((unsigned long)gnttab_table + PAGE_SIZE * 
i,
+__pte((frames[i] << PAGE_SHIFT) | L1_PROT), UVMF_INVLPG);
+}
+#endif
+}
diff --git a/arch/x86/mm.c b/arch/x86/mm.c
index c30d8bc..220c0b4 100644
--- a/arch/x86/mm.c
+++ b/arch/x86/mm.c
@@ -837,53 +837,6 @@ void arch_init_mm(unsigned long* start_pfn_p, unsigned 
long* max_pfn_p)
 #endif
 }
 
-grant_entry_v1_t *arch_init_gnttab(int nr_grant_frames)
-{
-struct gnttab_setup_table setup;
-unsigned long frames[nr_grant_frames];
-
-setup.dom = DOMID_SELF;
-setup.nr_frames = nr_grant_frames;
-set_xen_guest_handle(setup.frame_list, frames);
-
-HYPERVISOR_grant_table_op(GNTTABOP_setup_table, &setup, 1);
-return map_frames(frames, nr_grant_frames);
-}
-
-void arch_suspend_gnttab(grant_entry_v1_t *gnttab_table, int nr_grant_frames)
-{
-#ifdef CONFIG_PARAVIRT
-int i;
-
-for (i = 0; i < nr_grant_frames; i++) {
-HYPERVISOR_update_va_mapping((unsigned long)(((char *)gnttab_table) + 
PAGE_SIZE * i),
-(pte_t){0x0<

[PATCH v3 06/10] mini-os: add memory map service functions

2021-12-21 Thread Juergen Gross
Add two functions for adding reserved areas to the memory map and
for removing them again.

Those will be needed for proper grant table/mapping support in PVH
mode.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
V2:
- fix e820_put_reserved_pfns() (Samuel Thibault)
---
 e820.c | 50 ++
 include/e820.h |  4 
 2 files changed, 54 insertions(+)

diff --git a/e820.c b/e820.c
index 2a371c7..df29097 100644
--- a/e820.c
+++ b/e820.c
@@ -283,6 +283,56 @@ void arch_print_memmap(void)
 printk("%012lx-%012lx: %s\n", from, to, type);
 }
 }
+
+unsigned long e820_get_reserved_pfns(int pages)
+{
+int i;
+unsigned long last = 0, needed = (long)pages << PAGE_SHIFT;
+
+for ( i = 0; i < e820_entries && e820_map[i].addr < last + needed; i++ )
+last = e820_map[i].addr + e820_map[i].size;
+
+if ( i == 0 || e820_map[i - 1].type != E820_RESERVED )
+e820_insert_entry_at(i, last, needed, E820_RESERVED);
+else
+e820_map[i - 1].size += needed;
+
+return last >> PAGE_SHIFT;
+}
+
+void e820_put_reserved_pfns(unsigned long start_pfn, int pages)
+{
+int i;
+unsigned long addr = start_pfn << PAGE_SHIFT;
+unsigned long size = (long)pages << PAGE_SHIFT;
+
+for ( i = 0;
+  i < e820_entries && addr >= e820_map[i].addr + e820_map[i].size;
+  i++ );
+
+BUG_ON(i == e820_entries || e820_map[i].type != E820_RESERVED ||
+   addr + size > e820_map[i].addr + e820_map[i].size);
+
+if ( addr == e820_map[i].addr )
+{
+e820_map[i].addr += size;
+e820_map[i].size -= size;
+if ( e820_map[i].size == 0 )
+e820_remove_entry(i);
+return;
+}
+
+if ( addr + size == e820_map[i].addr + e820_map[i].size )
+{
+e820_map[i].size -= size;
+return;
+}
+
+e820_insert_entry_at(i + 1, addr + size,
+ e820_map[i].addr + e820_map[i].size - addr - size,
+ E820_RESERVED);
+e820_map[i].size = addr - e820_map[i].addr;
+}
 #endif
 
 unsigned long e820_get_maxpfn(unsigned long pages)
diff --git a/include/e820.h b/include/e820.h
index 8d4d371..aaf2f2c 100644
--- a/include/e820.h
+++ b/include/e820.h
@@ -51,5 +51,9 @@ extern unsigned e820_entries;
 
 unsigned long e820_get_maxpfn(unsigned long pages);
 unsigned long e820_get_max_contig_pages(unsigned long pfn, unsigned long 
pages);
+#ifndef CONFIG_E820_TRIVIAL
+unsigned long e820_get_reserved_pfns(int pages);
+void e820_put_reserved_pfns(unsigned long start_pfn, int pages);
+#endif
 
 #endif /*__E820_HEADER*/
-- 
2.26.2




[PATCH v3 04/10] mini-os: respect memory map when ballooning up

2021-12-21 Thread Juergen Gross
Today Mini-OS won't look at the memory map when ballooning up. This can
result in problems for PVH domains with more than 4 GB of RAM, as
ballooning will happily run into the ACPI area.

Fix that by adding only pages being marked as RAM in the memory map and
by distinguishing between the current number of RAM pages and the first
unallocated page.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
V2:
- rename and fix e820_get_max_pages() (Samuel Thibault)
V3:
- fix loop condition
---
 arch/arm/mm.c  |  3 +++
 arch/x86/balloon.c |  4 ++--
 arch/x86/mm.c  |  2 ++
 balloon.c  | 33 -
 e820.c | 21 -
 include/balloon.h  |  5 +++--
 include/e820.h |  1 +
 mm.c   |  7 ++-
 8 files changed, 57 insertions(+), 19 deletions(-)

diff --git a/arch/arm/mm.c b/arch/arm/mm.c
index 9068166..11962f8 100644
--- a/arch/arm/mm.c
+++ b/arch/arm/mm.c
@@ -3,6 +3,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -70,6 +71,8 @@ void arch_init_mm(unsigned long *start_pfn_p, unsigned long 
*max_pfn_p)
 }
 device_tree = new_device_tree;
 *max_pfn_p = to_phys(new_device_tree) >> PAGE_SHIFT;
+
+balloon_set_nr_pages(*max_pfn_p, *max_pfn_p);
 }
 
 void arch_init_demand_mapping_area(void)
diff --git a/arch/x86/balloon.c b/arch/x86/balloon.c
index 10b440c..fe79644 100644
--- a/arch/x86/balloon.c
+++ b/arch/x86/balloon.c
@@ -61,10 +61,10 @@ void arch_remap_p2m(unsigned long max_pfn)
 p2m_invalidate(l2_list, L2_P2M_IDX(max_pfn - 1) + 1);
 p2m_invalidate(l1_list, L1_P2M_IDX(max_pfn - 1) + 1);
 
-if ( p2m_pages(nr_max_pages) <= p2m_pages(max_pfn) )
+if ( p2m_pages(nr_max_pfn) <= p2m_pages(max_pfn) )
 return;
 
-new_p2m = alloc_virt_kernel(p2m_pages(nr_max_pages));
+new_p2m = alloc_virt_kernel(p2m_pages(nr_max_pfn));
 for ( pfn = 0; pfn < max_pfn; pfn += P2M_ENTRIES )
 {
 map_frame_rw(new_p2m + PAGE_SIZE * (pfn / P2M_ENTRIES),
diff --git a/arch/x86/mm.c b/arch/x86/mm.c
index 3bf6170..c30d8bc 100644
--- a/arch/x86/mm.c
+++ b/arch/x86/mm.c
@@ -72,6 +72,7 @@ void arch_mm_preinit(void *p)
 pt_base = (pgentry_t *)si->pt_base;
 first_free_pfn = PFN_UP(to_phys(pt_base)) + si->nr_pt_frames;
 last_free_pfn = si->nr_pages;
+balloon_set_nr_pages(last_free_pfn, last_free_pfn);
 }
 #else
 #include 
@@ -118,6 +119,7 @@ void arch_mm_preinit(void *p)
 }
 
 last_free_pfn = e820_get_maxpfn(ret);
+balloon_set_nr_pages(ret, last_free_pfn);
 }
 #endif
 
diff --git a/balloon.c b/balloon.c
index 5676d3b..9dc77c5 100644
--- a/balloon.c
+++ b/balloon.c
@@ -23,14 +23,24 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
 
-unsigned long nr_max_pages;
-unsigned long nr_mem_pages;
+unsigned long nr_max_pfn;
+
+static unsigned long nr_max_pages;
+static unsigned long nr_mem_pfn;
+static unsigned long nr_mem_pages;
+
+void balloon_set_nr_pages(unsigned long pages, unsigned long pfn)
+{
+nr_mem_pages = pages;
+nr_mem_pfn = pfn;
+}
 
 void get_max_pages(void)
 {
@@ -46,16 +56,18 @@ void get_max_pages(void)
 
 nr_max_pages = ret;
 printk("Maximum memory size: %ld pages\n", nr_max_pages);
+
+nr_max_pfn = e820_get_maxpfn(nr_max_pages);
 }
 
 void mm_alloc_bitmap_remap(void)
 {
 unsigned long i, new_bitmap;
 
-if ( mm_alloc_bitmap_size >= ((nr_max_pages + 1) >> 3) )
+if ( mm_alloc_bitmap_size >= ((nr_max_pfn + 1) >> 3) )
 return;
 
-new_bitmap = alloc_virt_kernel(PFN_UP((nr_max_pages + 1) >> 3));
+new_bitmap = alloc_virt_kernel(PFN_UP((nr_max_pfn + 1) >> 3));
 for ( i = 0; i < mm_alloc_bitmap_size; i += PAGE_SIZE )
 {
 map_frame_rw(new_bitmap + i,
@@ -70,7 +82,7 @@ static unsigned long balloon_frames[N_BALLOON_FRAMES];
 
 int balloon_up(unsigned long n_pages)
 {
-unsigned long page, pfn;
+unsigned long page, pfn, start_pfn;
 int rc;
 struct xen_memory_reservation reservation = {
 .domid= DOMID_SELF
@@ -81,8 +93,11 @@ int balloon_up(unsigned long n_pages)
 if ( n_pages > N_BALLOON_FRAMES )
 n_pages = N_BALLOON_FRAMES;
 
+start_pfn = e820_get_maxpfn(nr_mem_pages + 1) - 1;
+n_pages = e820_get_max_contig_pages(start_pfn, n_pages);
+
 /* Resize alloc_bitmap if necessary. */
-while ( mm_alloc_bitmap_size * 8 < nr_mem_pages + n_pages )
+while ( mm_alloc_bitmap_size * 8 < start_pfn + n_pages )
 {
 page = alloc_page();
 if ( !page )
@@ -99,14 +114,14 @@ int balloon_up(unsigned long n_pages)
 mm_alloc_bitmap_size += PAGE_SIZE;
 }
 
-rc = arch_expand_p2m(nr_mem_pages + n_pages);
+rc = arch_expand_p2m(start_pfn + n_pages);
 if ( rc )
 return rc;
 
 /* Get new memory from hypervisor. */
 for ( pfn = 0; pfn < n_pages; pfn++ )
 {
-balloon_frames[pfn] = nr_mem_pages + pfn;
+balloon_frames[pfn] = start_pfn + pfn;
 }
 

[PATCH v3 10/10] mini-os: modify grant mappings to work in PVH mode

2021-12-21 Thread Juergen Gross
For being able to use the grant mapping interface in PVH mode some
changes are required, as the guest needs to specify a physical address
in the hypercall interface.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
 gntmap.c | 81 ++--
 include/gntmap.h |  1 +
 2 files changed, 59 insertions(+), 23 deletions(-)

diff --git a/gntmap.c b/gntmap.c
index 7ae8fe6..126b04f 100644
--- a/gntmap.c
+++ b/gntmap.c
@@ -32,6 +32,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -97,11 +98,42 @@ gntmap_set_max_grants(struct gntmap *map, int count)
 if (map->entries == NULL)
 return -ENOMEM;
 
+#ifndef CONFIG_PARAVIRT
+map->start_pfn = e820_get_reserved_pfns(count);
+#endif
+
 memset(map->entries, 0, sizeof(struct gntmap_entry) * count);
 map->nentries = count;
 return 0;
 }
 
+static int
+_gntmap_unmap_grant_ref(struct gntmap *map, int idx)
+{
+struct gntmap_entry *entry = map->entries + idx;
+struct gnttab_unmap_grant_ref op;
+int rc;
+
+#ifdef CONFIG_PARAVIRT
+op.host_addr= (uint64_t) entry->host_addr;
+#else
+op.host_addr= (uint64_t)(map->start_pfn + idx) << PAGE_SHIFT;
+#endif
+op.dev_bus_addr = 0;
+op.handle   = entry->handle;
+
+rc = HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, &op, 1);
+if (rc != 0 || op.status != GNTST_okay) {
+printk("GNTTABOP_unmap_grant_ref failed: "
+   "returned %d, status %" PRId16 "\n",
+   rc, op.status);
+return rc != 0 ? rc : op.status;
+}
+
+entry->host_addr = 0;
+return 0;
+}
+
 static int
 _gntmap_map_grant_ref(struct gntmap *map, int idx,
   unsigned long host_addr,
@@ -112,10 +144,17 @@ _gntmap_map_grant_ref(struct gntmap *map, int idx,
 struct gntmap_entry *entry = map->entries + idx;
 struct gnttab_map_grant_ref op;
 int rc;
+#ifndef CONFIG_PARAVIRT
+unsigned long pfn = map->start_pfn + idx;
+#endif
 
 op.ref = (grant_ref_t) ref;
 op.dom = (domid_t) domid;
+#ifdef CONFIG_PARAVIRT
 op.host_addr = (uint64_t) host_addr;
+#else
+op.host_addr = (uint64_t)pfn << PAGE_SHIFT; 
+#endif
 op.flags = GNTMAP_host_map;
 if (!writable)
 op.flags |= GNTMAP_readonly;
@@ -128,31 +167,18 @@ _gntmap_map_grant_ref(struct gntmap *map, int idx,
 return rc != 0 ? rc : op.status;
 }
 
-entry->host_addr = host_addr;
-entry->handle = op.handle;
-return 0;
-}
-
-static int
-_gntmap_unmap_grant_ref(struct gntmap *map, int idx)
-{
-struct gntmap_entry *entry = map->entries + idx;
-struct gnttab_unmap_grant_ref op;
-int rc;
-
-op.host_addr= (uint64_t) entry->host_addr;
-op.dev_bus_addr = 0;
-op.handle   = entry->handle;
-
-rc = HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, &op, 1);
-if (rc != 0 || op.status != GNTST_okay) {
-printk("GNTTABOP_unmap_grant_ref failed: "
-   "returned %d, status %" PRId16 "\n",
-   rc, op.status);
-return rc != 0 ? rc : op.status;
+#ifndef CONFIG_PARAVIRT
+rc = do_map_frames(host_addr, &pfn, 1, 0, 0, DOMID_SELF, NULL,
+   writable ? L1_PROT : L1_PROT_RO);
+if ( rc )
+{
+_gntmap_unmap_grant_ref(map, idx);
+return rc;
 }
+#endif
 
-entry->host_addr = 0;
+entry->host_addr = host_addr;
+entry->handle = op.handle;
 return 0;
 }
 
@@ -165,6 +191,10 @@ gntmap_munmap(struct gntmap *map, unsigned long 
start_address, int count)
 DEBUG("(map=%p, start_address=%lx, count=%d)",
map, start_address, count);
 
+#ifndef CONFIG_PARAVIRT
+unmap_frames(start_address, count);
+#endif
+
 for (i = 0; i < count; i++) {
 idx = gntmap_find_entry(map, start_address + PAGE_SIZE * i);
 if (idx < 0) {
@@ -242,6 +272,11 @@ gntmap_fini(struct gntmap *map)
 (void) _gntmap_unmap_grant_ref(map, i);
 }
 
+#ifndef CONFIG_PARAVIRT
+e820_put_reserved_pfns(map->start_pfn, map->nentries);
+map->start_pfn = 0;
+#endif
+
 xfree(map->entries);
 map->entries = NULL;
 map->nentries = 0;
diff --git a/include/gntmap.h b/include/gntmap.h
index fde53f3..d3d7e88 100644
--- a/include/gntmap.h
+++ b/include/gntmap.h
@@ -10,6 +10,7 @@
 struct gntmap {
 int nentries;
 struct gntmap_entry *entries;
+unsigned long start_pfn;
 };
 
 int
-- 
2.26.2




[PATCH v3 05/10] mini-os: don't repeat definition available via header file

2021-12-21 Thread Juergen Gross
arch/x86/setup.c is repeating the definition of __pte() instead using
the appropriate header. Fix that.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
 arch/x86/setup.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/arch/x86/setup.c b/arch/x86/setup.c
index 1ec68d3..b27bbed 100644
--- a/arch/x86/setup.c
+++ b/arch/x86/setup.c
@@ -29,6 +29,7 @@
 #include 
 #include  /* for printk, memcpy */
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -61,13 +62,6 @@ char stack[2*STACK_SIZE];
 
 extern char shared_info[PAGE_SIZE];
 
-#if defined(__x86_64__)
-#define __pte(x) ((pte_t) { (x) } )
-#else
-#define __pte(x) ({ unsigned long long _x = (x);\
-((pte_t) {(unsigned long)(_x), (unsigned long)(_x>>32)}); })
-#endif
-
 static inline void fpu_init(void) {
asm volatile("fninit");
 }
-- 
2.26.2




[PATCH v3 09/10] mini-os: prepare grantmap entry interface for use by PVH mode

2021-12-21 Thread Juergen Gross
Instead of passing the pointer of a grantmap entry to the
_gntmap_[un]map_grant_ref() sub-functions use the map pointer and the
entry index instead. This will be needed for PVH mode usage.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
 gntmap.c | 48 +++-
 1 file changed, 23 insertions(+), 25 deletions(-)

diff --git a/gntmap.c b/gntmap.c
index f6ab3ad..7ae8fe6 100644
--- a/gntmap.c
+++ b/gntmap.c
@@ -55,36 +55,34 @@ struct gntmap_entry {
 };
 
 static inline int
-gntmap_entry_used(struct gntmap_entry *entry)
+gntmap_entry_used(struct gntmap *map, int idx)
 {
-return entry->host_addr != 0;
+return map->entries[idx].host_addr != 0;
 }
 
-static struct gntmap_entry*
-gntmap_find_free_entry(struct gntmap *map)
+static int gntmap_find_free_entry(struct gntmap *map)
 {
 int i;
 
 for (i = 0; i < map->nentries; i++) {
-if (!gntmap_entry_used(&map->entries[i]))
-return &map->entries[i];
+if (!gntmap_entry_used(map, i))
+return i;
 }
 
 DEBUG("(map=%p): all %d entries full",
map, map->nentries);
-return NULL;
+return -1;
 }
 
-static struct gntmap_entry*
-gntmap_find_entry(struct gntmap *map, unsigned long addr)
+static int gntmap_find_entry(struct gntmap *map, unsigned long addr)
 {
 int i;
 
 for (i = 0; i < map->nentries; i++) {
 if (map->entries[i].host_addr == addr)
-return &map->entries[i];
+return i;
 }
-return NULL;
+return -1;
 }
 
 int
@@ -105,12 +103,13 @@ gntmap_set_max_grants(struct gntmap *map, int count)
 }
 
 static int
-_gntmap_map_grant_ref(struct gntmap_entry *entry, 
+_gntmap_map_grant_ref(struct gntmap *map, int idx,
   unsigned long host_addr,
   uint32_t domid,
   uint32_t ref,
   int writable)
 {
+struct gntmap_entry *entry = map->entries + idx;
 struct gnttab_map_grant_ref op;
 int rc;
 
@@ -135,8 +134,9 @@ _gntmap_map_grant_ref(struct gntmap_entry *entry,
 }
 
 static int
-_gntmap_unmap_grant_ref(struct gntmap_entry *entry)
+_gntmap_unmap_grant_ref(struct gntmap *map, int idx)
 {
+struct gntmap_entry *entry = map->entries + idx;
 struct gnttab_unmap_grant_ref op;
 int rc;
 
@@ -160,19 +160,19 @@ int
 gntmap_munmap(struct gntmap *map, unsigned long start_address, int count)
 {
 int i, rc;
-struct gntmap_entry *ent;
+int idx;
 
 DEBUG("(map=%p, start_address=%lx, count=%d)",
map, start_address, count);
 
 for (i = 0; i < count; i++) {
-ent = gntmap_find_entry(map, start_address + PAGE_SIZE * i);
-if (ent == NULL) {
+idx = gntmap_find_entry(map, start_address + PAGE_SIZE * i);
+if (idx < 0) {
 printk("gntmap: tried to munmap unknown page\n");
 return -EINVAL;
 }
 
-rc = _gntmap_unmap_grant_ref(ent);
+rc = _gntmap_unmap_grant_ref(map, idx);
 if (rc != 0)
 return rc;
 }
@@ -189,7 +189,7 @@ gntmap_map_grant_refs(struct gntmap *map,
   int writable)
 {
 unsigned long addr;
-struct gntmap_entry *ent;
+int idx;
 int i;
 
 DEBUG("(map=%p, count=%" PRIu32 ", "
@@ -206,9 +206,9 @@ gntmap_map_grant_refs(struct gntmap *map,
 return NULL;
 
 for (i = 0; i < count; i++) {
-ent = gntmap_find_free_entry(map);
-if (ent == NULL ||
-_gntmap_map_grant_ref(ent,
+idx = gntmap_find_free_entry(map);
+if (idx < 0 ||
+_gntmap_map_grant_ref(map, idx,
   addr + PAGE_SIZE * i,
   domids[i * domids_stride],
   refs[i],
@@ -233,15 +233,13 @@ gntmap_init(struct gntmap *map)
 void
 gntmap_fini(struct gntmap *map)
 {
-struct gntmap_entry *ent;
 int i;
 
 DEBUG("(map=%p)", map);
 
 for (i = 0; i < map->nentries; i++) {
-ent = &map->entries[i];
-if (gntmap_entry_used(ent))
-(void) _gntmap_unmap_grant_ref(ent);
+if (gntmap_entry_used(map, i))
+(void) _gntmap_unmap_grant_ref(map, i);
 }
 
 xfree(map->entries);
-- 
2.26.2




[PATCH v3 01/10] mini-os: split e820 map handling into new source file

2021-12-21 Thread Juergen Gross
Introduce e820.c containing all the E820 memory map handling.

No functional change.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
 Makefile   |   1 +
 arch/arm/mm.c  |   8 
 arch/x86/mm.c  |  70 +
 e820.c | 119 +
 include/e820.h |   6 +++
 5 files changed, 128 insertions(+), 76 deletions(-)
 create mode 100644 e820.c

diff --git a/Makefile b/Makefile
index 4b76b55..06b60fc 100644
--- a/Makefile
+++ b/Makefile
@@ -41,6 +41,7 @@ src-$(CONFIG_TPMFRONT) += tpmfront.c
 src-$(CONFIG_TPM_TIS) += tpm_tis.c
 src-$(CONFIG_TPMBACK) += tpmback.c
 src-y += daytime.c
+src-y += e820.c
 src-y += events.c
 src-$(CONFIG_FBFRONT) += fbfront.c
 src-y += gntmap.c
diff --git a/arch/arm/mm.c b/arch/arm/mm.c
index f806c9f..9068166 100644
--- a/arch/arm/mm.c
+++ b/arch/arm/mm.c
@@ -7,14 +7,6 @@
 #include 
 
 uint32_t physical_address_offset;
-struct e820entry e820_map[1] = {
-{
-.addr = 0,
-.size = ULONG_MAX - 1,
-.type = E820_RAM
-}
-};
-unsigned e820_entries = 1;
 
 unsigned long allocate_ondemand(unsigned long n, unsigned long alignment)
 {
diff --git a/arch/x86/mm.c b/arch/x86/mm.c
index 8ba14a5..8df93da 100644
--- a/arch/x86/mm.c
+++ b/arch/x86/mm.c
@@ -64,15 +64,6 @@ extern char stack[];
 extern void page_walk(unsigned long va);
 
 #ifdef CONFIG_PARAVIRT
-struct e820entry e820_map[1] = {
-{
-.addr = 0,
-.size = ULONG_MAX - 1,
-.type = E820_RAM
-}
-};
-unsigned e820_entries = 1;
-
 void arch_mm_preinit(void *p)
 {
 start_info_t *si = p;
@@ -112,25 +103,11 @@ desc_ptr idt_ptr =
 .base = (unsigned long)&idt,
 };
 
-struct e820entry e820_map[E820_MAX];
-unsigned e820_entries;
-
-static char *e820_types[E820_TYPES] = {
-[E820_RAM]  = "RAM",
-[E820_RESERVED] = "Reserved",
-[E820_ACPI] = "ACPI",
-[E820_NVS]  = "NVS",
-[E820_UNUSABLE] = "Unusable",
-[E820_PMEM] = "PMEM"
-};
-
 void arch_mm_preinit(void *p)
 {
 long ret;
 domid_t domid = DOMID_SELF;
-struct xen_memory_map memmap;
-int i;
-unsigned long pfn, max = 0;
+unsigned long max;
 
 pt_base = page_table_base;
 first_free_pfn = PFN_UP(to_phys(&_end));
@@ -142,53 +119,10 @@ void arch_mm_preinit(void *p)
 }
 last_free_pfn = ret;
 
-memmap.nr_entries = E820_MAX;
-set_xen_guest_handle(memmap.buffer, e820_map);
-ret = HYPERVISOR_memory_op(XENMEM_memory_map, &memmap);
-if ( ret < 0 )
-{
-xprintk("could not get memory map\n");
-do_exit();
-}
-e820_entries = memmap.nr_entries;
-
-for ( i = 0; i < e820_entries; i++ )
-{
-if ( e820_map[i].type != E820_RAM )
-continue;
-pfn = (e820_map[i].addr + e820_map[i].size) >> PAGE_SHIFT;
-if ( pfn > max )
-max = pfn;
-}
-
+max = e820_get_maxpfn();
 if ( max < last_free_pfn )
 last_free_pfn = max;
 }
-
-void arch_print_memmap(void)
-{
-int i;
-unsigned long from, to;
-char *type;
-char buf[12];
-
-printk("Memory map:\n");
-for ( i = 0; i < e820_entries; i++ )
-{
-if ( e820_map[i].type >= E820_TYPES || !e820_types[e820_map[i].type] )
-{
-snprintf(buf, sizeof(buf), "%8x", e820_map[i].type);
-type = buf;
-}
-else
-{
-type = e820_types[e820_map[i].type];
-}
-from = e820_map[i].addr;
-to = from + e820_map[i].size - 1;
-printk("%012lx-%012lx: %s\n", from, to, type);
-}
-}
 #endif
 
 /*
diff --git a/e820.c b/e820.c
new file mode 100644
index 000..2165280
--- /dev/null
+++ b/e820.c
@@ -0,0 +1,119 @@
+/* -*-  Mode:C; c-basic-offset:4; tab-width:4 -*-
+ *
+ * (C) 2021 - Juergen Gross, SUSE Software Solutions Germany GmbH
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 

[PATCH v3 00/10] mini-os: add missing PVH features

2021-12-21 Thread Juergen Gross
Mini-OS in PVH mode is missing some features, especially in the areas
of ballooning and grant tables.

With this series I am able to run Xenstore stubdom in PVH mode.

Changes in V3:
- two minor fixes

Changes in V2:
- multiple comments addressed

Juergen Gross (10):
  mini-os: split e820 map handling into new source file
  mini-os: sort and sanitize e820 memory map
  mini-os: don't assume contiguous RAM when initializing in PVH mode
  mini-os: respect memory map when ballooning up
  mini-os: don't repeat definition available via header file
  mini-os: add memory map service functions
  mini-os: move x86 specific gnttab coding into arch/x86/gnttab.c
  mini-os: add proper pvh grant table handling
  mini-os: prepare grantmap entry interface for use by PVH mode
  mini-os: modify grant mappings to work in PVH mode

 Makefile   |   1 +
 arch/arm/mm.c  |  11 +-
 arch/x86/balloon.c |   4 +-
 arch/x86/gnttab.c  | 109 +
 arch/x86/mm.c  | 121 +--
 arch/x86/setup.c   |   8 +-
 balloon.c  |  33 ++--
 e820.c | 376 +
 gntmap.c   | 125 +--
 include/balloon.h  |   5 +-
 include/e820.h |  11 ++
 include/gntmap.h   |   1 +
 mm.c   |   7 +-
 13 files changed, 615 insertions(+), 197 deletions(-)
 create mode 100644 arch/x86/gnttab.c
 create mode 100644 e820.c

-- 
2.26.2




[PATCH v3 03/10] mini-os: don't assume contiguous RAM when initializing in PVH mode

2021-12-21 Thread Juergen Gross
Sizing the available memory should respect memory holes, so look at
the memory map when setting the boundary for the memory allocator.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
V2:
- rename "max" to "start" (Samuel Thibault)
---
 arch/x86/mm.c  |  6 +-
 e820.c | 14 --
 include/e820.h |  2 +-
 3 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/arch/x86/mm.c b/arch/x86/mm.c
index 8df93da..3bf6170 100644
--- a/arch/x86/mm.c
+++ b/arch/x86/mm.c
@@ -107,7 +107,6 @@ void arch_mm_preinit(void *p)
 {
 long ret;
 domid_t domid = DOMID_SELF;
-unsigned long max;
 
 pt_base = page_table_base;
 first_free_pfn = PFN_UP(to_phys(&_end));
@@ -117,11 +116,8 @@ void arch_mm_preinit(void *p)
 xprintk("could not get memory size\n");
 do_exit();
 }
-last_free_pfn = ret;
 
-max = e820_get_maxpfn();
-if ( max < last_free_pfn )
-last_free_pfn = max;
+last_free_pfn = e820_get_maxpfn(ret);
 }
 #endif
 
diff --git a/e820.c b/e820.c
index 70286cb..8030f43 100644
--- a/e820.c
+++ b/e820.c
@@ -285,10 +285,10 @@ void arch_print_memmap(void)
 }
 #endif
 
-unsigned long e820_get_maxpfn(void)
+unsigned long e820_get_maxpfn(unsigned long pages)
 {
 int i;
-unsigned long pfn, max = 0;
+unsigned long pfns, start = 0;
 
 e820_get_memmap();
 
@@ -296,10 +296,12 @@ unsigned long e820_get_maxpfn(void)
 {
 if ( e820_map[i].type != E820_RAM )
 continue;
-pfn = (e820_map[i].addr + e820_map[i].size) >> PAGE_SHIFT;
-if ( pfn > max )
-max = pfn;
+pfns = e820_map[i].size >> PAGE_SHIFT;
+start = e820_map[i].addr >> PAGE_SHIFT;
+if ( pages <= pfns )
+return start + pages;
+pages -= pfns;
 }
 
-return max;
+return start + pfns;
 }
diff --git a/include/e820.h b/include/e820.h
index af2129f..6a57f05 100644
--- a/include/e820.h
+++ b/include/e820.h
@@ -49,6 +49,6 @@ struct __packed e820entry {
 extern struct e820entry e820_map[];
 extern unsigned e820_entries;
 
-unsigned long e820_get_maxpfn(void);
+unsigned long e820_get_maxpfn(unsigned long pages);
 
 #endif /*__E820_HEADER*/
-- 
2.26.2




[PATCH v3 08/10] mini-os: add proper pvh grant table handling

2021-12-21 Thread Juergen Gross
Grant table initialization for PVH requires some additional actions
compared to PV mode. Add those.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
 arch/x86/gnttab.c | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/arch/x86/gnttab.c b/arch/x86/gnttab.c
index 56e59d7..281c207 100644
--- a/arch/x86/gnttab.c
+++ b/arch/x86/gnttab.c
@@ -22,11 +22,15 @@
  */
 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
 #include 
+#include 
 
+#ifdef CONFIG_PARAVIRT
 grant_entry_v1_t *arch_init_gnttab(int nr_grant_frames)
 {
 struct gnttab_setup_table setup;
@@ -39,6 +43,33 @@ grant_entry_v1_t *arch_init_gnttab(int nr_grant_frames)
 HYPERVISOR_grant_table_op(GNTTABOP_setup_table, &setup, 1);
 return map_frames(frames, nr_grant_frames);
 }
+#else
+grant_entry_v1_t *arch_init_gnttab(int nr_grant_frames)
+{
+int i, rc;
+struct xen_add_to_physmap xatp;
+unsigned long pfn;
+unsigned long frames[nr_grant_frames];
+
+pfn = e820_get_reserved_pfns(nr_grant_frames);
+for ( i = 0; i < nr_grant_frames; i++ )
+{
+xatp.domid = DOMID_SELF;
+xatp.idx = i;
+xatp.space = XENMAPSPACE_grant_table;
+xatp.gpfn = pfn + i;
+rc = HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp);
+if ( rc )
+{
+xprintk("could not init grant table\n");
+do_exit();
+}
+frames[i] = pfn + i;
+}
+
+return map_frames(frames, nr_grant_frames);
+}
+#endif
 
 void arch_suspend_gnttab(grant_entry_v1_t *gnttab_table, int nr_grant_frames)
 {
-- 
2.26.2




[PATCH v3 02/10] mini-os: sort and sanitize e820 memory map

2021-12-21 Thread Juergen Gross
Do some processing of the E820 memory map obtained from the hypervisor:

- align the entries to page boundaries
- sort the entries by their start address
- merge adjacent entries of same type

This is relevant for PVH mode only.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
V2:
- correct page boundary rounding
- handle overlaps after rounding (Samuel Thibault)
- improve sorting (Samuel Thibault)
V3:
- small optimization in e820_sanitize() (Samuel Thibault)
---
 e820.c | 186 +
 1 file changed, 186 insertions(+)

diff --git a/e820.c b/e820.c
index 2165280..70286cb 100644
--- a/e820.c
+++ b/e820.c
@@ -57,6 +57,190 @@ static char *e820_types[E820_TYPES] = {
 [E820_PMEM] = "PMEM"
 };
 
+/*
+ * E820 type based bitmask for deciding how to round entries to page
+ * boundaries: A set bit means the type relates to a resource managed by
+ * Mini-OS (e.g. RAM), so rounding needs to be done to only include pages
+ * completely of the related type (narrowing). All other types need to be
+ * rounded to include all pages with parts of that type (widening).
+ */
+#define E820_NARROW ((1U << E820_RAM) | (1U << E820_NVS) | (1 << E820_PMEM))
+
+/* Private type used to mark a range temporarily as reserved (lowest prio). */
+#define E820_TMP_RESERVED0
+
+static void e820_remove_entry(int idx)
+{
+int i;
+
+e820_entries--;
+for ( i = idx; i < e820_entries; i++ )
+e820_map[i] = e820_map[i + 1];
+}
+
+static void e820_insert_entry_at(int idx, unsigned long addr,
+ unsigned long size, unsigned int type)
+{
+int i;
+
+if ( e820_entries == E820_MAX )
+{
+xprintk("E820 memory map overflow\n");
+do_exit();
+}
+
+e820_entries++;
+for ( i = e820_entries - 1; i > idx; i-- )
+e820_map[i] = e820_map[i - 1];
+
+e820_map[idx].addr = addr;
+e820_map[idx].size = size;
+e820_map[idx].type = type;
+}
+
+static void e820_insert_entry(unsigned long addr, unsigned long size,
+  unsigned int type)
+{
+int i;
+
+for ( i = 0; i < e820_entries && addr > e820_map[i].addr; i++ );
+
+e820_insert_entry_at(i, addr, size, type);
+}
+
+static void e820_swap_entries(int idx1, int idx2)
+{
+struct e820entry entry;
+
+entry = e820_map[idx1];
+e820_map[idx1] = e820_map[idx2];
+e820_map[idx2] = entry;
+}
+
+/*
+ * Do a memory map sanitizing sweep:
+ * - sort the entries by start address
+ * - remove overlaps of entries (higher type value wins)
+ * - merge adjacent entries of same type
+ */
+static void e820_process_entries(void)
+{
+int i, j;
+unsigned long end, start;
+unsigned int type;
+
+/* Sort entries. */
+for ( i = 1; i < e820_entries; i++ )
+for ( j = i; j > 0 && e820_map[j - 1].addr > e820_map[j].addr; j-- )
+e820_swap_entries(j - 1, j);
+
+/* Handle overlapping entries (higher type values win). */
+for ( i = 1; i < e820_entries; i++ )
+{
+if ( e820_map[i - 1].addr + e820_map[i - 1].size <= e820_map[i].addr )
+continue;
+if ( e820_map[i - 1].addr < e820_map[i].addr )
+{
+e820_insert_entry_at(i - 1, e820_map[i - 1].addr,
+ e820_map[i].addr - e820_map[i - 1].addr,
+ e820_map[i - 1].type);
+e820_map[i].addr += e820_map[i - 1].size;
+e820_map[i].size -= e820_map[i - 1].size;
+i++;
+}
+if ( e820_map[i - 1].type < e820_map[i].type )
+e820_swap_entries(i - 1, i);
+if ( e820_map[i - 1].size >= e820_map[i].size )
+{
+e820_remove_entry(i);
+i--;
+}
+else
+{
+start = e820_map[i].addr + e820_map[i - 1].size;
+end = e820_map[i].addr + e820_map[i].size;
+type = e820_map[i].type;
+e820_remove_entry(i);
+e820_insert_entry(start, end - start, type);
+}
+}
+
+/* Merge adjacent entries. */
+for ( i = 0; i < e820_entries - 1; i++ )
+{
+if ( e820_map[i].type == e820_map[i + 1].type &&
+ e820_map[i].addr + e820_map[i].size >= e820_map[i + 1].addr )
+{
+if ( e820_map[i].addr + e820_map[i].size <
+ e820_map[i + 1].addr + e820_map[i + 1].size )
+{
+e820_map[i].size = e820_map[i + 1].addr - e820_map[i].addr +
+   e820_map[i + 1].size;
+}
+e820_remove_entry(i + 1);
+i--;
+}
+}
+}
+
+/*
+ * Transform memory map into a well sorted map without any overlaps.
+ * - sort map entries by start address
+ * - handle overlaps
+ * - merge adjacent entries of same type (possibly removing boundary in the
+ *   middle of a page)
+ * - trim entries to page boundaries (depending on type either expanding
+ *   the entry 

Re: [PATCH v2 04/10] mini-os: respect memory map when ballooning up

2021-12-21 Thread Samuel Thibault
Juergen Gross, le mar. 21 déc. 2021 07:16:49 +0100, a ecrit:
> On 21.12.21 00:22, Samuel Thibault wrote:
> > Juergen Gross, le lun. 20 déc. 2021 17:07:10 +0100, a ecrit:
> > > +unsigned long e820_get_max_contig_pages(unsigned long pfn, unsigned long 
> > > pages)
> > > +{
> > > +int i;
> > > +unsigned long end;
> > > +
> > > +for ( i = 0; i < e820_entries && e820_map[i].addr < (pfn << 
> > > PAGE_SHIFT);
> > 
> > Shouldn't that be addr+size? Otherwise if pfn is in the middle of an
> > e820 entry, we will miss picking up that.
> 
> No, we want to check all map entries starting below or at the given pfn.
> The test should be "e820_map[i].addr <= (pfn << PAGE_SHIFT)", of course.

Ah, yes, due to the "<" I mistook it for a loop that skips the entries
before the targetted pfn :)

With <=, 

Reviewed-by: Samuel Thibault 

Samuel



Re: [PATCH v2 16/18] x86: introduce helper for recording degree of contiguity in page tables

2021-12-21 Thread Jan Beulich
On 20.12.2021 16:25, Roger Pau Monné wrote:
> I think it might be interesting to add some kind of unit testing to
> this code in tools/tests. It's a standalone piece of code that could
> be easily tested for correct functionality. Not that you should do it
> here, in fact it might be interesting for me to do so in order to
> better understand the code.

Actually I developed this by first having a user space app where I could
control insertions / removals from the command line. Only once I had it
working that way was when I converted the helper function to what's now
in this header. But that user space app wouldn't directly lend itself to
become an element under tools/tests/, I'm afraid.

Jan




Re: [PATCH v2 14/18] IOMMU: fold flush-all hook into "flush one"

2021-12-21 Thread Jan Beulich
On 16.12.2021 12:30, Rahul Singh wrote:
>> On 24 Sep 2021, at 10:53 am, Jan Beulich  wrote:
>>
>> Having a separate flush-all hook has always been puzzling me some. We
>> will want to be able to force a full flush via accumulated flush flags
>> from the map/unmap functions. Introduce a respective new flag and fold
>> all flush handling to use the single remaining hook.
>>
>> Note that because of the respective comments in SMMU and IPMMU-VMSA
>> code, I've folded the two prior hook functions into one. For SMMU-v3,
>> which lacks a comment towards incapable hardware, I've left both
>> functions in place on the assumption that selective and full flushes
>> will eventually want separating.
> 
> 
> For SMMUv3 related Changs:
> Reviewed-by: Rahul Singh 

Thanks. Any chance of an ack / R-b also for patch 3?

Jan