date:20210126

Re: [PATCH v4 01/10] evtchn: use per-channel lock where possible

2021-01-26 Thread Jan Beulich

On 11.01.2021 12:06, Julien Grall wrote:
> On 11/01/2021 10:14, Jan Beulich wrote:
>> On 08.01.2021 21:32, Julien Grall wrote:
>>> On 05/01/2021 13:09, Jan Beulich wrote:
 Neither evtchn_status() nor domain_dump_evtchn_info() nor
 flask_get_peer_sid() need to hold the per-domain lock - they all only
 read a single channel's state (at a time, in the dump case).

 Signed-off-by: Jan Beulich 
 ---
 v4: New.

 --- a/xen/common/event_channel.c
 +++ b/xen/common/event_channel.c
 @@ -968,15 +968,16 @@ int evtchn_status(evtchn_status_t *statu
if ( d == NULL )
return -ESRCH;

 -spin_lock(>event_lock);
 -
if ( !port_is_valid(d, port) )
>>>
>>> There is one issue that is now becoming more apparent. To be clear, the
>>> problem is not in this patch, but I think it is the best place to
>>> discuss it as d->event_lock may be part of the solution.
>>>
>>> After XSA-344, evtchn_destroy() will end up to decrement d->valid_evtchns.
>>>
>>> Given that evtchn_status() can work on the non-current domain, it would
>>> be possible to run it concurrently with evtchn_destroy(). As a
>>> consequence, port_is_valid() will be unstable as a valid event channel
>>> may turn invalid.
>>>
>>> AFAICT, we are getting away so far, as the memory is not freed until the
>>> domain is fully destroyed. However, we re-introduced XSA-338 in a
>>> different way.
>>>
>>> To be clear this is not the fault of this patch. But I don't think this
>>> is sane to re-introduce a behavior that lead us to an XSA.
>>
>> I'm getting confused, I'm afraid, from the varying statements above:
>> Are you suggesting this patch does re-introduce bad behavior or not?
> 
> No. I am pointing out that this is widening the bad behavior (again).

Since I'd really like to get in some more of this series before
the full freeze, and hence I want (need) to re-post, I thought
I'd reply here despite (or in light of) your request for input
from others not having been met.

I don't view this as "bad" behaviour, btw. The situation is
quite different to that which had led to the XSA: Here we
only deal with the "illusion" of a port having become invalid.
IOW yes, ...

>> Yes, the decrementing of ->valid_evtchns has a similar effect, but
>> I'm not convinced it gets us into XSA territory again. The problem
>> wasn't the reducing of ->max_evtchns as such, but the derived
>> assumptions elsewhere in the code. If there were any such again, I
>> suppose we'd have reason to issue another XSA.
> 
> I don't think it get us to the XSA territory yet. However, the 
> locking/interaction in the event channel code is quite complex.
> 
> To give a concrete example, below the current implementation of 
> free_xen_event_channel():
> 
>  if ( !port_is_valid(d, port) )
>  {
>  /*
>   * Make sure ->is_dying is read /after/ ->valid_evtchns, pairing
>   * with the spin_barrier() and BUG_ON() in evtchn_destroy().
>   */
>  smp_rmb();
>  BUG_ON(!d->is_dying);
>  return;
>  }
> 
>  evtchn_close(d, port, 0);
> 
> It would be fair for a developer to assume that after the check above, 
> port_is_valid() would return true. However, this is not the case...

... there needs to be awareness that putting e.g.

ASSERT(port_is_valid(d, port));

anywhere past the if() cannot be done without considering domain
cleanup logic.

> I am not aware of any issue so far... But I am not ready to be this is 
> not going to be missed out. How about you?

There is a risk of this being overlooked, yes. But I'm unconvinced
this absolutely requires measures to be taken beyond, maybe, the
addition of a comment somewhere. I do, in particular, not think
this should stand in the way of the locking relaxation done by
this patch, even more so that (just to repeat) it merely introduces
more instances of a pattern found elsewhere already.

Jan

Re: [PATCH] xen: Fix XenStore initialisation for XS_LOCAL

2021-01-26 Thread Jürgen Groß


On 26.01.21 22:36, Boris Ostrovsky wrote:



On 1/26/21 12:01 PM, David Woodhouse wrote:

From: David Woodhouse 

In commit 3499ba8198ca ("xen: Fix event channel callback via INTX/GSI")
I reworked the triggering of xenbus_probe().

I tried to simplify things by taking out the workqueue based startup
triggered from wake_waiting(); the somewhat poorly named xenbus IRQ
handler.

I missed the fact that in the XS_LOCAL case (Dom0 starting its own
xenstored or xenstore-stubdom, which happens after the kernel is booted
completely), that IRQ-based trigger is still actually needed.

So... put it back, except more cleanly. By just spawning a xenbus_probe
thread which waits on xb_waitq and runs the probe the first time it
gets woken, just as the workqueue-based hack did.

This is actually a nicer approach for *all* the back ends with different
interrupt methods, and we can switch them all over to that without the
complex conditions for when to trigger it. But not in -rc6. This is
the minimal fix for the regression, although it's a step in the right
direction instead of doing a partial revert and actually putting the
workqueue back. It's also simpler than the workqueue.



Wouldn't the minimal fix be to restore wake_waiting() to its previous

 if (unlikely(xenstored_ready == 0)) {
 xenstored_ready = 1;
 schedule_work(_work);
 }

(And to avoid changing xenbus_probe()'s signature just create a wrapper)


David and I had a longer chat on IRC regarding this fix.

The long term idea is to have just his current thread based variant
for all cases calling xenbus_probe() (so no call of xenbus_probe() at
any other place).

We agreed that this approach would be too risky now, but we wanted to
go in the right direction with the current fix. This is the outcome.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: application/pgp-keys


OpenPGP_signature
Description: OpenPGP digital signature

[linux-linus test] 158631: regressions - FAIL

2021-01-26 Thread osstest service owner

flight 158631 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/158631/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-install fail REGR. vs. 
152332
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-install fail REGR. 
vs. 152332
 test-amd64-i386-xl-qemuu-ws16-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-xsm7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-qemuu-rhel6hvm-intel  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-libvirt   7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-ws16-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332
 test-amd64-coresched-i386-xl  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-pair 10 xen-install/src_host fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-pair 11 xen-install/dst_host fail REGR. vs. 152332
 test-amd64-i386-qemuu-rhel6hvm-amd  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-install fail REGR. vs. 
152332
 test-amd64-i386-qemut-rhel6hvm-amd  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-xl7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-libvirt-xsm   7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-raw7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-pvshim 7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332
 test-amd64-i386-freebsd10-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-shadow 7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-freebsd10-i386  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-win7-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-win7-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-amd64-xl-multivcpu 14 guest-start fail REGR. vs. 152332
 test-amd64-amd64-xl-pvshim   14 guest-start  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-install fail REGR. 
vs. 152332
 test-amd64-i386-libvirt-pair 10 xen-install/src_host fail REGR. vs. 152332
 test-amd64-i386-libvirt-pair 11 xen-install/dst_host fail REGR. vs. 152332
 test-amd64-amd64-xl  14 guest-start  fail REGR. vs. 152332
 test-amd64-amd64-xl-pvhv2-intel 14 guest-start   fail REGR. vs. 152332
 test-amd64-amd64-xl-credit2  14 guest-start  fail REGR. vs. 152332
 test-amd64-amd64-dom0pvh-xl-amd 14 guest-start   fail REGR. vs. 152332
 test-amd64-amd64-qemuu-freebsd12-amd64 13 guest-startfail REGR. vs. 152332
 test-amd64-amd64-xl-shadow   14 guest-start  fail REGR. vs. 152332
 test-amd64-i386-examine   6 xen-install  fail REGR. vs. 152332
 test-amd64-coresched-amd64-xl 14 guest-start fail REGR. vs. 152332
 test-amd64-amd64-xl-pvhv2-amd 14 guest-start fail REGR. vs. 152332
 test-amd64-amd64-qemuu-freebsd11-amd64 13 guest-startfail REGR. vs. 152332
 test-amd64-amd64-libvirt-xsm 14 guest-start  fail REGR. vs. 152332
 test-amd64-amd64-libvirt 14 guest-start  fail REGR. vs. 152332
 test-amd64-amd64-xl-credit1  14 guest-start  fail REGR. vs. 152332
 test-amd64-amd64-xl-xsm  14 guest-start  fail REGR. vs. 152332
 test-amd64-amd64-dom0pvh-xl-intel 14 guest-start fail REGR. vs. 152332
 test-amd64-amd64-amd64-pvgrub 12 debian-di-install   fail REGR. vs. 152332
 test-amd64-amd64-libvirt-pair 25 guest-start/debian  fail REGR. vs. 152332
 test-arm64-arm64-libvirt-xsm  8 xen-boot fail REGR. vs. 152332
 test-arm64-arm64-xl-credit2   8 xen-boot fail REGR. vs. 152332
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 12 debian-hvm-install fail REGR. vs. 
152332
 test-amd64-amd64-qemuu-nested-intel 12 debian-hvm-install fail REGR. vs. 152332
 test-amd64-amd64-i386-pvgrub 12 debian-di-installfail REGR. vs. 152332
 test-amd64-amd64-xl-qemut-debianhvm-amd64 12 debian-hvm-install fail REGR. vs. 
152332
 test-amd64-amd64-pygrub  12 debian-di-installfail REGR. vs. 152332
 test-amd64-amd64-pair25 guest-start/debian   fail REGR. vs. 152332
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 12 debian-hvm-install 
fail REGR. vs. 152332

[xen-unstable test] 158628: tolerable FAIL - PUSHED

2021-01-26 Thread osstest service owner

flight 158628 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/158628/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds18 guest-start/debian.repeat fail REGR. vs. 158607

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 158607
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 158607
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 158607
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 158607
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 158607
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 158607
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 158607
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 158607
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 158607
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 158607
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 158607
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  25fcedefaa9fcbd20203202aa1b73eef051a5fa9
baseline version:
 xen  452ddbe3592b141b05a7e0676f09c8ae07f98fdd

Last test of basis   158607  2021-01-25 01:53:39 Z2 days
Failing since158617  2021-01-25 20:36:33 Z1 days2 attempts
Testing same since   158628  2021-01-26 11:49:57 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Juergen Gross 
  Paul Durrant 
  Wei Liu 

jobs:

[PATCH v2] libs/light: pass some infos to qemu

2021-01-26 Thread Manuel Bouyer

Pass bridge name to qemu as command line option
When starting qemu, set an environnement variable XEN_DOMAIN_ID,
to be used by qemu helper scripts
The only functional difference of using the br parameter is that the
bridge name gets passed to the QEMU script.
NetBSD doesn't have the ioctl to rename network interfaces implemented, and
thus cannot rename the interface from tapX to vifX.Y-emu. Only qemu knowns
the tap interface name, so we need to use the qemu script from qemu itself.

Signed-off-by: Manuel Bouyer 
---
 tools/libs/light/libxl_dm.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/tools/libs/light/libxl_dm.c b/tools/libs/light/libxl_dm.c
index 3da83259c0..13f79ec471 100644
--- a/tools/libs/light/libxl_dm.c
+++ b/tools/libs/light/libxl_dm.c
@@ -761,6 +761,8 @@ static int libxl__build_device_model_args_old(libxl__gc *gc,
 int nr_set_cpus = 0;
 char *s;
 
+flexarray_append_pair(dm_envs, "XEN_DOMAIN_ID", GCSPRINTF("%d", 
domid));
+
 if (b_info->kernel) {
 LOGD(ERROR, domid, "HVM direct kernel boot is not supported by "
  "qemu-xen-traditional");
@@ -1547,8 +1549,10 @@ static int libxl__build_device_model_args_new(libxl__gc 
*gc,
 flexarray_append(dm_args, "-netdev");
 flexarray_append(dm_args,
  GCSPRINTF("type=tap,id=net%d,ifname=%s,"
+   "br=%s,"
"script=%s,downscript=%s",
nics[i].devid, ifname,
+   nics[i].bridge,
libxl_tapif_script(gc),
libxl_tapif_script(gc)));
 
@@ -1825,6 +1829,8 @@ static int libxl__build_device_model_args_new(libxl__gc 
*gc,
 flexarray_append(dm_args, GCSPRINTF("%"PRId64, ram_size));
 
 if (b_info->type == LIBXL_DOMAIN_TYPE_HVM) {
+flexarray_append_pair(dm_envs, "XEN_DOMAIN_ID", GCSPRINTF("%d", 
guest_domid));
+
 if (b_info->u.hvm.hdtype == LIBXL_HDTYPE_AHCI)
 flexarray_append_pair(dm_args, "-device", "ahci,id=ahci0");
 for (i = 0; i < num_disks; i++) {
-- 
2.29.2

[PATCH v2] libs/light: Switch NetBSD to QEMU_XEN

2021-01-26 Thread Manuel Bouyer

Switch NetBSD to QEMU_XEN.
All 3 versions of libxl__default_device_model() now return
LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN, so remove it and just set
b_info->device_model_version to LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN in
libxl__domain_build_info_setdefault().

Signed-off-by: Manuel Bouyer 
Reviewed-by: Roger Pau Monné 
---
 tools/libs/light/libxl_create.c   | 2 +-
 tools/libs/light/libxl_freebsd.c  | 5 -
 tools/libs/light/libxl_internal.h | 2 --
 tools/libs/light/libxl_linux.c| 5 -
 tools/libs/light/libxl_netbsd.c   | 5 -
 5 files changed, 1 insertion(+), 18 deletions(-)

diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index 86f4a8369d..8616113e72 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -102,7 +102,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
 b_info->device_model_version =
 LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL;
 } else {
-b_info->device_model_version = libxl__default_device_model(gc);
+b_info->device_model_version = 
LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN;
 }
 } else {
 b_info->device_model_version =
diff --git a/tools/libs/light/libxl_freebsd.c b/tools/libs/light/libxl_freebsd.c
index f7ef4a8910..422c6b3b79 100644
--- a/tools/libs/light/libxl_freebsd.c
+++ b/tools/libs/light/libxl_freebsd.c
@@ -229,11 +229,6 @@ out:
 return rc;
 }
 
-libxl_device_model_version libxl__default_device_model(libxl__gc *gc)
-{
-return LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN;
-}
-
 int libxl__pci_numdevs(libxl__gc *gc)
 {
 return ERROR_NI;
diff --git a/tools/libs/light/libxl_internal.h 
b/tools/libs/light/libxl_internal.h
index c79523ba92..6c8b7d71a9 100644
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -2309,8 +2309,6 @@ _hidden char *libxl__json_object_to_json(libxl__gc *gc,
   /* Based on /local/domain/$domid/dm-version xenstore key
* default is qemu xen traditional */
 _hidden int libxl__device_model_version_running(libxl__gc *gc, uint32_t domid);
-  /* Return the system-wide default device model */
-_hidden libxl_device_model_version libxl__default_device_model(libxl__gc *gc);
 
 static inline
 bool libxl__stubdomain_is_linux_running(libxl__gc *gc, uint32_t domid)
diff --git a/tools/libs/light/libxl_linux.c b/tools/libs/light/libxl_linux.c
index 873b0271af..8d62dfd255 100644
--- a/tools/libs/light/libxl_linux.c
+++ b/tools/libs/light/libxl_linux.c
@@ -241,11 +241,6 @@ out:
 return rc;
 }
 
-libxl_device_model_version libxl__default_device_model(libxl__gc *gc)
-{
-return LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN;
-}
-
 int libxl__pci_numdevs(libxl__gc *gc)
 {
 DIR *dir;
diff --git a/tools/libs/light/libxl_netbsd.c b/tools/libs/light/libxl_netbsd.c
index e66a393d7f..6ad4ed34c2 100644
--- a/tools/libs/light/libxl_netbsd.c
+++ b/tools/libs/light/libxl_netbsd.c
@@ -108,11 +108,6 @@ out:
 return rc;
 }
 
-libxl_device_model_version libxl__default_device_model(libxl__gc *gc)
-{
-return LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL;
-}
-
 int libxl__pci_numdevs(libxl__gc *gc)
 {
 return ERROR_NI;
-- 
2.29.2

[PATCH v2] libs/call: fix build on NetBSD

2021-01-26 Thread Manuel Bouyer

Define PAGE_* if not already defined
Catch up with osdep interface change.

Signed-off-by: Manuel Bouyer 
Reviewed-by: Roger Pau Monné 
---
 tools/libs/call/netbsd.c  | 19 +++
 tools/libs/call/private.h |  4 +++-
 2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/tools/libs/call/netbsd.c b/tools/libs/call/netbsd.c
index a5502da377..4dcc2919ba 100644
--- a/tools/libs/call/netbsd.c
+++ b/tools/libs/call/netbsd.c
@@ -19,12 +19,15 @@
  * Split from xc_netbsd.c
  */
 
-#include "xc_private.h"
 
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
+
+#include "private.h"
 
 int osdep_xencall_open(xencall_handle *xcall)
 {
@@ -69,12 +72,13 @@ int osdep_xencall_close(xencall_handle *xcall)
 return close(fd);
 }
 
-void *osdep_alloc_hypercall_buffer(xencall_handle *xcall, size_t npages)
+void *osdep_alloc_pages(xencall_handle *xcall, size_t npages)
 {
-size_t size = npages * XC_PAGE_SIZE;
+size_t size = npages * PAGE_SIZE;
 void *p;
+int ret;
 
-ret = posix_memalign(, XC_PAGE_SIZE, size);
+ret = posix_memalign(, PAGE_SIZE, size);
 if ( ret != 0 || !p )
 return NULL;
 
@@ -86,14 +90,13 @@ void *osdep_alloc_hypercall_buffer(xencall_handle *xcall, 
size_t npages)
 return p;
 }
 
-void osdep_free_hypercall_buffer(xencall_handle *xcall, void *ptr,
- size_t npages)
+void osdep_free_pages(xencall_handle *xcall, void *ptr, size_t npages)
 {
-(void) munlock(ptr, npages * XC_PAGE_SIZE);
+munlock(ptr, npages * PAGE_SIZE);
 free(ptr);
 }
 
-int do_xen_hypercall(xencall_handle *xcall, privcmd_hypercall_t *hypercall)
+int osdep_hypercall(xencall_handle *xcall, privcmd_hypercall_t *hypercall)
 {
 int fd = xcall->fd;
 int error = ioctl(fd, IOCTL_PRIVCMD_HYPERCALL, hypercall);
diff --git a/tools/libs/call/private.h b/tools/libs/call/private.h
index 21f992b37e..7944ac5baf 100644
--- a/tools/libs/call/private.h
+++ b/tools/libs/call/private.h
@@ -12,8 +12,10 @@
 #ifndef PAGE_SHIFT /* Mini-os, Yukk */
 #define PAGE_SHIFT   12
 #endif
-#ifndef __MINIOS__ /* Yukk */
+#ifndef PAGE_SIZE
 #define PAGE_SIZE(1UL << PAGE_SHIFT)
+#endif
+#ifndef PAGE_MASK
 #define PAGE_MASK(~(PAGE_SIZE-1))
 #endif
 
-- 
2.29.2

[PATCH v3 2/3] arm,smmu: restructure code in preparation to new bindings support

2021-01-26 Thread Stefano Stabellini

From: Brian Woods 

Restructure some of the code and add supporting functions for adding
generic device tree (DT) binding support.  This will allow for using
current Linux device trees with just modifying the chosen field to
enable Xen.

Signed-off-by: Brian Woods 
Signed-off-by: Stefano Stabellini 
---
Changes in v3:
- split patch
---
 xen/drivers/passthrough/arm/smmu.c | 60 +-
 1 file changed, 35 insertions(+), 25 deletions(-)

diff --git a/xen/drivers/passthrough/arm/smmu.c 
b/xen/drivers/passthrough/arm/smmu.c
index 3898d1d737..9687762283 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -782,50 +782,36 @@ static int insert_smmu_master(struct arm_smmu_device 
*smmu,
return 0;
 }
 
-static int register_smmu_master(struct arm_smmu_device *smmu,
-   struct device *dev,
-   struct of_phandle_args *masterspec)
+static int arm_smmu_dt_add_device_legacy(struct arm_smmu_device *smmu,
+struct device *dev,
+struct iommu_fwspec *fwspec)
 {
-   int i, ret = 0;
+   int i;
struct arm_smmu_master *master;
-   struct iommu_fwspec *fwspec;
+   struct device_node *dev_node = dev_get_dev_node(dev);
 
-   master = find_smmu_master(smmu, masterspec->np);
+   master = find_smmu_master(smmu, dev_node);
if (master) {
dev_err(dev,
"rejecting multiple registrations for master device 
%s\n",
-   masterspec->np->name);
+   dev_node->name);
return -EBUSY;
}
 
master = devm_kzalloc(dev, sizeof(*master), GFP_KERNEL);
if (!master)
return -ENOMEM;
-   master->of_node = masterspec->np;
-
-   ret = iommu_fwspec_init(>of_node->dev, smmu->dev);
-   if (ret) {
-   kfree(master);
-   return ret;
-   }
-   fwspec = dev_iommu_fwspec_get(dev);
-
-   /* adding the ids here */
-   ret = iommu_fwspec_add_ids(>np->dev,
-  masterspec->args,
-  masterspec->args_count);
-   if (ret)
-   return ret;
+   master->of_node = dev_node;
 
/* Xen: Let Xen know that the device is protected by an SMMU */
-   dt_device_set_protected(masterspec->np);
+   dt_device_set_protected(dev_node);
 
if (!(smmu->features & ARM_SMMU_FEAT_STREAM_MATCH)) {
for (i = 0; i < fwspec->num_ids; ++i) {
-   if (masterspec->args[i] >= smmu->num_mapping_groups) {
+   if (fwspec->ids[i] >= smmu->num_mapping_groups) {
dev_err(dev,
"stream ID for master device %s greater 
than maximum allowed (%d)\n",
-   masterspec->np->name, 
smmu->num_mapping_groups);
+   dev_node->name, 
smmu->num_mapping_groups);
return -ERANGE;
}
}
@@ -833,6 +819,30 @@ static int register_smmu_master(struct arm_smmu_device 
*smmu,
return insert_smmu_master(smmu, master);
 }
 
+static int register_smmu_master(struct arm_smmu_device *smmu,
+   struct device *dev,
+   struct of_phandle_args *masterspec)
+{
+   int ret = 0;
+   struct iommu_fwspec *fwspec;
+
+   ret = iommu_fwspec_init(>np->dev, smmu->dev);
+   if (ret)
+   return ret;
+
+   fwspec = dev_iommu_fwspec_get(>np->dev);
+
+   ret = iommu_fwspec_add_ids(>np->dev,
+  masterspec->args,
+  masterspec->args_count);
+   if (ret)
+   return ret;
+
+   return arm_smmu_dt_add_device_legacy(smmu,
+>np->dev,
+fwspec);
+}
+
 static struct arm_smmu_device *find_smmu_for_device(struct device *dev)
 {
struct arm_smmu_device *smmu;
-- 
2.17.1

[PATCH v3 3/3] arm,smmu: add support for generic DT bindings. Implement add_device and dt_xlate.

2021-01-26 Thread Stefano Stabellini

From: Brian Woods 

Now that all arm iommu drivers support generic bindings we can remove
the workaround from iommu_add_dt_device().

Note that if both legacy bindings and generic bindings are present in
device tree, the legacy bindings are the ones that are used.

Signed-off-by: Brian Woods 
Signed-off-by: Stefano Stabellini 
---
Changes in v3:
- split patch
- make find_smmu return non-const so that we can use it in 
arm_smmu_dt_add_device_generic
- use dt_phandle_args
- update commit message
---
 xen/drivers/passthrough/arm/smmu.c| 40 ++-
 xen/drivers/passthrough/device_tree.c | 17 +---
 2 files changed, 40 insertions(+), 17 deletions(-)

diff --git a/xen/drivers/passthrough/arm/smmu.c 
b/xen/drivers/passthrough/arm/smmu.c
index 9687762283..620ba5a4b5 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -254,6 +254,8 @@ struct iommu_group
atomic_t ref;
 };
 
+static struct arm_smmu_device *find_smmu(const struct device *dev);
+
 static struct iommu_group *iommu_group_alloc(void)
 {
struct iommu_group *group = xzalloc(struct iommu_group);
@@ -843,6 +845,40 @@ static int register_smmu_master(struct arm_smmu_device 
*smmu,
 fwspec);
 }
 
+static int arm_smmu_dt_add_device_generic(u8 devfn, struct device *dev)
+{
+   struct arm_smmu_device *smmu;
+   struct iommu_fwspec *fwspec;
+
+   fwspec = dev_iommu_fwspec_get(dev);
+   if (fwspec == NULL)
+   return -ENXIO;
+
+   smmu = find_smmu(fwspec->iommu_dev);
+   if (smmu == NULL)
+   return -ENXIO;
+
+   return arm_smmu_dt_add_device_legacy(smmu, dev, fwspec);
+}
+
+static int arm_smmu_dt_xlate_generic(struct device *dev,
+   const struct dt_phandle_args *spec)
+{
+   uint32_t mask, fwid = 0;
+
+   if (spec->args_count > 0)
+   fwid |= (SMR_ID_MASK & spec->args[0]) << SMR_ID_SHIFT;
+
+   if (spec->args_count > 1)
+   fwid |= (SMR_MASK_MASK & spec->args[1]) << SMR_MASK_SHIFT;
+   else if (!of_property_read_u32(spec->np, "stream-match-mask", ))
+   fwid |= (SMR_MASK_MASK & mask) << SMR_MASK_SHIFT;
+
+   return iommu_fwspec_add_ids(dev,
+   ,
+   1);
+}
+
 static struct arm_smmu_device *find_smmu_for_device(struct device *dev)
 {
struct arm_smmu_device *smmu;
@@ -2766,6 +2802,7 @@ static void arm_smmu_iommu_domain_teardown(struct domain 
*d)
 static const struct iommu_ops arm_smmu_iommu_ops = {
 .init = arm_smmu_iommu_domain_init,
 .hwdom_init = arm_smmu_iommu_hwdom_init,
+.add_device = arm_smmu_dt_add_device_generic,
 .teardown = arm_smmu_iommu_domain_teardown,
 .iotlb_flush = arm_smmu_iotlb_flush,
 .iotlb_flush_all = arm_smmu_iotlb_flush_all,
@@ -2773,9 +2810,10 @@ static const struct iommu_ops arm_smmu_iommu_ops = {
 .reassign_device = arm_smmu_reassign_dev,
 .map_page = arm_iommu_map_page,
 .unmap_page = arm_iommu_unmap_page,
+.dt_xlate = arm_smmu_dt_xlate_generic,
 };
 
-static __init const struct arm_smmu_device *find_smmu(const struct device *dev)
+static struct arm_smmu_device *find_smmu(const struct device *dev)
 {
struct arm_smmu_device *smmu;
bool found = false;
diff --git a/xen/drivers/passthrough/device_tree.c 
b/xen/drivers/passthrough/device_tree.c
index a51ae3c9c3..ae07f272e1 100644
--- a/xen/drivers/passthrough/device_tree.c
+++ b/xen/drivers/passthrough/device_tree.c
@@ -162,22 +162,7 @@ int iommu_add_dt_device(struct dt_device_node *np)
  * these callback implemented.
  */
 if ( !ops->add_device || !ops->dt_xlate )
-{
-/*
- * Some Device Trees may expose both legacy SMMU and generic
- * IOMMU bindings together. However, the SMMU driver is only
- * supporting the former and will protect them during the
- * initialization. So we need to skip them and not return
- * error here.
- *
- * XXX: This can be dropped when the SMMU is able to deal
- * with generic bindings.
- */
-if ( dt_device_is_protected(np) )
-return 0;
-else
-return -EINVAL;
-}
+return -EINVAL;
 
 if ( !dt_device_is_available(iommu_spec.np) )
 break;
-- 
2.17.1

[PATCH v3 1/3] arm,smmu: switch to using iommu_fwspec functions

2021-01-26 Thread Stefano Stabellini

From: Brian Woods 

Modify the smmu driver so that it uses the iommu_fwspec helper
functions.  This means both ARM IOMMU drivers will both use the
iommu_fwspec helper functions, making enabling generic device tree
bindings in the SMMU driver much cleaner.

Signed-off-by: Brian Woods 
Signed-off-by: Stefano Stabellini 
---
Changes in v3:
- add a comment in iommu_add_dt_device
- don't allocate fwspec twice in arm_smmu_add_device
- reuse existing fwspec pointer, don't add a second one
- add comment about supporting fwspec at the top of the file
---
 xen/drivers/passthrough/arm/smmu.c| 98 ---
 xen/drivers/passthrough/device_tree.c |  7 ++
 2 files changed, 66 insertions(+), 39 deletions(-)

diff --git a/xen/drivers/passthrough/arm/smmu.c 
b/xen/drivers/passthrough/arm/smmu.c
index 3e8aa37866..3898d1d737 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -32,6 +32,9 @@
  * - 4k and 64k pages, with contiguous pte hints.
  * - Up to 48-bit addressing (dependent on VA_BITS)
  * - Context fault reporting
+ *
+ * Changes compared to Linux driver:
+ * - support for fwspec
  */
 
 
@@ -49,6 +52,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /* Xen: The below defines are redefined within the file. Undef it */
@@ -302,9 +306,6 @@ static struct iommu_group *iommu_group_get(struct device 
*dev)
 
 /* Start of Linux SMMU code */
 
-/* Maximum number of stream IDs assigned to a single device */
-#define MAX_MASTER_STREAMIDS   MAX_PHANDLE_ARGS
-
 /* Maximum number of context banks per SMMU */
 #define ARM_SMMU_MAX_CBS   128
 
@@ -597,8 +598,6 @@ struct arm_smmu_smr {
 };
 
 struct arm_smmu_master_cfg {
-   int num_streamids;
-   u16 streamids[MAX_MASTER_STREAMIDS];
struct arm_smmu_smr *smrs;
 };
 
@@ -686,6 +685,14 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
{ 0, NULL},
 };
 
+static inline struct iommu_fwspec *
+arm_smmu_get_fwspec(struct arm_smmu_master_cfg *cfg)
+{
+   struct arm_smmu_master *master = container_of(cfg,
+ struct 
arm_smmu_master, cfg);
+   return dev_iommu_fwspec_get(>of_node->dev);
+}
+
 static void parse_driver_options(struct arm_smmu_device *smmu)
 {
int i = 0;
@@ -779,8 +786,9 @@ static int register_smmu_master(struct arm_smmu_device 
*smmu,
struct device *dev,
struct of_phandle_args *masterspec)
 {
-   int i;
+   int i, ret = 0;
struct arm_smmu_master *master;
+   struct iommu_fwspec *fwspec;
 
master = find_smmu_master(smmu, masterspec->np);
if (master) {
@@ -790,34 +798,37 @@ static int register_smmu_master(struct arm_smmu_device 
*smmu,
return -EBUSY;
}
 
-   if (masterspec->args_count > MAX_MASTER_STREAMIDS) {
-   dev_err(dev,
-   "reached maximum number (%d) of stream IDs for master 
device %s\n",
-   MAX_MASTER_STREAMIDS, masterspec->np->name);
-   return -ENOSPC;
-   }
-
master = devm_kzalloc(dev, sizeof(*master), GFP_KERNEL);
if (!master)
return -ENOMEM;
+   master->of_node = masterspec->np;
 
-   master->of_node = masterspec->np;
-   master->cfg.num_streamids   = masterspec->args_count;
+   ret = iommu_fwspec_init(>of_node->dev, smmu->dev);
+   if (ret) {
+   kfree(master);
+   return ret;
+   }
+   fwspec = dev_iommu_fwspec_get(dev);
+
+   /* adding the ids here */
+   ret = iommu_fwspec_add_ids(>np->dev,
+  masterspec->args,
+  masterspec->args_count);
+   if (ret)
+   return ret;
 
/* Xen: Let Xen know that the device is protected by an SMMU */
dt_device_set_protected(masterspec->np);
 
-   for (i = 0; i < master->cfg.num_streamids; ++i) {
-   u16 streamid = masterspec->args[i];
-
-   if (!(smmu->features & ARM_SMMU_FEAT_STREAM_MATCH) &&
-(streamid >= smmu->num_mapping_groups)) {
-   dev_err(dev,
-   "stream ID for master device %s greater than 
maximum allowed (%d)\n",
-   masterspec->np->name, smmu->num_mapping_groups);
-   return -ERANGE;
+   if (!(smmu->features & ARM_SMMU_FEAT_STREAM_MATCH)) {
+   for (i = 0; i < fwspec->num_ids; ++i) {
+   if (masterspec->args[i] >= smmu->num_mapping_groups) {
+   dev_err(dev,
+   "stream ID for master device %s greater 
than maximum allowed (%d)\n",
+

[PATCH v3 0/3] Generic SMMU Bindings

2021-01-26 Thread Stefano Stabellini

Hi all,

This series introduces support for the generic SMMU bindings to
xen/drivers/passthrough/arm/smmu.c.

The last version of the series was
https://marc.info/?l=xen-devel=159539053406643

I realize that it is late for 4.15 -- I think it is OK if this series
goes in afterwards.

Cheers,

Stefano


Brian Woods (3):
  arm,smmu: switch to using iommu_fwspec functions
  arm,smmu: restructure code in preparation to new bindings support
  arm,smmu: add support for generic DT bindings. Implement add_device and 
dt_xlate.

 xen/drivers/passthrough/arm/smmu.c| 162 --
 xen/drivers/passthrough/device_tree.c |  24 ++---
 2 files changed, 123 insertions(+), 63 deletions(-)

[linux-5.4 test] 158624: regressions - FAIL

2021-01-26 Thread osstest service owner

flight 158624 linux-5.4 real [real]
flight 158672 linux-5.4 real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/158624/
http://logs.test-lab.xenproject.org/osstest/logs/158672/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-dom0pvh-xl-intel  8 xen-bootfail REGR. vs. 158387

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 158387
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 158387
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 158387
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 158387
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 158387
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 158387
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 158387
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 158387
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 158387
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 158387
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 158387
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass

version targeted for testing:
 linux09f983f0c7fc0db79a5f6c883ec3510d424c369c
baseline version:
 linuxa829146c3fdcf6d0b76d9c54556a223820f1f73b

Last test of basis   158387  2021-01-12 19:40:06 Z   14 days
Failing since158473  2021-01-17 13:42:20 Z9 days   15 attempts
Testing same since   158593  2021-01-23 21:09:36 Z3 days5 attempts


People who

[PATCH v2] NetBSD hotplug: fix block unconfigure on destroy

2021-01-26 Thread Manuel Bouyer

When a domain is destroyed, xparams may not be available any more when
the block script is called to unconfigure the vnd.
Check xparam only at configure time, and just unconfigure any vnd present
in the xenstore.

Signed-off-by: Manuel Bouyer 
Reviewed-by: Roger Pau Monné 
---
 tools/hotplug/NetBSD/block | 37 ++---
 1 file changed, 14 insertions(+), 23 deletions(-)

diff --git a/tools/hotplug/NetBSD/block b/tools/hotplug/NetBSD/block
index 2a0516f436..c8b31a7b2b 100644
--- a/tools/hotplug/NetBSD/block
+++ b/tools/hotplug/NetBSD/block
@@ -20,37 +20,28 @@ error() {
 xpath=$1
 xstatus=$2
 xparams=$(xenstore-read "$xpath/params")
-if [ -b "$xparams" ]; then
-   xtype="phy"
-elif [ -f "$xparams" ]; then
-   xtype="file"
-elif [ -z "$xparams" ]; then
-   error "$xpath/params is empty, unable to attach block device."
-else
-   error "$xparams is not a valid file type to use as block device." \
- "Only block and regular image files accepted."
-fi
 
 case $xstatus in
 6)
# device removed
-   case $xtype in
-   file)
-   vnd=$(xenstore-read "$xpath/vnd" || echo none)
-   if [ $vnd != none ]; then
-   vnconfig -u $vnd
-   fi
-   ;;
-   phy)
-   ;;
-   *)
-   echo "unknown type $xtype" >&2
-   ;;
-   esac
+   vnd=$(xenstore-read "$xpath/vnd" || echo none)
+   if [ $vnd != none ]; then
+   vnconfig -u $vnd
+   fi
xenstore-rm $xpath
exit 0
;;
 2)
+   if [ -b "$xparams" ]; then
+   xtype="phy"
+   elif [ -f "$xparams" ]; then
+   xtype="file"
+   elif [ -z "$xparams" ]; then
+   error "$xpath/params is empty, unable to attach block device."
+   else
+   error "$xparams is not a valid file type to use as block 
device." \
+ "Only block and regular image files accepted."
+   fi
case $xtype in
file)
# Store the list of available vnd(4) devices in
-- 
2.29.2

[PATCH v2] libs/light: make it build without setresuid()

2021-01-26 Thread Manuel Bouyer

NetBSD doesn't have setresuid(). introcuce libxl__setresuid(),
which on NetBSD assert() that it's never called (it should not be called when
dm restriction is off, and NetBSD doesn't support dm restriction at
this time).
On linux and FreeBSD it just calls setresuid().

Signed-off-by: Manuel Bouyer 
---
 tools/libs/light/Makefile  |  4 ++--
 tools/libs/light/libxl_dm.c|  2 +-
 tools/libs/light/libxl_internal.h  |  3 +++
 tools/libs/light/libxl_netbsd.c|  5 +
 tools/libs/light/libxl_setresuid.c | 23 +++
 5 files changed, 34 insertions(+), 3 deletions(-)
 create mode 100644 tools/libs/light/libxl_setresuid.c

diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
index 68f6fa315f..d62ca6e477 100644
--- a/tools/libs/light/Makefile
+++ b/tools/libs/light/Makefile
@@ -64,8 +64,8 @@ SRCS-$(CONFIG_ARM) += libxl_arm_no_acpi.c
 endif
 
 SRCS-OS-$(CONFIG_NetBSD) = libxl_netbsd.c
-SRCS-OS-$(CONFIG_Linux) = libxl_linux.c
-SRCS-OS-$(CONFIG_FreeBSD) = libxl_freebsd.c
+SRCS-OS-$(CONFIG_Linux) = libxl_linux.c libxl_setresuid.c
+SRCS-OS-$(CONFIG_FreeBSD) = libxl_freebsd.c libxl_setresuid.c
 ifeq ($(SRCS-OS-y),)
 $(error Your Operating System is not supported by libxenlight, \
 please check libxl_linux.c and libxl_netbsd.c to see how to get it ported)
diff --git a/tools/libs/light/libxl_dm.c b/tools/libs/light/libxl_dm.c
index 13f79ec471..291dee9b3f 100644
--- a/tools/libs/light/libxl_dm.c
+++ b/tools/libs/light/libxl_dm.c
@@ -3655,7 +3655,7 @@ static int 
kill_device_model_uid_child(libxl__destroy_devicemodel_state *ddms,
 
 LOGD(DEBUG, domid, "DM reaper: calling setresuid(%d, %d, 0)",
  reaper_uid, dm_kill_uid);
-r = setresuid(reaper_uid, dm_kill_uid, 0);
+r = libxl__setresuid(reaper_uid, dm_kill_uid, 0);
 if (r) {
 LOGED(ERROR, domid, "setresuid to (%d, %d, 0)",
   reaper_uid, dm_kill_uid);
diff --git a/tools/libs/light/libxl_internal.h 
b/tools/libs/light/libxl_internal.h
index 6c8b7d71a9..028bc013d9 100644
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -4845,6 +4845,9 @@ _hidden int libxl__domain_pvcontrol(libxl__egc *egc,
 /* Check whether a domid is recent */
 int libxl__is_domid_recent(libxl__gc *gc, uint32_t domid, bool *recent);
 
+/* os-specific implementation of setresuid() */
+int libxl__setresuid(uid_t ruid, uid_t euid, uid_t suid);
+
 #endif
 
 /*
diff --git a/tools/libs/light/libxl_netbsd.c b/tools/libs/light/libxl_netbsd.c
index 6ad4ed34c2..67caafab9e 100644
--- a/tools/libs/light/libxl_netbsd.c
+++ b/tools/libs/light/libxl_netbsd.c
@@ -124,3 +124,8 @@ int libxl__local_dm_preexec_restrict(libxl__gc *gc)
 {
 return 0;
 }
+
+int libxl__setresuid(uid_t ruid, uid_t euid, uid_t suid)
+{
+assert(!"setresuid is not available on NetBSD, and dm restrction is not 
supported, so this code path should not have been reached");
+}
diff --git a/tools/libs/light/libxl_setresuid.c 
b/tools/libs/light/libxl_setresuid.c
new file mode 100644
index 00..ac5cb5db53
--- /dev/null
+++ b/tools/libs/light/libxl_setresuid.c
@@ -0,0 +1,23 @@
+/*
+ * Copyright (C) 2021
+ * Author Manuel Bouyer 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+ 
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+
+int libxl__setresuid(uid_t ruid, uid_t euid, uid_t suid)
+{
+setresuid(ruid, euid, suid);
+}
-- 
2.29.2

[PATCH v2] libs/light: fix tv_sec printf format

2021-01-26 Thread Manuel Bouyer

Don't assume tv_sec is a unsigned long, it is 64 bits on NetBSD 32 bits.
Use %jd and cast to (intmax_t) instead

Signed-off-by: Manuel Bouyer 
Reviewed-by: Roger Pau Monné 
---
 tools/libs/light/libxl_create.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index 8616113e72..9848d65f36 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -496,7 +496,7 @@ int libxl__domain_build(libxl__gc *gc,
 vments[2] = "image/ostype";
 vments[3] = "hvm";
 vments[4] = "start_time";
-vments[5] = GCSPRINTF("%lu.%02d", 
start_time.tv_sec,(int)start_time.tv_usec/1);
+vments[5] = GCSPRINTF("%jd.%02d", 
(intmax_t)start_time.tv_sec,(int)start_time.tv_usec/1);
 
 localents = libxl__calloc(gc, 13, sizeof(char *));
 i = 0;
@@ -535,7 +535,7 @@ int libxl__domain_build(libxl__gc *gc,
 vments[i++] = "image/kernel";
 vments[i++] = (char *) state->pv_kernel.path;
 vments[i++] = "start_time";
-vments[i++] = GCSPRINTF("%lu.%02d", 
start_time.tv_sec,(int)start_time.tv_usec/1);
+vments[i++] = GCSPRINTF("%jd.%02d", 
(intmax_t)start_time.tv_sec,(int)start_time.tv_usec/1);
 if (state->pv_ramdisk.path) {
 vments[i++] = "image/ramdisk";
 vments[i++] = (char *) state->pv_ramdisk.path;
@@ -1502,7 +1502,7 @@ static void domcreate_stream_done(libxl__egc *egc,
 vments[2] = "image/ostype";
 vments[3] = "hvm";
 vments[4] = "start_time";
-vments[5] = GCSPRINTF("%lu.%02d", 
start_time.tv_sec,(int)start_time.tv_usec/1);
+vments[5] = GCSPRINTF("%jd.%02d", 
(intmax_t)start_time.tv_sec,(int)start_time.tv_usec/1);
 break;
 case LIBXL_DOMAIN_TYPE_PV:
 vments = libxl__calloc(gc, 11, sizeof(char *));
@@ -1512,7 +1512,7 @@ static void domcreate_stream_done(libxl__egc *egc,
 vments[i++] = "image/kernel";
 vments[i++] = (char *) state->pv_kernel.path;
 vments[i++] = "start_time";
-vments[i++] = GCSPRINTF("%lu.%02d", 
start_time.tv_sec,(int)start_time.tv_usec/1);
+vments[i++] = GCSPRINTF("%jd.%02d", 
(intmax_t)start_time.tv_sec,(int)start_time.tv_usec/1);
 if (state->pv_ramdisk.path) {
 vments[i++] = "image/ramdisk";
 vments[i++] = (char *) state->pv_ramdisk.path;
-- 
2.29.2

[PATCH v2] libs/light: fix uuid on NetBSD

2021-01-26 Thread Manuel Bouyer

NetBSD uses the same uuid library as FreeBSD. As this is in a
__FreeBSD__ || __NetBSD__ block, just drop the #ifdef __FreeBSD__
and dead code.

Signed-off-by: Manuel Bouyer 
Reviewed-by: Roger Pau Monné 
---
 tools/libs/light/libxl_uuid.c | 21 -
 1 file changed, 21 deletions(-)

diff --git a/tools/libs/light/libxl_uuid.c b/tools/libs/light/libxl_uuid.c
index dadb79bad8..7b68270a33 100644
--- a/tools/libs/light/libxl_uuid.c
+++ b/tools/libs/light/libxl_uuid.c
@@ -82,7 +82,6 @@ void libxl_uuid_generate(libxl_uuid *uuid)
 uuid_enc_be(uuid->uuid, _uuid);
 }
 
-#ifdef __FreeBSD__
 int libxl_uuid_from_string(libxl_uuid *uuid, const char *in)
 {
 uint32_t status;
@@ -95,19 +94,6 @@ int libxl_uuid_from_string(libxl_uuid *uuid, const char *in)
 
 return 0;
 }
-#else
-#define LIBXL__UUID_PTRS(uuid) [0], [1], [2], [3], \
-   [4], [5], [6], [7], \
-   [8], [9], [10],[11], \
-   [12],[13],[14],[15]
-int libxl_uuid_from_string(libxl_uuid *uuid, const char *in)
-{
-if ( sscanf(in, LIBXL_UUID_FMT, LIBXL__UUID_PTRS(uuid->uuid)) != 
sizeof(uuid->uuid) )
-return -1;
-return 0;
-}
-#undef LIBXL__UUID_PTRS
-#endif
 
 void libxl_uuid_copy(libxl_ctx *ctx_opt, libxl_uuid *dst,
  const libxl_uuid *src)
@@ -120,7 +106,6 @@ void libxl_uuid_clear(libxl_uuid *uuid)
 memset(>uuid, 0, sizeof(uuid->uuid));
 }
 
-#ifdef __FreeBSD__
 int libxl_uuid_compare(const libxl_uuid *uuid1, const libxl_uuid *uuid2)
 {
 uuid_t nat_uuid1, nat_uuid2;
@@ -130,12 +115,6 @@ int libxl_uuid_compare(const libxl_uuid *uuid1, const 
libxl_uuid *uuid2)
 
 return uuid_compare(_uuid1, _uuid2, NULL);
 }
-#else
-int libxl_uuid_compare(const libxl_uuid *uuid1, const libxl_uuid *uuid2)
-{
- return memcmp(uuid1->uuid, uuid2->uuid, sizeof(uuid1->uuid));
-}
-#endif
 
 const uint8_t *libxl_uuid_bytearray_const(const libxl_uuid *uuid)
 {
-- 
2.29.2

[PATCH v2] libs/gnttab: implement on NetBSD

2021-01-26 Thread Manuel Bouyer

Implement gnttab interface on NetBSD.
The kernel interface is different from FreeBSD so we can't use the FreeBSD
version

Signed-off-by: Manuel Bouyer 
---
 tools/libs/gnttab/Makefile |   2 +-
 tools/libs/gnttab/netbsd.c | 267 +
 2 files changed, 268 insertions(+), 1 deletion(-)
 create mode 100644 tools/libs/gnttab/netbsd.c

diff --git a/tools/libs/gnttab/Makefile b/tools/libs/gnttab/Makefile
index d86c49d243..ae390ce60f 100644
--- a/tools/libs/gnttab/Makefile
+++ b/tools/libs/gnttab/Makefile
@@ -10,7 +10,7 @@ SRCS-GNTSHR+= gntshr_core.c
 SRCS-$(CONFIG_Linux)   += $(SRCS-GNTTAB) $(SRCS-GNTSHR) linux.c
 SRCS-$(CONFIG_MiniOS)  += $(SRCS-GNTTAB) gntshr_unimp.c minios.c
 SRCS-$(CONFIG_FreeBSD) += $(SRCS-GNTTAB) $(SRCS-GNTSHR) freebsd.c
+SRCS-$(CONFIG_NetBSD)  += $(SRCS-GNTTAB) $(SRCS-GNTSHR) netbsd.c
 SRCS-$(CONFIG_SunOS)   += gnttab_unimp.c gntshr_unimp.c
-SRCS-$(CONFIG_NetBSD)  += gnttab_unimp.c gntshr_unimp.c
 
 include $(XEN_ROOT)/tools/libs/libs.mk
diff --git a/tools/libs/gnttab/netbsd.c b/tools/libs/gnttab/netbsd.c
new file mode 100644
index 00..272cbd8961
--- /dev/null
+++ b/tools/libs/gnttab/netbsd.c
@@ -0,0 +1,267 @@
+/*
+ * Copyright (c) 2007-2008, D G Murray 
+ * Copyright (c) 2016-2017, Akshay Jaggi 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation;
+ * version 2.1 of the License.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; If not, see .
+ *
+ * Split out from linux.c
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+#include 
+
+#include "private.h"
+
+#define PAGE_SHIFT   12
+#define PAGE_SIZE(1UL << PAGE_SHIFT)
+#define PAGE_MASK(~(PAGE_SIZE-1))
+
+#define DEVXEN "/kern/xen/privcmd"
+
+int osdep_gnttab_open(xengnttab_handle *xgt)
+{
+int fd = open(DEVXEN, O_RDWR|O_CLOEXEC);
+
+if ( fd == -1 )
+return -1;
+xgt->fd = fd;
+
+return 0;
+}
+
+int osdep_gnttab_close(xengnttab_handle *xgt)
+{
+if ( xgt->fd == -1 )
+return 0;
+
+return close(xgt->fd);
+}
+
+int osdep_gnttab_set_max_grants(xengnttab_handle *xgt, uint32_t count)
+{
+return 0;
+}
+
+void *osdep_gnttab_grant_map(xengnttab_handle *xgt,
+ uint32_t count, int flags, int prot,
+ uint32_t *domids, uint32_t *refs,
+ uint32_t notify_offset,
+ evtchn_port_t notify_port)
+{
+uint32_t i;
+int fd = xgt->fd;
+struct ioctl_gntdev_mmap_grant_ref map;
+void *addr = NULL;
+int domids_stride;
+unsigned int refs_size = count * sizeof(struct ioctl_gntdev_grant_ref);
+int rv;
+
+domids_stride = (flags & XENGNTTAB_GRANT_MAP_SINGLE_DOMAIN) ? 0 : 1;
+map.refs = malloc(refs_size);
+
+for ( i = 0; i < count; i++ )
+{
+map.refs[i].domid = domids[i * domids_stride];
+map.refs[i].ref = refs[i];
+}
+
+map.count = count;
+addr = mmap(NULL, count * PAGE_SIZE,
+prot, flags | MAP_ANON | MAP_SHARED, -1, 0);
+
+if (map.va == MAP_FAILED) {
+GTERROR(xgt->logger, "osdep_gnttab_grant_map: mmap failed");
+munmap((void *)map.va, count * PAGE_SIZE);
+addr = MAP_FAILED;
+}
+map.va = addr;
+
+map.notify.offset = 0;
+map.notify.action = 0;
+if ( notify_offset < PAGE_SIZE * count )
+{
+map.notify.offset = notify_offset;
+map.notify.action |= UNMAP_NOTIFY_CLEAR_BYTE;
+}
+if ( notify_port != -1 )
+{
+   map.notify.event_channel_port = notify_port;
+   map.notify.action |= UNMAP_NOTIFY_SEND_EVENT;
+}
+
+rv = ioctl(fd, IOCTL_GNTDEV_MMAP_GRANT_REF, );
+if ( rv )
+{
+GTERROR(xgt->logger,
+"ioctl IOCTL_GNTDEV_MMAP_GRANT_REF failed: %d", rv);
+munmap(addr, count * PAGE_SIZE);
+addr = MAP_FAILED;
+}
+free(map.refs);
+return addr;
+}
+
+int osdep_gnttab_unmap(xengnttab_handle *xgt,
+   void *start_address,
+   uint32_t count)
+{
+int rc;
+if ( start_address == NULL )
+{
+errno = EINVAL;
+return -1;
+}
+
+/* Next, unmap the memory. */
+rc = munmap(start_address, count * PAGE_SIZE);
+
+return rc;
+}
+
+int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
+uint32_t count,
+xengnttab_grant_copy_segment_t *segs)
+{
+errno = ENOSYS;
+return

[PATCH v2] libs/store: make build without PTHREAD_STACK_MIN

2021-01-26 Thread Manuel Bouyer

On NetBSD, PTHREAD_STACK_MIN is not available.
If PTHREAD_STACK_MIN is not defined, define it to 0 so that we fallback to
DEFAULT_THREAD_STACKSIZE

Signed-off-by: Manuel Bouyer 
---
 tools/libs/store/xs.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/tools/libs/store/xs.c b/tools/libs/store/xs.c
index 4ac73ec317..b6ecbd787e 100644
--- a/tools/libs/store/xs.c
+++ b/tools/libs/store/xs.c
@@ -811,6 +811,11 @@ bool xs_watch(struct xs_handle *h, const char *path, const 
char *token)
 
 #ifdef USE_PTHREAD
 #define DEFAULT_THREAD_STACKSIZE (16 * 1024)
+/* NetBSD doesn't have PTHREAD_STACK_MIN. */
+#ifndef PTHREAD_STACK_MIN
+# define PTHREAD_STACK_MIN 0
+#endif
+
 #define READ_THREAD_STACKSIZE  \
((DEFAULT_THREAD_STACKSIZE < PTHREAD_STACK_MIN) ?   \
PTHREAD_STACK_MIN : DEFAULT_THREAD_STACKSIZE)
-- 
2.29.2

[PATCH v2] NetBSD: use system-provided headers

2021-01-26 Thread Manuel Bouyer

On NetBSD use the system-provided headers for ioctl and related definitions,
they are up to date and have more chances to match the kernel's idea of
the ioctls and structures.
Remove now-unused NetBSD/evtchn.h and NetBSD/privcmd.h.
Don't fail install if xen/sys/*.h are not present.

Signed-off-by: Manuel Bouyer 
---
 tools/debugger/gdbsx/xg/xg_main.c  |   4 +
 tools/include/Makefile |   2 +
 tools/include/xen-sys/NetBSD/evtchn.h  |  86 
 tools/include/xen-sys/NetBSD/privcmd.h | 106 -
 tools/libs/call/private.h  |   4 +
 tools/libs/ctrl/xc_private.h   |   4 +
 tools/libs/foreignmemory/private.h |   6 ++
 7 files changed, 20 insertions(+), 192 deletions(-)
 delete mode 100644 tools/include/xen-sys/NetBSD/evtchn.h
 delete mode 100644 tools/include/xen-sys/NetBSD/privcmd.h

diff --git a/tools/debugger/gdbsx/xg/xg_main.c 
b/tools/debugger/gdbsx/xg/xg_main.c
index 4576c762af..903d60baed 100644
--- a/tools/debugger/gdbsx/xg/xg_main.c
+++ b/tools/debugger/gdbsx/xg/xg_main.c
@@ -49,7 +49,11 @@
 #include "xg_public.h"
 #include 
 #include 
+#ifdef __NetBSD__
+#include 
+#else
 #include 
+#endif
 #include 
 #include 
 
diff --git a/tools/include/Makefile b/tools/include/Makefile
index 4d4ec5f974..04902397b7 100644
--- a/tools/include/Makefile
+++ b/tools/include/Makefile
@@ -68,7 +68,9 @@ install: all
$(INSTALL_DATA) xen/foreign/*.h $(DESTDIR)$(includedir)/xen/foreign
$(INSTALL_DATA) xen/hvm/*.h $(DESTDIR)$(includedir)/xen/hvm
$(INSTALL_DATA) xen/io/*.h $(DESTDIR)$(includedir)/xen/io
+ifeq ($(wildcard xen/sys/.),)
$(INSTALL_DATA) xen/sys/*.h $(DESTDIR)$(includedir)/xen/sys
+endif
$(INSTALL_DATA) xen/xsm/*.h $(DESTDIR)$(includedir)/xen/xsm
 
 .PHONY: uninstall
diff --git a/tools/include/xen-sys/NetBSD/evtchn.h 
b/tools/include/xen-sys/NetBSD/evtchn.h
deleted file mode 100644
index 2d8a1f9164..00
--- a/tools/include/xen-sys/NetBSD/evtchn.h
+++ /dev/null
@@ -1,86 +0,0 @@
-/* $NetBSD: evtchn.h,v 1.1.1.1 2007/06/14 19:39:45 bouyer Exp $ */
-/**
- * evtchn.h
- * 
- * Interface to /dev/xen/evtchn.
- * 
- * Copyright (c) 2003-2005, K A Fraser
- * 
- * This file may be distributed separately from the Linux kernel, or
- * incorporated into other software packages, subject to the following license:
- * 
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this source file (the "Software"), to deal in the Software without
- * restriction, including without limitation the rights to use, copy, modify,
- * merge, publish, distribute, sublicense, and/or sell copies of the Software,
- * and to permit persons to whom the Software is furnished to do so, subject to
- * the following conditions:
- * 
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- * 
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
- * IN THE SOFTWARE.
- */
-
-#ifndef __NetBSD_EVTCHN_H__
-#define __NetBSD_EVTCHN_H__
-
-/*
- * Bind a fresh port to VIRQ @virq.
- */
-#define IOCTL_EVTCHN_BIND_VIRQ \
-   _IOWR('E', 4, struct ioctl_evtchn_bind_virq)
-struct ioctl_evtchn_bind_virq {
-   unsigned int virq;
-   unsigned int port;
-};
-
-/*
- * Bind a fresh port to remote <@remote_domain, @remote_port>.
- */
-#define IOCTL_EVTCHN_BIND_INTERDOMAIN  \
-   _IOWR('E', 5, struct ioctl_evtchn_bind_interdomain)
-struct ioctl_evtchn_bind_interdomain {
-   unsigned int remote_domain, remote_port;
-   unsigned int port;
-};
-
-/*
- * Allocate a fresh port for binding to @remote_domain.
- */
-#define IOCTL_EVTCHN_BIND_UNBOUND_PORT \
-   _IOWR('E', 6, struct ioctl_evtchn_bind_unbound_port)
-struct ioctl_evtchn_bind_unbound_port {
-   unsigned int remote_domain;
-   unsigned int port;
-};
-
-/*
- * Unbind previously allocated @port.
- */
-#define IOCTL_EVTCHN_UNBIND\
-   _IOW('E', 7, struct ioctl_evtchn_unbind)
-struct ioctl_evtchn_unbind {
-   unsigned int port;
-};
-
-/*
- * Send event to previously allocated @port.
- */
-#define IOCTL_EVTCHN_NOTIFY\
-   _IOW('E', 8, struct ioctl_evtchn_notify)
-struct ioctl_evtchn_notify {
-   unsigned int port;
-};
-
-/* Clear and reinitialise the event buffer. Clear error condition. */
-#define IOCTL_EVTCHN_RESET \
-

[PATCH v2] xenpmd.c: use dynamic allocation

2021-01-26 Thread Manuel Bouyer

On NetBSD, d_name is larger than 256, so file_name[284] may not be large
enough (and gcc emits a format-truncation error).
Use asprintf() instead of snprintf() on a static on-stack buffer.

Signed-off-by: Manuel Bouyer 
---
 tools/xenpmd/xenpmd.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/xenpmd/xenpmd.c b/tools/xenpmd/xenpmd.c
index 12b82cf43e..e432aad856 100644
--- a/tools/xenpmd/xenpmd.c
+++ b/tools/xenpmd/xenpmd.c
@@ -101,7 +101,7 @@ FILE *get_next_battery_file(DIR *battery_dir,
 {
 FILE *file = 0;
 struct dirent *dir_entries;
-char file_name[284];
+char *file_name;
 int ret;
 
 do 
@@ -112,16 +112,16 @@ FILE *get_next_battery_file(DIR *battery_dir,
 if ( strlen(dir_entries->d_name) < 4 )
 continue;
 if ( battery_info_type == BIF ) 
-ret = snprintf(file_name, sizeof(file_name), 
BATTERY_INFO_FILE_PATH,
+ret = asprintf(_name, BATTERY_INFO_FILE_PATH,
  dir_entries->d_name);
 else 
-ret = snprintf(file_name, sizeof(file_name), 
BATTERY_STATE_FILE_PATH,
+ret = asprintf(_name, BATTERY_STATE_FILE_PATH,
  dir_entries->d_name);
 /* This should not happen but is needed to pass gcc checks */
 if (ret < 0)
 continue;
-file_name[sizeof(file_name) - 1] = '\0';
 file = fopen(file_name, "r");
+   free(file_name);
 } while ( !file );
 
 return file;
-- 
2.29.2

[PATCH v2] libs/foreignmemory: Implement on NetBSD

2021-01-26 Thread Manuel Bouyer

Implement foreignmemory interface on NetBSD. The compat interface is now used
only on __sun__

Signed-off-by: Manuel Bouyer 
---
 tools/libs/foreignmemory/Makefile  |  2 +-
 tools/libs/foreignmemory/netbsd.c  | 66 +-
 tools/libs/foreignmemory/private.h |  6 +--
 3 files changed, 60 insertions(+), 14 deletions(-)

diff --git a/tools/libs/foreignmemory/Makefile 
b/tools/libs/foreignmemory/Makefile
index 13850f7988..f191cdbed0 100644
--- a/tools/libs/foreignmemory/Makefile
+++ b/tools/libs/foreignmemory/Makefile
@@ -8,7 +8,7 @@ SRCS-y += core.c
 SRCS-$(CONFIG_Linux)   += linux.c
 SRCS-$(CONFIG_FreeBSD) += freebsd.c
 SRCS-$(CONFIG_SunOS)   += compat.c solaris.c
-SRCS-$(CONFIG_NetBSD)  += compat.c netbsd.c
+SRCS-$(CONFIG_NetBSD)  += netbsd.c
 SRCS-$(CONFIG_MiniOS)  += minios.c
 
 include $(XEN_ROOT)/tools/libs/libs.mk
diff --git a/tools/libs/foreignmemory/netbsd.c 
b/tools/libs/foreignmemory/netbsd.c
index 54a418ebd6..a7e1d72ffc 100644
--- a/tools/libs/foreignmemory/netbsd.c
+++ b/tools/libs/foreignmemory/netbsd.c
@@ -19,7 +19,9 @@
 
 #include 
 #include 
+#include 
 #include 
+#include 
 
 #include "private.h"
 
@@ -66,15 +68,17 @@ int osdep_xenforeignmemory_close(xenforeignmemory_handle 
*fmem)
 return close(fd);
 }
 
-void *osdep_map_foreign_batch(xenforeignmem_handle *fmem, uint32_t dom,
-  void *addr, int prot, int flags,
-  xen_pfn_t *arr, int num)
+void *osdep_xenforeignmemory_map(xenforeignmemory_handle *fmem,
+ uint32_t dom, void *addr,
+ int prot, int flags, size_t num,
+ const xen_pfn_t arr[/*num*/], int 
err[/*num*/])
+
 {
 int fd = fmem->fd;
-privcmd_mmapbatch_t ioctlx;
-addr = mmap(addr, num*XC_PAGE_SIZE, prot, flags | MAP_ANON | MAP_SHARED, 
-1, 0);
+privcmd_mmapbatch_v2_t ioctlx;
+addr = mmap(addr, num*PAGE_SIZE, prot, flags | MAP_ANON | MAP_SHARED, -1, 
0);
 if ( addr == MAP_FAILED ) {
-PERROR("osdep_map_foreign_batch: mmap failed");
+PERROR("osdep_xenforeignmemory_map: mmap failed");
 return NULL;
 }
 
@@ -82,11 +86,12 @@ void *osdep_map_foreign_batch(xenforeignmem_handle *fmem, 
uint32_t dom,
 ioctlx.dom=dom;
 ioctlx.addr=(unsigned long)addr;
 ioctlx.arr=arr;
-if ( ioctl(fd, IOCTL_PRIVCMD_MMAPBATCH, ) < 0 )
+ioctlx.err=err;
+if ( ioctl(fd, IOCTL_PRIVCMD_MMAPBATCH_V2, ) < 0 )
 {
 int saved_errno = errno;
-PERROR("osdep_map_foreign_batch: ioctl failed");
-(void)munmap(addr, num*XC_PAGE_SIZE);
+PERROR("osdep_xenforeignmemory_map: ioctl failed");
+(void)munmap(addr, num*PAGE_SIZE);
 errno = saved_errno;
 return NULL;
 }
@@ -97,7 +102,48 @@ void *osdep_map_foreign_batch(xenforeignmem_handle *fmem, 
uint32_t dom,
 int osdep_xenforeignmemory_unmap(xenforeignmemory_handle *fmem,
  void *addr, size_t num)
 {
-return munmap(addr, num*XC_PAGE_SIZE);
+return munmap(addr, num*PAGE_SIZE);
+}
+
+int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
+domid_t domid)
+{
+errno = EOPNOTSUPP;
+return -1;
+}
+
+int osdep_xenforeignmemory_unmap_resource(
+xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres)
+{
+return fres ? munmap(fres->addr, fres->nr_frames << PAGE_SHIFT) : 0;
+}
+
+int osdep_xenforeignmemory_map_resource(
+xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres)
+{
+privcmd_mmap_resource_t mr = {
+.dom = fres->domid,
+.type = fres->type,
+.id = fres->id,
+.idx = fres->frame,
+.num = fres->nr_frames,
+};
+int rc;
+
+fres->addr = mmap(fres->addr, fres->nr_frames << PAGE_SHIFT,
+  fres->prot, fres->flags | MAP_ANON | MAP_SHARED, -1, 0);
+if ( fres->addr == MAP_FAILED )
+return -1;
+
+mr.addr = (uintptr_t)fres->addr;
+
+rc = ioctl(fmem->fd, IOCTL_PRIVCMD_MMAP_RESOURCE, );
+if ( rc )
+{
+PERROR("ioctl failed");
+}
+
+return 0;
 }
 
 /*
diff --git a/tools/libs/foreignmemory/private.h 
b/tools/libs/foreignmemory/private.h
index ebd45c4785..7e734ac61e 100644
--- a/tools/libs/foreignmemory/private.h
+++ b/tools/libs/foreignmemory/private.h
@@ -36,9 +36,9 @@ void *osdep_xenforeignmemory_map(xenforeignmemory_handle 
*fmem,
 int osdep_xenforeignmemory_unmap(xenforeignmemory_handle *fmem,
  void *addr, size_t num);
 
-#if defined(__NetBSD__) || defined(__sun__)
+#if defined(__sun__)
 /* Strictly compat for those two only only */
-void *compat_mapforeign_batch(xenforeignmem_handle *fmem, uint32_t dom,
+void *osdep_map_foreign_batch(xenforeignmemory_handle *fmem, uint32_t dom,
   void *addr, int prot, int flags,
   xen_pfn_t *arr,

[PATCH v2] NetBSD hotplug: Introduce locking functions

2021-01-26 Thread Manuel Bouyer

On NetBSD, some block device configuration requires serialisation.
Introcuce locking functions (derived from the Linux version), and use them
in the block script where appropriate.

Signed-off-by: Manuel Bouyer 
---
 tools/hotplug/NetBSD/Makefile   |   1 +
 tools/hotplug/NetBSD/block  |   5 +-
 tools/hotplug/NetBSD/locking.sh | 121 
 3 files changed, 126 insertions(+), 1 deletion(-)
 create mode 100644 tools/hotplug/NetBSD/locking.sh

diff --git a/tools/hotplug/NetBSD/Makefile b/tools/hotplug/NetBSD/Makefile
index 6926885ab8..114b223207 100644
--- a/tools/hotplug/NetBSD/Makefile
+++ b/tools/hotplug/NetBSD/Makefile
@@ -3,6 +3,7 @@ include $(XEN_ROOT)/tools/Rules.mk
 
 # Xen script dir and scripts to go there.
 XEN_SCRIPTS =
+XEN_SCRIPTS += locking.sh
 XEN_SCRIPTS += block
 XEN_SCRIPTS += vif-bridge
 XEN_SCRIPTS += vif-ip
diff --git a/tools/hotplug/NetBSD/block b/tools/hotplug/NetBSD/block
index c8b31a7b2b..eb5e80d640 100644
--- a/tools/hotplug/NetBSD/block
+++ b/tools/hotplug/NetBSD/block
@@ -5,6 +5,7 @@
 
 DIR=$(dirname "$0")
 . "${DIR}/hotplugpath.sh"
+. "${DIR}/locking.sh"
 
 PATH=${bindir}:${sbindir}:${LIBEXEC_BIN}:/bin:/usr/bin:/sbin:/usr/sbin
 export PATH
@@ -52,6 +53,7 @@ case $xstatus in
available_disks="$available_disks $disk"
eval $disk=free
done
+   claim_lock block
# Mark the used vnd(4) devices as ``used''.
for disk in `sysctl hw.disknames`; do
case $disk in
@@ -67,6 +69,7 @@ case $xstatus in
break   
fi
done
+   release_lock block
if [ x$device = x ] ; then
error "no available vnd device"
fi
@@ -76,7 +79,7 @@ case $xstatus in
device=$xparams
;;
esac
-   physical_device=$(stat -f '%r' "$device")
+   physical_device=$(stat -L -f '%r' "$device")
xenstore-write $xpath/physical-device $physical_device
xenstore-write $xpath/hotplug-status connected
exit 0
diff --git a/tools/hotplug/NetBSD/locking.sh b/tools/hotplug/NetBSD/locking.sh
new file mode 100644
index 00..2098d0e3ab
--- /dev/null
+++ b/tools/hotplug/NetBSD/locking.sh
@@ -0,0 +1,121 @@
+#
+# Copyright (c) 2005 XenSource Ltd.
+# Copyright (c) 2007 Red Hat
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of version 2.1 of the GNU Lesser General Public
+# License as published by the Free Software Foundation.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; If not, see .
+#
+
+#
+# Serialisation
+#
+
+LOCK_BASEDIR=$XEN_LOCK_DIR/xen-hotplug
+
+_setlockfd()
+{
+_lockfd=9
+_lockfile="$LOCK_BASEDIR/$1"
+}
+
+
+claim_lock()
+{
+mkdir -p "$LOCK_BASEDIR"
+_setlockfd $1
+# The locking strategy is identical to that from with-lock-ex(1)
+# from chiark-utils, except using flock.  It has the benefit of
+# it being possible to safely remove the lockfile when done.
+# See below for a correctness proof.
+local stat
+while true; do
+eval "exec $_lockfd<> $_lockfile"
+   # we can't flock $_lockfd here, as the shell closes it on exec.
+   # Workaround by redirecting to 0 for the command, and flock 0 instead.
+flock -v -x 0  0<& $_lockfd|| exit 1
+local file_stat
+local fd_stat
+if fd_stat=$(stat -f '%d.%i' 0<&$_lockfd 2>/dev/null) && 
file_stat=$(stat -f '%d.%i' $_lockfile 2>/dev/null )
+then
+if [ "$fd_stat" = "$file_stat" ] ; then break; fi
+fi
+# Some versions of bash appear to be buggy if the same
+# $_lockfile is opened repeatedly. Close the current fd here.
+eval "exec $_lockfd<&-"
+done
+}
+
+
+release_lock()
+{
+_setlockfd $1
+rm "$_lockfile"
+}
+
+# Protocol and correctness proof:
+#
+# * The lock is owned not by a process but by an open-file (informally
+#   an fd).  Any process with an fd onto this open-file is a
+#   lockholder and may perform the various operations; such a process
+#   should only do so when its co-lockholder processes expect.  Ie, we
+#   will treat all processes holding fds onto the open-file as acting
+#   in concert and not distinguish between them.
+#
+# * You are a lockholder if
+# - You have an fd onto an open-file which
+#   currently holds an exclusive flock lock on its inum
+# - and that inum is currently linked at the lockfile path
+#
+# * The rules are:
+# - No-one but a lockholder may unlink

[PATCH v2] Fix error: array subscript has type 'char'

2021-01-26 Thread Manuel Bouyer

Use unsigned char variable, or cast to (unsigned char), for
tolower()/islower() and friends. Fix compiler error
array subscript has type 'char' [-Werror=char-subscripts]

Signed-off-by: Manuel Bouyer 
Reviewed-by: Ian Jackson 
Release-Acked-by: Ian Jackson 
---
 tools/libs/light/libxl_qmp.c | 2 +-
 tools/xentrace/xentrace.c| 2 +-
 xen/tools/symbols.c  | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/libs/light/libxl_qmp.c b/tools/libs/light/libxl_qmp.c
index c394000ea9..9b638e6f54 100644
--- a/tools/libs/light/libxl_qmp.c
+++ b/tools/libs/light/libxl_qmp.c
@@ -1249,7 +1249,7 @@ static int qmp_error_class_to_libxl_error_code(libxl__gc 
*gc,
 se++;
 continue;
 }
-if (tolower(*s) != tolower(*se))
+if (tolower((unsigned char)*s) != tolower((unsigned char)*se))
 break;
 s++, se++;
 }
diff --git a/tools/xentrace/xentrace.c b/tools/xentrace/xentrace.c
index 4b50b8a53e..a8903ebf46 100644
--- a/tools/xentrace/xentrace.c
+++ b/tools/xentrace/xentrace.c
@@ -957,7 +957,7 @@ static int parse_cpumask_range(const char *mask_str, 
xc_cpumap_t map)
 {
 unsigned int a, b;
 int nmaskbits;
-char c;
+unsigned char c;
 int in_range;
 const char *s;
 
diff --git a/xen/tools/symbols.c b/xen/tools/symbols.c
index 9f9e2c9900..0b12452616 100644
--- a/xen/tools/symbols.c
+++ b/xen/tools/symbols.c
@@ -173,11 +173,11 @@ static int read_symbol(FILE *in, struct sym_entry *s)
/* include the type field in the symbol name, so that it gets
 * compressed together */
s->len = strlen(str) + 1;
-   if (islower(stype) && filename)
+   if (islower((unsigned char)stype) && filename)
s->len += strlen(filename) + 1;
s->sym = malloc(s->len + 1);
sym = SYMBOL_NAME(s);
-   if (islower(stype) && filename) {
+   if (islower((unsigned char)stype) && filename) {
sym = stpcpy(sym, filename);
*sym++ = '#';
}
-- 
2.29.2

Re: [PATCH] libs/light: pass some infos to qemu

2021-01-26 Thread Manuel Bouyer

On Mon, Jan 18, 2021 at 09:36:42AM +0100, Roger Pau Monné wrote:
> [...]
> 
> Note also that there are networking modes that don't use a bridge, so
> you could likely end up with nics[i].bridge == NULL?

I couldn't cause this to happen. If no bridge is specified in the
domU's config file, qemu-ifup is called with xenbr0 as bridge name.

I tried this:
vif = [ 'mac=00:16:3e:00:10:54, gatewaydev=wm0 type=ioemu, model=e1000' ]

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--

Re: Null scheduler and vwfi native problem

2021-01-26 Thread Dario Faggioli

On Tue, 2021-01-26 at 18:03 +0100, Anders Törnqvist wrote:
> On 1/25/21 5:11 PM, Dario Faggioli wrote:
> > On Fri, 2021-01-22 at 14:26 +, Julien Grall wrote:
> > > Hi Anders,
> > > 
> > > On 22/01/2021 08:06, Anders Törnqvist wrote:
> > > > On 1/22/21 12:35 AM, Dario Faggioli wrote:
> > > > > On Thu, 2021-01-21 at 19:40 +, Julien Grall wrote:
> > > > - booting with "sched=null vwfi=native" but not doing the IRQ
> > > > passthrough that you mentioned above
> > > > "xl destroy" gives
> > > > (XEN) End of domain_destroy function
> > > > 
> > > > Then a "xl create" says nothing but the domain has not started
> > > > correct.
> > > > "xl list" look like this for the domain:
> > > > mydomu   2   512 1 --
> > > > 0.0
> > > This is odd. I would have expected ``xl create`` to fail if
> > > something
> > > went wrong with the domain creation.
> > > 
> > So, Anders, would it be possible to issue a:
> > 
> > # xl debug-keys r
> > # xl dmesg
> > 
> > And send it to us ?
> > 
> > Ideally, you'd do it:
> >   - with Julien's patch (the one he sent the other day, and that
> > you
> >     have already given a try to) applied
> >   - while you are in the state above, i.e., after having tried to
> >     destroy a domain and failing
> >   - and maybe again after having tried to start a new domain
> Here are some logs.
> 
Great, thanks a lot!

> The system is booted as before with the patch and the domu config
> does 
> not have the IRQs.
> 
Ok.

> # xl list
> Name    ID   Mem VCPUs State   
> Time(s)
> Domain-0 0  3000 5 r-
> 820.1
> mydomu   1   511 1 r-
> 157.0
> 
> # xl debug-keys r
> (XEN) sched_smt_power_savings: disabled
> (XEN) NOW=191793008000
> (XEN) Online Cpus: 0-5
> (XEN) Cpupool 0:
> (XEN) Cpus: 0-5
> (XEN) Scheduler: null Scheduler (null)
> (XEN)     cpus_free =
> (XEN) Domain info:
> (XEN)     Domain: 0
> (XEN)       1: [0.0] pcpu=0
> (XEN)       2: [0.1] pcpu=1
> (XEN)       3: [0.2] pcpu=2
> (XEN)       4: [0.3] pcpu=3
> (XEN)       5: [0.4] pcpu=4
> (XEN)     Domain: 1
> (XEN)       6: [1.0] pcpu=5
> (XEN) Waitqueue:
>
So far, so good. All vCPUs are running on their assigned pCPU, and
there is no vCPU wanting to run but not having a vCPU where to do so.

> (XEN) Command line: console=dtuart dtuart=/serial@5a06 
> dom0_mem=3000M dom0_max_vcpus=5 hmp-unsafe=true dom0_vcpus_pin 
> sched=null vwfi=native
>
Oh, just as a side note (and most likely unrelated to the problem we're
discussing), you should be able to get rid of dom0_vcpus_pin.

The NULL scheduler will do something similar to what that option itself
does anyway. And with the benefit that, if you want, you can actually
change to what pCPUs the dom0's vCPU are pinned. While, if you use
dom0_vcpus_pin, you can't.

So it using it has only downsides (and that's true in general, if you
ask me, but particularly so if using NULL).

> # xl destroy mydomu
> (XEN) End of domain_destroy function
> 
> # xl list
> Name    ID   Mem VCPUs State   
> Time(s)
> Domain-0 0  3000 5 r-   
> 1057.9
> 
> # xl debug-keys r
> (XEN) sched_smt_power_savings: disabled
> (XEN) NOW=223871439875
> (XEN) Online Cpus: 0-5
> (XEN) Cpupool 0:
> (XEN) Cpus: 0-5
> (XEN) Scheduler: null Scheduler (null)
> (XEN)     cpus_free =
> (XEN) Domain info:
> (XEN)     Domain: 0
> (XEN)       1: [0.0] pcpu=0
> (XEN)       2: [0.1] pcpu=1
> (XEN)       3: [0.2] pcpu=2
> (XEN)       4: [0.3] pcpu=3
> (XEN)       5: [0.4] pcpu=4
> (XEN)     Domain: 1
> (XEN)       6: [1.0] pcpu=5
>
Right. And from the fact that: 1) we only see the "End of
domain_destroy function" line in the logs, and 2) we see that the vCPU
is still listed here, we have our confirmation (like there wase the
need for it :-/) that domain destruction is done only partially.

> # xl create mydomu.cfg
> Parsing config from mydomu.cfg
> (XEN) Power on resource 215
> 
> # xl list
> Name    ID   Mem VCPUs State   
> Time(s)
> Domain-0 0  3000 5 r-   
> 1152.1
> mydomu   2   512 1 --
>    0.0
> 
> # xl debug-keys r
> (XEN) sched_smt_power_savings: disabled
> (XEN) NOW=241210530250
> (XEN) Online Cpus: 0-5
> (XEN) Cpupool 0:
> (XEN) Cpus: 0-5
> (XEN) Scheduler: null Scheduler (null)
> (XEN)     cpus_free =
> (XEN) Domain info:
> (XEN)     Domain: 0
> (XEN)       1: [0.0] pcpu=0
> (XEN)       2: [0.1] pcpu=1
> (XEN)       3: [0.2] pcpu=2
> (XEN)       4: [0.3] pcpu=3
> (XEN)       5: [0.4] pcpu=4
> (XEN)     Domain: 1
> (XEN)       6: [1.0] pcpu=5
> (XEN)     Domain: 2
> (XEN)       7: [2.0] pcpu=-1
> (XEN) Waitqueue: d2v0
>
Yep, so, as we were suspecting, domain 1 was not destroyed properly.
Specifically, we did not get to the point where the vCPU is

Xen Security Advisory 360 v2 (CVE-2021-3308) - IRQ vector leak on x86

2021-01-26 Thread Xen . org security team

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Xen Security Advisory CVE-2021-3308 / XSA-360
  version 2

IRQ vector leak on x86

UPDATES IN VERSION 2


CVE assigned.

ISSUE DESCRIPTION
=

An x86 HVM guest with PCI pass through devices can force the allocation
of all IDT vectors on the system by rebooting itself with MSI or MSI-X
capabilities enabled and entries setup.

Such reboots will leak any vectors used by the MSI(-X) entries that the
guest might had enabled, and hence will lead to vector exhaustion on the
system, not allowing further PCI pass through devices to work properly.

IMPACT
==

HVM guests with PCI pass through devices can mount a Denial of Service (DoS)
attack affecting the pass through of PCI devices to other guests or the
hardware domain.  In the latter case this would affect the entire host.

VULNERABLE SYSTEMS
==

Xen versions 4.12.3, 4.12.4, and all versions from 4.13.1 onwards are
vulnerable.  Xen version 4.13.0 and all versions up to 4.12.2 are not
affected.

Only x86 systems running HVM guests with PCI pass through devices are
vulnerable.

MITIGATION
==

Not running HVM guests with PCI pass through devices will avoid the
vulnerability.  Note that even non-malicious guests can trigger this
vulnerability as part of normal operation.

RESOLUTION
==

Applying the appropriate attached patch resolves this issue.

Note that patches for released versions are generally prepared to
apply to the stable branches, and may not apply cleanly to the most
recent release tarball.  Downstreams are encouraged to update to the
tip of the stable branch before applying these patches.

xsa360.patch   xen-unstable
xsa360-4.14.patch  Xen 4.14 - 4.12

$ sha256sum xsa360*
c874ad2b9edb0791ac975735306d055b1916f4acbc59e6f1550fbf33223d6106  xsa360.meta
592f3afda63777d31844e0e34d85fbe387a62d59fa7903ee19b22a98fba68894  xsa360.patch
809515011efb781a2a8742e9acfd76412d3920c2d4142bb187588cd36f77383e  
xsa360-4.14.patch
$

CREDITS
===

This issue was discovered by James McCoy, debugged in combination with
Samuel Verschelde of Vates, and recognised as a security issue by Roger
Pau Monné of Citrix.

NOTE REGARDING LACK OF EMBARGO
==

This was reported and debugged publicly, before the security
implications were apparent.
-BEGIN PGP SIGNATURE-

iQFABAEBCAAqFiEEI+MiLBRfRHX6gGCng/4UyVfoK9kFAmAQkcMMHHBncEB4ZW4u
b3JnAAoJEIP+FMlX6CvZCnkIAL4JBZ19GKWeLyjZSYJxMR7y677B0CQ627Swmu0L
UoCk6VhVmwNuqgU12yEiE8fgUA1sx2WIHcc4ZLBSA6RmaWLy21SKpDywNk1bDuGu
aAYqzgWg4ESaEt22khvOdqvWYVn7N6Ferg7Xeaf+w8MJo5qwwAqnbn2sO432uWga
rSeOBMnmrNsgWkoCNmcTVzFjhxHKz94mReGFGStN96zQuI2DedkKzWHS6YcDydAw
qyRmO3D+2RJGwTIAYQqKvT/wBtTLI1uCp2DOYEDS8A8zkMy88k9+1703N/BxfB31
Ax04vEHoJj0EaLV4dyqRaVDcW9iZSpgvMQGB/x2Jp6knrG8=
=Dr9U
-END PGP SIGNATURE-


xsa360.meta
Description: Binary data


xsa360.patch
Description: Binary data


xsa360-4.14.patch
Description: Binary data

Re: [PATCH] xen: Fix XenStore initialisation for XS_LOCAL

2021-01-26 Thread Boris Ostrovsky




On 1/26/21 12:01 PM, David Woodhouse wrote:
> From: David Woodhouse 
> 
> In commit 3499ba8198ca ("xen: Fix event channel callback via INTX/GSI")
> I reworked the triggering of xenbus_probe().
> 
> I tried to simplify things by taking out the workqueue based startup
> triggered from wake_waiting(); the somewhat poorly named xenbus IRQ
> handler.
> 
> I missed the fact that in the XS_LOCAL case (Dom0 starting its own
> xenstored or xenstore-stubdom, which happens after the kernel is booted
> completely), that IRQ-based trigger is still actually needed.
> 
> So... put it back, except more cleanly. By just spawning a xenbus_probe
> thread which waits on xb_waitq and runs the probe the first time it
> gets woken, just as the workqueue-based hack did.
> 
> This is actually a nicer approach for *all* the back ends with different
> interrupt methods, and we can switch them all over to that without the
> complex conditions for when to trigger it. But not in -rc6. This is
> the minimal fix for the regression, although it's a step in the right
> direction instead of doing a partial revert and actually putting the
> workqueue back. It's also simpler than the workqueue.


Wouldn't the minimal fix be to restore wake_waiting() to its previous 

if (unlikely(xenstored_ready == 0)) {
xenstored_ready = 1;
schedule_work(_work);
}

(And to avoid changing xenbus_probe()'s signature just create a wrapper)

-boris

[PATCH] xen-blkfront: Fix 'physical' typos

2021-01-26 Thread Bjorn Helgaas

From: Bjorn Helgaas 

Fix misspelling of "physical".

Signed-off-by: Bjorn Helgaas 
---
 drivers/block/xen-blkfront.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 5265975b3fba..876db9fcf388 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -2403,7 +2403,7 @@ static void blkfront_connect(struct blkfront_info *info)
}
 
/*
-* physcial-sector-size is a newer field, so old backends may not
+* physical-sector-size is a newer field, so old backends may not
 * provide this. Assume physical sector size to be the same as
 * sector_size in that case.
 */
-- 
2.25.1

Re: Question about xen and Rasp 4B

2021-01-26 Thread Stefano Stabellini

On Tue, 26 Jan 2021, Jukka Kaartinen wrote:
> On Tue, Jan 26, 2021 at 2:54 AM Stefano Stabellini  
> wrote:
>   On Sat, 23 Jan 2021, Jukka Kaartinen wrote:
>   > Thanks for the response!
>   >
>   > On Sat, Jan 23, 2021 at 2:27 AM Stefano Stabellini 
>  wrote:
>   >       + xen-devel, Roman,
>   >
>   >
>   >       On Fri, 22 Jan 2021, Jukka Kaartinen wrote:
>   >       > Hi Stefano,
>   >       > I'm Jukka Kaartinen a SW developer working on enabling 
> hypervisors on mobile platforms. One of our HW that we use on
>   >       development is
>   >       > Raspberry Pi 4B. I wonder if you could help me a bit :).
>   >       >
>   >       > I'm trying to enable the GPU with Xen + Raspberry Pi for
>   >       dom0. 
> https://www.raspberrypi.org/forums/viewtopic.php?f=63=232323#p1797605
>   >       >
>   >       > I got so far that GPU drivers are loaded (v3d & vc4) without 
> errors. But now Xen returns error when X is starting:
>   >       > (XEN) traps.c:1986:d0v1 HSR=0x93880045 pc=0x7f97b14e70 
> gva=0x7f7f817000 gpa=0x401315d000
>   >       >  I tried to debug what causes this and looks like 
> find_mmio_handler cannot find handler.
>   >       > (See more here: 
> https://www.raspberrypi.org/forums/viewtopic.php?f=63=232323=25#p1801691
>  )
>   >       >
>   >       > Any ideas why the handler is not found?
>   >
>   >
>   >       Hi Jukka,
>   >
>   >       I am glad to hear that you are interested in Xen on RaspberryPi 
> :-)  I
>   >       haven't tried the GPU yet, I have been using the serial only.
>   >       Roman, did you ever get the GPU working?
>   >
>   >
>   >       The error is a data abort error: Linux is trying to access an 
> address
>   >       which is not mapped to dom0. The address seems to be 
> 0x401315d000. It is
>   >       a pretty high address; I looked in device tree but couldn't 
> spot it.
>   >
>   >       >From the HSR (the syndrom register) it looks like it is a 
> translation
>   >       fault at EL1 on stage1. As if the Linux address mapping was 
> wrong.
>   >       Anyone has any ideas how this could happen? Maybe a 
> reserved-memory
>   >       misconfiguration?
>   >
>   > I had issues with loading the driver in the first place. Apparently 
> swiotlb is used, maybe it can cause this. I also tried to
>   enable CMA.
>   > config.txt:
>   > dtoverlay=vc4-fkms-v3d,cma=320M@0x0-0x4000
>   > gpu_mem=128
> 
>   Also looking at your other reply and the implementation of
>   vc4_bo_create, it looks like this is a CMA problem.
> 
>   It would be good to run a test with the swiotlb-xen disabled:
> 
>   diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
>   index 467fa225c3d0..2bdd12785d14 100644
>   --- a/arch/arm/xen/mm.c
>   +++ b/arch/arm/xen/mm.c
>   @@ -138,8 +138,7 @@ void xen_destroy_contiguous_region(phys_addr_t 
> pstart, unsigned int order)
>    static int __init xen_mm_init(void)
>    {
>           struct gnttab_cache_flush cflush;
>   -       if (!xen_initial_domain())
>   -               return 0;
>   +       return 0;
>           xen_swiotlb_init(1, false);
> 
>           cflush.op = 0;
> 
> With this change the kernel is not booting up. (btw. I'm using USB SSD for my 
> OS.)
> [    0.071081] bcm2835-dma fe007000.dma: Unable to set DMA mask
> [    0.076277] bcm2835-dma fe007b00.dma: Unable to set DMA mask
> (XEN) physdev.c:16:d0v0 PHYSDEVOP cmd=25: not implemented
> (XEN) physdev.c:16:d0v0 PHYSDEVOP cmd=15: not implemented
> [    0.592695] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X 
> might fail!
> (XEN) physdev.c:16:d0v0 PHYSDEVOP cmd=15: not implemented
> [    0.606819] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X 
> might fail!
> [    1.212820] usb 1-1: device descriptor read/64, error 18
> [    1.452815] usb 1-1: device descriptor read/64, error 18
> [    1.820813] usb 1-1: device descriptor read/64, error 18
> [    2.060815] usb 1-1: device descriptor read/64, error 18
> [    2.845548] usb 1-1: device descriptor read/8, error -61
> [    2.977603] usb 1-1: device descriptor read/8, error -61
> [    3.237530] usb 1-1: device descriptor read/8, error -61
> [    3.369585] usb 1-1: device descriptor read/8, error -61
> [    3.480765] usb usb1-port1: unable to enumerate USB device
> 
> Traces stop here. I could try with a memory card. Maybe it makes a difference.

This is very surprising. Disabling swiotlb-xen should make things better
not worse. The only reason I can think of why it could make things worse
is if Linux runs out of low memory. Julien's patch
437b0aa06a014ce174e24c0d3530b3e9ab19b18b for Xen should have addressed
that issue though. Julien, any ideas?

 
>   It is going to be fine just to boot Dom0 and DomUs without PV drivers.
>   Also, can you post the

[ovmf test] 158626: all pass - PUSHED

2021-01-26 Thread osstest service owner

flight 158626 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/158626/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 2d6fc9d36fd5ff15972bedab919f37bb4ee951d0
baseline version:
 ovmf 3a3501862f73095059bb05cc28147c8e899488f2

Last test of basis   158620  2021-01-26 00:41:48 Z0 days
Testing same since   158626  2021-01-26 10:52:49 Z0 days1 attempts


People who touched revisions under test:
  Jason Lou 
  Lou, Yun 
  Michael D Kinney 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   3a3501862f..2d6fc9d36f  2d6fc9d36fd5ff15972bedab919f37bb4ee951d0 -> 
xen-tested-master

[xen-unstable-smoke test] 158653: tolerable all pass - PUSHED

2021-01-26 Thread osstest service owner

flight 158653 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/158653/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  07edcd17fa2dce80250b3dd31e561268bc4663a9
baseline version:
 xen  c7b0f25e8f86373ed54e1c446f8e67ce25ac6819

Last test of basis   158639  2021-01-26 14:00:28 Z0 days
Testing same since   158653  2021-01-26 17:00:27 Z0 days1 attempts


People who touched revisions under test:
  Jan Beulich 
  Roger Pau Monné 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   c7b0f25e8f..07edcd17fa  07edcd17fa2dce80250b3dd31e561268bc4663a9 -> smoke

[PATCH v5 1/2] xen: EXPERT clean-up and introduce UNSUPPORTED

2021-01-26 Thread Stefano Stabellini

From: Stefano Stabellini 

A recent thread [1] has exposed a couple of issues with our current way
of handling EXPERT.

1) It is not obvious that "Configure standard Xen features (expert
users)" is actually the famous EXPERT we keep talking about on xen-devel

2) It is not obvious when we need to enable EXPERT to get a specific
feature

In particular if you want to enable ACPI support so that you can boot
Xen on an ACPI platform, you have to enable EXPERT first. But searching
through the kconfig menu it is really not clear (type '/' and "ACPI"):
nothing in the description tells you that you need to enable EXPERT to
get the option.

So this patch makes things easier by doing two things:

- introduce a new kconfig option UNSUPPORTED which is clearly to enable
  UNSUPPORTED features as defined by SUPPORT.md

- change EXPERT options to UNSUPPORTED where it makes sense: keep
  depending on EXPERT for features made for experts

- tag unsupported features by adding (UNSUPPORTED) to the one-line
  description

- clarify the EXPERT one-line description

[1] https://marc.info/?l=xen-devel=160333101228981

Signed-off-by: Stefano Stabellini 
Reviewed-by: Jan Beulich  [x86,common]
Reviewed-by: Bertrand Marquis 
CC: andrew.coop...@citrix.com
CC: george.dun...@citrix.com
CC: i...@xenproject.org
CC: jbeul...@suse.com
CC: jul...@xen.org
CC: w...@xen.org

---
Changes in v5:
- add reviwed-by
- remove changes to ARM_SSBD and HARDEN_BRANCH_PREDICTOR

Changes in v4:
- clarify support statement of UNSUPPORTED
- move UNSUPPORTED past EXPERT
- add default EXPERT to UNSUPPORTED

Changes in v3:
- improve UNSUPPORTED text description
- avoid changing XEN_SHSTK and EFI_SET_VIRTUAL_ADDRESS_MAP
- update HVM_FEP to be UNSUPPORTED

Changes in v2:
- introduce UNSUPPORTED
- don't switch all EXPERT options to UNSUPPORTED

See as reference the v2 thread here:
https://marc.info/?l=xen-devel=160566066013723
---
 xen/Kconfig  | 11 ++-
 xen/arch/arm/Kconfig |  6 +++---
 xen/arch/x86/Kconfig |  6 +++---
 xen/common/Kconfig   |  2 +-
 xen/common/sched/Kconfig |  6 +++---
 5 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/xen/Kconfig b/xen/Kconfig
index 34c318bfa2..bcbd2758e5 100644
--- a/xen/Kconfig
+++ b/xen/Kconfig
@@ -35,7 +35,7 @@ config DEFCONFIG_LIST
default ARCH_DEFCONFIG
 
 config EXPERT
-   bool "Configure standard Xen features (expert users)"
+   bool "Configure EXPERT features"
help
  This option allows certain base Xen options and settings
  to be disabled or tweaked. This is for specialized environments
@@ -45,6 +45,15 @@ config EXPERT
  supported.
default n
 
+config UNSUPPORTED
+   bool "Configure UNSUPPORTED features"
+   default EXPERT
+   help
+ This option allows certain unsupported Xen options to be changed,
+ which includes non-security-supported, experimental, and tech
+ preview features as defined by SUPPORT.md. (Note that if an option
+ doesn't depend on UNSUPPORTED it doesn't imply that is supported.)
+
 config LTO
bool "Link Time Optimisation"
depends on BROKEN
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index c3eb13ea73..330bbf6232 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -32,7 +32,7 @@ menu "Architecture Features"
 source "arch/Kconfig"
 
 config ACPI
-   bool "ACPI (Advanced Configuration and Power Interface) Support" if 
EXPERT
+   bool "ACPI (Advanced Configuration and Power Interface) Support 
(UNSUPPORTED)" if UNSUPPORTED
depends on ARM_64
---help---
 
@@ -49,7 +49,7 @@ config GICV3
  If unsure, say Y
 
 config HAS_ITS
-bool "GICv3 ITS MSI controller support" if EXPERT
+bool "GICv3 ITS MSI controller support (UNSUPPORTED)" if UNSUPPORTED
 depends on GICV3 && !NEW_VGIC
 
 config HVM
@@ -104,7 +104,7 @@ config HARDEN_BRANCH_PREDICTOR
  If unsure, say Y.
 
 config TEE
-   bool "Enable TEE mediators support" if EXPERT
+   bool "Enable TEE mediators support (UNSUPPORTED)" if UNSUPPORTED
default n
help
  This option enables generic TEE mediators support. It allows guests
diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 78f351f94b..302334d3e4 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -147,7 +147,7 @@ config BIGMEM
  If unsure, say N.
 
 config HVM_FEP
-   bool "HVM Forced Emulation Prefix support" if EXPERT
+   bool "HVM Forced Emulation Prefix support (UNSUPPORTED)" if UNSUPPORTED
default DEBUG
depends on HVM
---help---
@@ -166,7 +166,7 @@ config HVM_FEP
  If unsure, say N.
 
 config TBOOT
-   bool "Xen tboot support" if EXPERT
+   bool "Xen tboot support (UNSUPPORTED)" if UNSUPPORTED
default y if !PV_SHIM_EXCLUSIVE
select CRYPTO
---help---
@@ -252,7 +252,7 @@ config HYPERV_GUEST
 endif
 
 config MEM_SHARING
-

[PATCH v5 2/2] xen: add (EXPERT) to one-line description of XEN_SHSTK

2021-01-26 Thread Stefano Stabellini

From: Stefano Stabellini 

Add an "(EXPERT)" tag to the one-line description of Kconfig options
that depend on EXPERT. (Not where just the prompt depends on EXPERT.)

Today we only have one such option: XEN_SHSTK.

Signed-off-by: Stefano Stabellini 
CC: andrew.coop...@citrix.com
CC: george.dun...@citrix.com
CC: i...@xenproject.org
CC: jbeul...@suse.com
CC: jul...@xen.org
CC: w...@xen.org

---
Changes in v5:
- actually, only change one-line description of options that depends on
  EXPERT (and not just the prompt)

Changes in v4:
- new patch
---
 xen/arch/x86/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 302334d3e4..3f630b89e8 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -103,7 +103,7 @@ config HVM
  If unsure, say Y.
 
 config XEN_SHSTK
-   bool "Supervisor Shadow Stacks"
+   bool "Supervisor Shadow Stacks (EXPERT)"
depends on HAS_AS_CET_SS && EXPERT
default y
---help---
-- 
2.17.1

[PATCH v5 0/2] introduce UNSUPPORTED

2021-01-26 Thread Stefano Stabellini

Hi all,

A recent thread [1] has exposed a couple of issues with our current way
of handling EXPERT.

1) It is not obvious that "Configure standard Xen features (expert
users)" is actually the famous EXPERT we keep talking about on xen-devel

2) It is not obvious when we need to enable EXPERT to get a specific
feature

In particular if you want to enable ACPI support so that you can boot
Xen on an ACPI platform, you have to enable EXPERT first. But searching
through the kconfig menu it is really not clear (type '/' and "ACPI"):
nothing in the description tells you that you need to enable EXPERT to
get the option.

This series makes things easier by doing the following:

- introduce a new kconfig option UNSUPPORTED which is clearly to enable
  UNSUPPORTED features as defined by SUPPORT.md

- change EXPERT options to UNSUPPORTED where it makes sense: keep
  depending on EXPERT for features made for experts

- tag unsupported features by adding (UNSUPPORTED) to the one-line
  description

- clarify the EXPERT one-line description

[1] https://marc.info/?l=xen-devel=160333101228981

Cheers,

Stefano


Stefano Stabellini (2):
  xen: EXPERT clean-up and introduce UNSUPPORTED
  xen: add (EXPERT) to one-line description of XEN_SHSTK

 xen/Kconfig  | 11 ++-
 xen/arch/arm/Kconfig |  6 +++---
 xen/arch/x86/Kconfig |  8 
 xen/common/Kconfig   |  2 +-
 xen/common/sched/Kconfig |  6 +++---
 5 files changed, 21 insertions(+), 12 deletions(-)

[qemu-mainline test] 158622: regressions - FAIL

2021-01-26 Thread osstest service owner

flight 158622 qemu-mainline real [real]
flight 158651 qemu-mainline real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/158622/
http://logs.test-lab.xenproject.org/osstest/logs/158651/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt-vhd 19 guest-start/debian.repeat fail REGR. vs. 152631
 test-amd64-amd64-xl-qcow2   21 guest-start/debian.repeat fail REGR. vs. 152631
 test-armhf-armhf-xl-vhd 17 guest-start/debian.repeat fail REGR. vs. 152631

Tests which are failing intermittently (not blocking):
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 12 debian-hvm-install fail pass in 
158651-retest

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152631
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 152631
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 152631
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 152631
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152631
 test-armhf-armhf-xl-rtds 18 guest-start/debian.repeatfail  like 152631
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 152631
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152631
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuu31ee895047bdcf7387e3570cbd2a473c6f744b08
baseline version:
 qemuu1d806cef0e38b5db8347a8e12f214d543204a314

Last test of basis   152631  2020-08-20 09:07:46 Z  159 days
Failing since152659  2020-08-21 14:07:39 Z  158 days  323 attempts
Testing same since   158622  2021-01-26 03:12:32 Z0 days1

Re: [PATCH v4 2/2] xen: add (EXPERT) to one-line descriptions when appropriate

2021-01-26 Thread Stefano Stabellini

On Tue, 26 Jan 2021, Jan Beulich wrote:
> On 25.01.2021 22:27, Stefano Stabellini wrote:
> > --- a/xen/arch/x86/Kconfig
> > +++ b/xen/arch/x86/Kconfig
> > @@ -103,7 +103,7 @@ config HVM
> >   If unsure, say Y.
> >  
> >  config XEN_SHSTK
> > -   bool "Supervisor Shadow Stacks"
> > +   bool "Supervisor Shadow Stacks (EXPERT)"
> > depends on HAS_AS_CET_SS && EXPERT
> > default y
> > ---help---
> 
> I agree with this addition, but I'm afraid I'm at least uncertain
> about all the other ones made here, where ...
> 
> > --- a/xen/common/Kconfig
> > +++ b/xen/common/Kconfig
> > @@ -12,7 +12,7 @@ config CORE_PARKING
> > bool
> >  
> >  config GRANT_TABLE
> > -   bool "Grant table support" if EXPERT
> > +   bool "Grant table support (EXPERT)" if EXPERT
> > default y
> 
> ... like e.g. here, it's only the prompt that's conditional. IOW
> like with the respective uses of UNSUPPORTED in some of the Arm
> changes in patch 1, especially when the option's default is "yes"
> it's not the feature that is an expert one, but its turning off
> is considered an expert change. Which I don't think should be
> expressed this way.

That's a good point actually. It makes sense to add the EXPERT tag to
the one-line description of options that depends on EXPERT. Not where
only the prompt depends on EXPERT and the option is actually default y.

Which really narrows it down to XEN_SHSTK only. I'll reduce the patch
to do just that.

Re: [PATCH v4 1/2] xen: EXPERT clean-up and introduce UNSUPPORTED

2021-01-26 Thread Stefano Stabellini

On Tue, 26 Jan 2021, Jan Beulich wrote:
> On 26.01.2021 12:17, Bertrand Marquis wrote:
> > 
> > 
> >> On 26 Jan 2021, at 11:11, Jan Beulich  wrote:
> >>
> >> On 26.01.2021 12:06, Bertrand Marquis wrote:
>  On 26 Jan 2021, at 09:22, Jan Beulich  wrote:
>  On 25.01.2021 22:27, Stefano Stabellini wrote:
> > @@ -77,7 +77,7 @@ config SBSA_VUART_CONSOLE
> >   SBSA Generic UART implements a subset of ARM PL011 UART.
> >
> > config ARM_SSBD
> > -   bool "Speculative Store Bypass Disable" if EXPERT
> > +   bool "Speculative Store Bypass Disable (UNSUPPORTED)" if 
> > UNSUPPORTED
> > depends on HAS_ALTERNATIVE
> > default y
> > help
> > @@ -87,7 +87,7 @@ config ARM_SSBD
> >   If unsure, say Y.
> >
> > config HARDEN_BRANCH_PREDICTOR
> > -   bool "Harden the branch predictor against aliasing attacks" if 
> > EXPERT
> > +   bool "Harden the branch predictor against aliasing attacks 
> > (UNSUPPORTED)" if UNSUPPORTED
> > default y
> > help
> >   Speculation attacks against some high-performance processors 
> > rely on
> 
>  Both of these default to y and have their _prompt_
>  conditional upon EXPERT. Which means only an expert can turn them
>  _off_. Which doesn't make it look like these are unsupported? Or
>  if anything, turning them off is unsupported?
> >>>
> >>> ...You could see that as EXPERT/UNSUPPORTED options can only be
> >>> “modified” from their default value if EXPERT/UNSUPPORTED is activated.
> >>
> >> But this is nothing you can recognize when configuring Xen
> >> (i.e. seeing just these prompts).
> > 
> > Maybe something we could explain more clearly in the UNSUPPORTED/EXPERT
> > config parameters instead ?
> > We could also make that more clear in the help of such parameters directly.
> > 
> > I do not see how we could make that more clear directly in the prompt (as
> > making it too long is not a good solution).
> 
> My main request is that such tags be added only if there's
> absolutely no ambiguity. Anything else (requiring longer
> explanations in many cases) should be expressed in the help
> text of the option, or in yet other ways (a referral to
> SUPPORT.md comes to mind).

I actually agree completely with you. In the case of ARM_SSBD and
HARDEN_BRANCH_PREDICTOR, they should remain as they are today I think.

Re: [PATCH v4 1/2] xen: EXPERT clean-up and introduce UNSUPPORTED

2021-01-26 Thread Stefano Stabellini

On Tue, 26 Jan 2021, Julien Grall wrote:
> Hi Stefano,
> 
> On 25/01/2021 21:27, Stefano Stabellini wrote:
> >   config ARM_SSBD
> > -   bool "Speculative Store Bypass Disable" if EXPERT
> > +   bool "Speculative Store Bypass Disable (UNSUPPORTED)" if UNSUPPORTED
> > depends on HAS_ALTERNATIVE
> > default y
> > help
> > @@ -87,7 +87,7 @@ config ARM_SSBD
> >   If unsure, say Y.
> > config HARDEN_BRANCH_PREDICTOR
> > -   bool "Harden the branch predictor against aliasing attacks" if EXPERT
> > +   bool "Harden the branch predictor against aliasing attacks
> > (UNSUPPORTED)" if UNSUPPORTED
> > default y
> > help
> >   Speculation attacks against some high-performance processors rely on
> 
> I read through the back and forth between Bertrand and Jan about
> "UNSUPPORTED". However, I still don't understand why those two options are
> moved to UNSUPPORTED.
> 
> Both options will only build the code to enable the mitigation. The decision
> is still based on the processor you are running on.
> 
> In addition to that, ARM_SSBD can also be forced enabled/disabled on the
> command line.

Yes, you are right. HARDEN_BRANCH_PREDICTOR and ARM_SSBD should remain
EXPERT as they are today. It was a mistake to change them to
UNSUPPORTED.


> A user may want to compile out the code if the target processor is not the
> affected by the two issues. This wouldn't be much different to Xen deciding to
> not enabling the mitigation.
> 
> I would view the two options as supported but not security supported. So this
> seems to fit exactly in the definition of EXPERT rather than UNSUPPORTED.

Yes, I'll leave them unmodified.

Re: [PATCH] Fix error: array subscript has type 'char'

2021-01-26 Thread Ian Jackson

Manuel Bouyer writes ("Re: [PATCH] Fix error: array subscript has type 'char'"):
> On Thu, Jan 14, 2021 at 03:16:15PM +0100, Manuel Bouyer wrote:
> > On Thu, Jan 14, 2021 at 02:25:05PM +0100, Jan Beulich wrote:
> > > Which means you're shifting the undefined-ness from the implementation

The undefined-ness is in the *specification*. [1]

> > > Isn't this something that wants changing in your ctype.h instead?
> > > the functions (or macros), as per the C standard

Jan, can you please check your facts before asserting the existence
of a pretty blatant bug in a platform's implementation of basic C
functions ?

>From my copy of C99+TC1+TC2, para 7.4:

 1   In all cases the argument is an int, the value of which shall be
 representable as an unsigned char or shall equal the value of the
 macro EOF.  If the argument has any other value, the behavior is
 undefined. [...]

If char is signed, then it can contain -ve values.  Those values are
promoted to int by the usual integer promotions.  Naturally such
negative values are not representable as unsigned char.  Passing them
to ctype macros is UB.

So Manuel's ctypes.h is conforming and the compiler warning (which I
have seen on Linux too) is correct.

> I guess I'm going to give up on this one. We'll keep it as a local patch.

Manuel, your original patch:

Reviewed-by: Ian Jackson 
Release-Acked-by: Ian Jackson 

(The R-A is FTAOD since IMO this is a bugfix.)

Ian.

[1] I agree that "fixing" ctypes.h everywhere might be nice (eg the
way glibc does it) but that has other implications and it is not
reasonable to expect platforms supporting Xen to do that.

Re: [PATCH] x86/pod: Do not fragment PoD memory allocations

2021-01-26 Thread Elliott Mitchell

On Tue, Jan 26, 2021 at 12:08:15PM +0100, Jan Beulich wrote:
> On 25.01.2021 18:46, Elliott Mitchell wrote:
> > On Mon, Jan 25, 2021 at 10:56:25AM +0100, Jan Beulich wrote:
> >> On 24.01.2021 05:47, Elliott Mitchell wrote:
> >>>
> >>> ---
> >>> Changes in v2:
> >>> - Include the obvious removal of the goto target.  Always realize you're
> >>>   at the wrong place when you press "send".
> >>
> >> Please could you also label the submission then accordingly? I
> >> got puzzled by two identically titled messages side by side,
> >> until I noticed the difference.
> > 
> > Sorry about that.  Would you have preferred a third message mentioning
> > this mistake?
> 
> No. But I'd have expected v2 to have its subject start with
> "[PATCH v2] ...", making it relatively clear that one might
> save looking at the one labeled just "[PATCH] ...".

Yes, in fact I spotted this just after.  I was in a situation of, "does
this deserve sending an additional message out?"  (ugh, yet more damage
from that issue...)


> >>> I'm not including a separate cover message since this is a single hunk.
> >>> This really needs some checking in `xl`.  If one has a domain which
> >>> sometimes gets started on different hosts and is sometimes modified with
> >>> slightly differing settings, one can run into trouble.
> >>>
> >>> In this case most of the time the particular domain is most often used
> >>> PV/PVH, but every so often is used as a template for HVM.  Starting it
> >>> HVM will trigger PoD mode.  If it is started on a machine with less
> >>> memory than others, PoD may well exhaust all memory and then trigger a
> >>> panic.
> >>>
> >>> `xl` should likely fail HVM domain creation when the maximum memory
> >>> exceeds available memory (never mind total memory).
> >>
> >> I don't think so, no - it's the purpose of PoD to allow starting
> >> a guest despite there not being enough memory available to
> >> satisfy its "max", as such guests are expected to balloon down
> >> immediately, rather than triggering an oom condition.
> > 
> > Even Qemu/OVMF is expected to handle ballooning for a *HVM* domain?
> 
> No idea how qemu comes into play here. Any preboot environment
> aware of possibly running under Xen of course is expected to
> tolerate running with maxmem > memory (or clearly document its
> inability, in which case it may not be suitable for certain
> use cases). For example, I don't see why a preboot environment
> would need to touch all of the memory in a VM, except maybe
> for the purpose of zeroing it (which PoD can deal with fine).

I'm reading that as your answer to the above question is "yes".


> >>> For example try a domain with the following settings:
> >>>
> >>> memory = 8192
> >>> maxmem = 2147483648
> >>>
> >>> If type is PV or PVH, it will likely boot successfully.  Change type to
> >>> HVM and unless your hardware budget is impressive, Xen will soon panic.
> >>
> >> Xen will panic? That would need fixing if so. Also I'd consider
> >> an excessively high maxmem (compared to memory) a configuration
> >> error. According to my experiments long, long ago I seem to
> >> recall that a factor beyond 32 is almost never going to lead to
> >> anything good, irrespective of guest type. (But as said, badness
> >> here should be restricted to the guest; Xen itself should limp
> >> on fine.)
> > 
> > I'll confess I haven't confirmed the panic is in Xen itself.  Problem is
> > when this gets triggered, by the time the situation is clear and I can
> > get to the console the computer is already restarting, thus no error
> > message has been observed.
> 
> If the panic isn't in Xen itself, why would the computer be
> restarting?

I was thinking there was a possibility it is actually Domain 0 which is
panicing.


> > This is most certainly a configuration error.  Problem is this is a very
> > small delta between a perfectly valid configuration and the one which
> > reliably triggers a panic.
> > 
> > The memory:maxmem ratio isn't the problem.  My example had a maxmem of
> > 2147483648 since that is enough to exceed the memory of sub-$100K
> > computers.  The crucial features are maxmem >= machine memory,
> > memory < free memory (thus potentially bootable PV/PVH) and type = "hvm".
> > 
> > When was the last time you tried running a Xen machine with near zero
> > free memory?  Perhaps in the past Xen kept the promise of never panicing
> > on memory exhaustion, but this feels like this hasn't held for some time.
> 
> Such bugs needs fixing. Which first of all requires properly
> pointing them out. A PoD guest misbehaving when there's not
> enough memory to fill its pages (i.e. before it manages to
> balloon down) is expected behavior. If you can't guarantee the
> guest ballooning down quickly enough, don't configure it to
> use PoD. A PoD guest causing a Xen crash, otoh, is very likely
> even a security issue. Which needs to be treated as such: It
> needs fixing, not avoiding by "curing" one of perhaps many
> possible sources.

Okay,

Re: [PATCH] Fix error: array subscript has type 'char'

2021-01-26 Thread Manuel Bouyer

On Thu, Jan 14, 2021 at 03:16:15PM +0100, Manuel Bouyer wrote:
> On Thu, Jan 14, 2021 at 02:25:05PM +0100, Jan Beulich wrote:
> > On 14.01.2021 13:29, Manuel Bouyer wrote:
> > > On Thu, Jan 14, 2021 at 11:53:20AM +0100, Jan Beulich wrote:
> > >> On 12.01.2021 19:12, Manuel Bouyer wrote:
> > >>> From: Manuel Bouyer 
> > >>>
> > >>> Use unsigned char variable, or cast to (unsigned char), for
> > >>> tolower()/islower() and friends. Fix compiler error
> > >>> array subscript has type 'char' [-Werror=char-subscripts]
> > >>
> > >> Isn't this something that wants changing in your ctype.h instead?
> > >> the functions (or macros), as per the C standard, ought to accept
> > >> plain char aiui, and if they use the input as an array subscript,
> > >> it should be their implementation suitably converting type first.
> > > 
> > > I asked for inputs from NetBSD developers familiar with this part.
> > > 
> > > Although the parameter is an int, only a subset of values is valid,
> > > as stated in ISO C 2018 (Section 7.4 paragrah 1):
> > >> In all cases the argument is an int, the value of which shall be
> > >> representable as an unsigned char or shall equal the value of the
> > >> macro EOF.  If the argument has any other value, the behavior is 
> > >> undefined.   
> > 
> > Which means you're shifting the undefined-ness from the implementation
> > (using the value as array index) to the callers (truncating values, or
> > converting value range). In particular I do not think that the
> > intention behind the standard's wording is for every caller to need to
> > cast to unsigned char, when e.g. character literals are of type char
> > and string literals are of type const char[].
> 
> If you don't cast you fall into the undefined behavior case for non-ascii
> characters. For example, "é" in iso-latin-1 is 0xe9. In a signed char, this is
> -23 (decimal). without the cast, tolower() will see -23.
> If it is casted to (unsigned char) first, tolower() will see 233, as expected.
> The following test program illustrates this:
> armandeche:/tmp>cat toto.c
> #include 
> 
> int
> main(int _c, const char *_v[])
> {
> char c = 0xe9;
>   printf("%d %d\n", (int)c, (int)(unsigned char)c);
> }
> armandeche:/tmp>./toto 
> -23 233
> 
> 
> 
> > 
> > > As stated by NetBSD's ctype(3) manual page, NetBSD and glibc took 
> > > different
> > > approach. NetBSD emits a compile-time warning if the input may lead to
> > > undefined behavior. quoting the man page:
> > >  Some implementations of libc, such as glibc as of 2018, attempt to 
> > > avoid
> > >  the worst of the undefined behavior by defining the functions to 
> > > work for
> > >  all integer inputs representable by either unsigned char or char, and
> > >  suppress the warning.  However, this is not an excuse for avoiding
> > >  conversion to unsigned char: if EOF coincides with any such value, 
> > > as it
> > >  does when it is -1 on platforms with signed char, programs that pass 
> > > char
> > >  will still necessarily confuse the classification and mapping of EOF 
> > > with
> > >  the classification and mapping of some non-EOF inputs.
> > > 
> > > 
> > > So, although no warning is emmited on linux, it looks like to me that the
> > > cast to unsigned char is needed anyway, and relying on glibc's behavior
> > > is not portable.
> > 
> > I fully agree we shouldn't rely on glibc's behavior (I'm sure
> > you've looked at xen/include/xen/ctype.h to see how we address
> > this it Xen itself, but I will admit this is to a degree comparing
> > apples and oranges, not the least because the lack of a need to
> > consider EOF in Xen). At least in xen/tools/symbols.c I don't
> > think we're at risk of running into "undefined" space. Casts are
> 
> as long as there's only ascii characters.
> 
> Anyway NetBSD won't change its ctype.h

I guess I'm going to give up on this one. We'll keep it as a local patch.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--

Re: [PATCH] x86/timer: Fix boot on Intel systems using ITSSPRC static PIT clock gating

2021-01-26 Thread Andrew Cooper

On 07/01/2021 13:53, Jan Beulich wrote:
>> + !((hpet_cfg = hpet_read32(HPET_CFG)) & HPET_CFG_LEGACY) )
>> +{
>> +unsigned int c0_cfg, ticks, count;
>> +
>> +/* Stop the main counter. */
>> +hpet_write32(hpet_cfg & ~HPET_CFG_ENABLE, HPET_CFG);
>> +
>> +/* Reconfigure channel 0 to be 32bit periodic. */
>> +c0_cfg = hpet_read32(HPET_Tn_CFG(0));
>> +c0_cfg |= (HPET_TN_ENABLE | HPET_TN_PERIODIC | HPET_TN_SETVAL |
>> +   HPET_TN_32BIT);
>> +hpet_write32(c0_cfg, HPET_Tn_CFG(0));
>> +
>> +/*
>> + * The exact period doesn't have to match a legacy PIT.  All we need
>> + * is an interrupt queued up via the IO-APIC to check routing.
>> + *
>> + * Use HZ as the frequency.
>> + */
>> +ticks = (SECONDS(1) / HZ) * div_sc(hpet_rate, SECONDS(1), 32)) >> 
>> 32;
>> +
>> +count = hpet_read32(HPET_COUNTER);
>> +
>> +/*
>> + * HPET_TN_SETVAL above is atrociously documented in the spec.
>> + *
>> + * Periodic HPET channels have a main comparator register, and cache
>> + * the "last write to cmp", as a hidden register.
>> + *
>> + * The semantics on generating a periodic interrupt is:
>> + *   cmp += "last value written to the cmp register"
>> + * which will reload a new period.
>> + *
>> + * Normally, writes to cmp update the main comparator as well as 
>> being
>> + * cached for the reload value.  However, under these semantics, the
>> + * HPET main counter needs resetting to 0 to be able to configure 
>> the
>> + * period correctly.
>> + *
>> + * Instead, HPET_TN_SETVAL is a self-clearing control bit which we 
>> can
>> + * use for periodic timers to mean that the second write to the
>> + * comparator updates only the "last written" cache, and not the
>> + * absolute comparator value.
>> + *
>> + * This lets us set a period when the main counter isn't at 0.
>> + */
>> +hpet_write32(count + ticks, HPET_Tn_CMP(0));
>> +hpet_write32(ticks, HPET_Tn_CMP(0));
> As you say, documentation is poor here. In fact the wording in the
> HPET spec talks about updating of the "accumulator" instead;
> perhaps just an unhelpful use of a different term. (Respective
> Linux code has a comment indicating this is needed on a specific
> AMD chipset only.)

I'm fairly certain that Linux's comment is wrong.  This behaviour is
described in the HPET spec, which is an Intel spec.

It smells very much like a bug discovered during software bringup on
early alpha platforms with an HPET, and fixed at v0.9 of the spec (or
thereabouts).  I can entirely believe that it is the kind of thing which
would have been fixed in beta silicon before the spec was formally updated.

> It would seem more natural to me if things were explained a little
> differently: Writes to the comparator with SETVAL clear always
> update the "last written" value, i.e. the increment to be used
> once the main counter equals the comparator. Writes with SETVAL set
> update the comparator itself. (Assuming that's how it really is, of
> course, but that's what I infer from the doc available.)

By default, all writes update both the accumulator and comparator registers.

Except for the 2nd write of CMP following a write of SETVAL, where only
the accumulator is updated, and the comparator retains its old value.


I can only guess at the horrors for the internals of the HPET to make
this work.

SETVAL is RAZ so can't be observed in the CFG register.  To get the
observed semantics, every write to CMG will have to copy the SEVAL bit
from CFG to a local flipflop, and on the falling edge of bit, forgo the
comparator update.

This also smells of being a "least invasive fix" at a late point in
development.

> Linux additionally puts udelay(1) between the two writes. Do you
> think we're fine without such, on all platforms?

HPETs substantially predate 64bit capable systems.

There is no spec requirement for a pause, and there is a good chance
that implementation bugs like that were shaken out back in the 32bit days.

>> +/* Restart the main counter, and legacy mode. */
>> +hpet_write32(hpet_cfg | HPET_CFG_ENABLE | HPET_CFG_LEGACY, 
>> HPET_CFG);
> This isn't necessarily "restart" - you may start the counter for
> the first time. Hence in the comment maybe "(Re)start ..."?

Well - it is strictly a restart because of how the logic is laid out,
and even if that weren't the case, "restart" is fine to use in this context.

> As to the spurious IRQs, does it perhaps matter at which point
> CFG_LEGACY gets set? We could try setting it when clearing
> CFG_ENABLE further up, or we could also try two separate writes
> each setting just one of the bits. (At least I can't deduce
> from the spec that we ought to be prepared to observe spurious
> IRQs

Re: [PATCH for <= 5.4] xen/privcmd: allow fetching resource sizes

2021-01-26 Thread Andrew Cooper

On 20/01/2021 11:03, Greg KH wrote:
> On Mon, Jan 18, 2021 at 03:04:26PM +0100, Roger Pau Monne wrote:
>> commit ef3a575baf53571dc405ee4028e26f50856898e7 upstream
>>
>> Allow issuing an IOCTL_PRIVCMD_MMAP_RESOURCE ioctl with num = 0 and
>> addr = 0 in order to fetch the size of a specific resource.
>>
>> Add a shortcut to the default map resource path, since fetching the
>> size requires no address to be passed in, and thus no VMA to setup.
>>
>> This is missing from the initial implementation, and causes issues
>> when mapping resources that don't have fixed or known sizes.
>>
>> Signed-off-by: Roger Pau Monné 
>> Reviewed-by: Juergen Gross 
>> Tested-by: Andrew Cooper 
>> Cc: sta...@vger.kernel.org # >= 4.18
>> Link: https://lore.kernel.org/r/20210112115358.23346-1-roger@citrix.com
>> Signed-off-by: Juergen Gross 
>> ---
>> Cc: Boris Ostrovsky 
>> Cc: xen-devel@lists.xenproject.org
>> ---
>>  drivers/xen/privcmd.c | 25 +++--
>>  1 file changed, 19 insertions(+), 6 deletions(-)
> Now queued up, thanks.

Hello,

The upstream version of this patch was queued against 5.4 and 4.19, both
of which suffered a patch conflict, and are fixed by this version.

Was it an oversight that this version didn't get queued for 4.19?

Thanks,

~Andrew

Re: [PATCH v3 0/6] x86/intr: HVM guest interrupt handling fixes/cleanup

2021-01-26 Thread Jan Beulich

Ian,

On 26.01.2021 14:45, Roger Pau Monne wrote:
> The following series aims to fix some shortcomings of guest interrupt
> handling when using passthrough devices. The first 5 patches are
> bugfixes or cleanups, which I think should solve the issue(s) that
> caused the dpci EOI timer to be introduced. However neither me nor
> others seem to be able to reproduce the original issue, so it's hard to
> tell.
> 
> It's my opinion that we should remove the timer and see what explodes
> (if anything). That's the only way we will be able to figure out what
> the original issue was, and how to fix it without introducing yet
> another per-guest-irq related timer.
> 
> Thanks, Roger.
> 
> Roger Pau Monne (6):
>   x86/vioapic: top word redir entry writes don't trigger interrupts
>   x86/vioapic: issue EOI to dpci when switching pin to edge trigger mode
>   x86/vpic: force int output to low when in init mode
>   x86/vpic: don't trigger unmask event until end of init
>   x86/vpic: issue dpci EOI for cleared pins at ICW1
>   x86/dpci: remove the dpci EOI timer

while half of this series was still submitted in time, I'd still
like to raise the question of including part or all of it in
4.15. In particular the last change is one which I would prefer
to see happen early in a release cycle. Risk assessment is
pretty difficult, I'm afraid (Roger can correct me here), as at
least some of what gets adjusted are cases we don't normally
expect to be exercised. (FAOD patch 5 is still pending a R-b
tag.)

Roger, if you could give your own judgement on which of the
changes you would view as more or less clear 4.15 candidates,
this may help Ian take a decision.

Jan

Re: [PATCH] libs/gnttab: implement on NetBSD

2021-01-26 Thread Manuel Bouyer

On Mon, Jan 18, 2021 at 06:54:11PM +0100, Roger Pau Monné wrote:
> On Tue, Jan 12, 2021 at 07:12:32PM +0100, Manuel Bouyer wrote:
> > From: Manuel Bouyer 
> > 
> > Implement gnttab interface on NetBSD.
> > The kernel interface is different from FreeBSD so we can't use the FreeBSD
> > version
> 
> Since I'm not familiar with the NetBSD interface I can provide much
> feedback, but you have some hard tabs in the code below which should
> be removed.
> 
> Maybe you would like to be added as a maintainer for the tools NetBSD
> files?

Yes, please.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--

Re: Null scheduler and vwfi native problem

2021-01-26 Thread Anders Törnqvist


On 1/25/21 5:11 PM, Dario Faggioli wrote:

On Fri, 2021-01-22 at 14:26 +, Julien Grall wrote:

Hi Anders,

On 22/01/2021 08:06, Anders Törnqvist wrote:

On 1/22/21 12:35 AM, Dario Faggioli wrote:

On Thu, 2021-01-21 at 19:40 +, Julien Grall wrote:

- booting with "sched=null vwfi=native" but not doing the IRQ
passthrough that you mentioned above
"xl destroy" gives
(XEN) End of domain_destroy function

Then a "xl create" says nothing but the domain has not started
correct.
"xl list" look like this for the domain:
mydomu   2   512 1 --
0.0

This is odd. I would have expected ``xl create`` to fail if something
went wrong with the domain creation.


So, Anders, would it be possible to issue a:

# xl debug-keys r
# xl dmesg

And send it to us ?

Ideally, you'd do it:
  - with Julien's patch (the one he sent the other day, and that you
have already given a try to) applied
  - while you are in the state above, i.e., after having tried to
destroy a domain and failing
  - and maybe again after having tried to start a new domain

Here are some logs.

The system is booted as before with the patch and the domu config does 
not have the IRQs.



# xl list
Name    ID   Mem VCPUs State    Time(s)
Domain-0 0  3000 5 r- 820.1
mydomu   1   511 1 r- 157.0

# xl debug-keys r
(XEN) sched_smt_power_savings: disabled
(XEN) NOW=191793008000
(XEN) Online Cpus: 0-5
(XEN) Cpupool 0:
(XEN) Cpus: 0-5
(XEN) Scheduler: null Scheduler (null)
(XEN)     cpus_free =
(XEN) Domain info:
(XEN)     Domain: 0
(XEN)       1: [0.0] pcpu=0
(XEN)       2: [0.1] pcpu=1
(XEN)       3: [0.2] pcpu=2
(XEN)       4: [0.3] pcpu=3
(XEN)       5: [0.4] pcpu=4
(XEN)     Domain: 1
(XEN)       6: [1.0] pcpu=5
(XEN) Waitqueue:
(XEN) CPUs info:
(XEN) CPU[00] sibling={0}, core={0}, unit=d0v0
(XEN)     run: [0.0] pcpu=0
(XEN) CPU[01] sibling={1}, core={1}, unit=d0v1
(XEN)     run: [0.1] pcpu=1
(XEN) CPU[02] sibling={2}, core={2}, unit=d0v2
(XEN)     run: [0.2] pcpu=2
(XEN) CPU[03] sibling={3}, core={3}, unit=d0v3
(XEN)     run: [0.3] pcpu=3
(XEN) CPU[04] sibling={4}, core={4}, unit=d0v4
(XEN)     run: [0.4] pcpu=4
(XEN) CPU[05] sibling={5}, core={5}, unit=d1v0
(XEN)     run: [1.0] pcpu=5

# xl dmesg
(XEN) Checking for initrd in /chosen
(XEN) RAM: 8020 - 
(XEN) RAM: 00088000 - 0008
(XEN)
(XEN) MODULE[0]: 8040 - 8054d848 Xen
(XEN) MODULE[1]: 8300 - 83018000 Device Tree
(XEN) MODULE[2]: 8800 - 89701200 Kernel
(XEN)  RESVD[0]: 8800 - 9000
(XEN)  RESVD[1]: 8300 - 83018000
(XEN)  RESVD[2]: 8400 - 85ff
(XEN)  RESVD[3]: 8600 - 863f
(XEN)  RESVD[4]: 9000 - 903f
(XEN)  RESVD[5]: 9040 - 91ff
(XEN)  RESVD[6]: 9200 - 921f
(XEN)  RESVD[7]: 9220 - 923f
(XEN)  RESVD[8]: 9240 - 943f
(XEN)  RESVD[9]: 9440 - 94bf
(XEN)
(XEN) CMDLINE[8800]:chosen console=hvc0 earlycon=xen 
root=/dev/mmcblk0p3 mem=3000M hostname=myhost 
video=HDMI-A-1:1920x1080@60 imxdrm.legacyfb_depth=32   quiet loglevel=3 
logo.nologo vt.global_cursor_default=0

(XEN)
(XEN) Command line: console=dtuart dtuart=/serial@5a06 
dom0_mem=3000M dom0_max_vcpus=5 hmp-unsafe=true dom0_vcpus_pin 
sched=null vwfi=native

(XEN) Domain heap initialised
(XEN) Booting using Device Tree
(XEN) partition id 4
(XEN) Domain name mydomu
(XEN) *Initialized MU
(XEN) Looking for dtuart at "/serial@5a06", options ""
 Xen 4.13.1-pre
(XEN) Xen version 4.13.1-pre (anders@builder.local) 
(aarch64-poky-linux-gcc (GCC) 8.3.0) debug=n  Fri Jan 22 17:32:33 UTC 2021

(XEN) Latest ChangeSet: Wed Feb 27 17:56:28 2019 +0800 git:b64b8df-dirty
(XEN) Processor: 410fd034: "ARM Limited", variant: 0x0, part 0xd03, rev 0x4
(XEN) 64-bit Execution:
(XEN)   Processor Features: 0100 
(XEN) Exception Levels: EL3:64+32 EL2:64+32 EL1:64+32 EL0:64+32
(XEN) Extensions: FloatingPoint AdvancedSIMD GICv3-SysReg
(XEN)   Debug Features: 10305106 
(XEN)   Auxiliary Features:  
(XEN)   Memory Model Features: 1122 
(XEN)   ISA Features:  00011120 
(XEN) 32-bit Execution:
(XEN)   Processor Features: 0131:10011011
(XEN) Instruction Sets: AArch32 A32 Thumb Thumb-2 Jazelle
(XEN) Extensions: GenericTimer Security
(XEN)   Debug Features: 03010066
(XEN)   Auxiliary Features: 
(XEN)   Memory Model Features: 10201105 4000 0126 02102211
(XEN)  ISA Features: 02101110 13112111 21232042 01112131 00011142 00011121
(XEN) Generic Timer IRQ: phys=30 hyp=26 virt=27 Freq: 8000 KHz

[PATCH] xen: Fix XenStore initialisation for XS_LOCAL

2021-01-26 Thread David Woodhouse

From: David Woodhouse 

In commit 3499ba8198ca ("xen: Fix event channel callback via INTX/GSI")
I reworked the triggering of xenbus_probe().

I tried to simplify things by taking out the workqueue based startup
triggered from wake_waiting(); the somewhat poorly named xenbus IRQ
handler.

I missed the fact that in the XS_LOCAL case (Dom0 starting its own
xenstored or xenstore-stubdom, which happens after the kernel is booted
completely), that IRQ-based trigger is still actually needed.

So... put it back, except more cleanly. By just spawning a xenbus_probe
thread which waits on xb_waitq and runs the probe the first time it
gets woken, just as the workqueue-based hack did.

This is actually a nicer approach for *all* the back ends with different
interrupt methods, and we can switch them all over to that without the
complex conditions for when to trigger it. But not in -rc6. This is
the minimal fix for the regression, although it's a step in the right
direction instead of doing a partial revert and actually putting the
workqueue back. It's also simpler than the workqueue.

Fixes: 3499ba8198ca ("xen: Fix event channel callback via INTX/GSI")
Signed-off-by: David Woodhouse 
---
 drivers/xen/xenbus/xenbus_probe.c | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/drivers/xen/xenbus/xenbus_probe.c 
b/drivers/xen/xenbus/xenbus_probe.c
index c8f0282bb649..18ffd0551b54 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -714,6 +714,23 @@ static bool xs_hvm_defer_init_for_callback(void)
 #endif
 }
 
+static int xenbus_probe_thread(void *unused)
+{
+   DEFINE_WAIT(w);
+
+   /*
+* We actually just want to wait for *any* trigger of xb_waitq,
+* and run xenbus_probe() the moment it occurs.
+*/
+   prepare_to_wait(_waitq, , TASK_INTERRUPTIBLE);
+   schedule();
+   finish_wait(_waitq, );
+
+   DPRINTK("probing");
+   xenbus_probe();
+   return 0;
+}
+
 static int __init xenbus_probe_initcall(void)
 {
/*
@@ -725,6 +742,20 @@ static int __init xenbus_probe_initcall(void)
 !xs_hvm_defer_init_for_callback()))
xenbus_probe();
 
+   /*
+* For XS_LOCAL, spawn a thread which will wait for xenstored
+* or a xenstore-stubdom to be started, then probe. It will be
+* triggered when communication starts happening, by waiting
+* on xb_waitq.
+*/
+   if (xen_store_domain_type == XS_LOCAL) {
+   struct task_struct *probe_task;
+
+   probe_task = kthread_run(xenbus_probe_thread, NULL,
+"xenbus_probe");
+   if (IS_ERR(probe_task))
+   return PTR_ERR(probe_task);
+   }
return 0;
 }
 device_initcall(xenbus_probe_initcall);
-- 
2.29.2



smime.p7s
Description: S/MIME cryptographic signature

Re: [PATCH v3 5/6] x86/vpic: issue dpci EOI for cleared pins at ICW1

2021-01-26 Thread Jan Beulich

On 26.01.2021 14:45, Roger Pau Monne wrote:
> When pins are cleared from either ISR or IRR as part of the
> initialization sequence forward the clearing of those pins to the dpci
> EOI handler, as it is equivalent to an EOI. Not doing so can bring the
> interrupt controller state out of sync with the dpci handling logic,
> that expects a notification when a pin has been EOI'ed.
> 
> Fixes: 7b3cb5e5416 ('IRQ injection changes for HVM PCI passthru.')
> Signed-off-by: Roger Pau Monné 

As said before, under the assumption that the clearing of IRR
and ISR that we do is correct, I agree with the change. I'd
like to give it some time though before giving my R-b here, for
the inquiry to hopefully get answered. After all there's still
the possibility of us needing to instead squash that clearing
(which then would seem to result in getting things in sync the
other way around).

Jan

> --- a/xen/arch/x86/hvm/vpic.c
> +++ b/xen/arch/x86/hvm/vpic.c
> @@ -197,6 +197,8 @@ static void vpic_ioport_write(
>  {
>  if ( val & 0x10 )
>  {
> +unsigned int pending = vpic->isr | (vpic->irr & ~vpic->elcr);
> +
>  /* ICW1 */
>  /* Clear edge-sensing logic. */
>  vpic->irr &= vpic->elcr;
> @@ -220,6 +222,24 @@ static void vpic_ioport_write(
>  }
>  
>  vpic->init_state = ((val & 3) << 2) | 1;
> +vpic_update_int_output(vpic);
> +vpic_unlock(vpic);
> +
> +/*
> + * Forward the EOI of any pending or in service interrupt that 
> has
> + * been cleared from IRR or ISR, or else the dpci logic will get
> + * out of sync with the state of the interrupt controller.
> + */
> +while ( pending )
> +{
> +unsigned int pin = __scanbit(pending, 8);
> +
> +ASSERT(pin < 8);
> +hvm_dpci_eoi(current->domain,
> + hvm_isa_irq_to_gsi((addr >> 7) ? (pin | 8) : 
> pin));
> +__clear_bit(pin, );
> +}
> +return;
>  }
>  else if ( val & 0x08 )
>  {
>

Re: [PATCH] x86/PV: use 64-bit subtract to adjust guest RIP upon missing SYSCALL callbacks

2021-01-26 Thread Andrew Cooper

On 26/01/2021 16:31, Jan Beulich wrote:
> When discussing the shrunk down version of the commit in question it
> was said (in reply to my conditional choosing of the width):
>
> "However, the 32bit case isn't actually interesting here.  A
>  guest can't execute a SYSCALL instruction on/across the 4G->0 boundary
>  because the M2P is mapped NX up to the 4G boundary, so we can never
>  reach this point with %eip < 2.
>
>  Therefore, the 64bit-only form is the appropriate one to use, which
>  solves any question of cleverness, or potential decode stalls it
>  causes."
>
> Fixes: ca6fcf4321b3 ("x86/pv: Inject #UD for missing SYSCALL callbacks")
> Signed-off-by: Jan Beulich 

Crap.  I folded the fix into my wrong tree.  Sorry.

Reviewed-by: Andrew Cooper 

~Andrew

Re: [PATCH v3 4/6] x86/vpic: don't trigger unmask event until end of init

2021-01-26 Thread Jan Beulich

On 26.01.2021 14:45, Roger Pau Monne wrote:
> Wait until the end of the init sequence to trigger the unmask event.
> Note that it will be unconditionally triggered, but that's harmless if
> not unmask actually happened.
> 
> While there change the variable type to bool.
> 
> Requested-by: Jan Beulich 
> Signed-off-by: Roger Pau Monné 

Acked-by: Jan Beulich

Re: [PATCH v3 3/6] x86/vpic: force int output to low when in init mode

2021-01-26 Thread Jan Beulich

On 26.01.2021 14:45, Roger Pau Monne wrote:
> When the PIC is on the init sequence prevent interrupt delivery. The
> state of the registers is in the process of being set during the init
> phase, so it makes sense to prevent any int line changes during that
> process.
> 
> Requested-by: Jan Beulich 
> Signed-off-by: Roger Pau Monné 

Acked-by: Jan Beulich 

I guess I'll change the other tag in this case to Suggested-by
though, should I end up committing this.

Jan

Re: [PATCH] NetBSD hotplug: handle case where vifname is not present

2021-01-26 Thread Manuel Bouyer

On Fri, Jan 15, 2021 at 05:06:59PM +0100, Roger Pau Monné wrote:
> On Tue, Jan 12, 2021 at 07:12:25PM +0100, Manuel Bouyer wrote:
> > From: Manuel Bouyer 
> > 
> > Some Xen version didn't set the vifname in xenstore; just build one if
> > not present.
> 
> I think the current version (what's going to become 4.15) should write
> the vifname in all cases? If not that's likely an error that we should
> fix elsewhere IMO.
> 
> Can you check whether the current version has this error present? And
> whether it affects PV or HVM guests?

Yes, in recent Xen version this doesn't seem to be needed any more.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--

Re: [PATCH] NetBSD hotplug: fix block unconfigure on destroy

2021-01-26 Thread Manuel Bouyer

On Fri, Jan 15, 2021 at 04:27:12PM +0100, Roger Pau Monné wrote:
> On Tue, Jan 12, 2021 at 07:12:24PM +0100, Manuel Bouyer wrote:
> > From: Manuel Bouyer 
> > 
> > When a domain is destroyed, xparams may not be available any more when
> > the block script is called to unconfigure the vnd.
> > Check xparam only at configure time, and just unconfigure any vnd present
> > in the xenstore.
> 
> The patch itself seems fine to me, there's no need to fetch params
> for unplug, you can just reply on the vnd node.
> 
> I'm however not sure why params would be deleted from xenstore but not
> vnd, do you know why?

I'm not sure, it happens on xl destroy, when the kernel in the
domU is stuck (so it won't shutdown properly).

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--

Re: [PATCH v2 3/4] x86: Allow non-faulting accesses to non-emulated MSRs if policy permits this

2021-01-26 Thread Jan Beulich

On 26.01.2021 17:02, Boris Ostrovsky wrote:
> On 21-01-26 10:05:47, Jan Beulich wrote:
>> On 25.01.2021 19:42, Boris Ostrovsky wrote:
>>> On 21-01-25 11:22:08, Jan Beulich wrote:
 On 22.01.2021 20:52, Boris Ostrovsky wrote:
> "Sibling" in terms of name (yes, it would be) or something else?

 Name and (possible) purpose - a validate hook could want to
 make use of this, for example.
>>>
>>> A validate hook? 
>>
>> Quoting from struct x86_emulate_ops:
>>
>> /*
>>  * validate: Post-decode, pre-emulate hook to allow caller controlled
>>  * filtering.
>>  */
>> int (*validate)(
>> const struct x86_emulate_state *state,
>> struct x86_emulate_ctxt *ctxt);
>>
>> Granted to be directly usable the function would need to have a
>> "state" parameter. As that's unused, having it have one and
>> passing NULL in your case might be acceptable. But I also could
>> see arguments towards this not being a good idea.
> 
> I see. Where would we need to call this hook though? We are never directly
> emulating MSR access (compared to, for example, CR accesses where
> x86_insn_is_cr_access is used). And for PV we already validate it in
> emul-priv-op.c():validate().

If you look at some of the functions used for this hook, you may
realize that what your function does could also fit a hypothetical
future hook. Hence I was suggesting to make the new function
suitable for such use right away (and in particular fit their
naming scheme). But it looks like this has ended up more confusing
than anything else, so never mind.

Jan

[PATCH] x86/PV: use 64-bit subtract to adjust guest RIP upon missing SYSCALL callbacks

2021-01-26 Thread Jan Beulich

When discussing the shrunk down version of the commit in question it
was said (in reply to my conditional choosing of the width):

"However, the 32bit case isn't actually interesting here.  A
 guest can't execute a SYSCALL instruction on/across the 4G->0 boundary
 because the M2P is mapped NX up to the 4G boundary, so we can never
 reach this point with %eip < 2.

 Therefore, the 64bit-only form is the appropriate one to use, which
 solves any question of cleverness, or potential decode stalls it
 causes."

Fixes: ca6fcf4321b3 ("x86/pv: Inject #UD for missing SYSCALL callbacks")
Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -42,7 +42,7 @@ ENTRY(switch_to_kernel)
 UNLIKELY_START(z, syscall_no_callback) /* TB_eip == 0 => #UD */
 mov   VCPU_trap_ctxt(%rbx), %rdi
 movl  $X86_EXC_UD, UREGS_entry_vector(%rsp)
-subl  $2, UREGS_rip(%rsp)
+subq  $2, UREGS_rip(%rsp)
 mov   X86_EXC_UD * TRAPINFO_sizeof + TRAPINFO_eip(%rdi), %rax
 testb $4, X86_EXC_UD * TRAPINFO_sizeof + TRAPINFO_flags(%rdi)
 setnz %cl

Re: [PATCH 2/3] x86/irq: don't disable domain MSI IRQ on force unbind

2021-01-26 Thread Roger Pau Monné

On Tue, Jan 26, 2021 at 03:52:54PM +0100, Jan Beulich wrote:
> On 26.01.2021 12:06, Roger Pau Monne wrote:
> > When force unbinding a MSI the used IRQ would get added to the domain
> > irq-pirq tree as -irq -pirq,
> 
> I think it's -pirq at index irq, i.e. I don't think IRQ gets
> negated as far as the radix tree goes. info->arch.irq gets a
> negative value stored, yes.

Right, and this then prevents the IRQ to be used at all by the domain.
Doiong a domain_irq_to_pirq with that IRQ will get -pirq, but that
seems pretty arbitrary for MSI IRQs, that get allocated on demand.

At the end of unmap_domain_pirq the IRQ will get freed if it was
assigned to an MSI source, and hence it seem pointless to add irq ->
-pirq to the domain irq tree.

> > thus preventing the same IRQ to be used by the domain.
> 
> Iirc this (answering your post-commit-message question here)
> is for cleaning up _after_ the domain, i.e. there's no goal
> to allow re-use of this IRQ. The comment ahead of
> unmap_domain_pirq() validly says "The pirq should have been
> unbound before this call." The only time we can't make
> ourselves dependent upon this is when the guest is being
> cleaned up. During normal operation I think we actually
> _want_ to enforce correct behavior of the guest here.

OK, so that might be fine for legacy PCI IRQs, that are fixed, but
quite pointless for allocated on demand MSI IRQs that can change
between allocations.

> > It's not clear to me why we add the irq as -irq -pirq to the irq-pirq
> > domain tree on forced unbind, as allowing to bind the irq again should
> > just work regardless of whether it has been previously forcefully
> > unbound?
> 
> To continue from the above, see pirq_guest_unbind() where
> we have
> 
> if ( desc == NULL )
> {
> irq = -pirq->arch.irq;
> BUG_ON(irq <= 0);
> desc = irq_to_desc(irq);
> spin_lock_irq(>lock);
> clear_domain_irq_pirq(d, irq, pirq);
> }
> 
> as the alternative to going the normal path through
> __pirq_guest_unbind(). Again iirc that's to cover for the
> unbind request arriving after the unmap one (i.e. having
> caused the unmap to force-unbind the IRQ).

Oh, so that's the point. Do you think you could add some comments to
explain the indented behaviour? I think I get it now, but it's hard to
follow without some pointers.

Thanks, Roger.

Re: [PATCH 1/3] x86/irq: remove duplicated clear_domain_irq_pirq calls

2021-01-26 Thread Jan Beulich

On 26.01.2021 17:08, Roger Pau Monné wrote:
> On Tue, Jan 26, 2021 at 03:38:27PM +0100, Jan Beulich wrote:
>> On 26.01.2021 12:06, Roger Pau Monne wrote:
>>> There are two duplicated calls to cleanup_domain_irq_pirq in
>>> unmap_domain_pirq.
>>>
>>> The first one in the for loop will be called with exactly the same
>>> arguments as the call placed closer to the loop start.
>>
>> I'm having trouble figuring out which two instances you refer
>> to: To me "first one" and "closer to the start" are two ways
>> of expressing the same thing. You don't refer to the call to
>> clear_domain_irq_pirq(), do you, when the two calls you
>> remove are to cleanup_domain_irq_pirq()? Admittedly quite
>> similar names, but entirely different functions.
> 
> Urg, yes, that's impossible to parse sensibly, sorry.
> 
> Also the subject is wrong, should be cleanup_domain_irq_pirq, not
> clear_domain_irq_pirq.
> 
> This should instead be:
> 
> "There are two duplicated calls to cleanup_domain_irq_pirq in
> unmap_domain_pirq.
> 
> The first removed call to cleanup_domain_irq_pirq will be called with
> exactly the same arguments as the previous call placed above it."

And which one is "the previous call"? I'm still getting the
impression you're mixing up cleanup_domain_irq_pirq() and
clear_domain_irq_pirq(). (I guess we need to resort to line
numbers in current staging or master, to avoid
misunderstandings. Not for the commit message [if any], but
for the discussion.)

> It's hard to explain this in a commit message.
> 
> Do you agree that the removed calls are duplicated though? I might have
> as well missed part of the logic here and be wrong about this.

No, for the moment I don't agree yet, because I don't see
the redundancy so far.

>>> The logic used in the loop seems extremely complex to follow IMO,
>>> there are several breaks for exiting the loop, and the index (i) is
>>> also updated in different places.
>>
>> Indeed, and it didn't feel well already back at the time when
>> I much extended this to support multi-vector MSI. I simply
>> didn't have any good idea how to streamline all of this
>> (short of rewriting it altogether, which would have made
>> patch review quite difficult). If you have thoughts, I'm all
>> ears.
> 
> I would just rewrite it altogether. Code like this is very prone to
> cause mistakes in the future IMO. If you want I can try to.

I wouldn't mind, but yes, besides review difficulties ...

> I guess the problem with this is that we would lose the history of the
> code for no functional change.

... this also wouldn't be overly nice (but can be dealt with
via a few more steps wading through git history).

Jan

[xen-unstable-smoke test] 158639: tolerable all pass - PUSHED

2021-01-26 Thread osstest service owner

flight 158639 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/158639/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  c7b0f25e8f86373ed54e1c446f8e67ce25ac6819
baseline version:
 xen  ca6fcf4321b31df0b50720fa817e727b16e34f76

Last test of basis   158627  2021-01-26 11:00:29 Z0 days
Testing same since   158639  2021-01-26 14:00:28 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 
  Roger Pau Monne 
  Roger Pau Monné 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   ca6fcf4321..c7b0f25e8f  c7b0f25e8f86373ed54e1c446f8e67ce25ac6819 -> smoke

Re: [PATCH 1/3] x86/irq: remove duplicated clear_domain_irq_pirq calls

2021-01-26 Thread Roger Pau Monné

On Tue, Jan 26, 2021 at 03:38:27PM +0100, Jan Beulich wrote:
> On 26.01.2021 12:06, Roger Pau Monne wrote:
> > There are two duplicated calls to cleanup_domain_irq_pirq in
> > unmap_domain_pirq.
> > 
> > The first one in the for loop will be called with exactly the same
> > arguments as the call placed closer to the loop start.
> 
> I'm having trouble figuring out which two instances you refer
> to: To me "first one" and "closer to the start" are two ways
> of expressing the same thing. You don't refer to the call to
> clear_domain_irq_pirq(), do you, when the two calls you
> remove are to cleanup_domain_irq_pirq()? Admittedly quite
> similar names, but entirely different functions.

Urg, yes, that's impossible to parse sensibly, sorry.

Also the subject is wrong, should be cleanup_domain_irq_pirq, not
clear_domain_irq_pirq.

This should instead be:

"There are two duplicated calls to cleanup_domain_irq_pirq in
unmap_domain_pirq.

The first removed call to cleanup_domain_irq_pirq will be called with
exactly the same arguments as the previous call placed above it."

It's hard to explain this in a commit message.

Do you agree that the removed calls are duplicated though? I might have
as well missed part of the logic here and be wrong about this.

> > The logic used in the loop seems extremely complex to follow IMO,
> > there are several breaks for exiting the loop, and the index (i) is
> > also updated in different places.
> 
> Indeed, and it didn't feel well already back at the time when
> I much extended this to support multi-vector MSI. I simply
> didn't have any good idea how to streamline all of this
> (short of rewriting it altogether, which would have made
> patch review quite difficult). If you have thoughts, I'm all
> ears.

I would just rewrite it altogether. Code like this is very prone to
cause mistakes in the future IMO. If you want I can try to.

I guess the problem with this is that we would lose the history of the
code for no functional change.

Thanks, Roger.

Re: [PATCH v2 3/4] x86: Allow non-faulting accesses to non-emulated MSRs if policy permits this

2021-01-26 Thread Boris Ostrovsky

On 21-01-26 10:05:47, Jan Beulich wrote:
> On 25.01.2021 19:42, Boris Ostrovsky wrote:
> > On 21-01-25 11:22:08, Jan Beulich wrote:
> >> On 22.01.2021 20:52, Boris Ostrovsky wrote:
> >>> On 1/22/21 7:51 AM, Jan Beulich wrote:
>  On 20.01.2021 23:49, Boris Ostrovsky wrote:
> > +
> > +/*
> > + * Accesses to unimplemented MSRs as part of emulation of 
> > instructions
> > + * other than guest's RDMSR/WRMSR should never succeed.
> > + */
> > +if ( !is_guest_msr_access )
> > +ignore_msrs = MSR_UNHANDLED_NEVER;
> 
>  Wouldn't you better "return true" here? Such accesses also
>  shouldn't be logged imo (albeit I agree that's a change from
>  current behavior).
> >>>
> >>>
> >>> Yes, that's why I didn't return here. We will be here in 
> >>> !is_guest_msr_access case most likely due to a bug in the emulator so I 
> >>> think we do want to see the error logged.
> >>
> >> Why "most likely"?
> > 
> > 
> > OK, definitely ;-)
> 
> Oops - I was thinking the other way around, considering such
> to possibly be legitimate. It just so happens that curently
> we have no such path.

I was imagining a case where we'd add have another "ops->read_msr(_regs.ecx, 
...)"
in the emulator and some values of ecx may not be tested. (Or even passing
an explicit index that is not handled, although that is highly unlikely to
pass code review).

Yes, these cases do not currently exist. 

> 
> > But I still think logging these accesses would be helpful.
> 
> Because of the above I continue to question this.
> 
> > +if ( unlikely(ignore_msrs != MSR_UNHANDLED_NEVER) )
> > +*val = 0;
> 
>  I don't understand the conditional here, even more so with
>  the respective changelog entry. In any event you don't
>  want to clobber the value ahead of ...
> 
> > +if ( likely(ignore_msrs != MSR_UNHANDLED_SILENT) )
> > +{
> > +if ( is_write )
> > +gdprintk(XENLOG_WARNING, "WRMSR 0x%08x val 0x%016"PRIx64
> > +" unimplemented\n", msr, *val);
> 
>  ... logging it.
> >>>
> >>>
> >>> True. I dropped !is_write from v1 without considering this.
> >>>
> >>> As far as the conditional --- dropping it too would be a behavior change. 
> >>
> >> Albeit an intentional one then? Plus I think I have trouble
> >> seeing what behavior it would be that would change.
> > 
> > 
> > Currently callers of, say, read_msr() don't expect the argument that they 
> > pass in to change. Granted, they shouldn't (and AFAICS don't) look at it 
> > but it's a change nonetheless.
> 
> Hmm, I'm confused: The purpose of read_msr() is to change the
> value pointed at by the passed in argument.

Only if MSR exists/handled. We add parameters that allow it to be set
to zero for unhandled case but default (which is existing behavior, i.e.
MSR_UNHANDLED_NEVER) would leave the (pointed to) argument unchanged.

>  And for write_msr()
> the users of the hook pass the argument by value, i.e. wouldn't
> observe the changed value (it would only possibly be
> intermediate layers which might observe the change, but those
> ought to not care).

Yes, this would only be relevant to reads but since I dropped is_write check
the conditional covers both reads and writes.

In any case, this (and the one above) are minor issues and I don't mind
making the change.

> 
> > --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> > +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> > @@ -850,4 +850,10 @@ static inline void x86_emul_reset_event(struct 
> > x86_emulate_ctxt *ctxt)
> >  ctxt->event = (struct x86_event){};
> >  }
> >  
> > +static inline bool x86_emul_guest_msr_access(struct x86_emulate_ctxt 
> > *ctxt)
> 
>  The parameter wants to be pointer-to-const. In addition I wonder
>  whether this wouldn't better be a sibling to
>  x86_insn_is_cr_access() (without a "state" parameter, which
>  would be unused and unavailable to the callers), which may end
>  up finding further uses down the road.
> >>>
> >>>
> >>> "Sibling" in terms of name (yes, it would be) or something else?
> >>
> >> Name and (possible) purpose - a validate hook could want to
> >> make use of this, for example.
> > 
> > A validate hook? 
> 
> Quoting from struct x86_emulate_ops:
> 
> /*
>  * validate: Post-decode, pre-emulate hook to allow caller controlled
>  * filtering.
>  */
> int (*validate)(
> const struct x86_emulate_state *state,
> struct x86_emulate_ctxt *ctxt);
> 
> Granted to be directly usable the function would need to have a
> "state" parameter. As that's unused, having it have one and
> passing NULL in your case might be acceptable. But I also could
> see arguments towards this not being a good idea.

I see. Where would we need to call this hook though? We are never directly
emulating MSR access (compared to, for example, CR accesses where

Re: [PATCH v7 08/10] tools/misc: Add xen-vmtrace tool

2021-01-26 Thread Andrew Cooper

On 26/01/2021 13:32, Ian Jackson wrote:
> Andrew Cooper writes ("Re: [PATCH v7 08/10] tools/misc: Add xen-vmtrace 
> tool"):
>> On 26/01/2021 11:59, Ian Jackson wrote:
>>> [Andrew:]
 This is example code, not a production utility.
>>> Perhaps the utility could print some kind of health warning about it
>>> possibly leaving this perf-impacting feature enabled, and how to clean
>>> up ?
>> Why?  The theoretical fallout here is not conceptually different to
>> libxl or qemu segfaulting, or any of the myriad other random utilities
>> we have.
>>
>> Printing "Warning - this program, just like everything else in the Xen
>> tree, might in exceptional circumstances segfault and leave the domain
>> in a weird state" is obvious, and doesn't need stating.
>>
>> The domain is stuffed. `xl destroy` may or may not make the problem go away.
> Firstly, I don't agree with this pessimistic analysis of our current
> tooling.  Secondly, I would consider many such behaviours bugs;
> certainly we have bugs but we shouldn't introduce more of them.
>
> You are justifying the poor robustness of this tool on the grounds
> that it's "example code, not a production utility".
>
> But we are shipping it to bin/ and there is nothing telling anyone
> that trying to use it (perhaps wrapped in something of their own
> devising) is a bad idea.
>
> Either this is code users might be expected to run in production in
> which we need to make it at least have a minimal level of engineering
> robustness (which is perhaps too difficult at this stage), or we need
> to communicate to our users that it's a programming example, not a
> useful utility.
>
> Note that *even if it is a programming example*, we should highlight
> its most important deficiencies.  Otherwise it is a hazardously
> misleading example.
>
> I hope that makes sense.

First of all - I'm not the author of this code.  I'm merely the person
who did the latest round of cleanup, and sent the series.

This code is a damn sight more robust than the other utilities, because
in the case that something does go wrong, the domain will still function
correctly.  Almost everything else, qemu included, will leave the VM
totally broken, as it becomes paused waiting on a request which has
escaped into the ether.


I also don't feel that now is an appropriate time to be insisting that
the goalpost's move when it comes to submissions into tools/,
particularly as you were happy (well - didn't comment on) with this
pattern back in v3.

What exact colour do you want this bikeshed?  Anything non-trivial is
going to miss 4.15.

~Andrew

Re: [PATCH 3/3] x86/vmsi: tolerate unsupported MSI address/data fields

2021-01-26 Thread Roger Pau Monné

On Tue, Jan 26, 2021 at 04:17:59PM +0100, Jan Beulich wrote:
> On 26.01.2021 12:06, Roger Pau Monne wrote:
> > Plain MSI doesn't allow caching the MSI address and data fields while
> > the capability is enabled and not masked, hence we need to allow any
> > changes to those fields to update the binding of the interrupt. For
> > reference, the same doesn't apply to MSI-X that is allowed to cache
> > the data and address fields while the entry is unmasked, see section
> > 6.8.3.5 of the PCI Local Bus Specification 3.0.
> > 
> > Allowing such updates means that a guest can write an invalid address
> > (ie: all zeros) and then a valid one, so the PIRQs shouldn't be
> > unmapped when the interrupt cannot be bound to the guest, since
> > further updates to the address or data fields can result in the
> > binding succeeding.
> 
> IOW the breakage from the other patch was because rubbish was
> written first, and suitable data was written later on? I didn't
> think core PCI code in Linux would do such, which would make me
> suspect a driver having custom MSI handling code ...

So it seems that specific Linux driver will write 0s to the address
field at some point during initialization, but it also doesn't end up
using MSI interrupts anyway, so I assume it's somehow broken. FTR it's
the snd_hda_intel driver.

However it seems like Linux likes to zero all addresses fields on
shutdown for MSI (not MSI-X) with the capability enabled, and I do
see:

vmsi.c:688:d0v2 :00:1c.3: PIRQ 643: unsupported address 0
vmsi.c:688:d0v2 :00:1c.3: PIRQ 643: unsupported address 0
vmsi.c:688:d0v2 :00:1c.0: PIRQ 644: unsupported address 0
vmsi.c:688:d0v2 :00:1c.0: PIRQ 644: unsupported address 0
vmsi.c:688:d0v2 :00:14.0: PIRQ 641: unsupported address 0
vmsi.c:688:d0v2 :00:14.0: PIRQ 641: unsupported address 0
vmsi.c:688:d0v2 :00:14.0: PIRQ 641: unsupported address 0
vmsi.c:688:d0v2 :00:01.2: PIRQ 645: unsupported address 0
vmsi.c:688:d0v2 :00:01.2: PIRQ 645: unsupported address 0
vmsi.c:688:d0v2 :00:01.1: PIRQ 646: unsupported address 0
vmsi.c:688:d0v2 :00:01.1: PIRQ 646: unsupported address 0
vmsi.c:688:d0v2 :00:01.0: PIRQ 647: unsupported address 0
vmsi.c:688:d0v2 :00:01.0: PIRQ 647: unsupported address 0

When dom0 is shutting down. That's with the 5.4 kernel, maybe other
versions won't do it.

> > Modify the vPCI MSI arch helpers to track whether the interrupt is
> > bound, and make failures in vpci_msi_update not unmap the PIRQ, so
> > that further calls can attempt to bind the PIRQ again.
> > 
> > Note this requires some modifications to the MSI-X handlers, but there
> > shouldn't be any functional changes in that area.
> > 
> > Signed-off-by: Roger Pau Monné 
> 
> Reviewed-by: Jan Beulich 
> 
> Am I understanding correctly that this change is independent of
> the initial 2 patches (where I have reservations), and hence
> it could go in ahead of them (or all alone)?

Yes, it's fully independent.

Thanks, Roger.

Re: [PATCH] tools/libs: honor build dependencies for recently moved subdirs

2021-01-26 Thread Wei Liu

On Fri, Jan 22, 2021 at 04:14:30PM +0100, Jan Beulich wrote:
> While the lack of proper dependency tracking of #include-d files is
> wider than just the libs/ subtree, dealing with the problem universally
> there or  in tools/Rules.mk is too much of a risk at this point in the
> release cycle. Add the missing inclusion of $(DEPS_INCLUDE) only in the
> specific Makefile-s, after having checked that their prior Makefile-s
> had such includes.
> 
> Interestingly the $(DEPS_RM) use is present in tools/libs/libs.mk's
> clean target, so doesn't need taking care of in individual Makefile-s.
> 
> Signed-off-by: Jan Beulich 
> Release-acked-by: Ian Jackson 

Acked-by: Wei Liu

Re: [PATCH v3 2/6] x86/vioapic: issue EOI to dpci when switching pin to edge trigger mode

2021-01-26 Thread Jan Beulich

On 26.01.2021 14:45, Roger Pau Monne wrote:
> When an IO-APIC pin is switched from level to edge trigger mode the
> IRR bit is cleared, so it can be used as a way to EOI an interrupt at
> the IO-APIC level.
> 
> Such EOI however does not get forwarded to the dpci code like it's
> done for the local APIC initiated EOI. This change adds the code in
> order to notify dpci of such EOI, so that dpci and the interrupt
> controller are in sync.
> 
> Signed-off-by: Roger Pau Monné 

Reviewed-by: Jan Beulich

Re: [PATCH v3 1/6] x86/vioapic: top word redir entry writes don't trigger interrupts

2021-01-26 Thread Jan Beulich

On 26.01.2021 14:45, Roger Pau Monne wrote:
> Top word writes just update the destination of the interrupt, but
> since there's no change on the masking or the triggering mode no
> guest interrupt injection can result of such write. Add an assert to
> that effect.
> 
> Requested-by: Jan Beulich 
> Signed-off-by: Roger Pau Monné 

Acked-by: Jan Beulich

Re: [PATCH v4 1/2] xen: EXPERT clean-up and introduce UNSUPPORTED

2021-01-26 Thread Jan Beulich

On 26.01.2021 14:17, Ian Jackson wrote:
> Jan Beulich writes ("Re: [PATCH v4 1/2] xen: EXPERT clean-up and introduce 
> UNSUPPORTED"):
>> On 26.01.2021 12:17, Bertrand Marquis wrote:
>>> Maybe something we could explain more clearly in the UNSUPPORTED/EXPERT
>>> config parameters instead ?
>>> We could also make that more clear in the help of such parameters directly.
>>>
>>> I do not see how we could make that more clear directly in the prompt (as
>>> making it too long is not a good solution).
>>
>> My main request is that such tags be added only if there's
>> absolutely no ambiguity. Anything else (requiring longer
>> explanations in many cases) should be expressed in the help
>> text of the option, or in yet other ways (a referral to
>> SUPPORT.md comes to mind).
> 
> Is
> 
>> +bool "Harden the branch predictor against aliasing attacks 
>> (disabling UNSUPPORTED)" if UNSUPPORTED
> 
> too long ?

One more consideration, beyond the "too long" aspect. To me (as a
non-native speaker) this wording might allow inferring (by people
not knowing kconfig sufficiently well) that selecting this option
will turn off UNSUPPORTED. IOW if anything I'd see us use the yet
slightly longer "... (UNSUPPORTED if disabled)" or some such.

Jan

Re: [PATCH v4 1/2] xen: EXPERT clean-up and introduce UNSUPPORTED

2021-01-26 Thread Marek Marczykowski-Górecki

On Tue, Jan 26, 2021 at 01:20:14PM +, Bertrand Marquis wrote:
> 
> 
> > On 26 Jan 2021, at 13:19, Jan Beulich  wrote:
> > 
> > On 26.01.2021 14:17, Ian Jackson wrote:
> >> Jan Beulich writes ("Re: [PATCH v4 1/2] xen: EXPERT clean-up and introduce 
> >> UNSUPPORTED"):
> >>> On 26.01.2021 12:17, Bertrand Marquis wrote:
>  Maybe something we could explain more clearly in the UNSUPPORTED/EXPERT
>  config parameters instead ?
>  We could also make that more clear in the help of such parameters 
>  directly.
>  
>  I do not see how we could make that more clear directly in the prompt (as
>  making it too long is not a good solution).
> >>> 
> >>> My main request is that such tags be added only if there's
> >>> absolutely no ambiguity. Anything else (requiring longer
> >>> explanations in many cases) should be expressed in the help
> >>> text of the option, or in yet other ways (a referral to
> >>> SUPPORT.md comes to mind).
> >> 
> >> Is
> >> 
> >>> + bool "Harden the branch predictor against aliasing attacks 
> >>> (disabling UNSUPPORTED)" if UNSUPPORTED
> >> 
> >> too long ?
> > 
> > It may be tolerable, but it is getting longish imo.
> 
> I am also not quite sure this is making things clearer.

IMO it the original version strongly suggested that _enabling_ the
option is unsupported (and quite confusing why it was enabled by
default). When browsing the menu, it isn't really clear what is the
default value, and even if it would be, it's totally not obvious that the
meaning of "(UNSUPPORTED)" depends on default option value.

So, yes, I think "(disabling UNSUPPORTED)" is significantly better for
such cases.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab


signature.asc
Description: PGP signature

Re: [PATCH 3/3] x86/vmsi: tolerate unsupported MSI address/data fields

2021-01-26 Thread Jan Beulich

On 26.01.2021 12:06, Roger Pau Monne wrote:
> Plain MSI doesn't allow caching the MSI address and data fields while
> the capability is enabled and not masked, hence we need to allow any
> changes to those fields to update the binding of the interrupt. For
> reference, the same doesn't apply to MSI-X that is allowed to cache
> the data and address fields while the entry is unmasked, see section
> 6.8.3.5 of the PCI Local Bus Specification 3.0.
> 
> Allowing such updates means that a guest can write an invalid address
> (ie: all zeros) and then a valid one, so the PIRQs shouldn't be
> unmapped when the interrupt cannot be bound to the guest, since
> further updates to the address or data fields can result in the
> binding succeeding.

IOW the breakage from the other patch was because rubbish was
written first, and suitable data was written later on? I didn't
think core PCI code in Linux would do such, which would make me
suspect a driver having custom MSI handling code ...

> Modify the vPCI MSI arch helpers to track whether the interrupt is
> bound, and make failures in vpci_msi_update not unmap the PIRQ, so
> that further calls can attempt to bind the PIRQ again.
> 
> Note this requires some modifications to the MSI-X handlers, but there
> shouldn't be any functional changes in that area.
> 
> Signed-off-by: Roger Pau Monné 

Reviewed-by: Jan Beulich 

Am I understanding correctly that this change is independent of
the initial 2 patches (where I have reservations), and hence
it could go in ahead of them (or all alone)?

Jan

Re: [PATCH 3/3] x86: Support booting under Secure Startup via SKINIT

2021-01-26 Thread Roger Pau Monné

On Tue, Jan 19, 2021 at 05:23:25PM +, Andrew Cooper wrote:
> On 19/01/2021 15:48, Roger Pau Monné wrote:
> > On Fri, Jan 15, 2021 at 11:10:46PM +, Andrew Cooper wrote:
> >> From: Norbert Kamiński 
> >>
> >> For now, this is simply enough logic to let Xen come up after the 
> >> bootloader
> >> has executed an SKINIT instruction to begin a Secure Startup.
> > Since I know very little about this, I might as well ask. Reading the
> > PM, SKINIT requires a payload, which is an image to boot. That image
> > must be <= 64KB and needs a special header.
> >
> > In case of booting Xen using SKINIT, what would be that payload?
> > (called SLB in the PM).
> 
> The SK/SLB is implemented by https://github.com/TrenchBoot/landing-zone/
> which does all the low level platform stuff.  It is the logical
> equivalent of the Intel-provided TXT ACM which is a blob as far as the
> world is concerned.
> 
> From a "system boot" perspective, the plan is something like this:
> 
> * Grub puts xen/kernel/initrd in memory
> * A final stanza line of "secure_launch" changes the exit of grub to be
> a DRTM DLE (Dynamic Launch Event).
> ** For Intel TXT, this is the SENTER instruction, provided that the
> firmware loaded the TXT ACM properly
> ** For AMD/Hygon, this is additionally loading landing-zone into memory,
> and using the SKINIT instruction.
> * TXT blob, or Landing Zone, do low level things.
> * They enter xen/Linux/other via a common protocol.
> 
> With this patch series in place, Xen's Multiboot2 entrypoint works just
> fine as far as "invoke the next thing" goes.  Linux has a
> work-in-progress extension to their zero-page protocol.
> 
> >> During a Secure Startup, the BSP operates with the GIF clear (blocks all
> >> external interrupts, even SMI/NMI), and INIT_REDIRECTION active (converts 
> >> INIT
> >> IPIs to #SX exceptions, if e.g. the platform needs to scrub secrets before
> >> resetting).  To afford APs the same Secure Startup protections as the BSP, 
> >> the
> >> INIT IPI must be skipped, and SIPI must be the first interrupt seen.
> >>
> >> Full details are available in AMD APM Vol2 15.27 "Secure Startup with 
> >> SKINIT"
> >>
> >> Introduce skinit_enable_intr() and call it from cpu_init(), next to the
> >> enable_nmis() which performs a related function for tboot startups.
> >>
> >> Also introduce ap_boot_method to control the sequence of actions for AP 
> >> boot.
> >>
> >> Signed-off-by: Marek Kasiewicz 
> >> Signed-off-by: Norbert Kamiński 
> >> Signed-off-by: Andrew Cooper 
> >> ---
> >> CC: Jan Beulich 
> >> CC: Roger Pau Monné 
> >> CC: Wei Liu 
> >> CC: Marek Kasiewicz 
> >> CC: Norbert Kamiński 
> >> CC: Michal Zygowski 
> >> CC: Piotr Krol 
> >> CC: Krystian Hebel 
> >> CC: Daniel P. Smith 
> >> CC: Rich Persaud 
> >> CC: Christopher Clark 
> >> ---
> >>  xen/arch/x86/cpu/common.c| 32 
> >>  xen/arch/x86/smpboot.c   | 12 +++-
> >>  xen/include/asm-x86/cpufeature.h |  1 +
> >>  xen/include/asm-x86/msr-index.h  |  1 +
> >>  xen/include/asm-x86/processor.h  |  6 ++
> >>  5 files changed, 51 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
> >> index a684519a20..d9a103e721 100644
> >> --- a/xen/arch/x86/cpu/common.c
> >> +++ b/xen/arch/x86/cpu/common.c
> >> @@ -834,6 +834,29 @@ void load_system_tables(void)
> >>BUG_ON(system_state != SYS_STATE_early_boot && (stack_bottom & 0xf));
> >>  }
> >>  
> >> +static void skinit_enable_intr(void)
> > Since this is AFAICT AMD specific code, shouldn't it better be in
> > cpu/amd.c instead of cpu/common.c?
> 
> Hygon will get sad...
> 
> It's deliberately not in AMD-specific code.

Hygon already calls into AMD specific functions in amd.c
(early_init_amd, amd_log_freq), so it wouldn't seem that weird to
place it in amd.c and share with other AMD derivatives. Likely not
worth arguing about.

> >> +{
> >> +  uint64_t val;
> >> +
> >> +  /*
> >> +   * If the platform is performing a Secure Launch via SKINIT
> >> +   * INIT_REDIRECTION flag will be active.
> >> +   */
> >> +  if ( !cpu_has_skinit || rdmsr_safe(MSR_K8_VM_CR, val) ||
> >> +   !(val & VM_CR_INIT_REDIRECTION) )
> > I would add some kind of check here to assert that APs are started in
> > the correct state, ie:
> >
> > BUG_ON(ap_boot_method == AP_BOOT_SKINIT);
> 
> At the moment, it really doesn't matter.  This change is to simply let
> Xen boot.
> 
> Later changes which do the TPM and attestation work is going to need to
> know details like this, but it will be elsewhere on boot, and one
> recovery option is going to be "please just boot and let be get back
> into the system", in which case a crosscheck is not wanted.

I still think printing a message might be helpful to know the AP has
been started in an unexpected state.

> >> +
> >> +  /*
> >> +   * We don't yet handle #SX.  Disable INIT_REDIRECTION first, before
> >> +   * enabling GIF, so a pending INIT resets us, rather than

Re: XSA-332 kernel patch - huge network performance on pfSense VMs

2021-01-26 Thread Samuel Verschelde


Le 18/01/2021 à 11:03, Roger Pau Monné a écrit :

On Fri, Jan 15, 2021 at 03:03:26PM +, Samuel Verschelde wrote:

Hi list,

Another "popular" thread on XCP-ng forum [1], started in october 2020,
allowed us to detect that patch 12 from the XSA-332 advisory [2] had a very
significant impact on network performance in the case of pfSense VMs.

We reproduced the issue internally (well, we reproduced "something". The
user setups in this thread are diverse) and our findings seem to confirm
what the users reported. Running iperf3 from the pfSense VM to a debian VM
gives results around 5 times slower than before. Reverting this single patch
brings the performance back. On the debian to pfSense direction, the drop is
about 25%.


pfSense is based on FreeBSD, so I would bet that whatever performance
degradation you are seeing would also happen with plain FreeBSD. I
would assume netfront in FreeBSD is triggering the ratelimit on Linux,
and hence it gets throttled.

Do you think you have the bandwidth to look into the FreeBSD side and
try to provide a fix? I'm happy to review and commit in upstream
FreeBSD, but would be nice to have someone else also in the loop as
ATM I'm the only one doing FreeBSD/Xen development AFAIK.

Thanks, Roger.



(sorry about the previous email, looks like my mail client hates me)

I would personnally not be able to hack into either Xen, the linux
kernel or FreeBSD in any efficient way. My role here is limited to
packaging, testing and acting as a relay between users and developers.
We currently don't have anyone at Vates who would be able to hack into
FreeBSD either.

What currently put FreeBSD on our radar is the large amount of users who
use FreeNAS/TrueNAS or pfSense VMs, and the recent bugs they detected
(XSA-360 and this performance drop).

Additionnally, regarding this performance issue, some users report an
impact of that same patch 12 on the network performance of their non-BSD
VMs [1][2], so I think the FreeBSD case might be helpful to help
identify what in that patch caused throttling (if that's what happens),
because it's easier to reproduce, but I'm not sure fixes would only need
to be made in FreeBSD.

Best regards,

Samuel Verschelde

[1] https://xcp-ng.org/forum/post/35521 mentions debian based Untangle
OS and inter-VLAN traffic
[2] https://xcp-ng.org/forum/post/35476 general slowdown affecting all
VMs (VM to workstation traffic), from the first user who identified
patch 12 as the cause.

Re: [PATCH v1] mm/memory_hotplug: MEMHP_MERGE_RESOURCE -> MHP_MERGE_RESOURCE

2021-01-26 Thread Wei Liu

On Tue, Jan 26, 2021 at 12:58:29PM +0100, David Hildenbrand wrote:
> Let's make "MEMHP_MERGE_RESOURCE" consistent with "MHP_NONE", "mhp_t" and
> "mhp_flags". As discussed recently [1], "mhp" is our internal
> acronym for memory hotplug now.
> 
> [1] 
> https://lore.kernel.org/linux-mm/c37de2d0-28a1-4f7d-f944-cfd7d81c3...@redhat.com/
> 
[...]
> Signed-off-by: David Hildenbrand 

Acked-by: Wei Liu

Re: [PATCH 2/3] x86/irq: don't disable domain MSI IRQ on force unbind

2021-01-26 Thread Jan Beulich

On 26.01.2021 12:06, Roger Pau Monne wrote:
> When force unbinding a MSI the used IRQ would get added to the domain
> irq-pirq tree as -irq -pirq,

I think it's -pirq at index irq, i.e. I don't think IRQ gets
negated as far as the radix tree goes. info->arch.irq gets a
negative value stored, yes.

> thus preventing the same IRQ to be used by the domain.

Iirc this (answering your post-commit-message question here)
is for cleaning up _after_ the domain, i.e. there's no goal
to allow re-use of this IRQ. The comment ahead of
unmap_domain_pirq() validly says "The pirq should have been
unbound before this call." The only time we can't make
ourselves dependent upon this is when the guest is being
cleaned up. During normal operation I think we actually
_want_ to enforce correct behavior of the guest here.

> It's not clear to me why we add the irq as -irq -pirq to the irq-pirq
> domain tree on forced unbind, as allowing to bind the irq again should
> just work regardless of whether it has been previously forcefully
> unbound?

To continue from the above, see pirq_guest_unbind() where
we have

if ( desc == NULL )
{
irq = -pirq->arch.irq;
BUG_ON(irq <= 0);
desc = irq_to_desc(irq);
spin_lock_irq(>lock);
clear_domain_irq_pirq(d, irq, pirq);
}

as the alternative to going the normal path through
__pirq_guest_unbind(). Again iirc that's to cover for the
unbind request arriving after the unmap one (i.e. having
caused the unmap to force-unbind the IRQ).

Jan

Re: XSA-332 kernel patch - huge network performance on pfSense VMs

2021-01-26 Thread Samuel Verschelde


Le 18/01/2021 à 11:03, Roger Pau Monné a écrit :

On Fri, Jan 15, 2021 at 03:03:26PM +, Samuel Verschelde wrote:  >> Hi list, >> >> 
Another "popular" thread on XCP-ng forum [1],

started in october >> 2020, allowed us to detect that patch 12 from the
XSA-332 advisory >> [2] had a very significant impact on network
performance in the >> case of pfSense VMs. >> >> We reproduced the issue
internally (well, we reproduced >> "something". The user setups in this
thread are diverse) and our >> findings seem to confirm what the users
reported. Running iperf3 >> from the pfSense VM to a debian VM gives
results around 5 times >> slower than before. Reverting this single
patch brings the >> performance back. On the debian to pfSense
direction, the drop is >> about 25%. > > pfSense is based on FreeBSD, so
I would bet that whatever performance > degradation you are seeing would
also happen with plain FreeBSD. I > would assume netfront in FreeBSD is
triggering the ratelimit on > Linux, and hence it gets throttled. > > Do
you think you have the bandwidth to look into the FreeBSD side and > try
to provide a fix? I'm happy to review and commit in upstream > FreeBSD,
but would be nice to have someone else also in the loop as > ATM I'm the
only one doing FreeBSD/Xen development AFAIK. >
I would personnally not be able to hack into either Xen, the linux
kernel or FreeBSD in any efficient way. My role here is limited to
packaging, testing and acting as a relay between users and developers.
We currently don't have anyone at Vates who would be able to hack into
FreeBSD either.

What currently puts FreeBSD into our radar is the large amount of users
who use FreeNAS/TrueNAS or pfSense VMs, and the recent bugs they
detected (XSA-360 and this performance drop).

Additionnally, regarding this performance issue, some users report an
impact of that same patch 12 on the network performance of their non-BSD
VMs [1][2], so I think the FreeBSD case might be helpful to help
identify what in that patch caused throttling (if that's what happens),
because it's easier to reproduce, but I'm not sure fixes would only need
to be made in FreeBSD.

Best regards,

Samuel Verschelde

[1] https://xcp-ng.org/forum/post/35521 mentions debian based Untangle
OS and inter-VLAN traffic
[2] https://xcp-ng.org/forum/post/35476 general slowdown affecting all
VMs (VM to workstation traffic), from the first user who identified
patch 12 as the cause.

Re: [PATCH v1] mm/memory_hotplug: MEMHP_MERGE_RESOURCE -> MHP_MERGE_RESOURCE

2021-01-26 Thread Oscar Salvador

On Tue, Jan 26, 2021 at 12:58:29PM +0100, David Hildenbrand wrote:
> Let's make "MEMHP_MERGE_RESOURCE" consistent with "MHP_NONE", "mhp_t" and
> "mhp_flags". As discussed recently [1], "mhp" is our internal
> acronym for memory hotplug now.
> 
> [1] 
> https://lore.kernel.org/linux-mm/c37de2d0-28a1-4f7d-f944-cfd7d81c3...@redhat.com/
> 
> Cc: Andrew Morton 
> Cc: "K. Y. Srinivasan" 
> Cc: Haiyang Zhang 
> Cc: Stephen Hemminger 
> Cc: Wei Liu 
> Cc: "Michael S. Tsirkin" 
> Cc: Jason Wang 
> Cc: Boris Ostrovsky 
> Cc: Juergen Gross 
> Cc: Stefano Stabellini 
> Cc: Pankaj Gupta 
> Cc: Michal Hocko 
> Cc: Oscar Salvador 
> Cc: Anshuman Khandual 
> Cc: Wei Yang 
> Cc: linux-hyp...@vger.kernel.org
> Cc: virtualizat...@lists.linux-foundation.org
> Cc: xen-devel@lists.xenproject.org
> Signed-off-by: David Hildenbrand 

Reviewed-by: Oscar Salvador 

-- 
Oscar Salvador
SUSE L3

Re: [PATCH v1] mm/memory_hotplug: MEMHP_MERGE_RESOURCE -> MHP_MERGE_RESOURCE

2021-01-26 Thread Michael S. Tsirkin

On Tue, Jan 26, 2021 at 12:58:29PM +0100, David Hildenbrand wrote:
> Let's make "MEMHP_MERGE_RESOURCE" consistent with "MHP_NONE", "mhp_t" and
> "mhp_flags". As discussed recently [1], "mhp" is our internal
> acronym for memory hotplug now.
> 
> [1] 
> https://lore.kernel.org/linux-mm/c37de2d0-28a1-4f7d-f944-cfd7d81c3...@redhat.com/
> 
> Cc: Andrew Morton 
> Cc: "K. Y. Srinivasan" 
> Cc: Haiyang Zhang 
> Cc: Stephen Hemminger 
> Cc: Wei Liu 
> Cc: "Michael S. Tsirkin" 
> Cc: Jason Wang 
> Cc: Boris Ostrovsky 
> Cc: Juergen Gross 
> Cc: Stefano Stabellini 
> Cc: Pankaj Gupta 
> Cc: Michal Hocko 
> Cc: Oscar Salvador 
> Cc: Anshuman Khandual 
> Cc: Wei Yang 
> Cc: linux-hyp...@vger.kernel.org
> Cc: virtualizat...@lists.linux-foundation.org
> Cc: xen-devel@lists.xenproject.org
> Signed-off-by: David Hildenbrand 

Acked-by: Michael S. Tsirkin 

> ---
>  drivers/hv/hv_balloon.c| 2 +-
>  drivers/virtio/virtio_mem.c| 2 +-
>  drivers/xen/balloon.c  | 2 +-
>  include/linux/memory_hotplug.h | 2 +-
>  mm/memory_hotplug.c| 2 +-
>  5 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
> index 8c471823a5af..2f776d78e3c1 100644
> --- a/drivers/hv/hv_balloon.c
> +++ b/drivers/hv/hv_balloon.c
> @@ -726,7 +726,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned 
> long size,
>  
>   nid = memory_add_physaddr_to_nid(PFN_PHYS(start_pfn));
>   ret = add_memory(nid, PFN_PHYS((start_pfn)),
> - (HA_CHUNK << PAGE_SHIFT), MEMHP_MERGE_RESOURCE);
> + (HA_CHUNK << PAGE_SHIFT), MHP_MERGE_RESOURCE);
>  
>   if (ret) {
>   pr_err("hot_add memory failed error is %d\n", ret);
> diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
> index 85a272c9978e..148bea39b09a 100644
> --- a/drivers/virtio/virtio_mem.c
> +++ b/drivers/virtio/virtio_mem.c
> @@ -623,7 +623,7 @@ static int virtio_mem_add_memory(struct virtio_mem *vm, 
> uint64_t addr,
>   /* Memory might get onlined immediately. */
>   atomic64_add(size, >offline_size);
>   rc = add_memory_driver_managed(vm->nid, addr, size, vm->resource_name,
> -MEMHP_MERGE_RESOURCE);
> +MHP_MERGE_RESOURCE);
>   if (rc) {
>   atomic64_sub(size, >offline_size);
>   dev_warn(>vdev->dev, "adding memory failed: %d\n", rc);
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index b57b2067ecbf..671c71245a7b 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -331,7 +331,7 @@ static enum bp_state reserve_additional_memory(void)
>   mutex_unlock(_mutex);
>   /* add_memory_resource() requires the device_hotplug lock */
>   lock_device_hotplug();
> - rc = add_memory_resource(nid, resource, MEMHP_MERGE_RESOURCE);
> + rc = add_memory_resource(nid, resource, MHP_MERGE_RESOURCE);
>   unlock_device_hotplug();
>   mutex_lock(_mutex);
>  
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 3d99de0db2dd..4b834f5d032e 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -53,7 +53,7 @@ typedef int __bitwise mhp_t;
>   * with this flag set, the resource pointer must no longer be used as it
>   * might be stale, or the resource might have changed.
>   */
> -#define MEMHP_MERGE_RESOURCE ((__force mhp_t)BIT(0))
> +#define MHP_MERGE_RESOURCE   ((__force mhp_t)BIT(0))
>  
>  /*
>   * Extended parameters for memory hotplug:
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 710e469fb3a1..ae497e3ff77c 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1153,7 +1153,7 @@ int __ref add_memory_resource(int nid, struct resource 
> *res, mhp_t mhp_flags)
>* In case we're allowed to merge the resource, flag it and trigger
>* merging now that adding succeeded.
>*/
> - if (mhp_flags & MEMHP_MERGE_RESOURCE)
> + if (mhp_flags & MHP_MERGE_RESOURCE)
>   merge_system_ram_resource(res);
>  
>   /* online pages if requested */
> -- 
> 2.29.2

Re: [PATCH 1/3] x86/irq: remove duplicated clear_domain_irq_pirq calls

2021-01-26 Thread Jan Beulich

On 26.01.2021 12:06, Roger Pau Monne wrote:
> There are two duplicated calls to cleanup_domain_irq_pirq in
> unmap_domain_pirq.
> 
> The first one in the for loop will be called with exactly the same
> arguments as the call placed closer to the loop start.

I'm having trouble figuring out which two instances you refer
to: To me "first one" and "closer to the start" are two ways
of expressing the same thing. You don't refer to the call to
clear_domain_irq_pirq(), do you, when the two calls you
remove are to cleanup_domain_irq_pirq()? Admittedly quite
similar names, but entirely different functions.

> The logic used in the loop seems extremely complex to follow IMO,
> there are several breaks for exiting the loop, and the index (i) is
> also updated in different places.

Indeed, and it didn't feel well already back at the time when
I much extended this to support multi-vector MSI. I simply
didn't have any good idea how to streamline all of this
(short of rewriting it altogether, which would have made
patch review quite difficult). If you have thoughts, I'm all
ears.

Jan

Re: [PATCH v7 10/10] x86/vm_event: Carry Processor Trace buffer offset in vm_event

2021-01-26 Thread Jan Beulich

On 21.01.2021 22:27, Andrew Cooper wrote:
> --- a/xen/arch/x86/vm_event.c
> +++ b/xen/arch/x86/vm_event.c
> @@ -251,6 +251,9 @@ void vm_event_fill_regs(vm_event_request_t *req)
>  
>  req->data.regs.x86.shadow_gs = ctxt.shadow_gs;
>  req->data.regs.x86.dr6 = ctxt.dr6;
> +
> +if ( hvm_vmtrace_output_position(curr, >data.regs.x86.pt_offset) != 
> 1 )
> +req->data.regs.x86.pt_offset = ~0;

Ah. (Regarding my earlier question about this returning -errno or
boolean).

> --- a/xen/include/public/vm_event.h
> +++ b/xen/include/public/vm_event.h
> @@ -223,6 +223,12 @@ struct vm_event_regs_x86 {
>   */
>  uint64_t npt_base;
>  
> +/*
> + * Current offset in the Processor Trace buffer. For Intel Processor 
> Trace
> + * this is MSR_RTIT_OUTPUT_MASK. Set to ~0 if no Processor Trace is 
> active.
> + */
> +uint64_t pt_offset;

According to vmtrace_output_position() the value is only one half
of what the named MSR contains. Perhaps "... this is from MSR_..."?
Not sure whether, despite this, there still is a reason to have
this 64-bit wide.

Jan

Re: [PATCH v4 1/2] xen: EXPERT clean-up and introduce UNSUPPORTED

2021-01-26 Thread Bertrand Marquis

Hi Julien,

> On 26 Jan 2021, at 13:51, Julien Grall  wrote:
> 
> Hi Stefano,
> 
> On 25/01/2021 21:27, Stefano Stabellini wrote:
>>  config ARM_SSBD
>> -bool "Speculative Store Bypass Disable" if EXPERT
>> +bool "Speculative Store Bypass Disable (UNSUPPORTED)" if UNSUPPORTED
>>  depends on HAS_ALTERNATIVE
>>  default y
>>  help
>> @@ -87,7 +87,7 @@ config ARM_SSBD
>>If unsure, say Y.
>>config HARDEN_BRANCH_PREDICTOR
>> -bool "Harden the branch predictor against aliasing attacks" if EXPERT
>> +bool "Harden the branch predictor against aliasing attacks 
>> (UNSUPPORTED)" if UNSUPPORTED
>>  default y
>>  help
>>Speculation attacks against some high-performance processors rely on
> 
> I read through the back and forth between Bertrand and Jan about 
> "UNSUPPORTED". However, I still don't understand why those two options are 
> moved to UNSUPPORTED.

Discussion was more on what to do for options which have a default y and can 
only be turned off with UNSUPPORTED or EXPERT selected.

> 
> Both options will only build the code to enable the mitigation. The decision 
> is still based on the processor you are running on.
> 
> In addition to that, ARM_SSBD can also be forced enabled/disabled on the 
> command line.
> 
> A user may want to compile out the code if the target processor is not the 
> affected by the two issues. This wouldn't be much different to Xen deciding 
> to not enabling the mitigation.
> 
> I would view the two options as supported but not security supported. So this 
> seems to fit exactly in the definition of EXPERT rather than UNSUPPORTED.

I think you are right here, not security supported should be only available to 
EXPERT.

Cheers
Bertrand

> 
> Cheers,
> 
> -- 
> Julien Grall
>

Re: [PATCH v7 09/10] xen/vmtrace: support for VM forks

2021-01-26 Thread Jan Beulich

On 21.01.2021 22:27, Andrew Cooper wrote:
> From: Tamas K Lengyel 
> 
> Implement vmtrace_reset_pt function. Properly set IPT
> state for VM forks.
> 
> Signed-off-by: Tamas K Lengyel 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 
albeit it strikes me as somewhat odd that ...

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -2408,6 +2408,16 @@ static int vmtrace_output_position(struct vcpu *v, 
> uint64_t *pos)
>  return v->arch.hvm.vmx.ipt_active;
>  }
>  
> +static int vmtrace_reset(struct vcpu *v)
> +{
> +if ( !v->arch.hvm.vmx.ipt_active )
> +return -EINVAL;
> +
> +v->arch.msrs->rtit.output_offset = 0;
> +v->arch.msrs->rtit.status = 0;
> +return 0;
> +}

... the function/hook return non-void, yet ...

> --- a/xen/arch/x86/mm/mem_sharing.c
> +++ b/xen/arch/x86/mm/mem_sharing.c
> @@ -1632,6 +1632,8 @@ static int copy_vcpu_settings(struct domain *cd, const 
> struct domain *d)
>  copy_domain_page(new_vcpu_info_mfn, vcpu_info_mfn);
>  }
>  
> +hvm_vmtrace_reset(cd_vcpu);

... the only caller doesn't care.

Jan

Re: [PATCH v7 06/10] xen/domctl: Add XEN_DOMCTL_vmtrace_op

2021-01-26 Thread Jan Beulich

On 21.01.2021 22:27, Andrew Cooper wrote:
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -155,6 +155,55 @@ void arch_get_domain_info(const struct domain *d,
>  info->arch_config.emulation_flags = d->arch.emulation_flags;
>  }
>  
> +static int do_vmtrace_op(struct domain *d, struct xen_domctl_vmtrace_op *op,
> + XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
> +{
> +struct vcpu *v;
> +int rc;
> +
> +if ( !d->vmtrace_frames || d == current->domain /* No vcpu_pause() */ )
> +return -EINVAL;
> +
> +ASSERT(is_hvm_domain(d)); /* Restricted by domain creation logic. */
> +
> +v = domain_vcpu(d, op->vcpu);
> +if ( !v )
> +return -ENOENT;
> +
> +vcpu_pause(v);
> +switch ( op->cmd )
> +{
> +case XEN_DOMCTL_vmtrace_enable:
> +case XEN_DOMCTL_vmtrace_disable:
> +case XEN_DOMCTL_vmtrace_reset_and_enable:
> +rc = hvm_vmtrace_control(
> +v, op->cmd != XEN_DOMCTL_vmtrace_disable,
> +op->cmd == XEN_DOMCTL_vmtrace_reset_and_enable);
> +break;
> +
> +case XEN_DOMCTL_vmtrace_output_position:
> +rc = hvm_vmtrace_output_position(v, >value);
> +if ( rc >= 0 )
> +rc = 0;

So vmtrace_output_position() effectively returns a boolean, and
there is no other caller of it afaics. I understand the hook and
function return int to allow for error indicators. But what's
the purpose of returning ipt_active when the only caller doesn't
care?

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -2261,6 +2261,153 @@ static bool vmx_get_pending_event(struct vcpu *v, 
> struct x86_event *info)
>  return true;
>  }
>  
> +static int vmtrace_get_option(struct vcpu *v, uint64_t key, uint64_t *output)
> +{
> +const struct vcpu_msrs *msrs = v->arch.msrs;
> +
> +/*
> + * We only let vmtrace agents see and modify a subset of bits in
> + * MSR_RTIT_CTL.  These all pertain to date emitted into the trace

s/date/data/ ?

> + * buffer(s).  Must not include controls pertaining to the
> + * structure/position of the trace buffer(s).
> + */
> +#define RTIT_CTL_MASK   \
> +(RTIT_CTL_TRACE_EN | RTIT_CTL_OS | RTIT_CTL_USR | RTIT_CTL_TSC_EN | \
> + RTIT_CTL_DIS_RETC | RTIT_CTL_BRANCH_EN)
> +
> +/*
> + * Status bits restricted to the first-gen subset (i.e. no further CPUID
> + * requirements.)
> + */
> +#define RTIT_STATUS_MASK \
> +(RTIT_STATUS_FILTER_EN | RTIT_STATUS_CONTEXT_EN | RTIT_STATUS_TRIGGER_EN 
> | \
> + RTIT_STATUS_ERROR | RTIT_STATUS_STOPPED)

The placement of these two #define-s kind of suggests they're
intended for this function only, but the next one (at least)
also uses them. May I suggest to move these ahead of this
function?

> +static int vmtrace_set_option(struct vcpu *v, uint64_t key, uint64_t value)
> +{
> +struct vcpu_msrs *msrs = v->arch.msrs;
> +bool new_en, old_en = msrs->rtit.ctl & RTIT_CTL_TRACE_EN;
> +
> +switch ( key )
> +{
> +case MSR_RTIT_OUTPUT_MASK:
> +/*
> + * MSR_RTIT_OUTPUT_MASK, when using Single Output mode, has a limit
> + * field in the lower 32 bits, and an offset in the upper 32 bits.
> + *
> + * Limit is fixed by the vmtrace buffer size and must not be
> + * controlled by userspace, while offset must be within the limit.
> + *
> + * Drop writes to the limit field to simply userspace wanting to 
> reset
> + * the offset by writing 0.
> + */
> +if ( (value >> 32) > msrs->rtit.output_limit )
> +return -EINVAL;
> +msrs->rtit.output_offset = value >> 32;
> +break;
> +
> +case MSR_RTIT_CTL:
> +if ( value & ~RTIT_CTL_MASK )
> +return -EINVAL;
> +
> +msrs->rtit.ctl &= ~RTIT_CTL_MASK;
> +msrs->rtit.ctl |= (value & RTIT_CTL_MASK);
> +break;
> +
> +case MSR_RTIT_STATUS:
> +if ( value & ~RTIT_STATUS_MASK )
> +return -EINVAL;
> +
> +msrs->rtit.status &= ~RTIT_STATUS_MASK;
> +msrs->rtit.status |= (value & RTIT_STATUS_MASK);
> +break;
> +
> +default:
> +return -EINVAL;
> +}
> +
> +new_en = msrs->rtit.ctl & RTIT_CTL_TRACE_EN;
> +
> +/* ctl.trace_en changed => update MSR load/save lists appropriately. */
> +if ( !old_en && new_en )
> +{
> +if ( vmx_add_guest_msr(v, MSR_RTIT_CTL, msrs->rtit.ctl) ||
> + vmx_add_host_load_msr(v, MSR_RTIT_CTL, 0) )
> +{
> +/*
> + * The only failure cases here are failing the
> + * singleton-per-domain memory allocation, or exceeding the space
> + * in the allocation.  We could unwind in principle, but there is
> + * nothing userspace can usefully do to continue using this VM.
> + */
> +domain_crash(v->domain);
> +

Re: Question about xen and Rasp 4B

2021-01-26 Thread Jukka Kaartinen

On Tue, Jan 26, 2021 at 12:59 PM Jukka Kaartinen 
wrote:

>
>
> On Tue, Jan 26, 2021 at 2:54 AM Stefano Stabellini 
> wrote:
>
>> On Sat, 23 Jan 2021, Jukka Kaartinen wrote:
>> > Thanks for the response!
>> >
>> > On Sat, Jan 23, 2021 at 2:27 AM Stefano Stabellini <
>> sstabell...@kernel.org> wrote:
>> >   + xen-devel, Roman,
>> >
>> >
>> >   On Fri, 22 Jan 2021, Jukka Kaartinen wrote:
>> >   > Hi Stefano,
>> >   > I'm Jukka Kaartinen a SW developer working on enabling
>> hypervisors on mobile platforms. One of our HW that we use on
>> >   development is
>> >   > Raspberry Pi 4B. I wonder if you could help me a bit :).
>> >   >
>> >   > I'm trying to enable the GPU with Xen + Raspberry Pi for
>> >   dom0.
>> https://www.raspberrypi.org/forums/viewtopic.php?f=63=232323#p1797605
>> >   >
>> >   > I got so far that GPU drivers are loaded (v3d & vc4) without
>> errors. But now Xen returns error when X is starting:
>> >   > (XEN) traps.c:1986:d0v1 HSR=0x93880045 pc=0x7f97b14e70
>> gva=0x7f7f817000 gpa=0x401315d000
>> >   >  I tried to debug what causes this and looks
>> like find_mmio_handler cannot find handler.
>> >   > (See more here:
>> https://www.raspberrypi.org/forums/viewtopic.php?f=63=232323=25#p1801691
>> )
>> >   >
>> >   > Any ideas why the handler is not found?
>> >
>> >
>> >   Hi Jukka,
>> >
>> >   I am glad to hear that you are interested in Xen on RaspberryPi
>> :-)  I
>> >   haven't tried the GPU yet, I have been using the serial only.
>> >   Roman, did you ever get the GPU working?
>> >
>> >
>> >   The error is a data abort error: Linux is trying to access an
>> address
>> >   which is not mapped to dom0. The address seems to be
>> 0x401315d000. It is
>> >   a pretty high address; I looked in device tree but couldn't spot
>> it.
>> >
>> >   >From the HSR (the syndrom register) it looks like it is a
>> translation
>> >   fault at EL1 on stage1. As if the Linux address mapping was wrong.
>> >   Anyone has any ideas how this could happen? Maybe a
>> reserved-memory
>> >   misconfiguration?
>> >
>> > I had issues with loading the driver in the first place. Apparently
>> swiotlb is used, maybe it can cause this. I also tried to enable CMA.
>> > config.txt:
>> > dtoverlay=vc4-fkms-v3d,cma=320M@0x0-0x4000
>> > gpu_mem=128
>>
>> Also looking at your other reply and the implementation of
>> vc4_bo_create, it looks like this is a CMA problem.
>>
>> It would be good to run a test with the swiotlb-xen disabled:
>>
>> diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
>> index 467fa225c3d0..2bdd12785d14 100644
>> --- a/arch/arm/xen/mm.c
>> +++ b/arch/arm/xen/mm.c
>> @@ -138,8 +138,7 @@ void xen_destroy_contiguous_region(phys_addr_t
>> pstart, unsigned int order)
>>  static int __init xen_mm_init(void)
>>  {
>> struct gnttab_cache_flush cflush;
>> -   if (!xen_initial_domain())
>> -   return 0;
>> +   return 0;
>> xen_swiotlb_init(1, false);
>>
>> cflush.op = 0;
>>
> With this change the kernel is not booting up. (btw. I'm using USB SSD for
> my OS.)
> [0.071081] bcm2835-dma fe007000.dma: Unable to set DMA mask
> [0.076277] bcm2835-dma fe007b00.dma: Unable to set DMA mask
> (XEN) physdev.c:16:d0v0 PHYSDEVOP cmd=25: not implemented
> (XEN) physdev.c:16:d0v0 PHYSDEVOP cmd=15: not implemented
> [0.592695] pci :00:00.0: Failed to add - passthrough or MSI/MSI-X
> might fail!
> (XEN) physdev.c:16:d0v0 PHYSDEVOP cmd=15: not implemented
> [0.606819] pci :01:00.0: Failed to add - passthrough or MSI/MSI-X
> might fail!
> [1.212820] usb 1-1: device descriptor read/64, error 18
> [1.452815] usb 1-1: device descriptor read/64, error 18
> [1.820813] usb 1-1: device descriptor read/64, error 18
> [2.060815] usb 1-1: device descriptor read/64, error 18
> [2.845548] usb 1-1: device descriptor read/8, error -61
> [2.977603] usb 1-1: device descriptor read/8, error -61
> [3.237530] usb 1-1: device descriptor read/8, error -61
> [3.369585] usb 1-1: device descriptor read/8, error -61
> [3.480765] usb usb1-port1: unable to enumerate USB device
>
> Traces stop here. I could try with a memory card. Maybe it makes a
> difference.
>
>
>>
>> It is going to be fine just to boot Dom0 and DomUs without PV drivers.
>> Also, can you post the device tree that you are using here? Just in case
>> there is an issue with Xen parsing any possible /reserved-memory nodes
>> with CMA info that need to be passed on to Dom0.
>
> Here is the device dumped from command line:
> dtc -I fs /proc/device-tree
>
> https://drive.google.com/file/d/17u18dJHxRfbGZMtRXIwtLVZZfMj9KwN-/view?usp=sharing
>
>
Here is log from XEN when it goes through the device tree:
https://drive.google.com/file/d/1jvT3ZNJeXHfxPOffiaRSRg8FPwQHS1mb/view?usp=sharing



>
>>
>> >   > p.s.
>> >   > While testing I found issue with Xen master branch and your

Re: [PATCH v4 1/2] xen: EXPERT clean-up and introduce UNSUPPORTED

2021-01-26 Thread Julien Grall


Hi Stefano,

On 25/01/2021 21:27, Stefano Stabellini wrote:

  config ARM_SSBD
-   bool "Speculative Store Bypass Disable" if EXPERT
+   bool "Speculative Store Bypass Disable (UNSUPPORTED)" if UNSUPPORTED
depends on HAS_ALTERNATIVE
default y
help
@@ -87,7 +87,7 @@ config ARM_SSBD
  If unsure, say Y.
  
  config HARDEN_BRANCH_PREDICTOR

-   bool "Harden the branch predictor against aliasing attacks" if EXPERT
+   bool "Harden the branch predictor against aliasing attacks 
(UNSUPPORTED)" if UNSUPPORTED
default y
help
  Speculation attacks against some high-performance processors rely on


I read through the back and forth between Bertrand and Jan about 
"UNSUPPORTED". However, I still don't understand why those two options 
are moved to UNSUPPORTED.


Both options will only build the code to enable the mitigation. The 
decision is still based on the processor you are running on.


In addition to that, ARM_SSBD can also be forced enabled/disabled on the 
command line.


A user may want to compile out the code if the target processor is not 
the affected by the two issues. This wouldn't be much different to Xen 
deciding to not enabling the mitigation.


I would view the two options as supported but not security supported. So 
this seems to fit exactly in the definition of EXPERT rather than 
UNSUPPORTED.


Cheers,

--
Julien Grall

Re: [PATCH v3 00/15] zstd decompression for DomU-s + fallout / consolidation

2021-01-26 Thread Jan Beulich

On 26.01.2021 14:25, Ian Jackson wrote:
> I hope this explanation makes some kind of sense and sorry for the
> confusion.

It does and no, I don't think there was any confusion caused. Some
thinks merely needed clarifying, on my end at least. Thanks for
doing so.

Jan

[PATCH v3 4/6] x86/vpic: don't trigger unmask event until end of init

2021-01-26 Thread Roger Pau Monne

Wait until the end of the init sequence to trigger the unmask event.
Note that it will be unconditionally triggered, but that's harmless if
not unmask actually happened.

While there change the variable type to bool.

Requested-by: Jan Beulich 
Signed-off-by: Roger Pau Monné 
---
Changes since v2:
 - New in this version.
---
 xen/arch/x86/hvm/vpic.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/vpic.c b/xen/arch/x86/hvm/vpic.c
index 9195155ff0..795a76768d 100644
--- a/xen/arch/x86/hvm/vpic.c
+++ b/xen/arch/x86/hvm/vpic.c
@@ -188,7 +188,8 @@ static void vpic_ioport_write(
 struct hvm_hw_vpic *vpic, uint32_t addr, uint32_t val)
 {
 int priority, cmd;
-uint8_t mask, unmasked = 0;
+uint8_t mask;
+bool unmasked = false;
 
 vpic_lock(vpic);
 
@@ -200,7 +201,6 @@ static void vpic_ioport_write(
 /* Clear edge-sensing logic. */
 vpic->irr &= vpic->elcr;
 
-unmasked = vpic->imr;
 /* No interrupts masked or in service. */
 vpic->imr = vpic->isr = 0;
 
@@ -294,13 +294,17 @@ static void vpic_ioport_write(
 /* ICW3 */
 vpic->init_state++;
 if ( !(vpic->init_state & 4) )
+{
 vpic->init_state = 0; /* No ICW4: init done */
+unmasked = true;
+}
 break;
 case 3:
 /* ICW4 */
 vpic->special_fully_nested_mode = (val >> 4) & 1;
 vpic->auto_eoi = (val >> 1) & 1;
 vpic->init_state = 0;
+unmasked = true;
 break;
 }
 }
-- 
2.29.2

[PATCH v3 6/6] x86/dpci: remove the dpci EOI timer

2021-01-26 Thread Roger Pau Monne

Current interrupt pass though code will setup a timer for each
interrupt injected to the guest that requires an EOI from the guest.
Such timer would perform two actions if the guest doesn't EOI the
interrupt before a given period of time. The first one is deasserting
the virtual line, the second is perform an EOI of the physical
interrupt source if it requires such.

The deasserting of the guest virtual line is wrong, since it messes
with the interrupt status of the guest. This seems to have been done
in order to compensate for missing deasserts when certain interrupt
controller actions are performed. The original motivation of the
introduction of the timer was to fix issues when a GSI was shared
between different guests. We believe that other changes in the
interrupt handling code (ie: proper propagation of EOI related actions
to dpci) will have fixed such errors now.

Performing an EOI of the physical interrupt source is redundant, since
there's already a timer that takes care of this for all interrupts,
not just the HVM dpci ones, see irq_guest_action_t struct eoi_timer
field.

Since both of the actions performed by the dpci timer are not
required, remove it altogether.

Signed-off-by: Roger Pau Monné 
Reviewed-by: Jan Beulich 
---
Changes since v1:
 - Add parentheses.
---
 xen/drivers/passthrough/vtd/x86/hvm.c |  3 -
 xen/drivers/passthrough/x86/hvm.c | 95 +--
 xen/include/asm-x86/hvm/irq.h |  3 -
 xen/include/xen/iommu.h   |  5 --
 4 files changed, 2 insertions(+), 104 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/x86/hvm.c 
b/xen/drivers/passthrough/vtd/x86/hvm.c
index f77b35815c..b531fe907a 100644
--- a/xen/drivers/passthrough/vtd/x86/hvm.c
+++ b/xen/drivers/passthrough/vtd/x86/hvm.c
@@ -36,10 +36,7 @@ static int _hvm_dpci_isairq_eoi(struct domain *d,
 {
 hvm_pci_intx_deassert(d, digl->device, digl->intx);
 if ( --pirq_dpci->pending == 0 )
-{
-stop_timer(_dpci->timer);
 pirq_guest_eoi(dpci_pirq(pirq_dpci));
-}
 }
 }
 
diff --git a/xen/drivers/passthrough/x86/hvm.c 
b/xen/drivers/passthrough/x86/hvm.c
index a6e2863c14..351daafdc9 100644
--- a/xen/drivers/passthrough/x86/hvm.c
+++ b/xen/drivers/passthrough/x86/hvm.c
@@ -136,77 +136,6 @@ static void pt_pirq_softirq_reset(struct hvm_pirq_dpci 
*pirq_dpci)
 pirq_dpci->masked = 0;
 }
 
-bool pt_irq_need_timer(uint32_t flags)
-{
-return !(flags & (HVM_IRQ_DPCI_GUEST_MSI | HVM_IRQ_DPCI_TRANSLATE |
-  HVM_IRQ_DPCI_NO_EOI));
-}
-
-static int pt_irq_guest_eoi(struct domain *d, struct hvm_pirq_dpci *pirq_dpci,
-void *arg)
-{
-if ( __test_and_clear_bit(_HVM_IRQ_DPCI_EOI_LATCH_SHIFT,
-  _dpci->flags) )
-{
-pirq_dpci->masked = 0;
-pirq_dpci->pending = 0;
-pirq_guest_eoi(dpci_pirq(pirq_dpci));
-}
-
-return 0;
-}
-
-static void pt_irq_time_out(void *data)
-{
-struct hvm_pirq_dpci *irq_map = data;
-const struct hvm_irq_dpci *dpci;
-const struct dev_intx_gsi_link *digl;
-
-spin_lock(_map->dom->event_lock);
-
-if ( irq_map->flags & HVM_IRQ_DPCI_IDENTITY_GSI )
-{
-ASSERT(is_hardware_domain(irq_map->dom));
-/*
- * Identity mapped, no need to iterate over the guest GSI list to find
- * other pirqs sharing the same guest GSI.
- *
- * In the identity mapped case the EOI can also be done now, this way
- * the iteration over the list of domain pirqs is avoided.
- */
-hvm_gsi_deassert(irq_map->dom, dpci_pirq(irq_map)->pirq);
-irq_map->flags |= HVM_IRQ_DPCI_EOI_LATCH;
-pt_irq_guest_eoi(irq_map->dom, irq_map, NULL);
-spin_unlock(_map->dom->event_lock);
-return;
-}
-
-dpci = domain_get_irq_dpci(irq_map->dom);
-if ( unlikely(!dpci) )
-{
-ASSERT_UNREACHABLE();
-spin_unlock(_map->dom->event_lock);
-return;
-}
-list_for_each_entry ( digl, _map->digl_list, list )
-{
-unsigned int guest_gsi = hvm_pci_intx_gsi(digl->device, digl->intx);
-const struct hvm_girq_dpci_mapping *girq;
-
-list_for_each_entry ( girq, >girq[guest_gsi], list )
-{
-struct pirq *pirq = pirq_info(irq_map->dom, girq->machine_gsi);
-
-pirq_dpci(pirq)->flags |= HVM_IRQ_DPCI_EOI_LATCH;
-}
-hvm_pci_intx_deassert(irq_map->dom, digl->device, digl->intx);
-}
-
-pt_pirq_iterate(irq_map->dom, pt_irq_guest_eoi, NULL);
-
-spin_unlock(_map->dom->event_lock);
-}
-
 struct hvm_irq_dpci *domain_get_irq_dpci(const struct domain *d)
 {
 if ( !d || !is_hvm_domain(d) )
@@ -568,15 +497,10 @@ int pt_irq_create_bind(
 }
 }
 
-/* Init timer before binding */
-if ( pt_irq_need_timer(pirq_dpci->flags) )
-init_timer(_dpci->timer,

[PATCH v3 5/6] x86/vpic: issue dpci EOI for cleared pins at ICW1

2021-01-26 Thread Roger Pau Monne

When pins are cleared from either ISR or IRR as part of the
initialization sequence forward the clearing of those pins to the dpci
EOI handler, as it is equivalent to an EOI. Not doing so can bring the
interrupt controller state out of sync with the dpci handling logic,
that expects a notification when a pin has been EOI'ed.

Fixes: 7b3cb5e5416 ('IRQ injection changes for HVM PCI passthru.')
Signed-off-by: Roger Pau Monné 
---
Changes since v2:
 - Remove the unmask label.
---
 xen/arch/x86/hvm/vpic.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/xen/arch/x86/hvm/vpic.c b/xen/arch/x86/hvm/vpic.c
index 795a76768d..f465b7f997 100644
--- a/xen/arch/x86/hvm/vpic.c
+++ b/xen/arch/x86/hvm/vpic.c
@@ -197,6 +197,8 @@ static void vpic_ioport_write(
 {
 if ( val & 0x10 )
 {
+unsigned int pending = vpic->isr | (vpic->irr & ~vpic->elcr);
+
 /* ICW1 */
 /* Clear edge-sensing logic. */
 vpic->irr &= vpic->elcr;
@@ -220,6 +222,24 @@ static void vpic_ioport_write(
 }
 
 vpic->init_state = ((val & 3) << 2) | 1;
+vpic_update_int_output(vpic);
+vpic_unlock(vpic);
+
+/*
+ * Forward the EOI of any pending or in service interrupt that has
+ * been cleared from IRR or ISR, or else the dpci logic will get
+ * out of sync with the state of the interrupt controller.
+ */
+while ( pending )
+{
+unsigned int pin = __scanbit(pending, 8);
+
+ASSERT(pin < 8);
+hvm_dpci_eoi(current->domain,
+ hvm_isa_irq_to_gsi((addr >> 7) ? (pin | 8) : 
pin));
+__clear_bit(pin, );
+}
+return;
 }
 else if ( val & 0x08 )
 {
-- 
2.29.2

[PATCH v3 3/6] x86/vpic: force int output to low when in init mode

2021-01-26 Thread Roger Pau Monne

When the PIC is on the init sequence prevent interrupt delivery. The
state of the registers is in the process of being set during the init
phase, so it makes sense to prevent any int line changes during that
process.

Requested-by: Jan Beulich 
Signed-off-by: Roger Pau Monné 
---
Changes since v2:
 - New in this version.
---
 xen/arch/x86/hvm/vpic.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/vpic.c b/xen/arch/x86/hvm/vpic.c
index c1c1de7fd0..9195155ff0 100644
--- a/xen/arch/x86/hvm/vpic.c
+++ b/xen/arch/x86/hvm/vpic.c
@@ -101,11 +101,14 @@ static void vpic_update_int_output(struct hvm_hw_vpic 
*vpic)
 irq = vpic_get_highest_priority_irq(vpic);
 TRACE_3D(TRC_HVM_EMUL_PIC_INT_OUTPUT, vpic->int_output, vpic->is_master,
  irq);
-if ( vpic->int_output == (irq >= 0) )
+if ( vpic->int_output == (!vpic->init_state && irq >= 0) )
 return;
 
-/* INT line transition L->H or H->L. */
-vpic->int_output = !vpic->int_output;
+/*
+ * INT line transition L->H or H->L.
+ * Force line status to L when in init mode.
+ */
+vpic->int_output = !vpic->init_state && !vpic->int_output;
 
 if ( vpic->int_output )
 {
-- 
2.29.2

[PATCH v3 2/6] x86/vioapic: issue EOI to dpci when switching pin to edge trigger mode

2021-01-26 Thread Roger Pau Monne

When an IO-APIC pin is switched from level to edge trigger mode the
IRR bit is cleared, so it can be used as a way to EOI an interrupt at
the IO-APIC level.

Such EOI however does not get forwarded to the dpci code like it's
done for the local APIC initiated EOI. This change adds the code in
order to notify dpci of such EOI, so that dpci and the interrupt
controller are in sync.

Signed-off-by: Roger Pau Monné 
---
Changes since v2:
 - Fix comment message missing 'edge'.
 - Add asserts that previous triggering mode was level and it's not a
   top word write.
---
 xen/arch/x86/hvm/vioapic.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index e3ee747b7d..87370dd417 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -219,6 +219,7 @@ static void vioapic_write_redirent(
 struct domain *d = vioapic_domain(vioapic);
 struct hvm_irq *hvm_irq = hvm_domain_irq(d);
 union vioapic_redir_entry *pent, ent;
+bool prev_level;
 int unmasked = 0;
 unsigned int gsi;
 
@@ -234,6 +235,7 @@ static void vioapic_write_redirent(
 
 pent = >redirtbl[idx];
 ent  = *pent;
+prev_level = ent.fields.trig_mode == VIOAPIC_LEVEL_TRIG;
 
 if ( top_word )
 {
@@ -270,6 +272,21 @@ static void vioapic_write_redirent(
 
 spin_unlock(>arch.hvm.irq_lock);
 
+if ( ent.fields.trig_mode == VIOAPIC_EDGE_TRIG &&
+ ent.fields.remote_irr && is_iommu_enabled(d) )
+{
+/*
+ * Since IRR has been cleared and further interrupts can be
+ * injected also attempt to deassert any virtual line of passed
+ * through devices using this pin. Switching a pin from level to
+ * edge trigger mode can be used as a way to EOI an interrupt at
+ * the IO-APIC level.
+ */
+ASSERT(prev_level);
+ASSERT(!top_word);
+hvm_dpci_eoi(d, gsi);
+}
+
 if ( is_hardware_domain(d) && unmasked )
 {
 /*
-- 
2.29.2

[PATCH v3 1/6] x86/vioapic: top word redir entry writes don't trigger interrupts

2021-01-26 Thread Roger Pau Monne

Top word writes just update the destination of the interrupt, but
since there's no change on the masking or the triggering mode no
guest interrupt injection can result of such write. Add an assert to
that effect.

Requested-by: Jan Beulich 
Signed-off-by: Roger Pau Monné 
---
Changes since v2:
 - New in this version.
---
 xen/arch/x86/hvm/vioapic.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 804bc77279..e3ee747b7d 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -262,6 +262,8 @@ static void vioapic_write_redirent(
   !ent.fields.remote_irr &&
   hvm_irq->gsi_assert_count[gsi] )
 {
+/* A top word write should never trigger an interrupt injection. */
+ASSERT(!top_word);
 pent->fields.remote_irr = 1;
 vioapic_deliver(vioapic, idx);
 }
-- 
2.29.2

[PATCH v3 0/6] x86/intr: HVM guest interrupt handling fixes/cleanup

2021-01-26 Thread Roger Pau Monne

Hello,

The following series aims to fix some shortcomings of guest interrupt
handling when using passthrough devices. The first 5 patches are
bugfixes or cleanups, which I think should solve the issue(s) that
caused the dpci EOI timer to be introduced. However neither me nor
others seem to be able to reproduce the original issue, so it's hard to
tell.

It's my opinion that we should remove the timer and see what explodes
(if anything). That's the only way we will be able to figure out what
the original issue was, and how to fix it without introducing yet
another per-guest-irq related timer.

Thanks, Roger.

Roger Pau Monne (6):
  x86/vioapic: top word redir entry writes don't trigger interrupts
  x86/vioapic: issue EOI to dpci when switching pin to edge trigger mode
  x86/vpic: force int output to low when in init mode
  x86/vpic: don't trigger unmask event until end of init
  x86/vpic: issue dpci EOI for cleared pins at ICW1
  x86/dpci: remove the dpci EOI timer

 xen/arch/x86/hvm/vioapic.c| 19 ++
 xen/arch/x86/hvm/vpic.c   | 36 --
 xen/drivers/passthrough/vtd/x86/hvm.c |  3 -
 xen/drivers/passthrough/x86/hvm.c | 95 +--
 xen/include/asm-x86/hvm/irq.h |  3 -
 xen/include/xen/iommu.h   |  5 --
 6 files changed, 52 insertions(+), 109 deletions(-)

-- 
2.29.2

Re: [PATCH v7 05/10] x86/vmx: Add Intel Processor Trace support

2021-01-26 Thread Jan Beulich

On 21.01.2021 22:27, Andrew Cooper wrote:
> From: Michał Leszczyński 
> 
> Add CPUID/MSR enumeration details for Processor Trace.  For now, we will only
> support its use inside VMX operation.  Fill in the vmtrace_available boolean
> to activate the newly introduced common infrastructure for allocating trace
> buffers.
> 
> For now, Processor Trace is going to be operated in Single Output mode behind
> the guests back.  Add the MSRs to struct vcpu_msrs, and set up the buffer
> limit in vmx_init_pt() as it is fixed for the lifetime of the domain.
> 
> Context switch the most of the MSRs in and out of vCPU context switch, but the
> main control register needs to reside in the MSR load/save lists.  Explicitly
> pull the msrs pointer out into a local variable, because the optimiser cannot
> keep it live across the memory clobbers in the MSR accesses.
> 
> Signed-off-by: Michał Leszczyński 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 
albeit with a few things to consider and with a question to
confirm my understanding.

> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> @@ -291,6 +291,20 @@ static int vmx_init_vmcs_config(void)
>  _vmx_cpu_based_exec_control &=
>  ~(CPU_BASED_CR8_LOAD_EXITING | CPU_BASED_CR8_STORE_EXITING);
>  
> +rdmsrl(MSR_IA32_VMX_MISC, _vmx_misc_cap);
> +
> +/* Check whether IPT is supported in VMX operation. */
> +if ( !smp_processor_id() )

I'm not happy to see another one of these appear. I'd prefer
if the caller passed its "bsp" boolean into here, or if this
was made use system_state.

> +vmtrace_available = cpu_has_proc_trace &&
> +(_vmx_misc_cap & VMX_MISC_PROC_TRACE);
> +else if ( vmtrace_available &&
> +  !(_vmx_misc_cap & VMX_MISC_PROC_TRACE) )
> +{
> +printk("VMX: IPT capabilities differ between CPU%u and CPU0\n",

As a minor follow-on, instead of "CPU0" this may then want
to say "rest of the system" or "BSP", but I can see that at
least the former is also making the message longer (which
may not be desirable).

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -428,6 +428,20 @@ static void vmx_domain_relinquish_resources(struct 
> domain *d)
>  vmx_free_vlapic_mapping(d);
>  }
>  
> +static void vmx_init_pt(struct vcpu *v)

We use "pt" already as an acronym for pass-through. Could we
use "ipt" here, for example (following the new "ipt_active"
field)?

> --- a/xen/include/asm-x86/msr.h
> +++ b/xen/include/asm-x86/msr.h
> @@ -306,6 +306,38 @@ struct vcpu_msrs
>  };
>  } misc_features_enables;
>  
> +/*
> + * 0x0560 ... 57x - MSR_RTIT_*
> + *
> + * "Real Time Instruction Trace", now called Processor Trace.
> + *
> + * These MSRs are not exposed to guests.  They are controlled by Xen
> + * behind the scenes, when vmtrace is enabled for the domain.
> + *
> + * MSR_RTIT_OUTPUT_BASE not stored here.  It is fixed per vcpu, and
> + * derived from v->vmtrace.buf.
> + */
> +struct {
> +/*
> + * Placed in the MSR load/save lists.  Only modified by hypercall in
> + * the common case.
> + */
> +uint64_t ctl;

IIUC this field will be used by subsequent patches only?

Jan

Re: [PATCH v7 08/10] tools/misc: Add xen-vmtrace tool

2021-01-26 Thread Ian Jackson

Andrew Cooper writes ("Re: [PATCH v7 08/10] tools/misc: Add xen-vmtrace tool"):
> On 26/01/2021 11:59, Ian Jackson wrote:
> >[Andrew:]
> >> This is example code, not a production utility.
> > Perhaps the utility could print some kind of health warning about it
> > possibly leaving this perf-impacting feature enabled, and how to clean
> > up ?
> 
> Why?  The theoretical fallout here is not conceptually different to
> libxl or qemu segfaulting, or any of the myriad other random utilities
> we have.
> 
> Printing "Warning - this program, just like everything else in the Xen
> tree, might in exceptional circumstances segfault and leave the domain
> in a weird state" is obvious, and doesn't need stating.
> 
> The domain is stuffed. `xl destroy` may or may not make the problem go away.

Firstly, I don't agree with this pessimistic analysis of our current
tooling.  Secondly, I would consider many such behaviours bugs;
certainly we have bugs but we shouldn't introduce more of them.

You are justifying the poor robustness of this tool on the grounds
that it's "example code, not a production utility".

But we are shipping it to bin/ and there is nothing telling anyone
that trying to use it (perhaps wrapped in something of their own
devising) is a bad idea.

Either this is code users might be expected to run in production in
which we need to make it at least have a minimal level of engineering
robustness (which is perhaps too difficult at this stage), or we need
to communicate to our users that it's a programming example, not a
useful utility.

Note that *even if it is a programming example*, we should highlight
its most important deficiencies.  Otherwise it is a hazardously
misleading example.

I hope that makes sense.

Thanks,
Ian.

RE: [PATCH V5 09/22] xen/ioreq: Make x86's IOREQ related dm-op handling common

2021-01-26 Thread Paul Durrant

> -Original Message-
> From: Oleksandr Tyshchenko 
> Sent: 25 January 2021 19:08
> To: xen-devel@lists.xenproject.org
> Cc: Julien Grall ; Jan Beulich ; 
> Andrew Cooper
> ; Roger Pau Monné ; Wei Liu 
> ; George
> Dunlap ; Ian Jackson ; Julien 
> Grall ;
> Stefano Stabellini ; Paul Durrant ; 
> Daniel De Graaf
> ; Oleksandr Tyshchenko 
> Subject: [PATCH V5 09/22] xen/ioreq: Make x86's IOREQ related dm-op handling 
> common
> 
> From: Julien Grall 
> 
> As a lot of x86 code can be re-used on Arm later on, this patch
> moves the IOREQ related dm-op handling to the common code.
> 
> The idea is to have the top level dm-op handling arch-specific
> and call into ioreq_server_dm_op() for otherwise unhandled ops.
> Pros:
> - More natural than doing it other way around (top level dm-op
> handling common).
> - Leave compat_dm_op() in x86 code.
> Cons:
> - Code duplication. Both arches have to duplicate dm_op(), etc.
> 
> Make the corresponding functions static and rename them according
> to the new naming scheme (including dropping the "hvm" prefixes).
> 
> Introduce common dm.c file as a resting place for the do_dm_op()
> (which is identical for both Arm and x86) to minimize code duplication.
> The common DM feature is supposed to be built with IOREQ_SERVER
> option enabled (as well as the IOREQ feature), which is selected
> for x86's config HVM for now.
> 
> Also update XSM code a bit to let dm-op be used on Arm.
> 
> This support is going to be used on Arm to be able run device
> emulator outside of Xen hypervisor.
> 
> Signed-off-by: Julien Grall 
> Signed-off-by: Oleksandr Tyshchenko 
> Acked-by: Jan Beulich 
> [On Arm only]
> Tested-by: Wei Chen 
> 

Reviewed-by: Paul Durrant

Re: [PATCH v3 00/15] zstd decompression for DomU-s + fallout / consolidation

2021-01-26 Thread Ian Jackson

Jan Beulich writes ("Re: [PATCH v3 00/15] zstd decompression for DomU-s + 
fallout / consolidation"):
> On 26.01.2021 13:05, Ian Jackson wrote:
> > I approve of cleanups of course.  But:
> > 
> > Which of these cleanups were posted before the LPD ?  I'm not
> > currently aware of any reason for a freeze exception here, so I think
> > those patches which didn't meet the LPD should wait.  Ones which *did*
> > meet the LPD should be considered on their merits.
> > 
> > If you could direct me to which those are I would be happy to review
> > them.
> 
> All of them were posted after that date; only the Dom0 zstd
> decompression part was ready in time. I view this entire
> series as a logical extension to the earlier patches though.
> I'm unsure anyway how new patches in a previously submitted
> series would be treated in general; so far I've been under
> the impression that if in doubt the series as a whole would
> count, not every individual patch.

I think as a general rule, I don't think "logical extensions" to
things posted before the LPD count.  But bugfixes, and smallish
changes necessary to make the rest of a previously-posted series are
OK.

> As said (still visible above) I'm not meaning to insist on
> patches 3 and onwards to be taken. In fact it was you to
> ask whether patch 3 would possibly want a freeze exception.
> And you did give patch 4 a release ack already, based on it
> containing a bug fix. (The latter is true of patch 6 as
> well, btw.) Are you implying you withdraw that release ack
> now?

I'm afraid that I don't keep all this state in my head.  I rely on
written materials.  Where the written materials in front of me seem
self-sufficient and don't contain unresolved references or things that
are confusing, I rely on what is in front of me being sufficient.
ISTM that the point of a summary mail like yours is to save us going
and rereading all the background each time.

None of the subject lines in your mail mentioned that these were
bugfixes.  So I'm afraid I had forgotten that.

I'm not withdrawing my release ack for anything.  But whena commit is
being justified for 4.15 because it is (or contains) a bugfix then it
would be really good if the bugfix were mentioned in the summary line.

> While we're at this - how are bug fixes to be treated in
> this two week window? Do they need a release ack too if
> they did miss the LPD?

No.

>  I'm asking with specifically
> "xen/include: compat/xlat.h may change with .config
> changes" in mind, but there may be others, like Roger's
> 3-patch series posted today, which I intend to look at
> rather sooner than later.

Things that are just bugfixes do not need a release ack at this stage.
Things that are a mixture should either get a release ack, or be
separated out in which case the bugfix part does not need a release
ack.  I am happy to give out (or decline) acks in cases of doubt.

I appreciate your looking at bugfixes as a priority, so thanks.

> >> 07: gunzip: drop INIT{,DATA} and STATIC
> > 
> > I release-nacked this because I saw you posted it with this Subject
> >   Subject: [PATCH v3 01/15] gunzip: drop INIT{,DATA} and STATIC
> > which made me think you were targeting it for 4.15.  If not then fine.
> 
> FAOD - indeed it was a mistake of mine and was meant to
> be 07/15.

NP.

I hope this explanation makes some kind of sense and sorry for the
confusion.

Ian.

Re: [PATCH v4 1/2] xen: EXPERT clean-up and introduce UNSUPPORTED

2021-01-26 Thread Bertrand Marquis




> On 26 Jan 2021, at 13:19, Jan Beulich  wrote:
> 
> On 26.01.2021 14:17, Ian Jackson wrote:
>> Jan Beulich writes ("Re: [PATCH v4 1/2] xen: EXPERT clean-up and introduce 
>> UNSUPPORTED"):
>>> On 26.01.2021 12:17, Bertrand Marquis wrote:
 Maybe something we could explain more clearly in the UNSUPPORTED/EXPERT
 config parameters instead ?
 We could also make that more clear in the help of such parameters directly.
 
 I do not see how we could make that more clear directly in the prompt (as
 making it too long is not a good solution).
>>> 
>>> My main request is that such tags be added only if there's
>>> absolutely no ambiguity. Anything else (requiring longer
>>> explanations in many cases) should be expressed in the help
>>> text of the option, or in yet other ways (a referral to
>>> SUPPORT.md comes to mind).
>> 
>> Is
>> 
>>> +   bool "Harden the branch predictor against aliasing attacks 
>>> (disabling UNSUPPORTED)" if UNSUPPORTED
>> 
>> too long ?
> 
> It may be tolerable, but it is getting longish imo.

I am also not quite sure this is making things clearer.

Bertrand

> 
> Jan

Re: [PATCH v4 1/2] xen: EXPERT clean-up and introduce UNSUPPORTED

2021-01-26 Thread Jan Beulich

On 26.01.2021 14:17, Ian Jackson wrote:
> Jan Beulich writes ("Re: [PATCH v4 1/2] xen: EXPERT clean-up and introduce 
> UNSUPPORTED"):
>> On 26.01.2021 12:17, Bertrand Marquis wrote:
>>> Maybe something we could explain more clearly in the UNSUPPORTED/EXPERT
>>> config parameters instead ?
>>> We could also make that more clear in the help of such parameters directly.
>>>
>>> I do not see how we could make that more clear directly in the prompt (as
>>> making it too long is not a good solution).
>>
>> My main request is that such tags be added only if there's
>> absolutely no ambiguity. Anything else (requiring longer
>> explanations in many cases) should be expressed in the help
>> text of the option, or in yet other ways (a referral to
>> SUPPORT.md comes to mind).
> 
> Is
> 
>> +bool "Harden the branch predictor against aliasing attacks 
>> (disabling UNSUPPORTED)" if UNSUPPORTED
> 
> too long ?

It may be tolerable, but it is getting longish imo.

Jan

Re: [PATCH v4 1/2] xen: EXPERT clean-up and introduce UNSUPPORTED

2021-01-26 Thread Ian Jackson

Jan Beulich writes ("Re: [PATCH v4 1/2] xen: EXPERT clean-up and introduce 
UNSUPPORTED"):
> On 26.01.2021 12:17, Bertrand Marquis wrote:
> > Maybe something we could explain more clearly in the UNSUPPORTED/EXPERT
> > config parameters instead ?
> > We could also make that more clear in the help of such parameters directly.
> > 
> > I do not see how we could make that more clear directly in the prompt (as
> > making it too long is not a good solution).
> 
> My main request is that such tags be added only if there's
> absolutely no ambiguity. Anything else (requiring longer
> explanations in many cases) should be expressed in the help
> text of the option, or in yet other ways (a referral to
> SUPPORT.md comes to mind).

Is

> + bool "Harden the branch predictor against aliasing attacks (disabling 
> UNSUPPORTED)" if UNSUPPORTED

too long ?

Ian.

[xen-unstable-smoke test] 158627: tolerable all pass - PUSHED

2021-01-26 Thread osstest service owner

flight 158627 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/158627/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  ca6fcf4321b31df0b50720fa817e727b16e34f76
baseline version:
 xen  25fcedefaa9fcbd20203202aa1b73eef051a5fa9

Last test of basis   158618  2021-01-25 21:01:28 Z0 days
Testing same since   158627  2021-01-26 11:00:29 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Roger Pau Monné 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   25fcedefaa..ca6fcf4321  ca6fcf4321b31df0b50720fa817e727b16e34f76 -> smoke

Re: [PATCH v4 1/2] xen: EXPERT clean-up and introduce UNSUPPORTED

2021-01-26 Thread Jan Beulich

On 26.01.2021 12:17, Bertrand Marquis wrote:
> 
> 
>> On 26 Jan 2021, at 11:11, Jan Beulich  wrote:
>>
>> On 26.01.2021 12:06, Bertrand Marquis wrote:
 On 26 Jan 2021, at 09:22, Jan Beulich  wrote:
 On 25.01.2021 22:27, Stefano Stabellini wrote:
> @@ -77,7 +77,7 @@ config SBSA_VUART_CONSOLE
> SBSA Generic UART implements a subset of ARM PL011 UART.
>
> config ARM_SSBD
> - bool "Speculative Store Bypass Disable" if EXPERT
> + bool "Speculative Store Bypass Disable (UNSUPPORTED)" if UNSUPPORTED
>   depends on HAS_ALTERNATIVE
>   default y
>   help
> @@ -87,7 +87,7 @@ config ARM_SSBD
> If unsure, say Y.
>
> config HARDEN_BRANCH_PREDICTOR
> - bool "Harden the branch predictor against aliasing attacks" if EXPERT
> + bool "Harden the branch predictor against aliasing attacks 
> (UNSUPPORTED)" if UNSUPPORTED
>   default y
>   help
> Speculation attacks against some high-performance processors rely on

 Both of these default to y and have their _prompt_
 conditional upon EXPERT. Which means only an expert can turn them
 _off_. Which doesn't make it look like these are unsupported? Or
 if anything, turning them off is unsupported?
>>>
>>> ...You could see that as EXPERT/UNSUPPORTED options can only be
>>> “modified” from their default value if EXPERT/UNSUPPORTED is activated.
>>
>> But this is nothing you can recognize when configuring Xen
>> (i.e. seeing just these prompts).
> 
> Maybe something we could explain more clearly in the UNSUPPORTED/EXPERT
> config parameters instead ?
> We could also make that more clear in the help of such parameters directly.
> 
> I do not see how we could make that more clear directly in the prompt (as
> making it too long is not a good solution).

My main request is that such tags be added only if there's
absolutely no ambiguity. Anything else (requiring longer
explanations in many cases) should be expressed in the help
text of the option, or in yet other ways (a referral to
SUPPORT.md comes to mind).

Jan

Re: [PATCH v3 00/15] zstd decompression for DomU-s + fallout / consolidation

2021-01-26 Thread Jan Beulich

On 26.01.2021 13:05, Ian Jackson wrote:
> Jan Beulich writes ("[PATCH v3 00/15] zstd decompression for DomU-s + fallout 
> / consolidation"):
>> Only patches 1 and 2 are strictly intended for 4.15, paralleling
>> the recent Dom0 side work (and re-using many of the files
>> introduced there, for the stubdom build), but ones up to at least
>> patch 6 may still want considering (and 4 already has a release
>> ack).
> 
> Thanks.
> 
>> 01: libxenguest: add get_unaligned_le32()
>> 02: libxenguest: support zstd compressed kernels
> 
> So these two are fine.
> 
>> 03: xen/decompress: make helper symbols static
>> 04: libxenguest: "standardize" LZO kernel decompression code
>> 05: libxenguest: drop redundant decompression declarations
>> 06: libxenguest: simplify kernel decompression
> 
> I approve of cleanups of course.  But:
> 
> Which of these cleanups were posted before the LPD ?  I'm not
> currently aware of any reason for a freeze exception here, so I think
> those patches which didn't meet the LPD should wait.  Ones which *did*
> meet the LPD should be considered on their merits.
> 
> If you could direct me to which those are I would be happy to review
> them.

All of them were posted after that date; only the Dom0 zstd
decompression part was ready in time. I view this entire
series as a logical extension to the earlier patches though.
I'm unsure anyway how new patches in a previously submitted
series would be treated in general; so far I've been under
the impression that if in doubt the series as a whole would
count, not every individual patch.

As said (still visible above) I'm not meaning to insist on
patches 3 and onwards to be taken. In fact it was you to
ask whether patch 3 would possibly want a freeze exception.
And you did give patch 4 a release ack already, based on it
containing a bug fix. (The latter is true of patch 6 as
well, btw.) Are you implying you withdraw that release ack
now?

While we're at this - how are bug fixes to be treated in
this two week window? Do they need a release ack too if
they did miss the LPD? I'm asking with specifically
"xen/include: compat/xlat.h may change with .config
changes" in mind, but there may be others, like Roger's
3-patch series posted today, which I intend to look at
rather sooner than later.

>> 07: gunzip: drop INIT{,DATA} and STATIC
> 
> I release-nacked this because I saw you posted it with this Subject
>   Subject: [PATCH v3 01/15] gunzip: drop INIT{,DATA} and STATIC
> which made me think you were targeting it for 4.15.  If not then fine.

FAOD - indeed it was a mistake of mine and was meant to
be 07/15.

Jan

1 2 >

1 - 100 of 154 matches

Mail list logo