[ovmf test] 167940: all pass - PUSHED

2022-01-28 Thread osstest service owner
flight 167940 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167940/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 8542fc5f956821841154d4c11851c5484847ac0d
baseline version:
 ovmf f4b7b473b4afd0093768905529bfae09a2061d41

Last test of basis   167933  2022-01-28 10:10:35 Z0 days
Testing same since   167940  2022-01-29 01:41:58 Z0 days1 attempts


People who touched revisions under test:
  Liming Gao 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   f4b7b473b4..8542fc5f95  8542fc5f956821841154d4c11851c5484847ac0d -> 
xen-tested-master



[linux-linus test] 167937: tolerable FAIL - PUSHED

2022-01-28 Thread osstest service owner
flight 167937 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167937/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail blocked in 
167917
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 167917
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167917
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167917
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167917
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 167917
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167917
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167917
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 linux145d9b498fc827b79c1260b4caa29a8e59d4c2b9
baseline version:
 linux626b2dda7651a7c766108db4cdc0825db05b980d

Last test of basis   167917  2022-01-27 10:12:35 Z1 days
Failing since167927  2022-01-28 01:57:27 Z1 days2 attempts
Testing same since   167937  2022-01-28 15:11:08 Z0 days1 attempts


People who touched revisions under test:
  Alex Deucher 
  Angelo Dureghello 
  Arkadiusz Kubalewski 
  Balbir Singh 
  Bas Nieuwenhuizen 
  Bhawanpreet Lakha 
  caixf 
  Catherine Sullivan 
  Christian König 
  Christophe JAILLET 
  Colin Foster 
  Congyu Liu 
  Damien Le Moal 
  Dan Carpenter 
  Daniel Vetter 
  Daniel Wheeler 
  Dave Airlie 
  

Re: [PATCH v2 4/9] x86/spec-ctrl: Don't use spec_ctrl_{enter,exit}_idle() for S3

2022-01-28 Thread Andrew Cooper
On 28/01/2022 13:29, Andrew Cooper wrote:
> 'idle' here refers to hlt/mwait.  The S3 path isn't an idle path - it is a
> platform reset.
>
> Conditionally clearing IBRS and flushing the store buffers on the way down is
> a waste of time.
>
> Furthermore, we want to load default_xen_mcu_opt_ctrl unilaterally on the way
> back up.  Currently it happens as a side effect of X86_FEATURE_SC_MSR_IDLE or
> the next return-to-guest, but that's fragile behaviour.
>
> Signed-off-by: Andrew Cooper 
> ---
> CC: Jan Beulich 
> CC: Roger Pau Monné 
> CC: Wei Liu 
>
> v2:
>  * New
> ---
>  xen/arch/x86/acpi/power.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
> index 31a56f02d083..ea2bd8bbfe93 100644
> --- a/xen/arch/x86/acpi/power.c
> +++ b/xen/arch/x86/acpi/power.c
> @@ -248,7 +248,6 @@ static int enter_state(u32 state)
>  error = 0;
>  
>  ci = get_cpu_info();
> -spec_ctrl_enter_idle(ci);
>  /* Avoid NMI/#MC using MSR_SPEC_CTRL until we've reloaded microcode. */
>  ci->spec_ctrl_flags &= ~SCF_ist_wrmsr;
>  
> @@ -295,7 +294,9 @@ static int enter_state(u32 state)
>  
>  /* Re-enabled default NMI/#MC use of MSR_SPEC_CTRL. */
>  ci->spec_ctrl_flags |= (default_spec_ctrl_flags & SCF_ist_wrmsr);
> -spec_ctrl_exit_idle(ci);
> +
> +if ( boot_cpu_has(X86_FEATURE_IBRSB) )
> +wrmsrl(MSR_SPEC_CTRL, default_xen_mcu_opt_ctrl);

This logic works rather better when it gets the right variable. 
default_xen_spec_ctrl.

~Andrew

>  
>  if ( boot_cpu_has(X86_FEATURE_SRBDS_CTRL) )
>  wrmsrl(MSR_MCU_OPT_CTRL, default_xen_mcu_opt_ctrl);



[qemu-mainline test] 167936: tolerable FAIL - PUSHED

2022-01-28 Thread osstest service owner
flight 167936 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167936/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds18 guest-start/debian.repeat fail REGR. vs. 167922

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167922
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167922
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 12 debian-hvm-install fail like 
167922
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167922
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 167922
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167922
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 167922
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167922
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 167922
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuub367db48126d4ee14579af6cf5cdbffeb9496627
baseline version:
 qemuucfe63e46be0a1f8a7fd2fd5547222f8344a43279

Last test of basis   167922  2022-01-27 17:38:15 Z1 days

[PATCH v3 5/5] tools: add example application to initialize dom0less PV drivers

2022-01-28 Thread Stefano Stabellini
From: Luca Miccio 

Add an example application that can be run in dom0 to complete the
dom0less domains initialization so that they can get access to xenstore
and use PV drivers.

Signed-off-by: Luca Miccio 
Signed-off-by: Stefano Stabellini 
CC: Wei Liu 
CC: Anthony PERARD 
CC: Juergen Gross 
---
Changes in v3:
- handle xenstore errors
- add an in-code comment about xenstore entries
- less verbose output
- clean-up error path in main

Changes in v2:
- do not set HVM_PARAM_STORE_EVTCHN twice
- rename restore_xenstore to create_xenstore
- increase maxmem
---
 tools/helpers/Makefile|  13 ++
 tools/helpers/init-dom0less.c | 269 ++
 2 files changed, 282 insertions(+)
 create mode 100644 tools/helpers/init-dom0less.c

diff --git a/tools/helpers/Makefile b/tools/helpers/Makefile
index 7f6c422440..8e42997052 100644
--- a/tools/helpers/Makefile
+++ b/tools/helpers/Makefile
@@ -10,6 +10,9 @@ ifeq ($(CONFIG_Linux),y)
 ifeq ($(CONFIG_X86),y)
 PROGS += init-xenstore-domain
 endif
+ifeq ($(CONFIG_ARM),y)
+PROGS += init-dom0less
+endif
 endif
 
 XEN_INIT_DOM0_OBJS = xen-init-dom0.o init-dom-json.o
@@ -26,6 +29,13 @@ $(INIT_XENSTORE_DOMAIN_OBJS): CFLAGS += $(CFLAGS_libxenstore)
 $(INIT_XENSTORE_DOMAIN_OBJS): CFLAGS += $(CFLAGS_libxenlight)
 $(INIT_XENSTORE_DOMAIN_OBJS): CFLAGS += -include $(XEN_ROOT)/tools/config.h
 
+INIT_DOM0LESS_OBJS = init-dom0less.o init-dom-json.o
+$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxentoollog)
+$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenstore)
+$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenlight)
+$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenctrl)
+$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenevtchn)
+
 .PHONY: all
 all: $(PROGS)
 
@@ -35,6 +45,9 @@ xen-init-dom0: $(XEN_INIT_DOM0_OBJS)
 init-xenstore-domain: $(INIT_XENSTORE_DOMAIN_OBJS)
$(CC) $(LDFLAGS) -o $@ $(INIT_XENSTORE_DOMAIN_OBJS) 
$(LDLIBS_libxentoollog) $(LDLIBS_libxenstore) $(LDLIBS_libxenctrl) 
$(LDLIBS_libxenguest) $(LDLIBS_libxenlight) $(APPEND_LDFLAGS)
 
+init-dom0less: $(INIT_DOM0LESS_OBJS)
+   $(CC) $(LDFLAGS) -o $@ $(INIT_DOM0LESS_OBJS) $(LDLIBS_libxenctrl) 
$(LDLIBS_libxenevtchn) $(LDLIBS_libxentoollog) $(LDLIBS_libxenstore) 
$(LDLIBS_libxenlight) $(LDLIBS_libxenguest)  $(APPEND_LDFLAGS)
+
 .PHONY: install
 install: all
$(INSTALL_DIR) $(DESTDIR)$(LIBEXEC_BIN)
diff --git a/tools/helpers/init-dom0less.c b/tools/helpers/init-dom0less.c
new file mode 100644
index 00..b6a3831cb5
--- /dev/null
+++ b/tools/helpers/init-dom0less.c
@@ -0,0 +1,269 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "init-dom-json.h"
+
+#define NR_MAGIC_PAGES 4
+#define CONSOLE_PFN_OFFSET 0
+#define XENSTORE_PFN_OFFSET 1
+#define STR_MAX_LENGTH 64
+
+static int alloc_magic_pages(libxl_dominfo *info, struct xc_dom_image *dom)
+{
+int rc, i;
+const xen_pfn_t base = GUEST_MAGIC_BASE >> XC_PAGE_SHIFT;
+xen_pfn_t p2m[NR_MAGIC_PAGES];
+
+rc = xc_domain_setmaxmem(dom->xch, dom->guest_domid,
+ info->max_memkb + NR_MAGIC_PAGES * 4);
+if (rc < 0)
+return rc;
+
+for (i = 0; i < NR_MAGIC_PAGES; i++)
+p2m[i] = base + i;
+
+rc = xc_domain_populate_physmap_exact(dom->xch, dom->guest_domid,
+  NR_MAGIC_PAGES, 0, 0, p2m);
+if (rc < 0)
+return rc;
+
+dom->xenstore_pfn = base + XENSTORE_PFN_OFFSET;
+
+xc_clear_domain_page(dom->xch, dom->guest_domid, dom->xenstore_pfn);
+
+xc_hvm_param_set(dom->xch, dom->guest_domid, HVM_PARAM_STORE_PFN,
+ dom->xenstore_pfn);
+return 0;
+}
+
+static bool do_xs_write_dom(struct xs_handle *xsh, xs_transaction_t t,
+domid_t domid, char *path, char *val)
+{
+char full_path[STR_MAX_LENGTH];
+
+snprintf(full_path, STR_MAX_LENGTH,
+ "/local/domain/%d/%s", domid, path);
+return xs_write(xsh, t, full_path, val, strlen(val));
+}
+
+static bool do_xs_write_libxl(struct xs_handle *xsh, xs_transaction_t t,
+  domid_t domid, char *path, char *val)
+{
+char full_path[STR_MAX_LENGTH];
+
+snprintf(full_path, STR_MAX_LENGTH,
+ "/libxl/%d/%s", domid, path);
+return xs_write(xsh, t, full_path, val, strlen(val));
+}
+
+static bool do_xs_write_vm(struct xs_handle *xsh, xs_transaction_t t,
+   libxl_uuid uuid, char *path, char *val)
+{
+char full_path[STR_MAX_LENGTH];
+
+snprintf(full_path, STR_MAX_LENGTH,
+ "/vm/" LIBXL_UUID_FMT "/%s", LIBXL_UUID_BYTES(uuid), path);
+return xs_write(xsh, t, full_path, val, strlen(val));
+}
+
+/*
+ * The xenstore nodes are the xenstore nodes libxl writes at domain
+ * creation.
+ *
+ * The list was retrieved by running xenstore-ls on a corresponding
+ * domain started by xl/libxl.
+ */
+static int restore_xenstore(struct xs_handle *xsh,
+

[PATCH v3 4/5] xenstored: send an evtchn notification on introduce_domain

2022-01-28 Thread Stefano Stabellini
From: Luca Miccio 

When xs_introduce_domain is called, send out a notification on the
xenstore event channel so that any (dom0less) domain waiting for the
xenstore interface to be ready can continue with the initialization.

The extra notification is harmless for domains that don't require it.

Signed-off-by: Luca Miccio 
Signed-off-by: Stefano Stabellini 
Reviewed-by: Bertrand Marquis 
---
Changes in v2:
- drop the new late_init parameter
---
 tools/xenstore/xenstored_domain.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/xenstore/xenstored_domain.c 
b/tools/xenstore/xenstored_domain.c
index d03c7d93a9..7487129e47 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -461,6 +461,9 @@ static struct domain *introduce_domain(const void *ctx,
/* Now domain belongs to its connection. */
talloc_steal(domain->conn, domain);
 
+   /* Notify the domain that xenstore is available */
+   xenevtchn_notify(xce_handle, domain->port);
+
if (!is_master_domain && !restore)
fire_watches(NULL, ctx, "@introduceDomain", NULL,
 false, NULL);
-- 
2.25.1




[PATCH v3 1/5] xen: introduce xen,enhanced dom0less property

2022-01-28 Thread Stefano Stabellini
From: Stefano Stabellini 

Introduce a new "xen,enhanced" dom0less property to enable/disable PV
driver interfaces for dom0less guests. Currently only "enabled" and
"disabled" are supported property values (and empty). Leave the option
open to implement further possible values in the future (e.g.
"xenstore" to enable only xenstore.)

The configurable option is for domUs only. For dom0 we always set the
corresponding property in the Xen code to true (PV interfaces enabled.)

This patch only parses the property. Next patches will make use of it.

Signed-off-by: Stefano Stabellini 
Reviewed-by: Bertrand Marquis 
CC: Julien Grall 
CC: Volodymyr Babchuk 
CC: Bertrand Marquis 
---
Changes in v3:
- improve commit message

Changes in v2:
- rename kinfo.enhanced to kinfo.dom0less_enhanced
- set kinfo.dom0less_enhanced to true for dom0
- handle -ENODATA in addition to -EILSEQ
---
 docs/misc/arm/device-tree/booting.txt | 18 ++
 xen/arch/arm/domain_build.c   |  8 
 xen/arch/arm/include/asm/kernel.h |  3 +++
 3 files changed, 29 insertions(+)

diff --git a/docs/misc/arm/device-tree/booting.txt 
b/docs/misc/arm/device-tree/booting.txt
index 71895663a4..38c29fb3d8 100644
--- a/docs/misc/arm/device-tree/booting.txt
+++ b/docs/misc/arm/device-tree/booting.txt
@@ -169,6 +169,24 @@ with the following properties:
 Please note that the SPI used for the virtual pl011 could clash with the
 physical SPI of a physical device assigned to the guest.
 
+- xen,enhanced
+
+A string property. Possible property values are:
+
+- "enabled" (or missing property value)
+Xen PV interfaces, including grant-table and xenstore, will be
+enabled for the VM.
+
+- "disabled"
+Xen PV interfaces are disabled.
+
+If the xen,enhanced property is present with no value, it defaults
+to "enabled". If the xen,enhanced property is not present, PV
+interfaces are disabled.
+
+In the future other possible property values might be added to
+enable only selected interfaces.
+
 - nr_spis
 
 Optional. A 32-bit integer specifying the number of SPIs (Shared
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 6931c022a2..9144d6c0b6 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -2963,6 +2963,7 @@ static int __init construct_domU(struct domain *d,
  const struct dt_device_node *node)
 {
 struct kernel_info kinfo = {};
+const char *dom0less_enhanced;
 int rc;
 u64 mem;
 
@@ -2978,6 +2979,12 @@ static int __init construct_domU(struct domain *d,
 
 kinfo.vpl011 = dt_property_read_bool(node, "vpl011");
 
+rc = dt_property_read_string(node, "xen,enhanced", _enhanced);
+if ( rc == -EILSEQ ||
+ rc == -ENODATA ||
+ (rc == 0 && !strcmp(dom0less_enhanced, "enabled")) )
+kinfo.dom0less_enhanced = true;
+
 if ( vcpu_create(d, 0) == NULL )
 return -ENOMEM;
 
@@ -3095,6 +3102,7 @@ static int __init construct_dom0(struct domain *d)
 
 kinfo.unassigned_mem = dom0_mem;
 kinfo.d = d;
+kinfo.dom0less_enhanced = true;
 
 rc = kernel_probe(, NULL);
 if ( rc < 0 )
diff --git a/xen/arch/arm/include/asm/kernel.h 
b/xen/arch/arm/include/asm/kernel.h
index 874aa108a7..c4dc039b54 100644
--- a/xen/arch/arm/include/asm/kernel.h
+++ b/xen/arch/arm/include/asm/kernel.h
@@ -36,6 +36,9 @@ struct kernel_info {
 /* Enable pl011 emulation */
 bool vpl011;
 
+/* Enable PV drivers */
+bool dom0less_enhanced;
+
 /* GIC phandle */
 uint32_t phandle_gic;
 
-- 
2.25.1




[PATCH v3 3/5] xen/arm: configure dom0less domain for enabling xenstore after boot

2022-01-28 Thread Stefano Stabellini
From: Luca Miccio 

If "xen,enhanced" is enabled, then add to dom0less domains:

- the hypervisor node in device tree
- the xenstore event channel

The xenstore event channel is also used for the first notification to
let the guest know that xenstore has become available.

Signed-off-by: Luca Miccio 
Signed-off-by: Stefano Stabellini 
Reviewed-by: Bertrand Marquis 
CC: Julien Grall 
CC: Volodymyr Babchuk 
CC: Bertrand Marquis 

---
Changes in v3:
- use evtchn_alloc_unbound

Changes in v2:
- set HVM_PARAM_STORE_PFN to ~0ULL at domain creation
- in alloc_xenstore_evtchn do not call _evtchn_alloc_unbound
---
 xen/arch/arm/domain_build.c | 41 +
 1 file changed, 41 insertions(+)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 9144d6c0b6..8e030a7f05 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -2619,6 +2620,8 @@ static int __init prepare_dtb_domU(struct domain *d, 
struct kernel_info *kinfo)
 int ret;
 
 kinfo->phandle_gic = GUEST_PHANDLE_GIC;
+kinfo->gnttab_start = GUEST_GNTTAB_BASE;
+kinfo->gnttab_size = GUEST_GNTTAB_SIZE;
 
 addrcells = GUEST_ROOT_ADDRESS_CELLS;
 sizecells = GUEST_ROOT_SIZE_CELLS;
@@ -2693,6 +2696,13 @@ static int __init prepare_dtb_domU(struct domain *d, 
struct kernel_info *kinfo)
 goto err;
 }
 
+if ( kinfo->dom0less_enhanced )
+{
+ret = make_hypervisor_node(d, kinfo, addrcells, sizecells);
+if ( ret )
+goto err;
+}
+
 ret = fdt_end_node(kinfo->fdt);
 if ( ret < 0 )
 goto err;
@@ -2959,6 +2969,25 @@ static int __init construct_domain(struct domain *d, 
struct kernel_info *kinfo)
 return 0;
 }
 
+static int __init alloc_xenstore_evtchn(struct domain *d)
+{
+evtchn_alloc_unbound_t alloc;
+int rc;
+
+alloc.dom = d->domain_id;
+alloc.remote_dom = hardware_domain->domain_id;
+rc = evtchn_alloc_unbound(, true);
+if ( rc )
+{
+printk("Failed allocating event channel for domain\n");
+return rc;
+}
+
+d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN] = alloc.port;
+
+return 0;
+}
+
 static int __init construct_domU(struct domain *d,
  const struct dt_device_node *node)
 {
@@ -3014,7 +3043,19 @@ static int __init construct_domU(struct domain *d,
 return rc;
 
 if ( kinfo.vpl011 )
+{
 rc = domain_vpl011_init(d, NULL);
+if ( rc < 0 )
+return rc;
+}
+
+if ( kinfo.dom0less_enhanced )
+{
+rc = alloc_xenstore_evtchn(d);
+if ( rc < 0 )
+return rc;
+d->arch.hvm.params[HVM_PARAM_STORE_PFN] = ~0ULL;
+}
 
 return rc;
 }
-- 
2.25.1




[PATCH v3 2/5] xen: make evtchn_alloc_unbound public

2022-01-28 Thread Stefano Stabellini
From: Luca Miccio 

The xenstore event channel will be allocated for dom0less domains. It is
necessary to have access to the evtchn_alloc_unbound function to do
that, so make evtchn_alloc_unbound public.

Add a skip_xsm parameter to allow disabling the XSM check in
evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
originated from Xen before running any domains.)

Signed-off-by: Luca Miccio 
Signed-off-by: Stefano Stabellini 
CC: Julien Grall 
CC: Volodymyr Babchuk 
CC: Bertrand Marquis 
CC: Andrew Cooper 
CC: George Dunlap 
CC: Jan Beulich 
CC: Wei Liu 
---
Changes v3:
- expose evtchn_alloc_unbound, assing a skip_xsm parameter
---
 xen/common/event_channel.c | 13 -
 xen/include/xen/event.h|  3 +++
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index da88ad141a..be57d00a15 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn)
 xsm_evtchn_close_post(chn);
 }
 
-static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
+int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
 {
 struct evtchn *chn;
 struct domain *d;
@@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t 
*alloc)
 ERROR_EXIT_DOM(port, d);
 chn = evtchn_from_port(d, port);
 
-rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
-if ( rc )
-goto out;
+if ( !skip_xsm )
+{
+rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
+if ( rc )
+goto out;
+}
 
 evtchn_write_lock(chn);
 
@@ -1195,7 +1198,7 @@ long do_event_channel_op(int cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 struct evtchn_alloc_unbound alloc_unbound;
 if ( copy_from_guest(_unbound, arg, 1) != 0 )
 return -EFAULT;
-rc = evtchn_alloc_unbound(_unbound);
+rc = evtchn_alloc_unbound(_unbound, false);
 if ( !rc && __copy_to_guest(arg, _unbound, 1) )
 rc = -EFAULT; /* Cleaning up here would be a mess! */
 break;
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index 21c95e14fd..0a2cdedf7d 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -68,6 +68,9 @@ int evtchn_close(struct domain *d1, int port1, bool guest);
 /* Free an event channel. */
 void evtchn_free(struct domain *d, struct evtchn *chn);
 
+/* Create a new event channel port */
+int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm);
+
 /* Allocate a specific event channel port. */
 int evtchn_allocate_port(struct domain *d, unsigned int port);
 
-- 
2.25.1




[PATCH v3 0/5] dom0less PV drivers

2022-01-28 Thread Stefano Stabellini
Hi all,

Currently dom0less guests cannot use PV drivers because they don't have
access to xenstore. Also, the hypervisor node in device tree is missing
so they don't detect that they are running on Xen (thus, they don't try
to enable PV interfaces.)

This patch series enables dom0less guests (on ARM) to use PV drivers.

Instead of initializing xenstore immediately at boot, dom0less guests
get access to xenstore later. They delay the initialization until they
receive a notification via the xenstore event channel (which is
available at boot.)

An example workflow is as follows:
- all domains start in parallel, dom0less guests are immediately running
- when dom0 is up and running, the init-dom0less application is called
- dom0less guests receive the notification and initialize xenstore
- now xl network-attach/disk-attach works as expected for dom0less domUs

The patch series introduces a new dom0less device tree option
"xen,enhanced" (in the Xen device tree) to specify whether PV interfaces
should be enabled/disabled for the dom0less guest.

Cheers,

Stefano


Luca Miccio (4):
  xen: make evtchn_alloc_unbound public
  xen/arm: configure dom0less domain for enabling xenstore after boot
  xenstored: send an evtchn notification on introduce_domain
  tools: add example application to initialize dom0less PV drivers

Stefano Stabellini (1):
  xen: introduce xen,enhanced dom0less property

 docs/misc/arm/device-tree/booting.txt |  18 +++
 tools/helpers/Makefile|  13 ++
 tools/helpers/init-dom0less.c | 269 ++
 tools/xenstore/xenstored_domain.c |   3 +
 xen/arch/arm/domain_build.c   |  49 +++
 xen/arch/arm/include/asm/kernel.h |   3 +
 xen/common/event_channel.c|  13 +-
 xen/include/xen/event.h   |   3 +
 8 files changed, 366 insertions(+), 5 deletions(-)
 create mode 100644 tools/helpers/init-dom0less.c



Re: [XEN v1] xen/arm: io: Check ESR_EL2.ISV != 0 before searching for a MMIO handler

2022-01-28 Thread Stefano Stabellini
On Fri, 28 Jan 2022, Ayan Kumar Halder wrote:
> Hi Julien/Stefano,
> 
> Good discussion to learn about Xen (from a newbie's perspective). :)
> 
> I am trying to clarify my understanding. Some queries as below :-
> 
> On 28/01/2022 09:46, Julien Grall wrote:
> > 
> > 
> > On 28/01/2022 01:20, Stefano Stabellini wrote:
> > > On Thu, 27 Jan 2022, Julien Grall wrote:
> > > > On Thu, 27 Jan 2022 at 23:05, Julien Grall 
> > > > wrote:
> > > > > 
> > > > > On Thu, 27 Jan 2022 at 22:40, Stefano Stabellini
> > > > >  wrote:
> > > > > > I am with you on both points.
> > > > > > 
> > > > > > One thing I noticed is that the code today is not able to deal with
> > > > > > IO_UNHANDLED for MMIO regions handled by IOREQ servers or Xen MMIO
> > > > > > emulator handlers. p2m_resolve_translation_fault and try_map_mmio
> > > > > > are
> > > > > > called after try_handle_mmio returns IO_UNHANDLED but
> > > > > > try_handle_mmio is
> > > > > > not called a second time (or am I mistaken?)
> > > > > 
> > > > > Why would you need it? If try_mmio_fault() doesn't work the first
> > > > > time, then
> > > > 
> > > > Sorry I meant try_handle_mmio().
> > > > 
> > > > > it will not work the second it.
> > > 
> > > I think I explained myself badly, I'll try again below.
> > > 
> > > 
> > > > > > Another thing I noticed is that currently find_mmio_handler and
> > > > > > try_fwd_ioserv expect dabt to be already populated and valid so it
> > > > > > would
> > > > > > be better if we could get there only when dabt.valid.
> > > > > > 
> > > > > > With these two things in mind, I think maybe the best thing to do is
> > > > > > to
> > > > > > change the code in do_trap_stage2_abort_guest slightly so that
> > > > > > p2m_resolve_translation_fault and try_map_mmio are called first when
> > > > > > !dabt.valid.
> > > > > 
> > > > > An abort will mostly likely happen because of emulated I/O. If we call
> > > > > p2m_resolve_translation_fault() and try_map_mmio() first, then it
> > > > > means
> > > > > the processing will take longer than necessary for the common case.
> > > > > 
> > > > > So I think we want to keep the order as it is. I.e first trying the
> > > > > MMIO
> > > > > and then falling back to the less likely reason for a trap.
> > > 
> > > Yeah I thought about it as well. The idea would be that if dabt.valid is
> > > set then we leave things as they are (we call try_handle_mmio first) but
> > > if dabt.valid is not set (it is not valid) then we skip the
> > > try_handle_mmio() call because it wouldn't succeed anyway and go
> > > directly to p2m_resolve_translation_fault() and try_map_mmio().
> > > 
> > > If either of them work (also reading what you wrote about it) then we
> > > return immediately.
> > 
> > Ok. So the assumption is data abort with invalid syndrome would mostly
> > likely be because of a fault handled by p2m_resolve_translation_fault().
> > 
> > I think this makes sense. However, I am not convinced we can currently
> > safely call try_map_mmio() before try_handle_mmio(). This is because the
> > logic in try_map_mmio() is quite fragile and we may mistakenly map an
> > emulated region.
> 
> By emulated region, you mean vgic.dbase (Refer
> https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/arm/vgic-v2.c;h=589c033eda8f5e11af33c868eae2c159f985eac9;hb=0bdc43c8dec993258e930b34855853c22b917519#l702,
> which has not been mapped to the guest) and thus requires an MMIO handler.
> 
> Is my understanding correcr ?
 
I'll try to answer for Julien but yes.


> If so, can Xen mantain a table of such emulated regions ? I am guessing that
> all emulated regions will have a mmio_handler. Then, before invoking
> try_map_mmio(), it can check the table.

Today we keep those as a list, see find_mmio_handler (for regions
emulated in Xen) and also ioreq_server_select (for regions emulated by
QEMU or other external emulators.)

But I think there might be a simpler way: if you look at try_map_mmio,
you'll notice that there is iomem_access_permitted check. I don't think
that check can succeed for an emulated region. (I'd love for Julien and
others to confirm this observation though.)

 
> > Similarly, we can't call try_map_mmio() before
> > p2m_resolve_translation_fault() because a transient fault may be
> > misinterpreted.
> > 
> > I think we may be able to harden try_map_mmio() by checking if the I/O
> > region is emulated. But this will need to be fully thought through first.
> > 
> > > 
> > > If not, then we call decode_instruction from do_trap_stage2_abort_guest
> > > and try again. The second time dabt.valid is set so we end up calling
> > > try_handle_mmio() as usual.
> > 
> > With the approach below, you will also end up to call
> > p2m_resolve_translation_fault() and try_map_mmio() a second time if
> > try_handle_mmio() fails.
> > 
> > > 
> > > Just for clarity let me copy/paste the relevant code, apologies if it
> > > was already obvious to you -- I got the impression my suggestion wasn't
> > > very clear.
> > > 
> > > 
> > 

Re: [XEN v1] xen/arm: io: Check ESR_EL2.ISV != 0 before searching for a MMIO handler

2022-01-28 Thread Stefano Stabellini
On Fri, 28 Jan 2022, Julien Grall wrote:
> On 28/01/2022 01:20, Stefano Stabellini wrote:
> > On Thu, 27 Jan 2022, Julien Grall wrote:
> > > On Thu, 27 Jan 2022 at 23:05, Julien Grall 
> > > wrote:
> > > > 
> > > > On Thu, 27 Jan 2022 at 22:40, Stefano Stabellini
> > > >  wrote:
> > > > > I am with you on both points.
> > > > > 
> > > > > One thing I noticed is that the code today is not able to deal with
> > > > > IO_UNHANDLED for MMIO regions handled by IOREQ servers or Xen MMIO
> > > > > emulator handlers. p2m_resolve_translation_fault and try_map_mmio are
> > > > > called after try_handle_mmio returns IO_UNHANDLED but try_handle_mmio
> > > > > is
> > > > > not called a second time (or am I mistaken?)
> > > > 
> > > > Why would you need it? If try_mmio_fault() doesn't work the first time,
> > > > then
> > > 
> > > Sorry I meant try_handle_mmio().
> > > 
> > > > it will not work the second it.
> > 
> > I think I explained myself badly, I'll try again below.
> > 
> > 
> > > > > Another thing I noticed is that currently find_mmio_handler and
> > > > > try_fwd_ioserv expect dabt to be already populated and valid so it
> > > > > would
> > > > > be better if we could get there only when dabt.valid.
> > > > > 
> > > > > With these two things in mind, I think maybe the best thing to do is
> > > > > to
> > > > > change the code in do_trap_stage2_abort_guest slightly so that
> > > > > p2m_resolve_translation_fault and try_map_mmio are called first when
> > > > > !dabt.valid.
> > > > 
> > > > An abort will mostly likely happen because of emulated I/O. If we call
> > > > p2m_resolve_translation_fault() and try_map_mmio() first, then it means
> > > > the processing will take longer than necessary for the common case.
> > > > 
> > > > So I think we want to keep the order as it is. I.e first trying the MMIO
> > > > and then falling back to the less likely reason for a trap.
> > 
> > Yeah I thought about it as well. The idea would be that if dabt.valid is
> > set then we leave things as they are (we call try_handle_mmio first) but
> > if dabt.valid is not set (it is not valid) then we skip the
> > try_handle_mmio() call because it wouldn't succeed anyway and go
> > directly to p2m_resolve_translation_fault() and try_map_mmio().
> > 
> > If either of them work (also reading what you wrote about it) then we
> > return immediately.
> 
> Ok. So the assumption is data abort with invalid syndrome would mostly likely
> be because of a fault handled by p2m_resolve_translation_fault().
> 
> I think this makes sense. However, I am not convinced we can currently safely
> call try_map_mmio() before try_handle_mmio(). This is because the logic in
> try_map_mmio() is quite fragile and we may mistakenly map an emulated region.
> 
> Similarly, we can't call try_map_mmio() before p2m_resolve_translation_fault()
> because a transient fault may be
> misinterpreted.
> 
> I think we may be able to harden try_map_mmio() by checking if the I/O region
> is emulated. But this will need to be fully thought through first.

That's a good point. I wonder if it could be as simple as making sure
that iomem_access_permitted returns false for all emulated regions?
Looking at the code, it looks like it is already the case today. Is that
right?


> > If not, then we call decode_instruction from do_trap_stage2_abort_guest
> > and try again. The second time dabt.valid is set so we end up calling
> > try_handle_mmio() as usual.
> 
> With the approach below, you will also end up to call
> p2m_resolve_translation_fault() and try_map_mmio() a second time if
> try_handle_mmio() fails.
 
Yeah... we can find a way to avoid it.



[xen-unstable test] 167931: tolerable FAIL - PUSHED

2022-01-28 Thread osstest service owner
flight 167931 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167931/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 167921
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167921
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167921
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 167921
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 167921
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167921
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 167921
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 167921
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167921
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 167921
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167921
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 167921
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass

version targeted for testing:
 xen  2a565f9b40db981a860574be26d86a8665e71c38
baseline version:
 xen  

Re: [PATCH V3 2/2] iommu/arm: Remove code duplication in all IOMMU drivers

2022-01-28 Thread Volodymyr Babchuk


Hi Julien,

Julien Grall  writes:

> Hi,
>
> On 27/01/2022 19:55, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko 
>> All IOMMU drivers on Arm perform almost the same generic actions in
>> hwdom_init callback. Move this code to common arch_iommu_hwdom_init()
>> in order to get rid of code duplication.
>> Signed-off-by: Oleksandr Tyshchenko 
>> Reviewed-by: Volodymyr Babchuk 
>> Reviewed-by: Yoshihiro Shimoda 
>
> IMO, the reviewed-by tags should have been dropped with the changes
> you made. So I would like both reviewer to confirm they are happy with
> the change.

Yep, I'm still fine with this:

Reviewed-by: Volodymyr Babchuk 


-- 
Volodymyr Babchuk at EPAM


Re: [PATCH 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86

2022-01-28 Thread Anthony PERARD
On Thu, Jan 27, 2022 at 04:01:32PM +, Jane Malalane wrote:
> Add XEN_SYSCTL_PHYSCAP_ARCH_ASSISTED_xapic and
> XEN_SYSCTL_PHYSCAP_ARCH_ASSISTED_x2apic to report accelerated xapic
> and x2apic, on x86 hardware.
> No such features are currently implemented on AMD hardware.
> 
> For that purpose, also add an arch-specific "capabilities" parameter
> to struct xen_sysctl_physinfo.
> 
> Signed-off-by: Jane Malalane 
> Suggested-by: Andrew Cooper 
> ---
>  tools/golang/xenlight/helpers.gen.go |  4 
>  tools/golang/xenlight/types.gen.go   |  6 ++

Note for committers: Please regenerate the go bindings, there are
out-of-sync with libxl_types.idl at the moment.

> diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
> index 1feadebb18..33da51fe89 100644
> --- a/tools/libs/light/libxl_x86.c
> +++ b/tools/libs/light/libxl_x86.c
> @@ -866,6 +866,17 @@ int libxl__arch_passthrough_mode_setdefault(libxl__gc 
> *gc,
>  return rc;
>  }
>  
> +void libxl__arch_get_physinfo(libxl_physinfo *physinfo,
> +  xc_physinfo_t xcphysinfo)

It might be better to pass "xcphysinfo" as a pointer, otherwise I think
a copy of the whole struct is made when calling this function.


In any case, the tool part of the patch looks good:
Acked-by: Anthony PERARD 

Thanks,

-- 
Anthony PERARD



Re: [PATCH v3] x86/altp2m: p2m_altp2m_propagate_change() should honor present page order

2022-01-28 Thread Tamas K Lengyel
On Thu, Jan 27, 2022 at 10:07 AM Jan Beulich  wrote:
>
> For higher order mappings the comparison against p2m->min_remapped_gfn
> needs to take the upper bound of the covered GFN range into account, not
> just the base GFN. Otherwise, i.e. when dropping a mapping overlapping
> the remapped range but the base GFN outside of that range, an altp2m may
> wrongly not get reset.
>
> Note that there's no need to call get_gfn_type_access() ahead of the
> check against the remapped range boundaries: None of its outputs are
> needed earlier, and p2m_reset_altp2m() doesn't require the lock to be
> held. In fact this avoids a latent lock order violation: With per-GFN
> locking p2m_reset_altp2m() not only doesn't require the GFN lock to be
> held, but holding such a lock would actually not be allowed, as the
> function acquires a P2M lock.
>
> Local variables are moved into the more narrow scope (one is deleted
> altogether) to help see their actual life ranges.
>
> Signed-off-by: Jan Beulich 
> ---
> Note that this addresses only half of the problem: get_gfn_type_access()
> would also need invoking for all of the involved GFNs, not just the 1st
> one.
> ---
> v3: Don't pass the minimum of both orders to p2m_set_entry() (as was the
> case in the original code). Restore get_gfn_type_access() return
> value checking.
>
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -2534,9 +2534,6 @@ int p2m_altp2m_propagate_change(struct d
>  p2m_type_t p2mt, p2m_access_t p2ma)
>  {
>  struct p2m_domain *p2m;
> -p2m_access_t a;
> -p2m_type_t t;
> -mfn_t m;
>  unsigned int i;
>  unsigned int reset_count = 0;
>  unsigned int last_reset_idx = ~0;
> @@ -2549,15 +2546,16 @@ int p2m_altp2m_propagate_change(struct d
>
>  for ( i = 0; i < MAX_ALTP2M; i++ )
>  {
> +p2m_type_t t;
> +p2m_access_t a;
> +
>  if ( d->arch.altp2m_eptp[i] == mfn_x(INVALID_MFN) )
>  continue;
>
>  p2m = d->arch.altp2m_p2m[i];
> -m = get_gfn_type_access(p2m, gfn_x(gfn), , , 0, NULL);
>
>  /* Check for a dropped page that may impact this altp2m */
> -if ( mfn_eq(mfn, INVALID_MFN) &&
> - gfn_x(gfn) >= p2m->min_remapped_gfn &&
> +if ( gfn_x(gfn) + (1UL << page_order) > p2m->min_remapped_gfn &&
>   gfn_x(gfn) <= p2m->max_remapped_gfn )

Why are you dropping the mfn_eq(mfn, INVALID_MFN) check here?

Tamas



Re: [PATCH v3 1/2] IOMMU/x86: disallow device assignment to PoD guests

2022-01-28 Thread Anthony PERARD
On Tue, Jan 04, 2022 at 10:41:32AM +0100, Jan Beulich wrote:
> While it is okay for IOMMU page tables to get set up for guests starting
> in PoD mode, actual device assignment may only occur once all PoD
> entries have been removed from the P2M. So far this was enforced only
> for boot-time assignment, and only in the tool stack.
> 
> Also use the new function to replace p2m_pod_entry_count(): Its unlocked
> access to p2m->pod.entry_count wasn't really okay (irrespective of the
> result being stale by the time the caller gets to see it).
> 
> To allow the tool stack to see a consistent snapshot of PoD state, move
> the tail of XENMEM_{get,set}_pod_target handling into a function, adding
> proper locking there.
> 
> In libxl take the liberty to use the new local variable r also for a
> pre-existing call into libxc.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Anthony PERARD 

Thanks,

-- 
Anthony PERARD



Re: [XEN PATCH v9 08/30] build: fix enforce unique symbols for recent clang version

2022-01-28 Thread Anthony PERARD
On Fri, Jan 28, 2022 at 01:43:38PM +0100, Jan Beulich wrote:
> On 28.01.2022 13:03, Anthony PERARD wrote:
> > On Thu, Jan 27, 2022 at 04:57:20PM +0100, Jan Beulich wrote:
> >> On 25.01.2022 12:00, Anthony PERARD wrote:
> >>> clang 6.0 and newer behave like gcc in regards for the FILE symbol, so
> >>> only the filename rather than the full path to the source file.
> >>>
> >>> clang 3.8.1-24 (in our debian:stretch container) and 3.5.0-10
> >>> (in our debian:jessie container) do store the full path to the source
> >>> file in the FILE symbol.
> >>>
> >>> Also, based on commit 81ecb38b83 ("build: provide option to
> >>> disambiguate symbol names"), which were using clang 5, the change of
> >>> behavior likely happened in clang 6.0.
> >>>
> >>> This means that we also need to check clang version to figure out
> >>> which command we need to use to redefine symbol.
> >>>
> >>> Signed-off-by: Anthony PERARD 
> >>
> >> The "likely" in the description still worries me some. Roger, would
> >> you happen to know, or know of a way to find out for sure ("sure"
> >> not meaning to exclude the usual risk associated with version
> >> number checks)?
> > 
> > I found f5040b9685a7 ("Make .file directive to have basename only") as
> > part of LLVM's "release/6.x" branch (and "llvmorg-6.0.0" tag), but not
> > in "release/5.x".
> > 
> > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fllvm%2Fllvm-project%2Fcommit%2Ff5040b9685a760e584c576e9185295e54635d51edata=04%7C01%7Canthony.perard%40citrix.com%7C1ce7898a15bb4024260008d9e25be6f9%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C637789706644173026%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=V73NmkJWAHpqlzY9sAysf6%2Fw7q8ik6twT6lMLgglR3s%3Dreserved=0
> > 
> > This patch would seems to be the one changing the behavior. This still
> > suggest clang 6.0.
> 
> Oh, thanks for digging this out. May I suggest to replace (or delete)
> "likely" then in the description?

Maybe something like that? Or just delete the word might be enough.

Also we have commit 81ecb38b83 ("build: provide option to
disambiguate symbol names") which were using clang 5, and LLVM's
commit f5040b9685a7 [1] ("Make .file directive to have basename
only") which is part of "llvmorg-6.0.0" tag but not "release/5.x"
branch. Both suggest that clang change of behavior happened with
clang 6.0.

[1] 
https://github.com/llvm/llvm-project/commit/f5040b9685a760e584c576e9185295e54635d51e

> Acked-by: Jan Beulich 

Thanks,

-- 
Anthony PERARD



[ovmf test] 167933: all pass - PUSHED

2022-01-28 Thread osstest service owner
flight 167933 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167933/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf f4b7b473b4afd0093768905529bfae09a2061d41
baseline version:
 ovmf a867f3a7042ae0b4c1189bececbe72aa8a8a8e27

Last test of basis   167929  2022-01-28 02:41:45 Z0 days
Testing same since   167933  2022-01-28 10:10:35 Z0 days1 attempts


People who touched revisions under test:
  Rebecca Cran 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   a867f3a704..f4b7b473b4  f4b7b473b4afd0093768905529bfae09a2061d41 -> 
xen-tested-master



Re: [XEN PATCH v9 03/30] build: fix exported variable name CFLAGS_stack_boundary

2022-01-28 Thread Anthony PERARD
On Fri, Jan 28, 2022 at 12:14:58PM +0100, Jan Beulich wrote:
> On 25.01.2022 12:00, Anthony PERARD wrote:
> > Exporting a variable with a dash doesn't work reliably, they may be
> > striped from the environment when calling a sub-make or sub-shell.
> > 
> > CFLAGS-stack-boundary start to be removed from env in patch "build:
> > set ALL_OBJS in main Makefile; move prelink.o to main Makefile" when
> > running `make "ALL_OBJS=.."` due to the addition of the quote. At
> > least in my empirical tests.
> > 
> > Fixes: 2740d96efd ("xen/build: have the root Makefile generates the CFLAGS")
> > Signed-off-by: Anthony PERARD 
> 
> While I did commit this, I'm still somewhat confused. How would quoting
> of elements on a make command line make a difference to which variables
> get exported?

I don't know, maybe without quote, make run sub-make directly, but with
double-quote, a shell is used.

But after reading the manual, I didn't expect the variable to be passed
down sub-make at all:

5.7.2 Communicating Variables to a Sub-make

Except by explicit request, make exports a variable only if it is
either defined in the environment initially or set on the command
line, and if its name consists only of letters, numbers, and
underscores.

But somehow, sub-makes also export the non-shell-exportable variables,
unless the command line in a recipe invoking make has double-quotes.


I've tried again, and checked the process list:
  - when running `$(MAKE) -C foo`; make just execute make
  - when running `$(MAKE) -C 'foo'`; make just execute make
  - when running `$(MAKE) -C "foo"`; make execute /bin/sh -c ...
  - when running `$(MAKE) -C foo | tee`; make execute /bin/sh -c ...

So, CFLAGS-stack-boundary disappear when /bin/sh is involved.

> In any event I understand the description that prior to the subsequent
> change there's not actually any issue. Hence I'm not going to queue
> this for backporting despite the Fixes: tag. Unless of course I'm told
> otherwise (with justification).

Justification would be that it's not supposed to work, based on
information from make's manual.

Thanks,

-- 
Anthony PERARD



[xen-unstable-smoke test] 167934: tolerable all pass - PUSHED

2022-01-28 Thread osstest service owner
flight 167934 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167934/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  21170a738c11b24815b4afab2151bd3aa2a29acc
baseline version:
 xen  2a565f9b40db981a860574be26d86a8665e71c38

Last test of basis   167928  2022-01-28 02:00:32 Z0 days
Testing same since   167934  2022-01-28 11:01:44 Z0 days1 attempts


People who touched revisions under test:
  Anthony PERARD 
  Jan Beulich 
  Juergen Gross 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   2a565f9b40..21170a738c  21170a738c11b24815b4afab2151bd3aa2a29acc -> smoke



[linux-linus test] 167927: regressions - FAIL

2022-01-28 Thread osstest service owner
flight 167927 linux-linus real [real]
flight 167935 linux-linus real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/167927/
http://logs.test-lab.xenproject.org/osstest/logs/167935/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 12 debian-hvm-install fail REGR. 
vs. 167917

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail blocked in 
167917
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 167917
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167917
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167917
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167917
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 167917
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167917
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167917
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 linux23a46422c56144939c091c76cf389aa863ce9c18
baseline version:
 linux626b2dda7651a7c766108db4cdc0825db05b980d

Last test of basis   167917  2022-01-27 10:12:35 Z1 days
Testing same since   167927  2022-01-28 01:57:27 Z0 days1 attempts


People who touched revisions under test:
  Angelo Dureghello 
  Arkadiusz Kubalewski 
  Balbir Singh 
  caixf 
  Catherine Sullivan 
  Christophe JAILLET 
  

Re: [PATCH v2 1/9] x86/cpuid: Advertise SSB_NO to guests by default

2022-01-28 Thread Jan Beulich
On 28.01.2022 14:29, Andrew Cooper wrote:
> This is a statement of hardware behaviour, and not related to controls for the
> guest kernel to use.  Pass it straight through from hardware.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 




Re: [PATCH v5 03/14] vpci: move lock outside of struct vpci

2022-01-28 Thread Oleksandr Andrushchenko
Hi, Jan!

On 12.01.22 16:57, Jan Beulich wrote:
> On 25.11.2021 12:02, Oleksandr Andrushchenko wrote:
>> @@ -68,12 +84,13 @@ int vpci_add_handlers(struct pci_dev *pdev)
>>   /* We should not get here twice for the same device. */
>>   ASSERT(!pdev->vpci);
>>   
>> -pdev->vpci = xzalloc(struct vpci);
>> -if ( !pdev->vpci )
>> +vpci = xzalloc(struct vpci);
>> +if ( !vpci )
>>   return -ENOMEM;
>>   
>> +spin_lock(>vpci_lock);
>> +pdev->vpci = vpci;
>>   INIT_LIST_HEAD(>vpci->handlers);
>> -spin_lock_init(>vpci->lock);
> INIT_LIST_HEAD() can occur ahead of taking the lock, and can also act
> on >handlers rather than >vpci->handlers.
Yes, I will move it, good catch
>>   for ( i = 0; i < NUM_VPCI_INIT; i++ )
>>   {
>> @@ -83,7 +100,8 @@ int vpci_add_handlers(struct pci_dev *pdev)
>>   }
> This loop wants to live in the locked region because you need to install
> vpci into pdev->vpci up front, afaict. I wonder whether that couldn't
> be relaxed, but perhaps that's an improvement that can be thought about
> later.
Ok, so I'll leave it as is
>
> The main reason I'm looking at this closely is because from the patch
> title I didn't expect new locking regions to be introduced right here;
> instead I did expect strictly a mechanical conversion.
>
>> @@ -152,8 +170,6 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t 
>> *read_handler,
>>   r->offset = offset;
>>   r->private = data;
>>   
>> -spin_lock(>lock);
>  From the description I can't deduce why this lock is fine to go away
> now, i.e. that all call sites have the lock now acquire earlier.
> Therefore I'd expect at least an assertion to be left here ...
>
>> @@ -183,7 +197,6 @@ int vpci_remove_register(struct vpci *vpci, unsigned int 
>> offset,
>>   const struct vpci_register r = { .offset = offset, .size = size };
>>   struct vpci_register *rm;
>>   
>> -spin_lock(>lock);
> ... and here.
Previously the lock lived in struct vpci and now it lives in struct pci_dev 
which
is not visible here, so:
1. we cannot take that lock here and do expect for it to be acquired outside
2. we cannot add an ASSERT here as we would need
ASSERT(spin_is_locked(>vpci_lock));
and pdev is not here
All the callers of the vpci_{add|remove}_register are REGISTER_VPCI_INIT
functions which are called with >vpci_lock held.

So, while I agree that it would be indeed a good check with ASSERT here,
but adding an additional argument to the respective functions just for that
might not be a good idea IMO

I will describe this lock removal in the commit message

Thank you,
Oleksandr

Re: [PATCH v5 03/14] vpci: move lock outside of struct vpci

2022-01-28 Thread Oleksandr Andrushchenko
Hi, Roger, Jan!

On 13.01.22 10:58, Roger Pau Monné wrote:
> On Wed, Jan 12, 2022 at 04:52:51PM +0100, Jan Beulich wrote:
>> On 12.01.2022 16:42, Roger Pau Monné wrote:
>>> On Wed, Jan 12, 2022 at 03:57:36PM +0100, Jan Beulich wrote:
 On 25.11.2021 12:02, Oleksandr Andrushchenko wrote:
> @@ -379,7 +396,6 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, 
> unsigned int size)
>   
>   data = merge_result(data, tmp_data, size - data_offset, 
> data_offset);
>   }
> -spin_unlock(>vpci->lock);
>   
>   return data & (0x >> (32 - 8 * size));
>   }
 Here and ...

> @@ -475,13 +498,12 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, 
> unsigned int size,
>   break;
>   ASSERT(data_offset < size);
>   }
> +spin_unlock(>vpci_lock);
>   
>   if ( data_offset < size )
>   /* Tailing gap, write the remaining. */
>   vpci_write_hw(sbdf, reg + data_offset, size - data_offset,
> data >> (data_offset * 8));
> -
> -spin_unlock(>vpci->lock);
>   }
 ... even more so here I'm not sure of the correctness of the moving
 you do: While pdev->vpci indeed doesn't get accessed, I wonder
 whether there wasn't an intention to avoid racing calls to
 vpci_{read,write}_hw() this way. In any event I think such movement
 would need justification in the description.
>>> I agree about the need for justification in the commit message, or
>>> even better this being split into a pre-patch, as it's not related to
>>> the lock switching done here.
>>>
>>> I do think this is fine however, as racing calls to
>>> vpci_{read,write}_hw() shouldn't be a problem. Those are just wrappers
>>> around pci_conf_{read,write} functions, and the required locking (in
>>> case of using the IO ports) is already taken care in
>>> pci_conf_{read,write}.
>> IOW you consider it acceptable for a guest (really: Dom0) read racing
>> a write to read back only part of what was written (so far)? I would
>> think individual multi-byte reads and writes should appear atomic to
>> the guest.
> We split 64bit writes into two 32bit ones without taking the lock for
> the whole duration of the access, so it's already possible to see a
> partially updated state as a result of a 64bit write.
>
> I'm going over the PCI(e) spec but I don't seem to find anything about
> whether the ECAM is allowed to split memory transactions into multiple
> Configuration Requests, and whether those could then interleave with
> requests from a different CPU.
So, with the above is it still fine for you to have the change as is or
you want this optimization to go into a dedicated patch before this one?
>
> Thanks, Roger.
Thank you,
Oleksandr

[libvirt test] 167930: regressions - FAIL

2022-01-28 Thread osstest service owner
flight 167930 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167930/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 151777

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-raw   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  6abfe77f0b8d13b9208674330177532f61009b7e
baseline version:
 libvirt  2c846fa6bcc11929c9fb857a22430fb9945654ad

Last test of basis   151777  2020-07-10 04:19:19 Z  567 days
Failing since151818  2020-07-11 04:18:52 Z  566 days  548 attempts
Testing same since   167930  2022-01-28 04:18:52 Z0 days1 attempts


People who touched revisions under test:
Adolfo Jayme Barrientos 
  Aleksandr Alekseev 
  Aleksei Zakharov 
  Andika Triwidada 
  Andrea Bolognani 
  Ani Sinha 
  Balázs Meskó 
  Barrett Schonefeld 
  Bastian Germann 
  Bastien Orivel 
  BiaoXiang Ye 
  Bihong Yu 
  Binfeng Wu 
  Bjoern Walk 
  Boris Fiuczynski 
  Brad Laue 
  Brian Turek 
  Bruno Haible 
  Chris Mayo 
  Christian Borntraeger 
  Christian Ehrhardt 
  Christian Kirbach 
  Christian Schoenebeck 
  Christophe Fergeau 
  Cole Robinson 
  Collin Walling 
  Cornelia Huck 
  Cédric Bosdonnat 
  Côme Borsoi 
  Daniel Henrique Barboza 
  Daniel Letai 
  Daniel P. Berrange 
  Daniel P. Berrangé 
  Didik Supriadi 
  dinglimin 
  Divya Garg 
  Dmitrii Shcherbakov 
  Dmytro Linkin 
  Eiichi Tsukata 
  Eric Farman 
  Erik Skultety 
  Fabian Affolter 
  Fabian Freyer 
  Fabiano Fidêncio 
  Fangge Jin 
  Farhan Ali 
  Fedora Weblate Translation 
  Franck Ridel 
  Gavi Teitz 
  gongwei 
  Guoyi Tu
  Göran Uddeborg 
  Halil Pasic 
  Han Han 
  Hao Wang 
  Hela Basa 
  Helmut Grohne 
  Hiroki Narukawa 
  Hyman Huang(黄勇) 
  Ian Wienand 
  Ioanna Alifieraki 
  Ivan Teterevkov 
  Jakob Meng 
  Jamie Strandboge 
  Jamie Strandboge 
  Jan Kuparinen 
  jason lee 
  Jean-Baptiste Holcroft 
  Jia Zhou 
  Jianan Gao 
  Jim Fehlig 
  Jin Yan 
  Jinsheng Zhang 
  Jiri Denemark 
  Joachim Falk 
  John Ferlan 
  Jonathan Watt 
  Jonathon Jongsma 
  Julio Faracco 
  Justin Gatzen 
  Ján Tomko 
  Kashyap Chamarthy 
  Kevin Locke 
  Koichi Murase 
  Kristina Hanicova 
  Laine Stump 
  Laszlo Ersek 
  Lee Yarwood 
  Lei Yang 
  Liao Pingfang 
  Lin Ma 
  Lin Ma 
  Lin Ma 
  Liu Yiding 
  Luke Yue 
  Luyao Zhong 
  Marc Hartmayer 
  Marc-André Lureau 
  Marek Marczykowski-Górecki 
  Markus Schade 
  Martin Kletzander 
  Masayoshi Mizuma 
  Matej Cepl 
  Matt Coleman 
  Matt Coleman 
  Mauro Matteo Cascella 
  Meina Li 
  Michal Privoznik 
  Michał Smyk 
  Milo Casagrande 
  Moshe Levi 
  Muha Aliss 
  Nathan 
  Neal Gompa 
  Nick Chevsky 
  Nick Shyrokovskiy 
  Nickys Music Group 
  Nico Pache 
  Nicolas Lécureuil 
  Nicolas Lécureuil 
  Nikolay Shirokovskiy 
  Olaf Hering 
  Olesya Gerasimenko 
  Or Ozeri 
  Orion Poplawski 
  Pany 
  Patrick Magauran 
  Paulo de Rezende Pinatti 
  Pavel Hrdina 
  Peng Liang 
  Peter Krempa 
  Pino Toscano 
  Pino Toscano 
  Piotr Drąg 
  Prathamesh Chavan 
  Praveen K Paladugu 
  Richard W.M. Jones 
  Ricky Tigg 
  Robin Lee 
  Rohit Kumar 
  Roman Bogorodskiy 
  Roman Bolshakov 
  Ryan Gahagan 
  Ryan Schmidt 
  Sam Hartman 
  Scott Shambarger 
  Sebastian Mitterle 
  SeongHyun Jo 
  Shalini Chellathurai Saroja 
  Shaojun Yang 
  shenjiatong 
  Shi Lei 
  simmon 
  Simon 

[PATCH v2 0/9] x86: MSR_SPEC_CTRL support for SVM guests

2022-01-28 Thread Andrew Cooper
Fixes/extensions to allow HVM guests to use AMD hardware MSR_SPEC_CTRL
facilities.

No PV support yet - that will require some substantially more careful
unpicking of the PV entry/exit asm.

Andrew Cooper (9):
  x86/cpuid: Advertise SSB_NO to guests by default
  x86/spec-ctrl: Drop use_spec_ctrl boolean
  x86/spec-ctrl: Introduce new has_spec_ctrl boolean
  x86/spec-ctrl: Don't use spec_ctrl_{enter,exit}_idle() for S3
  x86/spec-ctrl: Record the last write to MSR_SPEC_CTRL
  x86/spec-ctrl: Use common MSR_SPEC_CTRL logic for AMD
  x86/svm: VMEntry/Exit logic for MSR_SPEC_CTRL
  x86/msr: AMD MSR_SPEC_CTRL infrastructure
  x86/cpuid: Enable MSR_SPEC_CTRL in SVM guests by default

 xen/arch/x86/acpi/power.c   |  8 +++-
 xen/arch/x86/cpu/amd.c  |  2 +-
 xen/arch/x86/cpuid.c| 16 +--
 xen/arch/x86/hvm/svm/entry.S| 12 +++---
 xen/arch/x86/hvm/svm/svm.c  | 40 +
 xen/arch/x86/include/asm/current.h  |  2 +-
 xen/arch/x86/include/asm/hvm/svm/svm.h  |  3 ++
 xen/arch/x86/include/asm/msr.h  |  9 
 xen/arch/x86/include/asm/spec_ctrl_asm.h|  7 +++
 xen/arch/x86/msr.c  |  8 ++--
 xen/arch/x86/setup.c|  5 ++-
 xen/arch/x86/smpboot.c  |  7 ++-
 xen/arch/x86/spec_ctrl.c| 66 -
 xen/include/public/arch-x86/cpufeatureset.h | 18 
 xen/tools/gen-cpuid.py  | 14 +++---
 15 files changed, 166 insertions(+), 51 deletions(-)

-- 
2.11.0




[PATCH v2 3/9] x86/spec-ctrl: Introduce new has_spec_ctrl boolean

2022-01-28 Thread Andrew Cooper
Most MSR_SPEC_CTRL setup will be common between Intel and AMD.  Instead of
opencoding an OR of two features everywhere, introduce has_spec_ctrl instead.

Reword the comment above the Intel specific alternatives block to highlight
that it is Intel specific, and pull the setting of default_xen_spec_ctrl.IBRS
out because it will want to be common.

No functional change.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 
---
 xen/arch/x86/spec_ctrl.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
index 8a550d0a0902..2072daf66245 100644
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -927,7 +927,7 @@ static __init void mds_calculations(uint64_t caps)
 void __init init_speculation_mitigations(void)
 {
 enum ind_thunk thunk = THUNK_DEFAULT;
-bool ibrs = false, hw_smt_enabled;
+bool has_spec_ctrl, ibrs = false, hw_smt_enabled;
 bool cpu_has_bug_taa;
 uint64_t caps = 0;
 
@@ -936,6 +936,8 @@ void __init init_speculation_mitigations(void)
 
 hw_smt_enabled = check_smt_enabled();
 
+has_spec_ctrl = boot_cpu_has(X86_FEATURE_IBRSB);
+
 /*
  * First, disable the use of retpolines if Xen is using shadow stacks, as
  * they are incompatible.
@@ -973,11 +975,11 @@ void __init init_speculation_mitigations(void)
  */
 else if ( retpoline_safe(caps) )
 thunk = THUNK_RETPOLINE;
-else if ( boot_cpu_has(X86_FEATURE_IBRSB) )
+else if ( has_spec_ctrl )
 ibrs = true;
 }
 /* Without compiler thunk support, use IBRS if available. */
-else if ( boot_cpu_has(X86_FEATURE_IBRSB) )
+else if ( has_spec_ctrl )
 ibrs = true;
 }
 
@@ -1008,10 +1010,7 @@ void __init init_speculation_mitigations(void)
 else if ( thunk == THUNK_JMP )
 setup_force_cpu_cap(X86_FEATURE_IND_THUNK_JMP);
 
-/*
- * If we are on hardware supporting MSR_SPEC_CTRL, see about setting up
- * the alternatives blocks so we can virtualise support for guests.
- */
+/* Intel hardware: MSR_SPEC_CTRL alternatives setup. */
 if ( boot_cpu_has(X86_FEATURE_IBRSB) )
 {
 if ( opt_msr_sc_pv )
@@ -1030,11 +1029,12 @@ void __init init_speculation_mitigations(void)
 default_spec_ctrl_flags |= SCF_ist_wrmsr;
 setup_force_cpu_cap(X86_FEATURE_SC_MSR_HVM);
 }
-
-if ( ibrs )
-default_xen_spec_ctrl |= SPEC_CTRL_IBRS;
 }
 
+/* If we have IBRS available, see whether we should use it. */
+if ( has_spec_ctrl && ibrs )
+default_xen_spec_ctrl |= SPEC_CTRL_IBRS;
+
 /* If we have SSBD available, see whether we should use it. */
 if ( boot_cpu_has(X86_FEATURE_SSBD) && opt_ssbd )
 default_xen_spec_ctrl |= SPEC_CTRL_SSBD;
@@ -1268,7 +1268,7 @@ void __init init_speculation_mitigations(void)
  * boot won't have any other code running in a position to mount an
  * attack.
  */
-if ( boot_cpu_has(X86_FEATURE_IBRSB) )
+if ( has_spec_ctrl )
 {
 bsp_delay_spec_ctrl = !cpu_has_hypervisor && default_xen_spec_ctrl;
 
-- 
2.11.0




[PATCH v2 5/9] x86/spec-ctrl: Record the last write to MSR_SPEC_CTRL

2022-01-28 Thread Andrew Cooper
In some cases, writes to MSR_SPEC_CTRL do not have interesting side effects,
and we should implement lazy context switching like we do with other MSRs.

In the short term, this will be used by the SVM infrastructure, but I expect
to extend it to other contexts in due course.

Introduce cpu_info.last_spec_ctrl for the purpose, and cache writes made from
the boot/resume paths.  The value can't live in regular per-cpu data when it
is eventually used for PV guests when XPTI might be active.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

v2:
 * New
---
 xen/arch/x86/acpi/power.c|  3 +++
 xen/arch/x86/include/asm/current.h   |  2 +-
 xen/arch/x86/include/asm/spec_ctrl_asm.h |  4 
 xen/arch/x86/setup.c |  5 -
 xen/arch/x86/smpboot.c   |  5 +
 xen/arch/x86/spec_ctrl.c | 10 +++---
 6 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index ea2bd8bbfe93..5f2ec74f744a 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -296,7 +296,10 @@ static int enter_state(u32 state)
 ci->spec_ctrl_flags |= (default_spec_ctrl_flags & SCF_ist_wrmsr);
 
 if ( boot_cpu_has(X86_FEATURE_IBRSB) )
+{
 wrmsrl(MSR_SPEC_CTRL, default_xen_mcu_opt_ctrl);
+ci->last_spec_ctrl = default_xen_mcu_opt_ctrl;
+}
 
 if ( boot_cpu_has(X86_FEATURE_SRBDS_CTRL) )
 wrmsrl(MSR_MCU_OPT_CTRL, default_xen_mcu_opt_ctrl);
diff --git a/xen/arch/x86/include/asm/current.h 
b/xen/arch/x86/include/asm/current.h
index cfbedc31983f..dc0edd9ed07d 100644
--- a/xen/arch/x86/include/asm/current.h
+++ b/xen/arch/x86/include/asm/current.h
@@ -56,6 +56,7 @@ struct cpu_info {
 /* See asm/spec_ctrl_asm.h for usage. */
 unsigned int shadow_spec_ctrl;
 uint8_t  xen_spec_ctrl;
+uint8_t  last_spec_ctrl;
 uint8_t  spec_ctrl_flags;
 
 /*
@@ -73,7 +74,6 @@ struct cpu_info {
  */
 bool use_pv_cr3;
 
-unsigned long __pad;
 /* get_stack_bottom() must be 16-byte aligned */
 };
 
diff --git a/xen/arch/x86/include/asm/spec_ctrl_asm.h 
b/xen/arch/x86/include/asm/spec_ctrl_asm.h
index bf82528a12ae..9c0c7622c41f 100644
--- a/xen/arch/x86/include/asm/spec_ctrl_asm.h
+++ b/xen/arch/x86/include/asm/spec_ctrl_asm.h
@@ -67,6 +67,10 @@
  * steps 2 and 6 will restore the shadow value rather than leaving Xen's value
  * loaded and corrupting the value used in guest context.
  *
+ * Additionally, in some cases it is safe to skip writes to MSR_SPEC_CTRL when
+ * we don't require any of the side effects of an identical write.  Maintain a
+ * per-cpu last_spec_ctrl value for this purpose.
+ *
  * The following ASM fragments implement this algorithm.  See their local
  * comments for further details.
  *  - SPEC_CTRL_ENTRY_FROM_PV
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index e716005145d3..115f8f651734 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1930,9 +1930,12 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
 if ( bsp_delay_spec_ctrl )
 {
-get_cpu_info()->spec_ctrl_flags &= ~SCF_use_shadow;
+struct cpu_info *info = get_cpu_info();
+
+info->spec_ctrl_flags &= ~SCF_use_shadow;
 barrier();
 wrmsrl(MSR_SPEC_CTRL, default_xen_spec_ctrl);
+info->last_spec_ctrl = default_xen_spec_ctrl;
 }
 
 /* Jump to the 1:1 virtual mappings of cpu0_stack. */
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 08c0f2d9df04..1cfdf96207d4 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -323,6 +323,8 @@ static void set_cpu_sibling_map(unsigned int cpu)
 
 void start_secondary(void *unused)
 {
+struct cpu_info *info = get_cpu_info();
+
 /*
  * Dont put anything before smp_callin(), SMP booting is so fragile that we
  * want to limit the things done here to the most necessary things.
@@ -379,7 +381,10 @@ void start_secondary(void *unused)
  * microcode.
  */
 if ( boot_cpu_has(X86_FEATURE_IBRSB) )
+{
 wrmsrl(MSR_SPEC_CTRL, default_xen_spec_ctrl);
+info->last_spec_ctrl = default_xen_spec_ctrl;
+}
 if ( boot_cpu_has(X86_FEATURE_SRBDS_CTRL) )
 wrmsrl(MSR_MCU_OPT_CTRL, default_xen_mcu_opt_ctrl);
 
diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
index 2072daf66245..b2fd86ebe587 100644
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -1270,6 +1270,9 @@ void __init init_speculation_mitigations(void)
  */
 if ( has_spec_ctrl )
 {
+struct cpu_info *info = get_cpu_info();
+unsigned int val;
+
 bsp_delay_spec_ctrl = !cpu_has_hypervisor && default_xen_spec_ctrl;
 
 /*
@@ -1278,15 +1281,16 @@ void __init init_speculation_mitigations(void)
  */
 if ( bsp_delay_spec_ctrl )
 {
-struct cpu_info *info = 

[PATCH v2 4/9] x86/spec-ctrl: Don't use spec_ctrl_{enter,exit}_idle() for S3

2022-01-28 Thread Andrew Cooper
'idle' here refers to hlt/mwait.  The S3 path isn't an idle path - it is a
platform reset.

Conditionally clearing IBRS and flushing the store buffers on the way down is
a waste of time.

Furthermore, we want to load default_xen_mcu_opt_ctrl unilaterally on the way
back up.  Currently it happens as a side effect of X86_FEATURE_SC_MSR_IDLE or
the next return-to-guest, but that's fragile behaviour.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

v2:
 * New
---
 xen/arch/x86/acpi/power.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index 31a56f02d083..ea2bd8bbfe93 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -248,7 +248,6 @@ static int enter_state(u32 state)
 error = 0;
 
 ci = get_cpu_info();
-spec_ctrl_enter_idle(ci);
 /* Avoid NMI/#MC using MSR_SPEC_CTRL until we've reloaded microcode. */
 ci->spec_ctrl_flags &= ~SCF_ist_wrmsr;
 
@@ -295,7 +294,9 @@ static int enter_state(u32 state)
 
 /* Re-enabled default NMI/#MC use of MSR_SPEC_CTRL. */
 ci->spec_ctrl_flags |= (default_spec_ctrl_flags & SCF_ist_wrmsr);
-spec_ctrl_exit_idle(ci);
+
+if ( boot_cpu_has(X86_FEATURE_IBRSB) )
+wrmsrl(MSR_SPEC_CTRL, default_xen_mcu_opt_ctrl);
 
 if ( boot_cpu_has(X86_FEATURE_SRBDS_CTRL) )
 wrmsrl(MSR_MCU_OPT_CTRL, default_xen_mcu_opt_ctrl);
-- 
2.11.0




[PATCH v2 2/9] x86/spec-ctrl: Drop use_spec_ctrl boolean

2022-01-28 Thread Andrew Cooper
Several bugfixes have reduced the utility of this variable from it's original
purpose, and now all it does is aid in the setup of SCF_ist_wrmsr.

Simplify the logic by drop the variable, and doubling up the setting of
SCF_ist_wrmsr for the PV and HVM blocks, which will make the AMD SPEC_CTRL
support easier to follow.  Leave a comment explaining why SCF_ist_wrmsr is
still necessary for the VMExit case.

No functional change.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 
---
 xen/arch/x86/spec_ctrl.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
index c18cc8aa493a..8a550d0a0902 100644
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -927,7 +927,7 @@ static __init void mds_calculations(uint64_t caps)
 void __init init_speculation_mitigations(void)
 {
 enum ind_thunk thunk = THUNK_DEFAULT;
-bool use_spec_ctrl = false, ibrs = false, hw_smt_enabled;
+bool ibrs = false, hw_smt_enabled;
 bool cpu_has_bug_taa;
 uint64_t caps = 0;
 
@@ -1016,19 +1016,21 @@ void __init init_speculation_mitigations(void)
 {
 if ( opt_msr_sc_pv )
 {
-use_spec_ctrl = true;
+default_spec_ctrl_flags |= SCF_ist_wrmsr;
 setup_force_cpu_cap(X86_FEATURE_SC_MSR_PV);
 }
 
 if ( opt_msr_sc_hvm )
 {
-use_spec_ctrl = true;
+/*
+ * While the guest MSR_SPEC_CTRL value is loaded/saved atomically,
+ * Xen's value is not restored atomically.  An early NMI hitting
+ * the VMExit path needs to restore Xen's value for safety.
+ */
+default_spec_ctrl_flags |= SCF_ist_wrmsr;
 setup_force_cpu_cap(X86_FEATURE_SC_MSR_HVM);
 }
 
-if ( use_spec_ctrl )
-default_spec_ctrl_flags |= SCF_ist_wrmsr;
-
 if ( ibrs )
 default_xen_spec_ctrl |= SPEC_CTRL_IBRS;
 }
-- 
2.11.0




[PATCH v2 1/9] x86/cpuid: Advertise SSB_NO to guests by default

2022-01-28 Thread Andrew Cooper
This is a statement of hardware behaviour, and not related to controls for the
guest kernel to use.  Pass it straight through from hardware.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

Not currently enumerated by any CPU I'm aware of.

v2:
 * New
---
 xen/include/public/arch-x86/cpufeatureset.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/include/public/arch-x86/cpufeatureset.h 
b/xen/include/public/arch-x86/cpufeatureset.h
index 6e44148a0901..fd8ab2572304 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -266,7 +266,7 @@ XEN_CPUFEATURE(NO_LMSL,   8*32+20) /*S  EFER.LMSLE no 
longer supported. */
 XEN_CPUFEATURE(AMD_PPIN,  8*32+23) /*   Protected Processor Inventory 
Number */
 XEN_CPUFEATURE(AMD_SSBD,  8*32+24) /*   MSR_SPEC_CTRL.SSBD available */
 XEN_CPUFEATURE(VIRT_SSBD, 8*32+25) /*   MSR_VIRT_SPEC_CTRL.SSBD */
-XEN_CPUFEATURE(SSB_NO,8*32+26) /*   Hardware not vulnerable to SSB */
+XEN_CPUFEATURE(SSB_NO,8*32+26) /*A  Hardware not vulnerable to SSB */
 XEN_CPUFEATURE(PSFD,  8*32+28) /*   MSR_SPEC_CTRL.PSFD */
 
 /* Intel-defined CPU features, CPUID level 0x0007:0.edx, word 9 */
-- 
2.11.0




[PATCH v2 6/9] x86/spec-ctrl: Use common MSR_SPEC_CTRL logic for AMD

2022-01-28 Thread Andrew Cooper
Currently, amd_init_ssbd() works by being the only write to MSR_SPEC_CTRL in
the system.  This ceases to be true when using the common logic.

Include AMD MSR_SPEC_CTRL in has_spec_ctrl to activate the common paths, and
introduce an AMD specific block to control alternatives.  Also update the
boot/resume paths to configure default_xen_spec_ctrl.

svm.h needs an adjustment to remove a dependency on include order.

For now, only active alternatives for HVM - PV will require more work.  No
functional change, as no alternatives are defined yet for HVM yet.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

v2:
 * Fix build in some PV Shim configurations
 * Adjust boot/resume paths too
 * Adjust commit message after rearranging in the series
 * Fix typo in comment
---
 xen/arch/x86/acpi/power.c  |  2 +-
 xen/arch/x86/cpu/amd.c |  2 +-
 xen/arch/x86/include/asm/hvm/svm/svm.h |  3 +++
 xen/arch/x86/smpboot.c |  2 +-
 xen/arch/x86/spec_ctrl.c   | 26 --
 5 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index 5f2ec74f744a..0eae29b5687a 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -295,7 +295,7 @@ static int enter_state(u32 state)
 /* Re-enabled default NMI/#MC use of MSR_SPEC_CTRL. */
 ci->spec_ctrl_flags |= (default_spec_ctrl_flags & SCF_ist_wrmsr);
 
-if ( boot_cpu_has(X86_FEATURE_IBRSB) )
+if ( boot_cpu_has(X86_FEATURE_IBRSB) || boot_cpu_has(X86_FEATURE_IBRS) )
 {
 wrmsrl(MSR_SPEC_CTRL, default_xen_mcu_opt_ctrl);
 ci->last_spec_ctrl = default_xen_mcu_opt_ctrl;
diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index f87484b7ce61..a8e37dbb1f5c 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -693,7 +693,7 @@ void amd_init_ssbd(const struct cpuinfo_x86 *c)
return;
 
if (cpu_has_amd_ssbd) {
-   wrmsrl(MSR_SPEC_CTRL, opt_ssbd ? SPEC_CTRL_SSBD : 0);
+   /* Handled by common MSR_SPEC_CTRL logic */
return;
}
 
diff --git a/xen/arch/x86/include/asm/hvm/svm/svm.h 
b/xen/arch/x86/include/asm/hvm/svm/svm.h
index 05e968502694..09c32044ec8a 100644
--- a/xen/arch/x86/include/asm/hvm/svm/svm.h
+++ b/xen/arch/x86/include/asm/hvm/svm/svm.h
@@ -45,6 +45,9 @@ static inline void svm_invlpga(unsigned long linear, uint32_t 
asid)
 "a" (linear), "c" (asid));
 }
 
+struct cpu_user_regs;
+struct vcpu;
+
 unsigned long *svm_msrbit(unsigned long *msr_bitmap, uint32_t msr);
 void __update_guest_eip(struct cpu_user_regs *regs, unsigned int inst_len);
 void svm_update_guest_cr(struct vcpu *, unsigned int cr, unsigned int flags);
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 1cfdf96207d4..22ae4c1b2de9 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -380,7 +380,7 @@ void start_secondary(void *unused)
  * settings.  Note: These MSRs may only become available after loading
  * microcode.
  */
-if ( boot_cpu_has(X86_FEATURE_IBRSB) )
+if ( boot_cpu_has(X86_FEATURE_IBRSB) || boot_cpu_has(X86_FEATURE_IBRS) )
 {
 wrmsrl(MSR_SPEC_CTRL, default_xen_spec_ctrl);
 info->last_spec_ctrl = default_xen_spec_ctrl;
diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
index b2fd86ebe587..c210bb662f84 100644
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -936,7 +937,8 @@ void __init init_speculation_mitigations(void)
 
 hw_smt_enabled = check_smt_enabled();
 
-has_spec_ctrl = boot_cpu_has(X86_FEATURE_IBRSB);
+has_spec_ctrl = (boot_cpu_has(X86_FEATURE_IBRSB) ||
+ boot_cpu_has(X86_FEATURE_IBRS));
 
 /*
  * First, disable the use of retpolines if Xen is using shadow stacks, as
@@ -1031,12 +1033,32 @@ void __init init_speculation_mitigations(void)
 }
 }
 
+/* AMD hardware: MSR_SPEC_CTRL alternatives setup. */
+if ( boot_cpu_has(X86_FEATURE_IBRS) )
+{
+/*
+ * Virtualising MSR_SPEC_CTRL for guests depends on SVM support, which
+ * on real hardware matches the availability of MSR_SPEC_CTRL in the
+ * first place.
+ *
+ * No need for SCF_ist_wrmsr because Xen's value is restored
+ * atomically WRT NMIs in the VMExit path.
+ *
+ * TODO Adjust cpu_has_svm_spec_ctrl to be configured earlier on boot.
+ */
+if ( opt_msr_sc_hvm &&
+ (boot_cpu_data.extended_cpuid_level >= 0x800a) &&
+ (cpuid_edx(0x800a) & (1u << SVM_FEATURE_SPEC_CTRL)) )
+setup_force_cpu_cap(X86_FEATURE_SC_MSR_HVM);
+}
+
 /* If we have IBRS available, see whether we should use it. */
 if ( has_spec_ctrl && ibrs )
 default_xen_spec_ctrl |= SPEC_CTRL_IBRS;

[PATCH v2 7/9] x86/svm: VMEntry/Exit logic for MSR_SPEC_CTRL

2022-01-28 Thread Andrew Cooper
Hardware maintains both host and guest versions of MSR_SPEC_CTRL, but guests
run with the logical OR of both values.  Therefore, in principle we want to
clear Xen's value before entering the guest.  However, for migration
compatibility, and for performance reasons with SEV-SNP guests, we want the
ability to use a nonzero value behind the guest's back.  Use vcpu_msrs to hold
this value, with the guest value in the VMCB.

On the VMEntry path, adjusting MSR_SPEC_CTRL must be done after CLGI so as to
be atomic with respect to NMIs/etc.  The loading of spec_ctrl_raw into %eax
was also stale from the unused old code, so can be dropped too.

Implement both pieces of logic as small pieces of C, and alternative the call
to get there based on X86_FEATURE_SC_MSR_HVM.  The use of double alternative
blocks is due to a quirk of the current infrastructure, where call
displacements only get fixed up for the first replacement instruction.  While
adjusting the clobber lists, drop the stale requirements on the VMExit side.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

The RAS[:32] flushing side effect is under reconsideration.  It is actually a
very awkward side effect in practice, and not applicable to any
implementations (that I'm aware of), but for now, it's the documented safe
action to take.  Furthermore, it avoids complicating the logic with an lfence
in the else case for Spectre v1 safety.

v2:
 * Split last_spec_ctrl introduction into earlier patch.
 * Use STR() rather than __stringify() for brevity.
 * Use double alt blocks in order to pass function parameters.
---
 xen/arch/x86/hvm/svm/entry.S | 12 +++-
 xen/arch/x86/hvm/svm/svm.c   | 27 +++
 xen/arch/x86/include/asm/msr.h   |  9 +
 xen/arch/x86/include/asm/spec_ctrl_asm.h |  3 +++
 4 files changed, 46 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/entry.S b/xen/arch/x86/hvm/svm/entry.S
index 276215d36aff..190f7095c65c 100644
--- a/xen/arch/x86/hvm/svm/entry.S
+++ b/xen/arch/x86/hvm/svm/entry.S
@@ -55,11 +55,12 @@ __UNLIKELY_END(nsvm_hap)
 mov  %rsp, %rdi
 call svm_vmenter_helper
 
-mov VCPU_arch_msrs(%rbx), %rax
-mov VCPUMSR_spec_ctrl_raw(%rax), %eax
+clgi
 
 /* WARNING! `ret`, `call *`, `jmp *` not safe beyond this point. */
-/* SPEC_CTRL_EXIT_TO_SVM   (nothing currently) */
+/* SPEC_CTRL_EXIT_TO_SVM   Req:   Clob: C  
 */
+ALTERNATIVE "", STR(mov %rbx, %rdi; mov %rsp, %rsi), 
X86_FEATURE_SC_MSR_HVM
+ALTERNATIVE "", STR(call vmentry_spec_ctrl), X86_FEATURE_SC_MSR_HVM
 
 pop  %r15
 pop  %r14
@@ -78,7 +79,6 @@ __UNLIKELY_END(nsvm_hap)
 pop  %rsi
 pop  %rdi
 
-clgi
 sti
 vmrun
 
@@ -86,8 +86,10 @@ __UNLIKELY_END(nsvm_hap)
 
 GET_CURRENT(bx)
 
-/* SPEC_CTRL_ENTRY_FROM_SVMReq: b=curr %rsp=regs/cpuinfo, Clob: ac 
 */
+/* SPEC_CTRL_ENTRY_FROM_SVMReq:   Clob: C  
 */
 ALTERNATIVE "", DO_OVERWRITE_RSB, X86_FEATURE_SC_RSB_HVM
+ALTERNATIVE "", STR(mov %rsp, %rdi), X86_FEATURE_SC_MSR_HVM
+ALTERNATIVE "", STR(call vmexit_spec_ctrl), X86_FEATURE_SC_MSR_HVM
 /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
 
 stgi
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index bb6b8e560a9f..f753bf48c252 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -3086,6 +3086,33 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
 vmcb_set_vintr(vmcb, intr);
 }
 
+/* Called with GIF=0. */
+void vmexit_spec_ctrl(struct cpu_info *info)
+{
+unsigned int val = info->xen_spec_ctrl;
+
+/*
+ * Write to MSR_SPEC_CTRL unconditionally, for the RAS[:32] flushing side
+ * effect.
+ */
+wrmsr(MSR_SPEC_CTRL, val, 0);
+info->last_spec_ctrl = val;
+}
+
+/* Called with GIF=0. */
+void vmentry_spec_ctrl(const struct vcpu *curr, struct cpu_info *info)
+{
+unsigned int val = curr->arch.msrs->spec_ctrl.raw;
+
+if ( val != info->last_spec_ctrl )
+{
+wrmsr(MSR_SPEC_CTRL, val, 0);
+info->last_spec_ctrl = val;
+}
+
+/* No Spectre v1 concerns.  Execution is going to hit VMRUN imminently. */
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/include/asm/msr.h b/xen/arch/x86/include/asm/msr.h
index 657a3295613d..ce4fe51afe54 100644
--- a/xen/arch/x86/include/asm/msr.h
+++ b/xen/arch/x86/include/asm/msr.h
@@ -297,6 +297,15 @@ struct vcpu_msrs
  *
  * For VT-x guests, the guest value is held in the MSR guest load/save
  * list.
+ *
+ * For SVM, the guest value lives in the VMCB, and hardware saves/restores
+ * the host value automatically.  However, guests run with the OR of the
+ * host and guest value, which allows Xen to set protections behind the
+ * 

[PATCH v2 9/9] x86/cpuid: Enable MSR_SPEC_CTRL in SVM guests by default

2022-01-28 Thread Andrew Cooper
With all other pieces in place, MSR_SPEC_CTRL is fully working for HVM guests.

Update the CPUID derivation logic (both PV and HVM to avoid losing subtle
changes), drop the MSR intercept, and explicitly enable the CPUID bits for HVM
guests.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

v2:
 * Drop the MSR intercept too
 * Rework the comment block in gen-cpuid.py
 * Fix typo in comment
---
 xen/arch/x86/cpuid.c| 16 
 xen/arch/x86/hvm/svm/svm.c  |  4 
 xen/include/public/arch-x86/cpufeatureset.h | 16 
 xen/tools/gen-cpuid.py  | 14 +-
 4 files changed, 33 insertions(+), 17 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index b5af48324aef..e24dd283e761 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -433,6 +433,8 @@ static void __init 
guest_common_feature_adjustments(uint32_t *fs)
  */
 if ( test_bit(X86_FEATURE_IBRSB, fs) )
 __set_bit(X86_FEATURE_STIBP, fs);
+if ( test_bit(X86_FEATURE_IBRS, fs) )
+__set_bit(X86_FEATURE_AMD_STIBP, fs);
 
 /*
  * On hardware which supports IBRS/IBPB, we can offer IBPB independently
@@ -456,11 +458,14 @@ static void __init calculate_pv_max_policy(void)
 pv_featureset[i] &= pv_max_featuremask[i];
 
 /*
- * If Xen isn't virtualising MSR_SPEC_CTRL for PV guests because of
- * administrator choice, hide the feature.
+ * If Xen isn't virtualising MSR_SPEC_CTRL for PV guests (functional
+ * availability, or admin choice), hide the feature.
  */
 if ( !boot_cpu_has(X86_FEATURE_SC_MSR_PV) )
+{
 __clear_bit(X86_FEATURE_IBRSB, pv_featureset);
+__clear_bit(X86_FEATURE_IBRS, pv_featureset);
+}
 
 guest_common_feature_adjustments(pv_featureset);
 
@@ -530,11 +535,14 @@ static void __init calculate_hvm_max_policy(void)
 __set_bit(X86_FEATURE_SEP, hvm_featureset);
 
 /*
- * If Xen isn't virtualising MSR_SPEC_CTRL for HVM guests because of
- * administrator choice, hide the feature.
+ * If Xen isn't virtualising MSR_SPEC_CTRL for HVM guests (functional
+ * availability, or admin choice), hide the feature.
  */
 if ( !boot_cpu_has(X86_FEATURE_SC_MSR_HVM) )
+{
 __clear_bit(X86_FEATURE_IBRSB, hvm_featureset);
+__clear_bit(X86_FEATURE_IBRS, hvm_featureset);
+}
 
 /*
  * With VT-x, some features are only supported by Xen if dedicated
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index aa82fe29befb..01ce6c71b5f8 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -606,6 +606,10 @@ static void svm_cpuid_policy_changed(struct vcpu *v)
 
 vmcb_set_exception_intercepts(vmcb, bitmap);
 
+/* Give access to MSR_SPEC_CTRL if the guest has been told about it. */
+svm_intercept_msr(v, MSR_SPEC_CTRL,
+  cp->extd.ibrs ? MSR_INTERCEPT_NONE : MSR_INTERCEPT_RW);
+
 /* Give access to MSR_PRED_CMD if the guest has been told about it. */
 svm_intercept_msr(v, MSR_PRED_CMD,
   cp->extd.ibpb ? MSR_INTERCEPT_NONE : MSR_INTERCEPT_RW);
diff --git a/xen/include/public/arch-x86/cpufeatureset.h 
b/xen/include/public/arch-x86/cpufeatureset.h
index fd8ab2572304..957df23b65f2 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -256,18 +256,18 @@ XEN_CPUFEATURE(CLZERO,8*32+ 0) /*A  CLZERO 
instruction */
 XEN_CPUFEATURE(RSTR_FP_ERR_PTRS, 8*32+ 2) /*A  (F)X{SAVE,RSTOR} always 
saves/restores FPU Error pointers */
 XEN_CPUFEATURE(WBNOINVD,  8*32+ 9) /*   WBNOINVD instruction */
 XEN_CPUFEATURE(IBPB,  8*32+12) /*A  IBPB support only (no IBRS, used 
by AMD) */
-XEN_CPUFEATURE(IBRS,  8*32+14) /*   MSR_SPEC_CTRL.IBRS */
-XEN_CPUFEATURE(AMD_STIBP, 8*32+15) /*   MSR_SPEC_CTRL.STIBP */
-XEN_CPUFEATURE(IBRS_ALWAYS,   8*32+16) /*   IBRS preferred always on */
-XEN_CPUFEATURE(STIBP_ALWAYS,  8*32+17) /*   STIBP preferred always on */
-XEN_CPUFEATURE(IBRS_FAST, 8*32+18) /*   IBRS preferred over software 
options */
-XEN_CPUFEATURE(IBRS_SAME_MODE, 8*32+19) /*   IBRS provides same-mode 
protection */
+XEN_CPUFEATURE(IBRS,  8*32+14) /*S  MSR_SPEC_CTRL.IBRS */
+XEN_CPUFEATURE(AMD_STIBP, 8*32+15) /*S  MSR_SPEC_CTRL.STIBP */
+XEN_CPUFEATURE(IBRS_ALWAYS,   8*32+16) /*S  IBRS preferred always on */
+XEN_CPUFEATURE(STIBP_ALWAYS,  8*32+17) /*S  STIBP preferred always on */
+XEN_CPUFEATURE(IBRS_FAST, 8*32+18) /*S  IBRS preferred over software 
options */
+XEN_CPUFEATURE(IBRS_SAME_MODE, 8*32+19) /*S  IBRS provides same-mode 
protection */
 XEN_CPUFEATURE(NO_LMSL,   8*32+20) /*S  EFER.LMSLE no longer supported. */
 XEN_CPUFEATURE(AMD_PPIN,  8*32+23) /*   Protected Processor Inventory 
Number */
-XEN_CPUFEATURE(AMD_SSBD,  8*32+24) /*   MSR_SPEC_CTRL.SSBD available */

[PATCH v2 8/9] x86/msr: AMD MSR_SPEC_CTRL infrastructure

2022-01-28 Thread Andrew Cooper
Fill in VMCB accessors for spec_ctrl in svm_{get,set}_reg(), and CPUID checks
for all supported bits in guest_{rd,wr}msr().

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 
---
 xen/arch/x86/hvm/svm/svm.c | 9 +
 xen/arch/x86/msr.c | 8 +---
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index f753bf48c252..aa82fe29befb 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -2471,10 +2471,14 @@ static bool svm_get_pending_event(struct vcpu *v, 
struct x86_event *info)
 
 static uint64_t svm_get_reg(struct vcpu *v, unsigned int reg)
 {
+const struct vmcb_struct *vmcb = v->arch.hvm.svm.vmcb;
 struct domain *d = v->domain;
 
 switch ( reg )
 {
+case MSR_SPEC_CTRL:
+return vmcb->spec_ctrl;
+
 default:
 printk(XENLOG_G_ERR "%s(%pv, 0x%08x) Bad register\n",
__func__, v, reg);
@@ -2485,10 +2489,15 @@ static uint64_t svm_get_reg(struct vcpu *v, unsigned 
int reg)
 
 static void svm_set_reg(struct vcpu *v, unsigned int reg, uint64_t val)
 {
+struct vmcb_struct *vmcb = v->arch.hvm.svm.vmcb;
 struct domain *d = v->domain;
 
 switch ( reg )
 {
+case MSR_SPEC_CTRL:
+vmcb->spec_ctrl = val;
+break;
+
 default:
 printk(XENLOG_G_ERR "%s(%pv, 0x%08x, 0x%016"PRIx64") Bad register\n",
__func__, v, reg, val);
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 5e80c8b47c21..4ac5b5a048eb 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -265,7 +265,7 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
 break;
 
 case MSR_SPEC_CTRL:
-if ( !cp->feat.ibrsb )
+if ( !cp->feat.ibrsb && !cp->extd.ibrs )
 goto gp_fault;
 goto get_reg;
 
@@ -442,7 +442,8 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
  */
 uint64_t msr_spec_ctrl_valid_bits(const struct cpuid_policy *cp)
 {
-bool ssbd = cp->feat.ssbd;
+bool ssbd = cp->feat.ssbd || cp->extd.amd_ssbd;
+bool psfd = cp->extd.psfd;
 
 /*
  * Note: SPEC_CTRL_STIBP is specified as safe to use (i.e. ignored)
@@ -450,6 +451,7 @@ uint64_t msr_spec_ctrl_valid_bits(const struct cpuid_policy 
*cp)
  */
 return (SPEC_CTRL_IBRS | SPEC_CTRL_STIBP |
 (ssbd   ? SPEC_CTRL_SSBD   : 0) |
+(psfd   ? SPEC_CTRL_PSFD   : 0) |
 0);
 }
 
@@ -526,7 +528,7 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
 break;
 
 case MSR_SPEC_CTRL:
-if ( !cp->feat.ibrsb ||
+if ( (!cp->feat.ibrsb && !cp->extd.ibrs) ||
  (val & ~msr_spec_ctrl_valid_bits(cp)) )
 goto gp_fault;
 goto set_reg;
-- 
2.11.0




Re: [PATCH v5 14/14] vpci: add TODO for the registers not explicitly handled

2022-01-28 Thread Oleksandr Andrushchenko
Hello, Roger, Jan!

On 13.01.22 15:38, Jan Beulich wrote:
> On 13.01.2022 14:27, Roger Pau Monné wrote:
>> On Thu, Nov 25, 2021 at 12:17:32PM +0100, Jan Beulich wrote:
>>> On 25.11.2021 12:02, Oleksandr Andrushchenko wrote:
 From: Oleksandr Andrushchenko 

 For unprivileged guests vpci_{read|write} need to be re-worked
 to not passthrough accesses to the registers not explicitly handled
 by the corresponding vPCI handlers: without fixing that passthrough
 to guests is completely unsafe as Xen allows them full access to
 the registers.

 Xen needs to be sure that every register a guest accesses is not
 going to cause the system to malfunction, so Xen needs to keep a
 list of the registers it is safe for a guest to access.

 For example, we should only expose the PCI capabilities that we know
 are safe for a guest to use, i.e.: MSI and MSI-X initially.
 The rest of the capabilities should be blocked from guest access,
 unless we audit them and declare safe for a guest to access.

 As a reference we might want to look at the approach currently used
 by QEMU in order to do PCI passthrough. A very limited set of PCI
 capabilities known to be safe for untrusted access are exposed to the
 guest and registers need to be explicitly handled or else access is
 rejected. Xen needs a fairly similar model in vPCI or else none of
 this will be safe for unprivileged access.

 Add the corresponding TODO comment to highlight there is a problem that
 needs to be fixed.

 Suggested-by: Roger Pau Monné 
 Suggested-by: Jan Beulich 
 Signed-off-by: Oleksandr Andrushchenko 
>>> Looks okay to me in principle, but imo needs to come earlier in the
>>> series, before things actually get exposed to DomU-s.
>> Are domUs really allowed to use this code? Maybe it's done in a
>> separate series, but has_vpci is hardcoded to false on Arm, and
>> X86_EMU_VPCI can only be set for the hardware domain on x86.
That is by intention: we do not want to have this enabled on Arm until
it can really be used...
> I'm not sure either. This series gives the impression of exposing things,
> but I admit I didn't pay attention to has_vpci() being hardcoded on Arm.
...so we enable vPCI on Arm right after we are all set
> Then again there were at least 3 series in parallel originally, with
> interdependencies (iirc) not properly spelled out ...
Sorry about that, we should have said that explicitly
>
> Jan
>
Thank you,
Oleksandr

Re: [XEN PATCH v9 08/30] build: fix enforce unique symbols for recent clang version

2022-01-28 Thread Jan Beulich
On 28.01.2022 13:03, Anthony PERARD wrote:
> On Thu, Jan 27, 2022 at 04:57:20PM +0100, Jan Beulich wrote:
>> On 25.01.2022 12:00, Anthony PERARD wrote:
>>> clang 6.0 and newer behave like gcc in regards for the FILE symbol, so
>>> only the filename rather than the full path to the source file.
>>>
>>> clang 3.8.1-24 (in our debian:stretch container) and 3.5.0-10
>>> (in our debian:jessie container) do store the full path to the source
>>> file in the FILE symbol.
>>>
>>> Also, based on commit 81ecb38b83 ("build: provide option to
>>> disambiguate symbol names"), which were using clang 5, the change of
>>> behavior likely happened in clang 6.0.
>>>
>>> This means that we also need to check clang version to figure out
>>> which command we need to use to redefine symbol.
>>>
>>> Signed-off-by: Anthony PERARD 
>>
>> The "likely" in the description still worries me some. Roger, would
>> you happen to know, or know of a way to find out for sure ("sure"
>> not meaning to exclude the usual risk associated with version
>> number checks)?
> 
> I found f5040b9685a7 ("Make .file directive to have basename only") as
> part of LLVM's "release/6.x" branch (and "llvmorg-6.0.0" tag), but not
> in "release/5.x".
> 
> https://github.com/llvm/llvm-project/commit/f5040b9685a760e584c576e9185295e54635d51e
> 
> This patch would seems to be the one changing the behavior. This still
> suggest clang 6.0.

Oh, thanks for digging this out. May I suggest to replace (or delete)
"likely" then in the description?

Acked-by: Jan Beulich 

Jan




Re: [PATCH v5 04/14] vpci: cancel pending map/unmap on vpci removal

2022-01-28 Thread Oleksandr Andrushchenko


On 12.01.22 17:27, Jan Beulich wrote:
> On 25.11.2021 12:02, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko 
>>
>> When a vPCI is removed for a PCI device it is possible that we have
>> scheduled a delayed work for map/unmap operations for that device.
>> For example, the following scenario can illustrate the problem:
>>
>> pci_physdev_op
>> pci_add_device
>> init_bars -> modify_bars -> defer_map -> 
>> raise_softirq(SCHEDULE_SOFTIRQ)
>> iommu_add_device <- FAILS
>> vpci_remove_device -> xfree(pdev->vpci)
>>
>> leave_hypervisor_to_guest
>> vpci_process_pending: v->vpci.mem != NULL; v->vpci.pdev->vpci == NULL
>>
>> For the hardware domain we continue execution as the worse that
>> could happen is that MMIO mappings are left in place when the
>> device has been deassigned.
>>
>> For unprivileged domains that get a failure in the middle of a vPCI
>> {un}map operation we need to destroy them, as we don't know in which
>> state the p2m is. This can only happen in vpci_process_pending for
>> DomUs as they won't be allowed to call pci_add_device.
>>
>> Signed-off-by: Oleksandr Andrushchenko 
>>
>> ---
>> Cc: Roger Pau Monné 
>> ---
>> Since v4:
>>   - crash guest domain if map/unmap operation didn't succeed
>>   - re-work vpci cancel work to cancel work on all vCPUs
>>   - use new locking scheme with pdev->vpci_lock
>> New in v4
>>
>> Fixes: 86dbcf6e30cb ("vpci: cancel pending map/unmap on vpci removal")
> What is this about?
Just a leftover after squashing WIP patches, sorry
>
> Jan
>
Thank you,
Oleksandr

Re: [XEN v1] xen/arm: io: Check ESR_EL2.ISV != 0 before searching for a MMIO handler

2022-01-28 Thread Ayan Kumar Halder

Hi Julien/Stefano,

Good discussion to learn about Xen (from a newbie's perspective). :)

I am trying to clarify my understanding. Some queries as below :-

On 28/01/2022 09:46, Julien Grall wrote:



On 28/01/2022 01:20, Stefano Stabellini wrote:

On Thu, 27 Jan 2022, Julien Grall wrote:
On Thu, 27 Jan 2022 at 23:05, Julien Grall 
 wrote:


On Thu, 27 Jan 2022 at 22:40, Stefano Stabellini 
 wrote:

I am with you on both points.

One thing I noticed is that the code today is not able to deal with
IO_UNHANDLED for MMIO regions handled by IOREQ servers or Xen MMIO
emulator handlers. p2m_resolve_translation_fault and try_map_mmio are
called after try_handle_mmio returns IO_UNHANDLED but 
try_handle_mmio is

not called a second time (or am I mistaken?)


Why would you need it? If try_mmio_fault() doesn't work the first 
time, then


Sorry I meant try_handle_mmio().


it will not work the second it.


I think I explained myself badly, I'll try again below.



Another thing I noticed is that currently find_mmio_handler and
try_fwd_ioserv expect dabt to be already populated and valid so it 
would

be better if we could get there only when dabt.valid.

With these two things in mind, I think maybe the best thing to do 
is to

change the code in do_trap_stage2_abort_guest slightly so that
p2m_resolve_translation_fault and try_map_mmio are called first when
!dabt.valid.


An abort will mostly likely happen because of emulated I/O. If we call
p2m_resolve_translation_fault() and try_map_mmio() first, then it 
means

the processing will take longer than necessary for the common case.

So I think we want to keep the order as it is. I.e first trying the 
MMIO

and then falling back to the less likely reason for a trap.


Yeah I thought about it as well. The idea would be that if dabt.valid is
set then we leave things as they are (we call try_handle_mmio first) but
if dabt.valid is not set (it is not valid) then we skip the
try_handle_mmio() call because it wouldn't succeed anyway and go
directly to p2m_resolve_translation_fault() and try_map_mmio().

If either of them work (also reading what you wrote about it) then we
return immediately.


Ok. So the assumption is data abort with invalid syndrome would mostly 
likely be because of a fault handled by p2m_resolve_translation_fault().


I think this makes sense. However, I am not convinced we can currently 
safely call try_map_mmio() before try_handle_mmio(). This is because 
the logic in try_map_mmio() is quite fragile and we may mistakenly map 
an emulated region.


By emulated region, you mean vgic.dbase (Refer 
https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/arm/vgic-v2.c;h=589c033eda8f5e11af33c868eae2c159f985eac9;hb=0bdc43c8dec993258e930b34855853c22b917519#l702, 
which has not been mapped to the guest) and thus requires an MMIO handler.


Is my understanding correcr ?

If so, can Xen mantain a table of such emulated regions ? I am guessing 
that all emulated regions will have a mmio_handler. Then, before 
invoking try_map_mmio(), it can check the table.




Similarly, we can't call try_map_mmio() before 
p2m_resolve_translation_fault() because a transient fault may be

misinterpreted.

I think we may be able to harden try_map_mmio() by checking if the I/O 
region is emulated. But this will need to be fully thought through first.




If not, then we call decode_instruction from do_trap_stage2_abort_guest
and try again. The second time dabt.valid is set so we end up calling
try_handle_mmio() as usual.


With the approach below, you will also end up to call 
p2m_resolve_translation_fault() and try_map_mmio() a second time if 
try_handle_mmio() fails.




Just for clarity let me copy/paste the relevant code, apologies if it
was already obvious to you -- I got the impression my suggestion wasn't
very clear.



+again:
+    if ( is_data && hsr.dabt.valid )
 {
 enum io_state state = try_handle_mmio(regs, hsr, gpa);

 switch ( state )
 {
 case IO_ABORT:
 goto inject_abt;
 case IO_HANDLED:
 advance_pc(regs, hsr);
 return;
 case IO_RETRY:
 /* finish later */
 return;
 case IO_UNHANDLED:
 /* IO unhandled, try another way to handle it. */
 break;
 }
 }

 /*
  * First check if the translation fault can be resolved by the
  * P2M subsystem. If that's the case nothing else to do.
  */
 if ( p2m_resolve_translation_fault(current->domain,
    gaddr_to_gfn(gpa)) )
 return;

 if ( is_data && try_map_mmio(gaddr_to_gfn(gpa)) )
 return;

+    if ( !hsr.dabt.valid )
One more thing I noticed, a stage 2 fault can also happen on an access 
made for a stage 1 translation walk. In this case, I think we don't 
want to decode the instruction.


So 

Re: [XEN PATCH v9 08/30] build: fix enforce unique symbols for recent clang version

2022-01-28 Thread Anthony PERARD
On Thu, Jan 27, 2022 at 04:57:20PM +0100, Jan Beulich wrote:
> On 25.01.2022 12:00, Anthony PERARD wrote:
> > clang 6.0 and newer behave like gcc in regards for the FILE symbol, so
> > only the filename rather than the full path to the source file.
> > 
> > clang 3.8.1-24 (in our debian:stretch container) and 3.5.0-10
> > (in our debian:jessie container) do store the full path to the source
> > file in the FILE symbol.
> > 
> > Also, based on commit 81ecb38b83 ("build: provide option to
> > disambiguate symbol names"), which were using clang 5, the change of
> > behavior likely happened in clang 6.0.
> > 
> > This means that we also need to check clang version to figure out
> > which command we need to use to redefine symbol.
> > 
> > Signed-off-by: Anthony PERARD 
> 
> The "likely" in the description still worries me some. Roger, would
> you happen to know, or know of a way to find out for sure ("sure"
> not meaning to exclude the usual risk associated with version
> number checks)?

I found f5040b9685a7 ("Make .file directive to have basename only") as
part of LLVM's "release/6.x" branch (and "llvmorg-6.0.0" tag), but not
in "release/5.x".

https://github.com/llvm/llvm-project/commit/f5040b9685a760e584c576e9185295e54635d51e

This patch would seems to be the one changing the behavior. This still
suggest clang 6.0.

Cheers,

-- 
Anthony PERARD



Re: [XEN PATCH v9 04/30] build: set ALL_OBJS in main Makefile; move prelink.o to main Makefile

2022-01-28 Thread Jan Beulich
On 28.01.2022 12:32, Anthony PERARD wrote:
> On Thu, Jan 27, 2022 at 04:50:32PM +0100, Jan Beulich wrote:
>> On 25.01.2022 12:00, Anthony PERARD wrote:
>>> --- a/xen/Makefile
>>> +++ b/xen/Makefile
>>> @@ -285,6 +285,16 @@ CFLAGS += -flto
>>>  LDFLAGS-$(CONFIG_CC_IS_CLANG) += -plugin LLVMgold.so
>>>  endif
>>>  
>>> +# Note that link order matters!
>>
>> Merely as a remark: I wonder how applicable that comment is anymore.
>> If anything I'd expect it to be relevant to $(TARGET_SUBARCH)/head.o
>> (Arm) and boot/built_in.o (x86), neither of which get named here.
> 
> Indeed, the order here probably doesn't matter. I tried to build on x86
> with the list reversed (so still leaving boot/ first) and the build
> works. I didn't try to boot it.

It's quite unlikely for the order to matter at build time. Being able
to boot the result is the minimum. Even then you can't be sure you
merely avoided the problematic piece of code on the particular
hardware you did the test on. Perhaps the most fragile parts are
sections holding pointers which get processed in the order the linker
put them. E.g. unexpected interdependencies between initcalls.

> Maybe it's time to retire the comment?

Probably, but Arm folks would want to confirm that's fine on their side
as well.

Jan




Re: [XEN PATCH v9 04/30] build: set ALL_OBJS in main Makefile; move prelink.o to main Makefile

2022-01-28 Thread Anthony PERARD
On Thu, Jan 27, 2022 at 04:50:32PM +0100, Jan Beulich wrote:
> On 25.01.2022 12:00, Anthony PERARD wrote:
> > --- a/xen/Makefile
> > +++ b/xen/Makefile
> > @@ -285,6 +285,16 @@ CFLAGS += -flto
> >  LDFLAGS-$(CONFIG_CC_IS_CLANG) += -plugin LLVMgold.so
> >  endif
> >  
> > +# Note that link order matters!
> 
> Merely as a remark: I wonder how applicable that comment is anymore.
> If anything I'd expect it to be relevant to $(TARGET_SUBARCH)/head.o
> (Arm) and boot/built_in.o (x86), neither of which get named here.

Indeed, the order here probably doesn't matter. I tried to build on x86
with the list reversed (so still leaving boot/ first) and the build
works. I didn't try to boot it.

Maybe it's time to retire the comment?

> > @@ -407,7 +417,7 @@ $(TARGET): FORCE
> > $(MAKE) -f $(BASEDIR)/Rules.mk -C include
> > $(MAKE) -f $(BASEDIR)/Rules.mk -C arch/$(TARGET_ARCH) include
> > $(MAKE) -f $(BASEDIR)/Rules.mk 
> > arch/$(TARGET_ARCH)/include/asm/asm-offsets.h
> > -   $(MAKE) -f $(BASEDIR)/Rules.mk -C arch/$(TARGET_ARCH) 
> > MKRELOC=$(MKRELOC) $@
> > +   $(MAKE) -f $(BASEDIR)/Rules.mk MKRELOC=$(MKRELOC) 
> > "ALL_OBJS=$(ALL_OBJS-y)" "ALL_LIBS=$(ALL_LIBS-y)" $@
> 
> I'm always a little wary of using double quotes when it's not clear
> what exactly a macro may expand to. Single quotes at least have less
> restrictions ...

Using single quotes sounds good.

Thanks,

-- 
Anthony PERARD



[qemu-mainline test] 167922: tolerable FAIL - PUSHED

2022-01-28 Thread osstest service owner
flight 167922 qemu-mainline real [real]
flight 167932 qemu-mainline real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/167922/
http://logs.test-lab.xenproject.org/osstest/logs/167932/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 12 debian-hvm-install fail pass 
in 167932-retest

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167884
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167884
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167884
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 167884
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167884
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 167884
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167884
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 167884
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuucfe63e46be0a1f8a7fd2fd5547222f8344a43279
baseline version:
 qemuu48302d4eb628ff0bea4d7e92cbf6b726410eb4c3

Last test of basis   167884  

Re: [XEN PATCH v9 03/30] build: fix exported variable name CFLAGS_stack_boundary

2022-01-28 Thread Jan Beulich
On 25.01.2022 12:00, Anthony PERARD wrote:
> Exporting a variable with a dash doesn't work reliably, they may be
> striped from the environment when calling a sub-make or sub-shell.
> 
> CFLAGS-stack-boundary start to be removed from env in patch "build:
> set ALL_OBJS in main Makefile; move prelink.o to main Makefile" when
> running `make "ALL_OBJS=.."` due to the addition of the quote. At
> least in my empirical tests.
> 
> Fixes: 2740d96efd ("xen/build: have the root Makefile generates the CFLAGS")
> Signed-off-by: Anthony PERARD 

While I did commit this, I'm still somewhat confused. How would quoting
of elements on a make command line make a difference to which variables
get exported?

In any event I understand the description that prior to the subsequent
change there's not actually any issue. Hence I'm not going to queue
this for backporting despite the Fixes: tag. Unless of course I'm told
otherwise (with justification).

Jan




Re: [PATCH v2] tools/libs/light: don't touch nr_vcpus_out if listing vcpus and returning NULL

2022-01-28 Thread Anthony PERARD
On Fri, Jan 28, 2022 at 09:48:05AM +0100, Dario Faggioli wrote:
> If we are in libxl_list_vcpu() and we are returning NULL, let's avoid
> touching the output parameter *nr_vcpus_out, which the caller should
> have initialized to 0.
> 
> The current behavior could be problematic if are creating a domain and,
> in the meantime, an existing one is destroyed when we have already done
> some steps of the loop. At which point, we'd return a NULL list of vcpus
> but with something different than 0 as the number of vcpus in that list.
> And this can cause troubles in the callers (e.g., nr_vcpus_on_nodes()),
> when they do a libxl_vcpuinfo_list_free().
> 
> Crashes due to this are rare and difficult to reproduce, but have been
> observed, with stack traces looking like this one:
> 
> #0  libxl_bitmap_dispose (map=map@entry=0x50) at libxl_utils.c:626
> #1  0x7fe72c993a32 in libxl_vcpuinfo_dispose (p=p@entry=0x38) at 
> _libxl_types.c:692
> #2  0x7fe72c94e3c4 in libxl_vcpuinfo_list_free (list=0x0, nr= out>) at libxl_utils.c:1059
> #3  0x7fe72c9528bf in nr_vcpus_on_nodes (vcpus_on_node=0x7fe71000eb60, 
> suitable_cpumap=0x7fe721df0d38, tinfo_elements=48, tinfo=0x7fe7101b3900, 
> gc=0x7fe7101bbfa0) at libxl_numa.c:258
> #4  libxl__get_numa_candidate (gc=gc@entry=0x7fe7100033a0, 
> min_free_memkb=4233216, min_cpus=4, min_nodes=min_nodes@entry=0, 
> max_nodes=max_nodes@entry=0, 
> suitable_cpumap=suitable_cpumap@entry=0x7fe721df0d38, 
> numa_cmpf=0x7fe72c940110 , cndt_out=0x7fe721df0cf0, 
> cndt_found=0x7fe721df0cb4) at libxl_numa.c:394
> #5  0x7fe72c94152b in numa_place_domain (d_config=0x7fe721df11b0, 
> domid=975, gc=0x7fe7100033a0) at libxl_dom.c:209
> #6  libxl__build_pre (gc=gc@entry=0x7fe7100033a0, domid=domid@entry=975, 
> d_config=d_config@entry=0x7fe721df11b0, state=state@entry=0x7fe710077700) at 
> libxl_dom.c:436
> #7  0x7fe72c92c4a5 in libxl__domain_build (gc=0x7fe7100033a0, 
> d_config=d_config@entry=0x7fe721df11b0, domid=975, state=0x7fe710077700) at 
> libxl_create.c:444
> #8  0x7fe72c92de8b in domcreate_bootloader_done (egc=0x7fe721df0f60, 
> bl=0x7fe7100778c0, rc=) at libxl_create.c:1222
> #9  0x7fe72c980425 in libxl__bootloader_run 
> (egc=egc@entry=0x7fe721df0f60, bl=bl@entry=0x7fe7100778c0) at 
> libxl_bootloader.c:403
> #10 0x7fe72c92f281 in initiate_domain_create 
> (egc=egc@entry=0x7fe721df0f60, dcs=dcs@entry=0x7fe7100771b0) at 
> libxl_create.c:1159
> #11 0x7fe72c92f456 in do_domain_create (ctx=ctx@entry=0x7fe71001c840, 
> d_config=d_config@entry=0x7fe721df11b0, domid=domid@entry=0x7fe721df10a8, 
> restore_fd=restore_fd@entry=-1, send_back_fd=send_back_fd@entry=-1, 
> params=params@entry=0x0, ao_how=0x0, aop_console_how=0x7fe721df10f0) at 
> libxl_create.c:1856
> #12 0x7fe72c92f776 in libxl_domain_create_new (ctx=0x7fe71001c840, 
> d_config=d_config@entry=0x7fe721df11b0, domid=domid@entry=0x7fe721df10a8, 
> ao_how=ao_how@entry=0x0, 
> aop_console_how=aop_console_how@entry=0x7fe721df10f0) at libxl_create.c:2075
> 
> Signed-off-by: Dario Faggioli 
> Tested-by: James Fehlig 
> ---
> This change should be backported to all supported branches.
> ---
> Changes from v1:
> - dropped patch 1; this one is enough of a fix
> - removed an assert() deemed non necessary
> - kept GC_FREE just before return in libxl_list_vcpu()
> - nr_vcpus is now unsigned
> - fix some typos
> ---

Reviewed-by: Anthony PERARD 

Thanks,

-- 
Anthony PERARD



Re: [PATCH v2 1/4] IOMMU/x86: switch to alternatives-call patching in further instances

2022-01-28 Thread Andrew Cooper
On 28/01/2022 10:36, Jan Beulich wrote:
> On 28.01.2022 10:28, Durrant, Paul wrote:
>> On 27/01/2022 14:47, Jan Beulich wrote:
>>> @@ -1457,24 +1462,24 @@ static int iommu_get_device_group(
>>>   if ( !is_iommu_enabled(d) || !ops->get_device_group_id )
>>>   return 0;
>>>   
>>> -group_id = ops->get_device_group_id(seg, bus, devfn);
>>> +group_id = iommu_call(ops, get_device_group_id, seg, bus, devfn);
>>>   
>>>   pcidevs_lock();
>>>   for_each_pdev( d, pdev )
>>>   {
>>> -if ( (pdev->seg != seg) ||
>>> - ((pdev->bus == bus) && (pdev->devfn == devfn)) )
>>> +unsigned int b = pdev->bus;
>>> +unsigned int df = pdev->devfn;
>>> +
>>> +if ( (pdev->seg != seg) || ((b == bus) && (df == devfn)) )
>>>   continue;
>>>   
>>> -if ( xsm_get_device_group(XSM_HOOK, (seg << 16) | (pdev->bus << 8) 
>>> | pdev->devfn) )
>>> +if ( xsm_get_device_group(XSM_HOOK, (seg << 16) | (b << 8) | df) )
>>>   continue;
>>>   
>>> -sdev_id = ops->get_device_group_id(seg, pdev->bus, pdev->devfn);
>>> +sdev_id = iommu_call(ops, get_device_group_id, seg, b, df);
>>>   if ( (sdev_id == group_id) && (i < max_sdevs) )
>>>   {
>>> -bdf = 0;
>>> -bdf |= (pdev->bus & 0xff) << 16;
>>> -bdf |= (pdev->devfn & 0xff) << 8;
>>> +bdf = (b << 16) | (df << 8);
>> Don't we have a macro for this now? Probably best to start using it 
>> whilst modifying the code.
> We don't. And it would feel somewhat misleading to use PCI_BDF2(b, df) << 8
> here. The situation is even worse imo: Besides there not being a macro, I
> also cannot seem to find any documentation on this non-standard layout (BDF
> shifted left by 8). Yet then again I also can't spot any caller of
> xc_get_device_group() ...

I'm sure I already did the archaeology.

device groups were broken by a hypercall bounce buffering change 2 years
before the only caller was dropped with Xend.

This mess of a hypercall has demonstrably not been used in a decade.  I
firmly suggest dropping it, rather than wasting effort trying to unbreak
an interface which needs deleting anyway as the first step to doing
IOMMU groups.

~Andrew


Re: [PATCH v2 1/4] IOMMU/x86: switch to alternatives-call patching in further instances

2022-01-28 Thread Rahul Singh
Hi Jan,

> On 27 Jan 2022, at 2:47 pm, Jan Beulich  wrote:
> 
> This is, once again, to limit the number of indirect calls as much as
> possible. The only hook invocation which isn't sensible to convert is
> setup(). And of course Arm-only use sites are left alone as well.
> 
> Note regarding the introduction / use of local variables in pci.c:
> struct pci_dev's involved fields are const. This const propagates, via
> typeof(), to the local helper variables in the altcall macros. These
> helper variables are, however, used as outputs (and hence can't be
> const). In iommu_get_device_group() make use of the new local variables
> to also simplify some adjacent code.
> 
> Signed-off-by: Jan Beulich 

Reviewed-by: Rahul Singh 
Tested-by: Rahul Singh 

Regards,
Rahul
> 
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -198,7 +198,7 @@ int iommu_domain_init(struct domain *d,
> return ret;
> 
> hd->platform_ops = iommu_get_ops();
> -ret = hd->platform_ops->init(d);
> +ret = iommu_call(hd->platform_ops, init, d);
> if ( ret || is_system_domain(d) )
> return ret;
> 
> @@ -233,7 +233,7 @@ void __hwdom_init iommu_hwdom_init(struc
> 
> register_keyhandler('o', _dump_page_tables, "dump iommu page 
> tables", 0);
> 
> -hd->platform_ops->hwdom_init(d);
> +iommu_vcall(hd->platform_ops, hwdom_init, d);
> }
> 
> static void iommu_teardown(struct domain *d)
> @@ -576,7 +576,7 @@ int iommu_get_reserved_device_memory(iom
> if ( !ops->get_reserved_device_memory )
> return 0;
> 
> -return ops->get_reserved_device_memory(func, ctxt);
> +return iommu_call(ops, get_reserved_device_memory, func, ctxt);
> }
> 
> bool_t iommu_has_feature(struct domain *d, enum iommu_feature feature)
> @@ -603,7 +603,7 @@ static void iommu_dump_page_tables(unsig
> continue;
> }
> 
> -dom_iommu(d)->platform_ops->dump_page_tables(d);
> +iommu_vcall(dom_iommu(d)->platform_ops, dump_page_tables, d);
> }
> 
> rcu_read_unlock(_read_lock);
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -861,15 +861,15 @@ static int deassign_device(struct domain
> devfn += pdev->phantom_stride;
> if ( PCI_SLOT(devfn) != PCI_SLOT(pdev->devfn) )
> break;
> -ret = hd->platform_ops->reassign_device(d, target, devfn,
> -pci_to_dev(pdev));
> +ret = iommu_call(hd->platform_ops, reassign_device, d, target, devfn,
> + pci_to_dev(pdev));
> if ( ret )
> goto out;
> }
> 
> devfn = pdev->devfn;
> -ret = hd->platform_ops->reassign_device(d, target, devfn,
> -pci_to_dev(pdev));
> +ret = iommu_call(hd->platform_ops, reassign_device, d, target, devfn,
> + pci_to_dev(pdev));
> if ( ret )
> goto out;
> 
> @@ -1300,7 +1300,7 @@ static int iommu_add_device(struct pci_d
> {
> const struct domain_iommu *hd;
> int rc;
> -u8 devfn;
> +unsigned int devfn = pdev->devfn;
> 
> if ( !pdev->domain )
> return -EINVAL;
> @@ -1311,16 +1311,16 @@ static int iommu_add_device(struct pci_d
> if ( !is_iommu_enabled(pdev->domain) )
> return 0;
> 
> -rc = hd->platform_ops->add_device(pdev->devfn, pci_to_dev(pdev));
> +rc = iommu_call(hd->platform_ops, add_device, devfn, pci_to_dev(pdev));
> if ( rc || !pdev->phantom_stride )
> return rc;
> 
> -for ( devfn = pdev->devfn ; ; )
> +for ( ; ; )
> {
> devfn += pdev->phantom_stride;
> if ( PCI_SLOT(devfn) != PCI_SLOT(pdev->devfn) )
> return 0;
> -rc = hd->platform_ops->add_device(devfn, pci_to_dev(pdev));
> +rc = iommu_call(hd->platform_ops, add_device, devfn, 
> pci_to_dev(pdev));
> if ( rc )
> printk(XENLOG_WARNING "IOMMU: add %pp failed (%d)\n",
>>sbdf, rc);
> @@ -1341,7 +1341,7 @@ static int iommu_enable_device(struct pc
>  !hd->platform_ops->enable_device )
> return 0;
> 
> -return hd->platform_ops->enable_device(pci_to_dev(pdev));
> +return iommu_call(hd->platform_ops, enable_device, pci_to_dev(pdev));
> }
> 
> static int iommu_remove_device(struct pci_dev *pdev)
> @@ -1363,7 +1363,8 @@ static int iommu_remove_device(struct pc
> devfn += pdev->phantom_stride;
> if ( PCI_SLOT(devfn) != PCI_SLOT(pdev->devfn) )
> break;
> -rc = hd->platform_ops->remove_device(devfn, pci_to_dev(pdev));
> +rc = iommu_call(hd->platform_ops, remove_device, devfn,
> +pci_to_dev(pdev));
> if ( !rc )
> continue;
> 
> @@ -1371,7 +1372,9 @@ static int iommu_remove_device(struct pc
> return rc;
> }
> 
> -return hd->platform_ops->remove_device(pdev->devfn, pci_to_dev(pdev));
> +devfn = pdev->devfn;
> +
> +return 

Re: [PATCH v2 1/4] IOMMU/x86: switch to alternatives-call patching in further instances

2022-01-28 Thread Jan Beulich
On 28.01.2022 10:28, Durrant, Paul wrote:
> On 27/01/2022 14:47, Jan Beulich wrote:
>> @@ -1457,24 +1462,24 @@ static int iommu_get_device_group(
>>   if ( !is_iommu_enabled(d) || !ops->get_device_group_id )
>>   return 0;
>>   
>> -group_id = ops->get_device_group_id(seg, bus, devfn);
>> +group_id = iommu_call(ops, get_device_group_id, seg, bus, devfn);
>>   
>>   pcidevs_lock();
>>   for_each_pdev( d, pdev )
>>   {
>> -if ( (pdev->seg != seg) ||
>> - ((pdev->bus == bus) && (pdev->devfn == devfn)) )
>> +unsigned int b = pdev->bus;
>> +unsigned int df = pdev->devfn;
>> +
>> +if ( (pdev->seg != seg) || ((b == bus) && (df == devfn)) )
>>   continue;
>>   
>> -if ( xsm_get_device_group(XSM_HOOK, (seg << 16) | (pdev->bus << 8) 
>> | pdev->devfn) )
>> +if ( xsm_get_device_group(XSM_HOOK, (seg << 16) | (b << 8) | df) )
>>   continue;
>>   
>> -sdev_id = ops->get_device_group_id(seg, pdev->bus, pdev->devfn);
>> +sdev_id = iommu_call(ops, get_device_group_id, seg, b, df);
>>   if ( (sdev_id == group_id) && (i < max_sdevs) )
>>   {
>> -bdf = 0;
>> -bdf |= (pdev->bus & 0xff) << 16;
>> -bdf |= (pdev->devfn & 0xff) << 8;
>> +bdf = (b << 16) | (df << 8);
> 
> Don't we have a macro for this now? Probably best to start using it 
> whilst modifying the code.

We don't. And it would feel somewhat misleading to use PCI_BDF2(b, df) << 8
here. The situation is even worse imo: Besides there not being a macro, I
also cannot seem to find any documentation on this non-standard layout (BDF
shifted left by 8). Yet then again I also can't spot any caller of
xc_get_device_group() ...

> Reviewed-by: Paul Durrant 

Thanks.

Jan




Re: [PATCH V3 2/2] iommu/arm: Remove code duplication in all IOMMU drivers

2022-01-28 Thread Rahul Singh
Hi Oleksandr,

> On 27 Jan 2022, at 7:55 pm, Oleksandr Tyshchenko  wrote:
> 
> From: Oleksandr Tyshchenko 
> 
> All IOMMU drivers on Arm perform almost the same generic actions in
> hwdom_init callback. Move this code to common arch_iommu_hwdom_init()
> in order to get rid of code duplication.
> 
> Signed-off-by: Oleksandr Tyshchenko 
> Reviewed-by: Volodymyr Babchuk 
> Reviewed-by: Yoshihiro Shimoda 

Acked-by:: Rahul Singh 

Regards,
Rahul
> ---
> Changes V1 -> V2:
>   - add R-b
> 
> Changes V2 -> V3:
>   - drop platform specific *_iommu_hwdom_init(), make .hwdom_init
> to directly point to the common arch_iommu_hwdom_init()
> ---
> xen/drivers/passthrough/arm/iommu.c  |  7 +++
> xen/drivers/passthrough/arm/ipmmu-vmsa.c | 15 +--
> xen/drivers/passthrough/arm/smmu-v3.c| 17 +
> xen/drivers/passthrough/arm/smmu.c   | 17 +
> 4 files changed, 10 insertions(+), 46 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/arm/iommu.c 
> b/xen/drivers/passthrough/arm/iommu.c
> index ee653a9..fc45318 100644
> --- a/xen/drivers/passthrough/arm/iommu.c
> +++ b/xen/drivers/passthrough/arm/iommu.c
> @@ -134,6 +134,13 @@ void arch_iommu_domain_destroy(struct domain *d)
> 
> void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
> {
> +/* Set to false options not supported on ARM. */
> +if ( iommu_hwdom_inclusive )
> +printk(XENLOG_WARNING "map-inclusive dom0-iommu option is not 
> supported on ARM\n");
> +iommu_hwdom_inclusive = false;
> +if ( iommu_hwdom_reserved == 1 )
> +printk(XENLOG_WARNING "map-reserved dom0-iommu option is not 
> supported on ARM\n");
> +iommu_hwdom_reserved = 0;
> }
> 
> /*
> diff --git a/xen/drivers/passthrough/arm/ipmmu-vmsa.c 
> b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
> index c912120..d2572bc 100644
> --- a/xen/drivers/passthrough/arm/ipmmu-vmsa.c
> +++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
> @@ -1329,19 +1329,6 @@ static int ipmmu_iommu_domain_init(struct domain *d)
> return 0;
> }
> 
> -static void __hwdom_init ipmmu_iommu_hwdom_init(struct domain *d)
> -{
> -/* Set to false options not supported on ARM. */
> -if ( iommu_hwdom_inclusive )
> -printk(XENLOG_WARNING "ipmmu: map-inclusive dom0-iommu option is not 
> supported on ARM\n");
> -iommu_hwdom_inclusive = false;
> -if ( iommu_hwdom_reserved == 1 )
> -printk(XENLOG_WARNING "ipmmu: map-reserved dom0-iommu option is not 
> supported on ARM\n");
> -iommu_hwdom_reserved = 0;
> -
> -arch_iommu_hwdom_init(d);
> -}
> -
> static void ipmmu_iommu_domain_teardown(struct domain *d)
> {
> struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(d)->arch.priv;
> @@ -1369,7 +1356,7 @@ static void ipmmu_iommu_domain_teardown(struct domain 
> *d)
> static const struct iommu_ops ipmmu_iommu_ops =
> {
> .init= ipmmu_iommu_domain_init,
> -.hwdom_init  = ipmmu_iommu_hwdom_init,
> +.hwdom_init  = arch_iommu_hwdom_init,
> .teardown= ipmmu_iommu_domain_teardown,
> .iotlb_flush = ipmmu_iotlb_flush,
> .iotlb_flush_all = ipmmu_iotlb_flush_all,
> diff --git a/xen/drivers/passthrough/arm/smmu-v3.c 
> b/xen/drivers/passthrough/arm/smmu-v3.c
> index d115df7..71b022f 100644
> --- a/xen/drivers/passthrough/arm/smmu-v3.c
> +++ b/xen/drivers/passthrough/arm/smmu-v3.c
> @@ -3402,21 +3402,6 @@ static int arm_smmu_iommu_xen_domain_init(struct 
> domain *d)
> 
> }
> 
> -static void __hwdom_init arm_smmu_iommu_hwdom_init(struct domain *d)
> -{
> - /* Set to false options not supported on ARM. */
> - if (iommu_hwdom_inclusive)
> - printk(XENLOG_WARNING
> - "map-inclusive dom0-iommu option is not supported on ARM\n");
> - iommu_hwdom_inclusive = false;
> - if (iommu_hwdom_reserved == 1)
> - printk(XENLOG_WARNING
> - "map-reserved dom0-iommu option is not supported on ARM\n");
> - iommu_hwdom_reserved = 0;
> -
> - arch_iommu_hwdom_init(d);
> -}
> -
> static void arm_smmu_iommu_xen_domain_teardown(struct domain *d)
> {
>   struct arm_smmu_xen_domain *xen_domain = dom_iommu(d)->arch.priv;
> @@ -3427,7 +3412,7 @@ static void arm_smmu_iommu_xen_domain_teardown(struct 
> domain *d)
> 
> static const struct iommu_ops arm_smmu_iommu_ops = {
>   .init   = arm_smmu_iommu_xen_domain_init,
> - .hwdom_init = arm_smmu_iommu_hwdom_init,
> + .hwdom_init = arch_iommu_hwdom_init,
>   .teardown   = arm_smmu_iommu_xen_domain_teardown,
>   .iotlb_flush= arm_smmu_iotlb_flush,
>   .iotlb_flush_all= arm_smmu_iotlb_flush_all,
> diff --git a/xen/drivers/passthrough/arm/smmu.c 
> b/xen/drivers/passthrough/arm/smmu.c
> index c9dfc4c..b186c28 100644
> --- a/xen/drivers/passthrough/arm/smmu.c
> +++ b/xen/drivers/passthrough/arm/smmu.c
> @@ -2849,21 +2849,6 @@ static int arm_smmu_iommu_domain_init(struct domain *d)
>   return 0;
> }
> 

Re: [XEN v1] xen/arm: io: Check ESR_EL2.ISV != 0 before searching for a MMIO handler

2022-01-28 Thread Julien Grall




On 28/01/2022 01:20, Stefano Stabellini wrote:

On Thu, 27 Jan 2022, Julien Grall wrote:

On Thu, 27 Jan 2022 at 23:05, Julien Grall  wrote:


On Thu, 27 Jan 2022 at 22:40, Stefano Stabellini  wrote:

I am with you on both points.

One thing I noticed is that the code today is not able to deal with
IO_UNHANDLED for MMIO regions handled by IOREQ servers or Xen MMIO
emulator handlers. p2m_resolve_translation_fault and try_map_mmio are
called after try_handle_mmio returns IO_UNHANDLED but try_handle_mmio is
not called a second time (or am I mistaken?)


Why would you need it? If try_mmio_fault() doesn't work the first time, then


Sorry I meant try_handle_mmio().


it will not work the second it.


I think I explained myself badly, I'll try again below.



Another thing I noticed is that currently find_mmio_handler and
try_fwd_ioserv expect dabt to be already populated and valid so it would
be better if we could get there only when dabt.valid.

With these two things in mind, I think maybe the best thing to do is to
change the code in do_trap_stage2_abort_guest slightly so that
p2m_resolve_translation_fault and try_map_mmio are called first when
!dabt.valid.


An abort will mostly likely happen because of emulated I/O. If we call
p2m_resolve_translation_fault() and try_map_mmio() first, then it means
the processing will take longer than necessary for the common case.

So I think we want to keep the order as it is. I.e first trying the MMIO
and then falling back to the less likely reason for a trap.


Yeah I thought about it as well. The idea would be that if dabt.valid is
set then we leave things as they are (we call try_handle_mmio first) but
if dabt.valid is not set (it is not valid) then we skip the
try_handle_mmio() call because it wouldn't succeed anyway and go
directly to p2m_resolve_translation_fault() and try_map_mmio().

If either of them work (also reading what you wrote about it) then we
return immediately.


Ok. So the assumption is data abort with invalid syndrome would mostly 
likely be because of a fault handled by p2m_resolve_translation_fault().


I think this makes sense. However, I am not convinced we can currently 
safely call try_map_mmio() before try_handle_mmio(). This is because the 
logic in try_map_mmio() is quite fragile and we may mistakenly map an 
emulated region.


Similarly, we can't call try_map_mmio() before 
p2m_resolve_translation_fault() because a transient fault may be

misinterpreted.

I think we may be able to harden try_map_mmio() by checking if the I/O 
region is emulated. But this will need to be fully thought through first.




If not, then we call decode_instruction from do_trap_stage2_abort_guest
and try again. The second time dabt.valid is set so we end up calling
try_handle_mmio() as usual.


With the approach below, you will also end up to call 
p2m_resolve_translation_fault() and try_map_mmio() a second time if 
try_handle_mmio() fails.




Just for clarity let me copy/paste the relevant code, apologies if it
was already obvious to you -- I got the impression my suggestion wasn't
very clear.



+again:
+if ( is_data && hsr.dabt.valid )
 {
 enum io_state state = try_handle_mmio(regs, hsr, gpa);

 switch ( state )
 {
 case IO_ABORT:
 goto inject_abt;
 case IO_HANDLED:
 advance_pc(regs, hsr);
 return;
 case IO_RETRY:
 /* finish later */
 return;
 case IO_UNHANDLED:
 /* IO unhandled, try another way to handle it. */
 break;
 }
 }

 /*
  * First check if the translation fault can be resolved by the
  * P2M subsystem. If that's the case nothing else to do.
  */
 if ( p2m_resolve_translation_fault(current->domain,
gaddr_to_gfn(gpa)) )
 return;

 if ( is_data && try_map_mmio(gaddr_to_gfn(gpa)) )
 return;

+if ( !hsr.dabt.valid )
One more thing I noticed, a stage 2 fault can also happen on an access 
made for a stage 1 translation walk. In this case, I think we don't want 
to decode the instruction.


So this would need to be !hsr.dabt.valid && !hsr.dabt.s1ptw. Depending 
on which patch we go with, this would also need to be adjusted in the 
other one as well.


Cheers,

--
Julien Grall



[ovmf test] 167929: all pass - PUSHED

2022-01-28 Thread osstest service owner
flight 167929 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167929/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf a867f3a7042ae0b4c1189bececbe72aa8a8a8e27
baseline version:
 ovmf 6777e673839a510aaa62a48514a4223da7d3bba2

Last test of basis   167919  2022-01-27 13:10:49 Z0 days
Testing same since   167929  2022-01-28 02:41:45 Z0 days1 attempts


People who touched revisions under test:
  Liu, Zhiguang 
  Zhiguang Liu 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   6777e67383..a867f3a704  a867f3a7042ae0b4c1189bececbe72aa8a8a8e27 -> 
xen-tested-master



Re: [PATCH v2 4/4] IOMMU/PCI: propagate get_device_group_id() failure

2022-01-28 Thread Durrant, Paul

On 27/01/2022 14:49, Jan Beulich wrote:

The VT-d hook can indicate an error, which shouldn't be ignored. Convert
the hook's return value to a proper error code, and let that bubble up.

Signed-off-by: Jan Beulich 


Reviewed-by: Paul Durrant 



Re: [PATCH v2 3/4] VT-d: replace flush_all_cache()

2022-01-28 Thread Durrant, Paul

On 27/01/2022 14:49, Jan Beulich wrote:

Let's use infrastructure we have available instead of an open-coded
wbinvd() invocation.

Signed-off-by: Jan Beulich 


Reviewed-by: Paul Durrant 



Re: [PATCH v2 2/4] VT-d / x86: re-arrange cache syncing

2022-01-28 Thread Durrant, Paul

On 27/01/2022 14:48, Jan Beulich wrote:

The actual function should always have lived in core x86 code; move it
there, replacing get_cache_line_size() by readily available (except very
early during boot; see the code comment) data. Also rename the function.

Drop the respective IOMMU hook, (re)introducing a respective boolean
instead. Replace a true and an almost open-coding instance of
iommu_sync_cache().

Signed-off-by: Jan Beulich 


Reviewed-by: Paul Durrant 



Re: [PATCH v2 1/4] IOMMU/x86: switch to alternatives-call patching in further instances

2022-01-28 Thread Durrant, Paul

On 27/01/2022 14:47, Jan Beulich wrote:

This is, once again, to limit the number of indirect calls as much as
possible. The only hook invocation which isn't sensible to convert is
setup(). And of course Arm-only use sites are left alone as well.

Note regarding the introduction / use of local variables in pci.c:
struct pci_dev's involved fields are const. This const propagates, via
typeof(), to the local helper variables in the altcall macros. These
helper variables are, however, used as outputs (and hence can't be
const). In iommu_get_device_group() make use of the new local variables
to also simplify some adjacent code.

Signed-off-by: Jan Beulich 

--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -198,7 +198,7 @@ int iommu_domain_init(struct domain *d,
  return ret;
  
  hd->platform_ops = iommu_get_ops();

-ret = hd->platform_ops->init(d);
+ret = iommu_call(hd->platform_ops, init, d);
  if ( ret || is_system_domain(d) )
  return ret;
  
@@ -233,7 +233,7 @@ void __hwdom_init iommu_hwdom_init(struc
  
  register_keyhandler('o', _dump_page_tables, "dump iommu page tables", 0);
  
-hd->platform_ops->hwdom_init(d);

+iommu_vcall(hd->platform_ops, hwdom_init, d);
  }
  
  static void iommu_teardown(struct domain *d)

@@ -576,7 +576,7 @@ int iommu_get_reserved_device_memory(iom
  if ( !ops->get_reserved_device_memory )
  return 0;
  
-return ops->get_reserved_device_memory(func, ctxt);

+return iommu_call(ops, get_reserved_device_memory, func, ctxt);
  }
  
  bool_t iommu_has_feature(struct domain *d, enum iommu_feature feature)

@@ -603,7 +603,7 @@ static void iommu_dump_page_tables(unsig
  continue;
  }
  
-dom_iommu(d)->platform_ops->dump_page_tables(d);

+iommu_vcall(dom_iommu(d)->platform_ops, dump_page_tables, d);
  }
  
  rcu_read_unlock(_read_lock);

--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -861,15 +861,15 @@ static int deassign_device(struct domain
  devfn += pdev->phantom_stride;
  if ( PCI_SLOT(devfn) != PCI_SLOT(pdev->devfn) )
  break;
-ret = hd->platform_ops->reassign_device(d, target, devfn,
-pci_to_dev(pdev));
+ret = iommu_call(hd->platform_ops, reassign_device, d, target, devfn,
+ pci_to_dev(pdev));
  if ( ret )
  goto out;
  }
  
  devfn = pdev->devfn;

-ret = hd->platform_ops->reassign_device(d, target, devfn,
-pci_to_dev(pdev));
+ret = iommu_call(hd->platform_ops, reassign_device, d, target, devfn,
+ pci_to_dev(pdev));
  if ( ret )
  goto out;
  
@@ -1300,7 +1300,7 @@ static int iommu_add_device(struct pci_d

  {
  const struct domain_iommu *hd;
  int rc;
-u8 devfn;
+unsigned int devfn = pdev->devfn;
  
  if ( !pdev->domain )

  return -EINVAL;
@@ -1311,16 +1311,16 @@ static int iommu_add_device(struct pci_d
  if ( !is_iommu_enabled(pdev->domain) )
  return 0;
  
-rc = hd->platform_ops->add_device(pdev->devfn, pci_to_dev(pdev));

+rc = iommu_call(hd->platform_ops, add_device, devfn, pci_to_dev(pdev));
  if ( rc || !pdev->phantom_stride )
  return rc;
  
-for ( devfn = pdev->devfn ; ; )

+for ( ; ; )
  {
  devfn += pdev->phantom_stride;
  if ( PCI_SLOT(devfn) != PCI_SLOT(pdev->devfn) )
  return 0;
-rc = hd->platform_ops->add_device(devfn, pci_to_dev(pdev));
+rc = iommu_call(hd->platform_ops, add_device, devfn, pci_to_dev(pdev));
  if ( rc )
  printk(XENLOG_WARNING "IOMMU: add %pp failed (%d)\n",
 >sbdf, rc);
@@ -1341,7 +1341,7 @@ static int iommu_enable_device(struct pc
   !hd->platform_ops->enable_device )
  return 0;
  
-return hd->platform_ops->enable_device(pci_to_dev(pdev));

+return iommu_call(hd->platform_ops, enable_device, pci_to_dev(pdev));
  }
  
  static int iommu_remove_device(struct pci_dev *pdev)

@@ -1363,7 +1363,8 @@ static int iommu_remove_device(struct pc
  devfn += pdev->phantom_stride;
  if ( PCI_SLOT(devfn) != PCI_SLOT(pdev->devfn) )
  break;
-rc = hd->platform_ops->remove_device(devfn, pci_to_dev(pdev));
+rc = iommu_call(hd->platform_ops, remove_device, devfn,
+pci_to_dev(pdev));
  if ( !rc )
  continue;
  
@@ -1371,7 +1372,9 @@ static int iommu_remove_device(struct pc

  return rc;
  }
  
-return hd->platform_ops->remove_device(pdev->devfn, pci_to_dev(pdev));

+devfn = pdev->devfn;
+
+return iommu_call(hd->platform_ops, remove_device, devfn, 
pci_to_dev(pdev));
  }
  
  static int device_assigned(u16 seg, u8 bus, u8 devfn)

@@ -1421,7 +1424,8 @@ static int assign_device(struct domain *
 

[PATCH v2] tools/libs/light: don't touch nr_vcpus_out if listing vcpus and returning NULL

2022-01-28 Thread Dario Faggioli
If we are in libxl_list_vcpu() and we are returning NULL, let's avoid
touching the output parameter *nr_vcpus_out, which the caller should
have initialized to 0.

The current behavior could be problematic if are creating a domain and,
in the meantime, an existing one is destroyed when we have already done
some steps of the loop. At which point, we'd return a NULL list of vcpus
but with something different than 0 as the number of vcpus in that list.
And this can cause troubles in the callers (e.g., nr_vcpus_on_nodes()),
when they do a libxl_vcpuinfo_list_free().

Crashes due to this are rare and difficult to reproduce, but have been
observed, with stack traces looking like this one:

#0  libxl_bitmap_dispose (map=map@entry=0x50) at libxl_utils.c:626
#1  0x7fe72c993a32 in libxl_vcpuinfo_dispose (p=p@entry=0x38) at 
_libxl_types.c:692
#2  0x7fe72c94e3c4 in libxl_vcpuinfo_list_free (list=0x0, nr=) at libxl_utils.c:1059
#3  0x7fe72c9528bf in nr_vcpus_on_nodes (vcpus_on_node=0x7fe71000eb60, 
suitable_cpumap=0x7fe721df0d38, tinfo_elements=48, tinfo=0x7fe7101b3900, 
gc=0x7fe7101bbfa0) at libxl_numa.c:258
#4  libxl__get_numa_candidate (gc=gc@entry=0x7fe7100033a0, 
min_free_memkb=4233216, min_cpus=4, min_nodes=min_nodes@entry=0, 
max_nodes=max_nodes@entry=0, 
suitable_cpumap=suitable_cpumap@entry=0x7fe721df0d38, numa_cmpf=0x7fe72c940110 
, cndt_out=0x7fe721df0cf0, cndt_found=0x7fe721df0cb4) at 
libxl_numa.c:394
#5  0x7fe72c94152b in numa_place_domain (d_config=0x7fe721df11b0, 
domid=975, gc=0x7fe7100033a0) at libxl_dom.c:209
#6  libxl__build_pre (gc=gc@entry=0x7fe7100033a0, domid=domid@entry=975, 
d_config=d_config@entry=0x7fe721df11b0, state=state@entry=0x7fe710077700) at 
libxl_dom.c:436
#7  0x7fe72c92c4a5 in libxl__domain_build (gc=0x7fe7100033a0, 
d_config=d_config@entry=0x7fe721df11b0, domid=975, state=0x7fe710077700) at 
libxl_create.c:444
#8  0x7fe72c92de8b in domcreate_bootloader_done (egc=0x7fe721df0f60, 
bl=0x7fe7100778c0, rc=) at libxl_create.c:1222
#9  0x7fe72c980425 in libxl__bootloader_run (egc=egc@entry=0x7fe721df0f60, 
bl=bl@entry=0x7fe7100778c0) at libxl_bootloader.c:403
#10 0x7fe72c92f281 in initiate_domain_create (egc=egc@entry=0x7fe721df0f60, 
dcs=dcs@entry=0x7fe7100771b0) at libxl_create.c:1159
#11 0x7fe72c92f456 in do_domain_create (ctx=ctx@entry=0x7fe71001c840, 
d_config=d_config@entry=0x7fe721df11b0, domid=domid@entry=0x7fe721df10a8, 
restore_fd=restore_fd@entry=-1, send_back_fd=send_back_fd@entry=-1, 
params=params@entry=0x0, ao_how=0x0, aop_console_how=0x7fe721df10f0) at 
libxl_create.c:1856
#12 0x7fe72c92f776 in libxl_domain_create_new (ctx=0x7fe71001c840, 
d_config=d_config@entry=0x7fe721df11b0, domid=domid@entry=0x7fe721df10a8, 
ao_how=ao_how@entry=0x0, aop_console_how=aop_console_how@entry=0x7fe721df10f0) 
at libxl_create.c:2075

Signed-off-by: Dario Faggioli 
Tested-by: James Fehlig 
---
Cc: Wei Liu 
Cc: Anthony PERARD 
Cc: Juergen Gross 
---
This change should be backported to all supported branches.
---
Changes from v1:
- dropped patch 1; this one is enough of a fix
- removed an assert() deemed non necessary
- kept GC_FREE just before return in libxl_list_vcpu()
- nr_vcpus is now unsigned
- fix some typos
---
 tools/libs/light/libxl_domain.c |   14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/tools/libs/light/libxl_domain.c b/tools/libs/light/libxl_domain.c
index 544a9bf59d..d438232117 100644
--- a/tools/libs/light/libxl_domain.c
+++ b/tools/libs/light/libxl_domain.c
@@ -1661,6 +1661,7 @@ libxl_vcpuinfo *libxl_list_vcpu(libxl_ctx *ctx, uint32_t 
domid,
 libxl_vcpuinfo *ptr, *ret;
 xc_domaininfo_t domaininfo;
 xc_vcpuinfo_t vcpuinfo;
+unsigned int nr_vcpus;
 
 if (xc_domain_getinfolist(ctx->xch, domid, 1, ) != 1) {
 LOGED(ERROR, domid, "Getting infolist");
@@ -1677,33 +1678,34 @@ libxl_vcpuinfo *libxl_list_vcpu(libxl_ctx *ctx, 
uint32_t domid,
 ret = ptr = libxl__calloc(NOGC, domaininfo.max_vcpu_id + 1,
   sizeof(libxl_vcpuinfo));
 
-for (*nr_vcpus_out = 0;
- *nr_vcpus_out <= domaininfo.max_vcpu_id;
- ++*nr_vcpus_out, ++ptr) {
+for (nr_vcpus = 0;
+ nr_vcpus <= domaininfo.max_vcpu_id;
+ ++nr_vcpus, ++ptr) {
 libxl_bitmap_init(>cpumap);
 if (libxl_cpu_bitmap_alloc(ctx, >cpumap, 0))
 goto err;
 libxl_bitmap_init(>cpumap_soft);
 if (libxl_cpu_bitmap_alloc(ctx, >cpumap_soft, 0))
 goto err;
-if (xc_vcpu_getinfo(ctx->xch, domid, *nr_vcpus_out, ) == -1) {
+if (xc_vcpu_getinfo(ctx->xch, domid, nr_vcpus, ) == -1) {
 LOGED(ERROR, domid, "Getting vcpu info");
 goto err;
 }
 
-if (xc_vcpu_getaffinity(ctx->xch, domid, *nr_vcpus_out,
+if (xc_vcpu_getaffinity(ctx->xch, domid, nr_vcpus,
 ptr->cpumap.map, ptr->cpumap_soft.map,