Re: [PATCH] tools: ROUNDUP() related adjustments

2021-08-18 Thread Juergen Gross

On 10.08.21 12:03, Jan Beulich wrote:

For one xc_private.h needlessly repeats xen-tools/libs.h's definition.

And then there are two suspicious uses (resulting from the inconsistency
with the respective 2nd parameter of DIV_ROUNDUP()): While the one in
tools/console/daemon/io.c - as per the code comment - intentionally uses
8 as the second argument (meaning to align to a multiple of 256), the
one in alloc_magic_pages_hvm() pretty certainly does not: There the goal
is to align to a uint64_t boundary, for the following module struct to
end up aligned.

Signed-off-by: Jan Beulich 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


[PATCH] mini-os: netfront: fix initialization without ip address in xenstore

2021-08-18 Thread Juergen Gross
Commit 4821876fcd2ff ("mini-os: netfront: fix suspend/resume handling")
introduced a NULL pointer dereference in the initialization of netfront
in the case of no IP address being set in Xenstore.

Fix that by testing this condition. At the same time fix a long
standing bug for the same condition if someone used init_netfront()
with a non-NULL ip parameter.

Fixes: 4821876fcd2ff ("mini-os: netfront: fix suspend/resume handling")
Signed-off-by: Juergen Gross 
---
 netfront.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/netfront.c b/netfront.c
index f927e99..dfe065b 100644
--- a/netfront.c
+++ b/netfront.c
@@ -365,7 +365,7 @@ out:
 rawmac[5] = dev->rawmac[5];
}
 if (ip)
-*ip = strdup(dev->ip);
+*ip = dev->ip ? strdup(dev->ip) : NULL;
 
 err:
 return dev;
@@ -527,7 +527,7 @@ done:
 snprintf(path, sizeof(path), "%s/ip", dev->backend);
 xenbus_read(XBT_NIL, path, >ip);
 
-p = strchr(dev->ip, ' ');
+p = dev->ip ? strchr(dev->ip, ' ') : NULL;
 if (p) {
 *p++ = '\0';
 dev->mask = p;
-- 
2.26.2




[xen-unstable test] 164237: regressions - FAIL

2021-08-18 Thread osstest service owner
flight 164237 xen-unstable real [real]
flight 164246 xen-unstable real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/164237/
http://logs.test-lab.xenproject.org/osstest/logs/164246/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-amd64-pvgrub 12 debian-di-install   fail REGR. vs. 164178
 test-amd64-amd64-i386-pvgrub 12 debian-di-installfail REGR. vs. 164178

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 164178
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 164178
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 164178
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 164178
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 164178
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 164178
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 164178
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 164178
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 164178
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 164178
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 164178
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass

version targeted for testing:
 xen  54c9736382e0d558a6acd820e44185e020131c48
baseline version:
 xen  5a88d524857e5bf78b077d30ea515fcaac061bfc

Last test of basis   164178  2021-08-13 10:47:20 Z5 days
Failing since164182  2021-08-14 00:39:12 Z5 days8 attempts
Testing same since   164237  2021-08-18 07:24:21 Z0 days

[qemu-mainline test] 164234: regressions - FAIL

2021-08-18 Thread osstest service owner
flight 164234 qemu-mainline real [real]
flight 164243 qemu-mainline real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/164234/
http://logs.test-lab.xenproject.org/osstest/logs/164243/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-amd64-pvgrub 12 debian-di-install   fail REGR. vs. 164152
 test-amd64-amd64-i386-pvgrub 12 debian-di-installfail REGR. vs. 164152

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds18 guest-start/debian.repeat fail REGR. vs. 164152

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 164152
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 164152
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 164152
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 164152
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 164152
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 164152
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 164152
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuuecf2706e271fa705621f0d5ad9517fe15a22bf22
baseline version:
 qemuu703e8cd6189cf699c8d5c094bc68b5f3afa6ad71

Last test of basis   164152  2021-08-10 21:08:02 Z8 days
Failing since164194  2021-08-15 10:38:08 Z3 days5 attempts
Testing same since   164234  2021-08-18 02:37:17 Z0 days1 attempts


People who touched revisions under test:
  Alexander Bulekov 
  Andrew Jones 
  Daniel P. Berrangé 
  

[PATCH v3 6/6] x86: change asm/debugger.h to xen/debugger.h

2021-08-18 Thread Bobby Eshleman
This commit allows non-x86 architecture to omit the file asm/debugger.h
if they do not require it.  It changes debugger.h to be a general
xen/debugger.h which, if CONFIG_CRASH_DEBUG, resolves to include
asm/debugger.h.

It also changes all asm/debugger.h includes to xen/debugger.h.

Because it is no longer required, arm/debugger.h is removed.

Signed-off-by: Bobby Eshleman 
---
Changes in v3:
- No longer introduces a fake TRAP_invalid_op for arm to consume, it
  is not necessary given the removal of the arm calls now precedes
  this patch.
- Instead of defining prototypes that arch/ is expected to implement,
  this version simply #includes  when CONFIG_CRASH_DEBUG.

 xen/arch/x86/traps.c   |  2 +-
 xen/common/domain.c|  2 +-
 xen/common/gdbstub.c   |  2 +-
 xen/common/keyhandler.c|  2 +-
 xen/common/shutdown.c  |  2 +-
 xen/drivers/char/console.c |  2 +-
 xen/include/asm-arm/debugger.h | 15 --
 xen/include/asm-x86/debugger.h | 15 --
 xen/include/xen/debugger.h | 51 ++
 9 files changed, 57 insertions(+), 36 deletions(-)
 delete mode 100644 xen/include/asm-arm/debugger.h
 create mode 100644 xen/include/xen/debugger.h

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 5947ed25d6..a540f0832e 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -62,7 +63,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 6b71c6d6a9..a87d814b38 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -33,7 +34,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/xen/common/gdbstub.c b/xen/common/gdbstub.c
index 848c1f4327..1d7b98cdac 100644
--- a/xen/common/gdbstub.c
+++ b/xen/common/gdbstub.c
@@ -38,12 +38,12 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c
index 8b9f378371..1eafaef9b2 100644
--- a/xen/common/keyhandler.c
+++ b/xen/common/keyhandler.c
@@ -3,6 +3,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -20,7 +21,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 static unsigned char keypress_key;
diff --git a/xen/common/shutdown.c b/xen/common/shutdown.c
index abde48aa4c..a933ee001e 100644
--- a/xen/common/shutdown.c
+++ b/xen/common/shutdown.c
@@ -2,13 +2,13 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
-#include 
 #include 
 
 /* opt_noreboot: If true, machine will need manual reset on error. */
diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c
index 7d0a603d03..3d1cdde821 100644
--- a/xen/drivers/char/console.c
+++ b/xen/drivers/char/console.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -26,7 +27,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include  /* for do_console_io */
 #include 
diff --git a/xen/include/asm-arm/debugger.h b/xen/include/asm-arm/debugger.h
deleted file mode 100644
index ac776efa78..00
--- a/xen/include/asm-arm/debugger.h
+++ /dev/null
@@ -1,15 +0,0 @@
-#ifndef __ARM_DEBUGGER_H__
-#define __ARM_DEBUGGER_H__
-
-#define debugger_trap_fatal(v, r) (0)
-#define debugger_trap_immediate() ((void) 0)
-
-#endif /* __ARM_DEBUGGER_H__ */
-/*
- * Local variables:
- * mode: C
- * c-file-style: "BSD"
- * c-basic-offset: 4
- * indent-tabs-mode: nil
- * End:
- */
diff --git a/xen/include/asm-x86/debugger.h b/xen/include/asm-x86/debugger.h
index 8f6222956e..b9eeed395c 100644
--- a/xen/include/asm-x86/debugger.h
+++ b/xen/include/asm-x86/debugger.h
@@ -25,9 +25,6 @@
 #include 
 #include 
 #include 
-
-#ifdef CONFIG_CRASH_DEBUG
-
 #include 
 
 static inline bool debugger_trap_fatal(
@@ -40,16 +37,4 @@ static inline bool debugger_trap_fatal(
 /* Int3 is a trivial way to gather cpu_user_regs context. */
 #define debugger_trap_immediate() __asm__ __volatile__ ( "int3" );
 
-#else
-
-static inline bool debugger_trap_fatal(
-unsigned int vector, struct cpu_user_regs *regs)
-{
-return false;
-}
-
-#define debugger_trap_immediate() ((void)0)
-
-#endif
-
 #endif /* __X86_DEBUGGER_H__ */
diff --git a/xen/include/xen/debugger.h b/xen/include/xen/debugger.h
new file mode 100644
index 00..166fad9d2e
--- /dev/null
+++ b/xen/include/xen/debugger.h
@@ -0,0 +1,51 @@
+/**
+ * Generic hooks into arch-dependent Xen.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * 

[PATCH v3 5/6] arch/x86: move domain_pause_for_debugger() to domain.h

2021-08-18 Thread Bobby Eshleman
domain_pause_for_debugger() was previously in debugger.h.  This commit
moves it to domain.h because its implementation is in domain.c.

Signed-off-by: Bobby Eshleman 
---
Changes in v3:
- domain_pause_for_debugger() is now moved into debugger.h, not a new
  file debugger.c

 xen/arch/x86/hvm/svm/svm.c  | 2 +-
 xen/arch/x86/hvm/vmx/realmode.c | 2 +-
 xen/arch/x86/hvm/vmx/vmx.c  | 2 +-
 xen/arch/x86/nmi.c  | 1 -
 xen/arch/x86/traps.c| 1 +
 xen/include/asm-x86/debugger.h  | 2 --
 xen/include/asm-x86/domain.h| 2 ++
 7 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 642a64b747..84448e496f 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -58,7 +59,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index cc23afa788..5c4b1910a9 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -14,7 +14,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index e09b7e3af9..6fd59865c7 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -51,7 +52,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/xen/arch/x86/nmi.c b/xen/arch/x86/nmi.c
index ab94a96c4d..11d5f5a917 100644
--- a/xen/arch/x86/nmi.c
+++ b/xen/arch/x86/nmi.c
@@ -30,7 +30,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index d0a4c0ea74..5947ed25d6 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -63,6 +63,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/xen/include/asm-x86/debugger.h b/xen/include/asm-x86/debugger.h
index ed4d5c829b..8f6222956e 100644
--- a/xen/include/asm-x86/debugger.h
+++ b/xen/include/asm-x86/debugger.h
@@ -26,8 +26,6 @@
 #include 
 #include 
 
-void domain_pause_for_debugger(void);
-
 #ifdef CONFIG_CRASH_DEBUG
 
 #include 
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 92d54de0b9..de854b5bfa 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -672,6 +672,8 @@ void update_guest_memory_policy(struct vcpu *v,
 
 void domain_cpu_policy_changed(struct domain *d);
 
+void domain_pause_for_debugger(void);
+
 bool update_runstate_area(struct vcpu *);
 bool update_secondary_system_time(struct vcpu *,
   struct vcpu_time_info *);
-- 
2.32.0




[PATCH v3 2/6] x86/debugger: separate Xen and guest debugging debugger_trap_* functions

2021-08-18 Thread Bobby Eshleman
Unlike debugger_trap_fatal() and debugger_trap_immediate(),
debugger_trap_entry() is specific to guest debugging and *NOT* the
debugging of Xen itself. That is, it is part of gdbsx functionality and
not the Xen gdstub. This is evidenced by debugger_trap_fatal()'s usage
of domain_pause_for_debugger(). Because of this, debugger_trap_entry()
does not belong alongside the generic Xen debugger functionality.

This commit fixes this by expanding inline debugger_trap_entry() into
its usage in x86/traps.c and stubbing out domain_pause_for_debugger()
when !CONFIG_GDBSX. Properly placing what was debugger_trap_entry()
under the scope of gdbsx instead of gdbstub.

The function calls that caused an effective no-op and early exit out of
debugger_trap_entry() are removed completely (when the trapnr is not
int3/debug).

This commit is one of a series geared towards removing the unnecessary
requirement that all architectures to implement .

Signed-off-by: Bobby Eshleman 
---
 xen/arch/x86/domain.c  |  2 +-
 xen/arch/x86/traps.c   | 48 --
 xen/include/asm-x86/debugger.h | 42 ++---
 3 files changed, 31 insertions(+), 61 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index ef1812dc14..70894ff826 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -2541,7 +2541,7 @@ __initcall(init_vcpu_kick_softirq);
 
 void domain_pause_for_debugger(void)
 {
-#ifdef CONFIG_CRASH_DEBUG
+#ifdef CONFIG_GDBSX
 struct vcpu *curr = current;
 struct domain *d = curr->domain;
 
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index e60af16ddd..d0a4c0ea74 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -858,13 +858,20 @@ static void do_trap(struct cpu_user_regs *regs)
 if ( regs->error_code & X86_XEC_EXT )
 goto hardware_trap;
 
-if ( debugger_trap_entry(trapnr, regs) )
-return;
-
 ASSERT(trapnr < 32);
 
 if ( guest_mode(regs) )
 {
+struct vcpu *curr = current;
+if ( (trapnr == TRAP_debug || trapnr == TRAP_int3) &&
+  guest_kernel_mode(curr, regs) &&
+  curr->domain->debugger_attached )
+{
+if ( trapnr != TRAP_debug )
+curr->arch.gdbsx_vcpu_event = trapnr;
+domain_pause_for_debugger();
+return;
+}
 pv_inject_hw_exception(trapnr,
(TRAP_HAVE_EC & (1u << trapnr))
? regs->error_code : X86_EVENT_NO_EC);
@@ -1094,9 +1101,6 @@ void do_invalid_op(struct cpu_user_regs *regs)
 int id = -1, lineno;
 const struct virtual_region *region;
 
-if ( debugger_trap_entry(TRAP_invalid_op, regs) )
-return;
-
 if ( likely(guest_mode(regs)) )
 {
 if ( pv_emulate_invalid_op(regs) )
@@ -1201,8 +1205,7 @@ void do_invalid_op(struct cpu_user_regs *regs)
 
 void do_int3(struct cpu_user_regs *regs)
 {
-if ( debugger_trap_entry(TRAP_int3, regs) )
-return;
+struct vcpu *curr = current;
 
 if ( !guest_mode(regs) )
 {
@@ -1215,6 +1218,12 @@ void do_int3(struct cpu_user_regs *regs)
 
 return;
 }
+else if ( guest_kernel_mode(curr, regs) && curr->domain->debugger_attached 
)
+{
+curr->arch.gdbsx_vcpu_event = TRAP_int3;
+domain_pause_for_debugger();
+return;
+}
 
 pv_inject_hw_exception(TRAP_int3, X86_EVENT_NO_EC);
 }
@@ -1492,9 +1501,6 @@ void do_page_fault(struct cpu_user_regs *regs)
 /* fixup_page_fault() might change regs->error_code, so cache it here. */
 error_code = regs->error_code;
 
-if ( debugger_trap_entry(TRAP_page_fault, regs) )
-return;
-
 perfc_incr(page_faults);
 
 /* Any shadow stack access fault is a bug in Xen. */
@@ -1593,9 +1599,6 @@ void do_general_protection(struct cpu_user_regs *regs)
 struct vcpu *v = current;
 #endif
 
-if ( debugger_trap_entry(TRAP_gp_fault, regs) )
-return;
-
 if ( regs->error_code & X86_XEC_EXT )
 goto hardware_gp;
 
@@ -1888,9 +1891,6 @@ void do_debug(struct cpu_user_regs *regs)
 /* Stash dr6 as early as possible. */
 dr6 = read_debugreg(6);
 
-if ( debugger_trap_entry(TRAP_debug, regs) )
-return;
-
 /*
  * At the time of writing (March 2018), on the subject of %dr6:
  *
@@ -1994,6 +1994,11 @@ void do_debug(struct cpu_user_regs *regs)
 
 return;
 }
+else if ( guest_kernel_mode(v, regs) && v->domain->debugger_attached )
+{
+domain_pause_for_debugger();
+return;
+}
 
 /* Save debug status register where guest OS can peek at it */
 v->arch.dr6 |= (dr6 & ~X86_DR6_DEFAULT);
@@ -2014,9 +2019,6 @@ void do_entry_CP(struct cpu_user_regs *regs)
 const char *err = "??";
 unsigned int ec = regs->error_code;
 
-if ( debugger_trap_entry(TRAP_debug, regs) )
-return;
-
 /* Decode ec if possible */
 if ( ec < ARRAY_SIZE(errors) && 

[PATCH v3 3/6] arch/x86: rename debug.c to gdbsx.c

2021-08-18 Thread Bobby Eshleman
This commit renames debug.c to gdbsx.c to clarify its purpose.

The function gdbsx_guest_mem_io() is moved from domctl.c to gdbsx.c.

Although gdbsx_guest_mem_io() is conditionally removed from its single
call site in domctl.c upon !CONFIG_GDBSX and so no stub is technically
necessary, this commit adds a stub that would preserve the functioning
of that call site if the #ifdef CONFIG_GDBSX were to ever be removed or
the function were to ever be called outside of such an ifdef block.

Signed-off-by: Bobby Eshleman 
---
 xen/arch/x86/Makefile |  2 +-
 xen/arch/x86/domctl.c | 12 +---
 xen/arch/x86/{debug.c => gdbsx.c} | 12 ++--
 xen/include/asm-x86/debugger.h|  6 --
 xen/include/asm-x86/gdbsx.h   | 17 +
 5 files changed, 29 insertions(+), 20 deletions(-)
 rename xen/arch/x86/{debug.c => gdbsx.c} (93%)
 create mode 100644 xen/include/asm-x86/gdbsx.h

diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index fe38cfd544..ef8c2c4770 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -20,7 +20,7 @@ obj-y += cpuid.o
 obj-$(CONFIG_PV) += compat.o
 obj-$(CONFIG_PV32) += x86_64/compat.o
 obj-$(CONFIG_KEXEC) += crash.o
-obj-$(CONFIG_GDBSX) += debug.o
+obj-$(CONFIG_GDBSX) += gdbsx.o
 obj-y += delay.o
 obj-y += desc.o
 obj-bin-y += dmi_scan.init.o
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 26a76d2be9..a492fe140e 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -33,20 +34,9 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
-#ifdef CONFIG_GDBSX
-static int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio 
*iop)
-{
-iop->remain = dbg_rw_mem(iop->gva, guest_handle_from_ptr(iop->uva, void),
- iop->len, domid, iop->gwr, iop->pgd3val);
-
-return iop->remain ? -EFAULT : 0;
-}
-#endif
-
 static int update_domain_cpu_policy(struct domain *d,
 xen_domctl_cpu_policy_t *xdpc)
 {
diff --git a/xen/arch/x86/debug.c b/xen/arch/x86/gdbsx.c
similarity index 93%
rename from xen/arch/x86/debug.c
rename to xen/arch/x86/gdbsx.c
index d90dc93056..adea0f017b 100644
--- a/xen/arch/x86/debug.c
+++ b/xen/arch/x86/gdbsx.c
@@ -19,7 +19,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 typedef unsigned long dbgva_t;
@@ -158,7 +158,7 @@ static unsigned int dbg_rw_guest_mem(struct domain *dp, 
unsigned long addr,
  * pgd3: value of init_mm.pgd[3] in guest. see above.
  * Returns: number of bytes remaining to be copied.
  */
-unsigned int dbg_rw_mem(unsigned long gva, XEN_GUEST_HANDLE_PARAM(void) buf,
+static unsigned int dbg_rw_mem(unsigned long gva, XEN_GUEST_HANDLE_PARAM(void) 
buf,
 unsigned int len, domid_t domid, bool toaddr,
 uint64_t pgd3)
 {
@@ -174,6 +174,14 @@ unsigned int dbg_rw_mem(unsigned long gva, 
XEN_GUEST_HANDLE_PARAM(void) buf,
 return len;
 }
 
+int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio *iop)
+{
+iop->remain = dbg_rw_mem(iop->gva, guest_handle_from_ptr(iop->uva, void),
+ iop->len, domid, iop->gwr, iop->pgd3val);
+
+return iop->remain ? -EFAULT : 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/debugger.h b/xen/include/asm-x86/debugger.h
index cd6b9477f7..ed4d5c829b 100644
--- a/xen/include/asm-x86/debugger.h
+++ b/xen/include/asm-x86/debugger.h
@@ -54,10 +54,4 @@ static inline bool debugger_trap_fatal(
 
 #endif
 
-#ifdef CONFIG_GDBSX
-unsigned int dbg_rw_mem(unsigned long gva, XEN_GUEST_HANDLE_PARAM(void) buf,
-unsigned int len, domid_t domid, bool toaddr,
-uint64_t pgd3);
-#endif
-
 #endif /* __X86_DEBUGGER_H__ */
diff --git a/xen/include/asm-x86/gdbsx.h b/xen/include/asm-x86/gdbsx.h
new file mode 100644
index 00..2b8812881d
--- /dev/null
+++ b/xen/include/asm-x86/gdbsx.h
@@ -0,0 +1,17 @@
+#ifndef __X86_GDBX_H
+#define __X86_GDBX_H__
+
+#ifdef CONFIG_GDBSX
+
+int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio *iop);
+
+#else
+
+static inline int gdbsx_guest_mem_io(domid_t domid, struct 
xen_domctl_gdbsx_memio *iop)
+{
+return -EOPNOTSUPP;
+}
+
+#endif
+
+#endif
-- 
2.32.0




[PATCH v3 4/6] x86/gdbsx: expand dbg_rw_mem() inline

2021-08-18 Thread Bobby Eshleman
Because dbg_rw_mem() has only a single call site, this commit
expands it inline.
---
 xen/arch/x86/gdbsx.c | 32 +++-
 1 file changed, 11 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/gdbsx.c b/xen/arch/x86/gdbsx.c
index adea0f017b..4cb8e042f9 100644
--- a/xen/arch/x86/gdbsx.c
+++ b/xen/arch/x86/gdbsx.c
@@ -151,33 +151,23 @@ static unsigned int dbg_rw_guest_mem(struct domain *dp, 
unsigned long addr,
 return len;
 }
 
-/*
- * addr is guest addr
- * buf is debugger buffer.
- * if toaddr, then addr = buf (write to addr), else buf = addr (rd from guest)
- * pgd3: value of init_mm.pgd[3] in guest. see above.
- * Returns: number of bytes remaining to be copied.
- */
-static unsigned int dbg_rw_mem(unsigned long gva, XEN_GUEST_HANDLE_PARAM(void) 
buf,
-unsigned int len, domid_t domid, bool toaddr,
-uint64_t pgd3)
+int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio *iop)
 {
 struct domain *d = rcu_lock_domain_by_id(domid);
 
-if ( d )
+if ( d && !d->is_dying )
 {
-if ( !d->is_dying )
-len = dbg_rw_guest_mem(d, gva, buf, len, toaddr, pgd3);
-rcu_unlock_domain(d);
+iop->remain = dbg_rw_guest_mem(
+d, iop->gva, guest_handle_from_ptr(iop->uva, void),
+iop->len, domid, iop->pgd3val);
+}
+else
+{
+iop->remain = iop->len;
 }
 
-return len;
-}
-
-int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio *iop)
-{
-iop->remain = dbg_rw_mem(iop->gva, guest_handle_from_ptr(iop->uva, void),
- iop->len, domid, iop->gwr, iop->pgd3val);
+if ( d )
+rcu_unlock_domain(d);
 
 return iop->remain ? -EFAULT : 0;
 }
-- 
2.32.0




[PATCH v3 1/6] arm/traps: remove debugger_trap_fatal() calls

2021-08-18 Thread Bobby Eshleman
ARM doesn't actually use debugger_trap_* anything, and is stubbed out.

This commit simply removes the unneeded calls.

Signed-off-by: Bobby Eshleman 
---
 xen/arch/arm/traps.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 4ccb6e7d18..889650ba63 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -41,7 +41,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -1266,10 +1265,6 @@ int do_bug_frame(const struct cpu_user_regs *regs, 
vaddr_t pc)
 
 case BUGFRAME_bug:
 printk("Xen BUG at %s%s:%d\n", prefix, filename, lineno);
-
-if ( debugger_trap_fatal(TRAP_invalid_op, regs) )
-return 0;
-
 show_execution_state(regs);
 panic("Xen BUG at %s%s:%d\n", prefix, filename, lineno);
 
@@ -1281,8 +1276,6 @@ int do_bug_frame(const struct cpu_user_regs *regs, 
vaddr_t pc)
 
 printk("Assertion '%s' failed at %s%s:%d\n",
predicate, prefix, filename, lineno);
-if ( debugger_trap_fatal(TRAP_invalid_op, regs) )
-return 0;
 show_execution_state(regs);
 panic("Assertion '%s' failed at %s%s:%d\n",
   predicate, prefix, filename, lineno);
-- 
2.32.0




[PATCH v3 0/6] Remove unconditional arch dependency on asm/debugger.h

2021-08-18 Thread Bobby Eshleman
This series removes the unconditional requirement that all architectures
implement asm/debugger.h. It additionally removes arm's debugger.h and
disentangles some of the x86 gdbsx/gdbstub/generic debugger code.

Additionally, this series does the following:
- Provides generic stubs when !CONFIG_CRASH_DEBUG
- Adds stronger separation between gdbstub, gdbsx, and generic debugger
  code.

The patches in this v3 are quite a bit different than in v2, so
per-patch changes are omitted. This difference in the patchset version
is largely due to the need to decouple the debugger_trap_* functions.

Changes from v2:
- The first patch drops ARM's calls to the debugger stubs, removing the
  need to add fake values.
- No debugger.c is added, as it was unnecessary when code was moved into
  existing and appropriate files.
- debugger_trap_entry() expands inline into its few call sites
- debug.c becomes gdbsx.c
- All gdbsx functions move into gdbsx.c and become dependent on
  CONFIG_GDBSX (instead of CONFIG_CRASH_DEBUG as was the case for
  domain_pause_for_debugger(), for example)

It's worth noting that debugger.h is still not truly generic as
debugger_trap_fatal() for x86 necessarily calls into the gdbstub… but
further generalization is unnecessary while it is still only
backed by the gdbstub.

As I'm not *exactly* an expert on this code, so feel free to inform me of my
confusion where you see it.

Bobby Eshleman (6):
  arm/traps: remove debugger_trap_fatal() calls
  x86/debugger: separate Xen and guest debugging debugger_trap_*
functions
  arch/x86: rename debug.c to gdbsx.c
  x86/gdbsx: expand dbg_rw_mem() inline
  arch/x86: move domain_pause_for_debugger() to domain.h
  x86: change asm/debugger.h to xen/debugger.h

 xen/arch/arm/traps.c  |  7 
 xen/arch/x86/Makefile |  2 +-
 xen/arch/x86/domain.c |  2 +-
 xen/arch/x86/domctl.c | 12 +-
 xen/arch/x86/{debug.c => gdbsx.c} | 30 +++---
 xen/arch/x86/hvm/svm/svm.c|  2 +-
 xen/arch/x86/hvm/vmx/realmode.c   |  2 +-
 xen/arch/x86/hvm/vmx/vmx.c|  2 +-
 xen/arch/x86/nmi.c|  1 -
 xen/arch/x86/traps.c  | 51 ++--
 xen/common/domain.c   |  2 +-
 xen/common/gdbstub.c  |  2 +-
 xen/common/keyhandler.c   |  2 +-
 xen/common/shutdown.c |  2 +-
 xen/drivers/char/console.c|  2 +-
 xen/include/asm-arm/debugger.h| 15 ---
 xen/include/asm-x86/debugger.h| 65 +--
 xen/include/asm-x86/domain.h  |  2 +
 xen/include/asm-x86/gdbsx.h   | 17 
 xen/include/xen/debugger.h| 51 
 20 files changed, 127 insertions(+), 144 deletions(-)
 rename xen/arch/x86/{debug.c => gdbsx.c} (89%)
 delete mode 100644 xen/include/asm-arm/debugger.h
 create mode 100644 xen/include/asm-x86/gdbsx.h
 create mode 100644 xen/include/xen/debugger.h

-- 
2.32.0




Re: [PATCH] Arm: relax iomem_access_permitted() check

2021-08-18 Thread Julien Grall

Hi Jan,

On 18/08/2021 08:52, Jan Beulich wrote:

Ranges checked by iomem_access_permitted() are inclusive; to permit a
mapping there's no need for access to also have been granted for the
subsequent page.


Good catch! OOI, how did you find it?



Fixes: 80f9c3167084 ("xen/arm: acpi: Map MMIO on fault in stage-2 page table for the 
hardware domain")
Signed-off-by: Jan Beulich 


Reviewed-by: Julien Grall 

Cheers,



--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1893,7 +1893,7 @@ static bool try_map_mmio(gfn_t gfn)
  return false;
  
  /* The hardware domain can only map permitted MMIO regions */

-if ( !iomem_access_permitted(d, mfn_x(mfn), mfn_x(mfn) + 1) )
+if ( !iomem_access_permitted(d, mfn_x(mfn), mfn_x(mfn)) )
  return false;
  
  return !map_regions_p2mt(d, gfn, 1, mfn, p2m_mmio_direct_c);




--
Julien Grall



Re: [PATCH v2] xen/arm: smmu: Set/clear IOMMU domain for device

2021-08-18 Thread Julien Grall

Hi Oleksandr,

On 18/08/2021 06:22, Oleksandr Andrushchenko wrote:

From: Oleksandr Andrushchenko 

When a device is assigned/de-assigned it is required to properly set
IOMMU domain used to protect the device. This assignment was missing,
thus it was not possible to de-assign the device:

(XEN) Deassigning device :03:00.0 from dom2
(XEN) smmu: :03:00.0:  not attached to domain 2
(XEN) d2: deassign (:03:00.0) failed (-3)

Fix this by assigning IOMMU domain on arm_smmu_assign_dev and reset it
to NULL on arm_smmu_deassign_dev.

Fixes: 06d1f7a278dd ("xen/arm: smmuv1: Keep track of S2CR state")

Signed-off-by: Oleksandr Andrushchenko 
Signed-off-by: Oleksandr Tyshchenko 


Reviewed-by: Julien Grall 

Cheers,

--
Julien Grall



[linux-linus test] 164233: regressions - FAIL

2021-08-18 Thread osstest service owner
flight 164233 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/164233/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-xsm7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-install fail REGR. 
vs. 152332
 test-amd64-i386-xl-qemuu-ws16-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-install fail REGR. vs. 
152332
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-qemuu-rhel6hvm-intel  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-examine   6 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332
 test-amd64-i386-libvirt   7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-install  fail REGR. vs. 152332
 test-amd64-coresched-i386-xl  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-qemut-rhel6hvm-amd  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-install fail REGR. vs. 
152332
 test-amd64-i386-qemuu-rhel6hvm-amd  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-pair 10 xen-install/src_host fail REGR. vs. 152332
 test-amd64-i386-pair 11 xen-install/dst_host fail REGR. vs. 152332
 test-amd64-i386-libvirt-xsm   7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-ws16-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-raw7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332
 test-amd64-i386-freebsd10-i386  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-xl-shadow 7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-freebsd10-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-pvshim 7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-win7-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-win7-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-install fail REGR. 
vs. 152332
 test-amd64-i386-libvirt-pair 10 xen-install/src_host fail REGR. vs. 152332
 test-amd64-i386-libvirt-pair 11 xen-install/dst_host fail REGR. vs. 152332
 test-arm64-arm64-xl-credit2  13 debian-fixup fail REGR. vs. 152332
 test-arm64-arm64-xl-thunderx 13 debian-fixup fail REGR. vs. 152332
 test-arm64-arm64-xl-credit1  13 debian-fixup fail REGR. vs. 152332
 test-arm64-arm64-xl  13 debian-fixup fail REGR. vs. 152332
 test-arm64-arm64-libvirt-xsm 14 guest-start  fail REGR. vs. 152332
 test-amd64-amd64-qemuu-freebsd12-amd64 21 guest-start/freebsd.repeat fail 
REGR. vs. 152332
 test-amd64-amd64-amd64-pvgrub 12 debian-di-install   fail REGR. vs. 152332
 test-amd64-amd64-i386-pvgrub 12 debian-di-installfail REGR. vs. 152332

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 152332
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 152332
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152332
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152332
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 152332
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 

[PATCH] mini-os: xenbus: support large messages

2021-08-18 Thread Juergen Gross
Today the implementation of the xenbus protocol in Mini-OS will only
allow to transfer the complete message to or from the ring page buffer.
This is limiting the maximum message size to lower values as the xenbus
protocol normally would allow.

Change that by allowing to transfer the xenbus message in chunks as
soon as they are available.

Avoid crashing Mini-OS in case of illegal data read from the ring
buffer.

Signed-off-by: Juergen Gross 
---
 xenbus/xenbus.c | 212 
 1 file changed, 124 insertions(+), 88 deletions(-)

diff --git a/xenbus/xenbus.c b/xenbus/xenbus.c
index 23de61e..3fbb122 100644
--- a/xenbus/xenbus.c
+++ b/xenbus/xenbus.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define min(x,y) ({   \
 typeof(x) tmpx = (x); \
@@ -46,6 +47,7 @@
 static struct xenstore_domain_interface *xenstore_buf;
 static DECLARE_WAIT_QUEUE_HEAD(xb_waitq);
 DECLARE_WAIT_QUEUE_HEAD(xenbus_watch_queue);
+static __DECLARE_SEMAPHORE_GENERIC(xb_write_sem, 1);
 
 xenbus_event_queue xenbus_events;
 static struct watch {
@@ -231,75 +233,105 @@ char *xenbus_wait_for_state_change(const char* path, 
XenbusState *state, xenbus_
 }
 
 
+static void xenbus_read_data(char *buf, unsigned int len)
+{
+unsigned int off = 0;
+unsigned int prod;
+unsigned int size;
+int notify;
+
+while (off != len)
+{
+if (xenstore_buf->rsp_prod == xenstore_buf->rsp_cons)
+wait_event(xb_waitq,
+   xenstore_buf->rsp_prod != xenstore_buf->rsp_cons);
+
+prod = xenstore_buf->rsp_prod;
+DEBUG("Rsp_cons %d, rsp_prod %d.\n", xenstore_buf->rsp_cons, prod);
+size = min(len - off, prod - xenstore_buf->rsp_cons);
+memcpy_from_ring(xenstore_buf->rsp, buf + off,
+ MASK_XENSTORE_IDX(xenstore_buf->rsp_cons), size);
+off += size;
+notify = (xenstore_buf->rsp_cons + XENSTORE_RING_SIZE ==
+  xenstore_buf->rsp_prod);
+rmb();
+xenstore_buf->rsp_cons += size;
+wmb();
+if (notify)
+notify_remote_via_evtchn(xenbus_evtchn);
+}
+}
+
 static void xenbus_thread_func(void *ign)
 {
 struct xsd_sockmsg msg;
-unsigned prod = xenstore_buf->rsp_prod;
+char *data;
 
 for (;;) {
-wait_event(xb_waitq, prod != xenstore_buf->rsp_prod);
-while (1) {
-prod = xenstore_buf->rsp_prod;
-DEBUG("Rsp_cons %d, rsp_prod %d.\n", xenstore_buf->rsp_cons,
-  xenstore_buf->rsp_prod);
-if (xenstore_buf->rsp_prod - xenstore_buf->rsp_cons < sizeof(msg))
-break;
-rmb();
-memcpy_from_ring(xenstore_buf->rsp, ,
- MASK_XENSTORE_IDX(xenstore_buf->rsp_cons),
- sizeof(msg));
-DEBUG("Msg len %d, %d avail, id %d.\n", msg.len + sizeof(msg),
-  xenstore_buf->rsp_prod - xenstore_buf->rsp_cons, msg.req_id);
-
-if (xenstore_buf->rsp_prod - xenstore_buf->rsp_cons <
-sizeof(msg) + msg.len)
-break;
-
-DEBUG("Message is good.\n");
-
-if (msg.type == XS_WATCH_EVENT) {
-struct xenbus_event *event = malloc(sizeof(*event) + msg.len);
-xenbus_event_queue *events = NULL;
-char *data = (char*)event + sizeof(*event);
-struct watch *watch;
-
-memcpy_from_ring(xenstore_buf->rsp, data,
-MASK_XENSTORE_IDX(xenstore_buf->rsp_cons + sizeof(msg)),
-msg.len);
-
-event->path = data;
-event->token = event->path + strlen(event->path) + 1;
-
-mb();
-xenstore_buf->rsp_cons += msg.len + sizeof(msg);
-
-for (watch = watches; watch; watch = watch->next)
-if (!strcmp(watch->token, event->token)) {
-events = watch->events;
-break;
-}
-
-if (events) {
-event->next = *events;
-*events = event;
-wake_up(_watch_queue);
-} else {
-printk("unexpected watch token %s\n", event->token);
-free(event);
+xenbus_read_data((char *), sizeof(msg));
+DEBUG("Msg len %d, %d avail, id %d.\n", msg.len + sizeof(msg),
+  xenstore_buf->rsp_prod - xenstore_buf->rsp_cons, msg.req_id);
+
+if (msg.len > XENSTORE_PAYLOAD_MAX) {
+printk("Xenstore violates protocol, message longer than 
allowed.\n");
+return;
+}
+
+if (msg.type == XS_WATCH_EVENT) {
+struct xenbus_event *event = malloc(sizeof(*event) + msg.len);
+xenbus_event_queue *events = NULL;
+struct watch *watch;
+

Xen 4.16: Proposed release manager and schedule

2021-08-18 Thread Ian Jackson
We haven't formally appointed a release manager for Xen 4.16.
I was approached and asked if I would do the job, and said yes,
but I think things got stuck there.  Taking this as a prima
faciae indication of community confidence, I hereby volunteer to
take on this role.

And, I would like to tentatively propose the following schedule and
policies for Xen 4.16.  This is based on the 4.15 schedule with some
tweaks, and is intended to align the ultimate date roughly with the
4.10 and 4.13 releases, which were also in early/mid December.

I suggest we use the Lazy Consensus procedure to decide on the Release
Manager appointment, starting now (since we're already rather late).
In particular, if you feel someone else would make a better release
manager, please say so right away.

For the release schedule, assuming I'm the RM, please send comments
ASAP and in any case by noon UTC on Wednesday the 25th of August.  I
hope to finalise the schedule then.

** DRAFT **

  Friday 17th September Last posting date

Patches adding new features should be posted to the mailing list
by this cate, although perhaps not in their final version.
(3 weeks)

  Friday 8th OctoberFeature freeze
 
Patches adding new features should be committed by this date.
Straightforward bugfixes may continue to be accepted by
maintainers.
(3 weeks)

  Friday 29th October **tentatve**  Code freeze

Bugfixes only, all changes to be approved by the Release Manager,
on the basis of a (progressively stricter[*]) risk assessment.
(2.5 weeks)

  Tuesday 16nd November **tentative**   Branch off staging-4.16
Hard code freeze [*]

Bugfixes for serious bugs (including regressions), and low-risk
fixes only.  All changes to be approved by the Release Manager on
the basis of a (progressively stricter[*]) risk assessment.

xen-unstable open again - with caveats to avoid release disruption.

(2.5 weeks)

  Friday 3rd December **tentative** Final commits (docs/prep only)
  Week of 6th December **tentative**Release
(probably Tuesday or Wednesday)

Any patches containing substantial refactoring are to treated as
new features, even if they intent is to fix bugs.

Freeze exceptions will not be routine, but may be granted in
exceptional cases for small changes (on the basis of risk assessment,
like any release-ack).  Large series will not get exceptions.
Contributors *must not* rely on, or expect, a freeze exception, or
release schedule slip.

If as a feature proponent you feel your feature is at risk and there
is something the Xen Project could do to help, please consult me or
ask here on xen-devel.  In such situations please reach out earlier
rather than later.  I will try to put you in touch with people who may
be able to help.

[*] The distinction between Code Freeze and Hard Code Freeze is a
matter of degree, not kind; the Hard Code Freeze data and associated
tighter policy text is indicative rather than normative.

** END OF DRAFT **

Thanks,
Ian.



[xtf test] 164235: all pass - PUSHED

2021-08-18 Thread osstest service owner
flight 164235 xtf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/164235/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 xtf  fc32c40a97069d4696fb7aa9cb76e3ae09aa18dd
baseline version:
 xtf  6d8000aeadb82c96567022a623f770348989

Last test of basis   164231  2021-08-17 20:47:09 Z0 days
Testing same since   164235  2021-08-18 03:44:23 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 

jobs:
 build-amd64-xtf  pass
 build-amd64  pass
 build-amd64-pvopspass
 test-xtf-amd64-amd64-1   pass
 test-xtf-amd64-amd64-2   pass
 test-xtf-amd64-amd64-3   pass
 test-xtf-amd64-amd64-4   pass
 test-xtf-amd64-amd64-5   pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xtf.git
   6d8000a..fc32c40  fc32c40a97069d4696fb7aa9cb76e3ae09aa18dd -> 
xen-tested-master



Re: meaning and use of IOMMU_FLUSHF_added

2021-08-18 Thread Paul Durrant

On 18/08/2021 13:09, Jan Beulich wrote:

On 18.08.2021 12:51, Jan Beulich wrote:

Paul,

back at the time I did already question your intended meaning of
this flag. I notice that there's presently no consumer of it being
set (apart from yielding non-zero flush_flags). I'm afraid this
model makes accumulation of flush flags not work properly: With
both flags set and more than a single page altered, it is
impossible to tell apart whether two present PTEs were altered, or
a non-present and a present one.

VT-d's flushing needs to know the distinction; it may in fact be
necessary to issue two flushes (or a single "heavier" one) when
both non-present and present entries got transitioned to present
in one go.


No two (or "heavier") flush looks to be needed upon further reading.
I did derive this from our setting of "did" to zero in that case,
but that looks to be wrong in the first place - it's correct only
for context cache entry flushes. I'll make a(nother) patch ...



Yes, the intention of the flag was simply to allow a 'lighter' flush in 
the case there are no modified entries as part of the accumulation. If 
it is impossible to tell the difference then I guess it's not useful.


  Paul




[linux-5.4 test] 164232: regressions - FAIL

2021-08-18 Thread osstest service owner
flight 164232 linux-5.4 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/164232/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-i386-pvgrub 12 debian-di-installfail REGR. vs. 164131
 test-amd64-amd64-amd64-pvgrub 12 debian-di-install   fail REGR. vs. 164131

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-examine4 memdisk-try-append fail in 164224 pass in 164232
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 12 debian-hvm-install fail pass 
in 164224
 test-armhf-armhf-xl-rtds 18 guest-start/debian.repeat  fail pass in 164224
 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 12 debian-hvm-install fail pass in 
164224

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 164131
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 164131
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 164131
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 164131
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 164131
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 164131
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 164131
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 164131
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 164131
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 164131
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 164131
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 linux

Re: [PATCH] xen/sched: fix get_cpu_idle_time() for smt=0 suspend/resume

2021-08-18 Thread Dario Faggioli
On Wed, 2021-08-18 at 12:21 +0200, Juergen Gross wrote:
> Fix that by letting get_cpu_idle_time() deal with this case.
> 
> Fixes: 132cbe8f35632fb2 ("sched: fix get_cpu_idle_time() with core
> scheduling")
> Reported-by: Marek Marczykowski-Górecki
> 
> Signed-off-by: Juergen Gross 
> Tested-by: Marek Marczykowski-Górecki
> 
>
Mmm... This is an interesting one. In fact, this fix is not only
correct, it's also simple, effective and (I guess) easy enough to
backport.

Considering all these things together with the fact that we have an
open issue, I'm going to provide my:

Acked-by: Dario Faggioli 

(and this remains valid with Jan's proposed change done upon
committing.)

That said, in the long run...

> ---
> An alternative way to fix the issue would be to keep the
> sched_resource
> of offline cpus allocated like we already do with idle vcpus and
> units.
> This fix would be more intrusive, but it would avoid similar other
> bugs
> like this one.
> 
... it would be probably interesting to go this route, as it looks both
more consistent and future proof (I mean implement it proactively,
independently of issues... when/if we have time, of course!)

Thanks and Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
---
<> (Raistlin Majere)


signature.asc
Description: This is a digitally signed message part


[libvirt test] 164236: regressions - FAIL

2021-08-18 Thread osstest service owner
flight 164236 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/164236/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 151777

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  5590fbf8d6adbf87999f5883bb97cf5782042fdf
baseline version:
 libvirt  2c846fa6bcc11929c9fb857a22430fb9945654ad

Last test of basis   151777  2020-07-10 04:19:19 Z  404 days
Failing since151818  2020-07-11 04:18:52 Z  403 days  395 attempts
Testing same since   164236  2021-08-18 04:20:08 Z0 days1 attempts


People who touched revisions under test:
Adolfo Jayme Barrientos 
  Aleksandr Alekseev 
  Aleksei Zakharov 
  Andika Triwidada 
  Andrea Bolognani 
  Balázs Meskó 
  Barrett Schonefeld 
  Bastian Germann 
  Bastien Orivel 
  BiaoXiang Ye 
  Bihong Yu 
  Binfeng Wu 
  Bjoern Walk 
  Boris Fiuczynski 
  Brian Turek 
  Bruno Haible 
  Chris Mayo 
  Christian Ehrhardt 
  Christian Kirbach 
  Christian Schoenebeck 
  Cole Robinson 
  Collin Walling 
  Cornelia Huck 
  Cédric Bosdonnat 
  Côme Borsoi 
  Daniel Henrique Barboza 
  Daniel Letai 
  Daniel P. Berrange 
  Daniel P. Berrangé 
  Didik Supriadi 
  Dmytro Linkin 
  Eiichi Tsukata 
  Eric Farman 
  Erik Skultety 
  Fabian Affolter 
  Fabian Freyer 
  Fabiano Fidêncio 
  Fangge Jin 
  Farhan Ali 
  Fedora Weblate Translation 
  gongwei 
  Guoyi Tu
  Göran Uddeborg 
  Halil Pasic 
  Han Han 
  Hao Wang 
  Hela Basa 
  Helmut Grohne 
  Ian Wienand 
  Jakob Meng 
  Jamie Strandboge 
  Jamie Strandboge 
  Jan Kuparinen 
  Jean-Baptiste Holcroft 
  Jia Zhou 
  Jianan Gao 
  Jim Fehlig 
  Jin Yan 
  Jinsheng Zhang 
  Jiri Denemark 
  John Ferlan 
  Jonathan Watt 
  Jonathon Jongsma 
  Julio Faracco 
  Justin Gatzen 
  Ján Tomko 
  Kashyap Chamarthy 
  Kevin Locke 
  Kristina Hanicova 
  Laine Stump 
  Laszlo Ersek 
  Lee Yarwood 
  Liao Pingfang 
  Lin Ma 
  Lin Ma 
  Lin Ma 
  Liu Yiding 
  Luke Yue 
  Luyao Zhong 
  Marc Hartmayer 
  Marc-André Lureau 
  Marek Marczykowski-Górecki 
  Markus Schade 
  Martin Kletzander 
  Masayoshi Mizuma 
  Matej Cepl 
  Matt Coleman 
  Matt Coleman 
  Mauro Matteo Cascella 
  Meina Li 
  Michal Privoznik 
  Michał Smyk 
  Milo Casagrande 
  Moshe Levi 
  Muha Aliss 
  Nathan 
  Neal Gompa 
  Nick Shyrokovskiy 
  Nickys Music Group 
  Nico Pache 
  Nikolay Shirokovskiy 
  Olaf Hering 
  Olesya Gerasimenko 
  Orion Poplawski 
  Pany 
  Patrick Magauran 
  Paulo de Rezende Pinatti 
  Pavel Hrdina 
  Peng Liang 
  Peter Krempa 
  Pino Toscano 
  Pino Toscano 
  Piotr Drąg 
  Prathamesh Chavan 
  Richard W.M. Jones 
  Ricky Tigg 
  Roman Bogorodskiy 
  Roman Bolshakov 
  Ryan Gahagan 
  Ryan Schmidt 
  Sam Hartman 
  Scott Shambarger 
  Sebastian Mitterle 
  SeongHyun Jo 
  Shalini Chellathurai Saroja 
  Shaojun Yang 
  Shi Lei 
  simmon 
  Simon Chopin 
  Simon Gaiser 
  Stefan Bader 
  Stefan Berger 
  Stefan Berger 
  Stefan Hajnoczi 
  Szymon Scholz 
  Thomas Huth 
  Tim Wiederhake 
  Tomáš Golembiovský 
  Tomáš Janoušek 
  Tuguoyi 
  Ville Skyttä 
  Vinayak Kale 
  Wang Xin 
  WangJian 
  Weblate 
  Wei Liu 
  Wei Liu 
  William Douglas 
  Yalei Li <274268...@qq.com>
  Yalei Li 
  Yang Fei 
  Yang Hang 
  Yanqiu Zhang 
  Yaroslav Kargin 
  Yi Li 
  Yi Wang 
  Yuri Chornoivan 
  Zbigniew Jędrzejewski-Szmek 
  zhangjl02 
  Zheng Chuan 
  zhenwei pi 
  Zhenyu Ye 
  Zhenyu Zheng 
  Zhenzhong Duan 

jobs:
 build-amd64-xsm  pass
 

Re: [PATCH] x86/xstate: reset cached register values on resume

2021-08-18 Thread Andrew Cooper
On 18/08/2021 12:30, Marek Marczykowski-Górecki wrote:
> set_xcr0() and set_msr_xss() use cached value to avoid setting the
> register to the same value over and over. But suspend/resume implicitly
> reset the registers and since percpu areas are not deallocated on
> suspend anymore, the cache gets stale.
> Reset the cache on resume, to ensure the next write will really hit the
> hardware. Choose value 0, as it will never be a legitimate write to
> those registers - and so, will force write (and cache update).
>
> Note the cache is used io get_xcr0() and get_msr_xss() too, but:
> - set_xcr0() is called few lines below in xstate_init(), so it will
>   update the cache with appropriate value
> - get_msr_xss() is not used anywhere - and thus not before any
>   set_msr_xss() that will fill the cache
>
> Fixes: aca2a985a55a "xen: don't free percpu areas during suspend"
> Signed-off-by: Marek Marczykowski-Górecki 

I'd prefer to do this differently.  As I said in the thread, there are
other registers such as MSR_TSC_AUX which fall into the same category,
and I'd like to make something which works systematically.

~Andrew



[PATCH v2 2/2] ns16550: add Exar PCIe UART cards support

2021-08-18 Thread Marek Marczykowski-Górecki
Besides standard UART setup, this device needs enabling
(vendor-specific) "Enhanced Control Bits" - otherwise disabling hardware
control flow (MCR[2]) is ignored. Add appropriate quirk to the
ns16550_setup_preirq(), similar to the handle_dw_usr_busy_quirk(). The
new function act on Exar 2-, 4-, and 8- port cards only. I have tested
the functionality on 2-port card but based on the Linux driver, the same
applies to other models too.

Additionally, Exar card supports fractional divisor (DLD[3:0] register,
at 0x02). This part is not supported here yet, and seems to not
be required for working 115200bps at the very least.

The specification for the 2-port card is available at:
https://www.maxlinear.com/product/interface/uarts/pcie-uarts/xr17v352

Signed-off-by: Marek Marczykowski-Górecki 
---
Changes in v2:
 - use read-modify-write for enabling ECB
 - handle also 4- and 8-ports cards (but not everything from Exar)
 - formatting fixes
---
 xen/drivers/char/ns16550.c  | 81 -
 xen/include/xen/8250-uart.h |  5 +++
 xen/include/xen/pci_ids.h   |  2 +
 3 files changed, 86 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
index 97b85b0225cc..27501f28aa7b 100644
--- a/xen/drivers/char/ns16550.c
+++ b/xen/drivers/char/ns16550.c
@@ -88,6 +88,9 @@ struct ns16550_config {
 param_pericom_2port,
 param_pericom_4port,
 param_pericom_8port,
+param_exar_xr17v352,
+param_exar_xr17v354,
+param_exar_xr17v358,
 } param;
 };
 
@@ -169,6 +172,29 @@ static void handle_dw_usr_busy_quirk(struct ns16550 *uart)
 }
 }
 
+static void enable_exar_enhanced_bits(struct ns16550 *uart)
+{
+#ifdef NS16550_PCI
+if ( uart->bar &&
+ pci_conf_read16(PCI_SBDF(0, uart->ps_bdf[0], uart->ps_bdf[2],
+ uart->ps_bdf[2]), PCI_VENDOR_ID) == 
PCI_VENDOR_ID_EXAR )
+{
+u8 devid = ns_read_reg(uart, UART_XR_DVID);
+u8 efr;
+/*
+ * Exar XR17V35x cards ignore setting MCR[2] (hardware flow control)
+ * unless "Enhanced control bits" is enabled.
+ * The below checks for a 2, 4 or 8 port UART, following Linux driver.
+ */
+if ( devid == 0x82 || devid == 0x84 || devid == 0x88 ) {
+efr = ns_read_reg(uart, UART_XR_EFR);
+efr |= UART_EFR_ECB;
+ns_write_reg(uart, UART_XR_EFR, efr);
+}
+}
+#endif
+}
+
 static void ns16550_interrupt(
 int irq, void *dev_id, struct cpu_user_regs *regs)
 {
@@ -303,6 +329,9 @@ static void ns16550_setup_preirq(struct ns16550 *uart)
 /* Handle the DesignWare 8250 'busy-detect' quirk. */
 handle_dw_usr_busy_quirk(uart);
 
+/* Enable Exar "Enhanced function bits" */
+enable_exar_enhanced_bits(uart);
+
 /* Line control and baud-rate generator. */
 ns_write_reg(uart, UART_LCR, lcr | UART_LCR_DLAB);
 if ( uart->baud != BAUD_AUTO )
@@ -781,7 +810,37 @@ static const struct ns16550_config_param __initconst 
uart_param[] = {
 .lsr_mask = UART_LSR_THRE,
 .bar0 = 1,
 .max_ports = 8,
-}
+},
+[param_exar_xr17v352] = {
+.base_baud = 7812500,
+.uart_offset = 0x400,
+.reg_width = 1,
+.fifo_size = 256,
+.lsr_mask = UART_LSR_THRE,
+.bar0 = 1,
+.mmio = 1,
+.max_ports = 2,
+},
+[param_exar_xr17v354] = {
+.base_baud = 7812500,
+.uart_offset = 0x400,
+.reg_width = 1,
+.fifo_size = 256,
+.lsr_mask = UART_LSR_THRE,
+.bar0 = 1,
+.mmio = 1,
+.max_ports = 4,
+},
+[param_exar_xr17v358] = {
+.base_baud = 7812500,
+.uart_offset = 0x400,
+.reg_width = 1,
+.fifo_size = 256,
+.lsr_mask = UART_LSR_THRE,
+.bar0 = 1,
+.mmio = 1,
+.max_ports = 8,
+},
 };
 
 static const struct ns16550_config __initconst uart_config[] =
@@ -1007,7 +1066,25 @@ static const struct ns16550_config __initconst 
uart_config[] =
 .vendor_id = PCI_VENDOR_ID_PERICOM,
 .dev_id = 0x7958,
 .param = param_pericom_8port
-}
+},
+/* Exar Corp. XR17V352 Dual PCIe UART */
+{
+.vendor_id = PCI_VENDOR_ID_EXAR,
+.dev_id = 0x0352,
+.param = param_exar_xr17v352
+},
+/* Exar Corp. XR17V354 Quad PCIe UART */
+{
+.vendor_id = PCI_VENDOR_ID_EXAR,
+.dev_id = 0x0354,
+.param = param_exar_xr17v354
+},
+/* Exar Corp. XR17V358 Octal PCIe UART */
+{
+.vendor_id = PCI_VENDOR_ID_EXAR,
+.dev_id = 0x0358,
+.param = param_exar_xr17v358
+},
 };
 
 static int __init
diff --git a/xen/include/xen/8250-uart.h b/xen/include/xen/8250-uart.h
index 5c3bac33221e..74e9d552a718 100644
--- a/xen/include/xen/8250-uart.h
+++ b/xen/include/xen/8250-uart.h
@@ -35,6 +35,8 @@
 #define UART_USR  0x1f/* Status register (DW) */
 #define UART_DLL  

[PATCH v2 1/2] ns16550: do not override fifo size if explicitly set

2021-08-18 Thread Marek Marczykowski-Górecki
If fifo size is already set via uart_params, do not force it to 16 - which
may not match the actual hardware. Specifically Exar cards have fifo of
256 bytes.

Signed-off-by: Marek Marczykowski-Górecki 
Reviewed-by: Jan Beulich 
---
 xen/drivers/char/ns16550.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
index 16a73d0c0e47..97b85b0225cc 100644
--- a/xen/drivers/char/ns16550.c
+++ b/xen/drivers/char/ns16550.c
@@ -357,7 +357,8 @@ static void __init ns16550_init_preirq(struct serial_port 
*port)
 ns16550_setup_preirq(uart);
 
 /* Check this really is a 16550+. Otherwise we have no FIFOs. */
-if ( ((ns_read_reg(uart, UART_IIR) & 0xc0) == 0xc0) &&
+if ( uart->fifo_size <= 1 &&
+ ((ns_read_reg(uart, UART_IIR) & 0xc0) == 0xc0) &&
  ((ns_read_reg(uart, UART_FCR) & UART_FCR_TRG14) == UART_FCR_TRG14) )
 uart->fifo_size = 16;
 }
-- 
2.31.1




Re: meaning and use of IOMMU_FLUSHF_added

2021-08-18 Thread Jan Beulich
On 18.08.2021 12:51, Jan Beulich wrote:
> Paul,
> 
> back at the time I did already question your intended meaning of
> this flag. I notice that there's presently no consumer of it being
> set (apart from yielding non-zero flush_flags). I'm afraid this
> model makes accumulation of flush flags not work properly: With
> both flags set and more than a single page altered, it is
> impossible to tell apart whether two present PTEs were altered, or
> a non-present and a present one.
> 
> VT-d's flushing needs to know the distinction; it may in fact be
> necessary to issue two flushes (or a single "heavier" one) when
> both non-present and present entries got transitioned to present
> in one go.

No two (or "heavier") flush looks to be needed upon further reading.
I did derive this from our setting of "did" to zero in that case,
but that looks to be wrong in the first place - it's correct only
for context cache entry flushes. I'll make a(nother) patch ...

Jan




Re: [PATCH] x86/xstate: reset cached register values on resume

2021-08-18 Thread Marek Marczykowski-Górecki
On Wed, Aug 18, 2021 at 02:05:05PM +0200, Jan Beulich wrote:
> On 18.08.2021 13:30, Marek Marczykowski-Górecki wrote:
> > --- a/xen/arch/x86/xstate.c
> > +++ b/xen/arch/x86/xstate.c
> > @@ -642,6 +642,13 @@ void xstate_init(struct cpuinfo_x86 *c)
> >  return;
> >  }
> >  
> > +/*
> > + * Clear the cached value to make set_xcr0() and set_msr_xss() really
> > + * write it.
> > + */
> > +this_cpu(xcr0) = 0;
> 
> While XCR0 cannot be successfully written with 0, ...
> 
> > +this_cpu(xss) = 0;
> 
> ... XSS can. XSS can't be written with various other values; I'd
> suggest using ~0 here. Then
> 
> Reviewed-by: Jan Beulich 
> 
> and I'd be happy to make the adjustment while committing, so long
> as you agree.

Yes, makes sense. Thanks!

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab


signature.asc
Description: PGP signature


Re: [PATCH] x86/xstate: reset cached register values on resume

2021-08-18 Thread Jan Beulich
On 18.08.2021 13:30, Marek Marczykowski-Górecki wrote:
> --- a/xen/arch/x86/xstate.c
> +++ b/xen/arch/x86/xstate.c
> @@ -642,6 +642,13 @@ void xstate_init(struct cpuinfo_x86 *c)
>  return;
>  }
>  
> +/*
> + * Clear the cached value to make set_xcr0() and set_msr_xss() really
> + * write it.
> + */
> +this_cpu(xcr0) = 0;

While XCR0 cannot be successfully written with 0, ...

> +this_cpu(xss) = 0;

... XSS can. XSS can't be written with various other values; I'd
suggest using ~0 here. Then

Reviewed-by: Jan Beulich 

and I'd be happy to make the adjustment while committing, so long
as you agree.

Jan




Re: [PATCH] VT-d: Tylersburg errata apply to further steppings

2021-08-18 Thread Jan Beulich
On 18.08.2021 13:32, Andrew Cooper wrote:
> On 03/08/2021 12:13, Jan Beulich wrote:
>> While for 5500 and 5520 chipsets only B3 and C2 are mentioned in the
>> spec update, X58's also mentions B2, and searching the internet suggests
>> systems with this stepping are actually in use. Even worse, for X58
>> erratum #69 is marked applicable even to C2. Split the check to cover
>> all applicable steppings and to also report applicable errata numbers in
>> the log message. The splitting requires using the DMI port instead of
>> the System Management Registers device, but that's then in line (also
>> revision checking wise) with the spec updates.
>>
>> Fixes: 6890cebc6a98 ("VT-d: deal with 5500/5520/X58 errata")
>> Signed-off-by: Jan Beulich 
>> ---
>> As to disabling just interrupt remapping (as the initial version of the
>> original patch did) vs disabling the IOMMU as a whole: Using a less
>> heavy workaround would of course be desirable, but then we need to
>> ensure not to misguide the tool stack about the state of the system.
> 
> This reasoning is buggy.
> 
> This errata is very specifically to do with interrupt remapping only. 
> Disabling the whole IOMMU in response is inappropriate.

That's your view, and I accept it as a reasonable one. I don't accept
it as being the only reasonable one though, and hence I object to you
tagging other views (here just like in various cases elsewhere) as
"buggy" (or sometimes worse).

>> It uses the PHYSCAP_directio sysctl output to determine whether PCI pass-
>> through can be made use of, yet that flag is driven by "iommu_enabled"
>> alone, without regard to the setting of "iommu_intremap".
> 
> The fact that range of hardware, including Tylersburg, don't have
> interrupt remapping, and noone plumbed this nicely to the toolstack is
> suboptimal.
> 
> But it is wholly inappropriate to punish users with Tylersburg hardware
> because you don't like the fact that the toolstack can't see when
> interrupt remapping is off.  The two issues are entirely orthogonal.
> 
> Tylersburg (taking this erratum into account) works just as well as and
> securely as several previous generations of hardware, and should behave
> the same.

Should behave the same - yes. Previous generations without interrupt
remapping also shouldn't allow pass-through by default, i.e. require
admin consent to run guests in this less secure mode (except, perhaps,
for devices without interrupts, albeit I'm unaware of ways to tell).

Jan




[xen-unstable-smoke test] 164238: tolerable all pass - PUSHED

2021-08-18 Thread osstest service owner
flight 164238 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/164238/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  517a90d1ca09ce00e50d46ac25566cc3bd2eb34d
baseline version:
 xen  54c9736382e0d558a6acd820e44185e020131c48

Last test of basis   164228  2021-08-17 13:02:13 Z0 days
Testing same since   164238  2021-08-18 08:00:27 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   54c9736382..517a90d1ca  517a90d1ca09ce00e50d46ac25566cc3bd2eb34d -> smoke



Re: [PATCH] VT-d: Tylersburg errata apply to further steppings

2021-08-18 Thread Andrew Cooper
On 03/08/2021 12:13, Jan Beulich wrote:
> While for 5500 and 5520 chipsets only B3 and C2 are mentioned in the
> spec update, X58's also mentions B2, and searching the internet suggests
> systems with this stepping are actually in use. Even worse, for X58
> erratum #69 is marked applicable even to C2. Split the check to cover
> all applicable steppings and to also report applicable errata numbers in
> the log message. The splitting requires using the DMI port instead of
> the System Management Registers device, but that's then in line (also
> revision checking wise) with the spec updates.
>
> Fixes: 6890cebc6a98 ("VT-d: deal with 5500/5520/X58 errata")
> Signed-off-by: Jan Beulich 
> ---
> As to disabling just interrupt remapping (as the initial version of the
> original patch did) vs disabling the IOMMU as a whole: Using a less
> heavy workaround would of course be desirable, but then we need to
> ensure not to misguide the tool stack about the state of the system.

This reasoning is buggy.

This errata is very specifically to do with interrupt remapping only. 
Disabling the whole IOMMU in response is inappropriate.

> It uses the PHYSCAP_directio sysctl output to determine whether PCI pass-
> through can be made use of, yet that flag is driven by "iommu_enabled"
> alone, without regard to the setting of "iommu_intremap".

The fact that range of hardware, including Tylersburg, don't have
interrupt remapping, and noone plumbed this nicely to the toolstack is
suboptimal.

But it is wholly inappropriate to punish users with Tylersburg hardware
because you don't like the fact that the toolstack can't see when
interrupt remapping is off.  The two issues are entirely orthogonal.

Tylersburg (taking this erratum into account) works just as well as and
securely as several previous generations of hardware, and should behave
the same.

~Andrew




[PATCH] x86/xstate: reset cached register values on resume

2021-08-18 Thread Marek Marczykowski-Górecki
set_xcr0() and set_msr_xss() use cached value to avoid setting the
register to the same value over and over. But suspend/resume implicitly
reset the registers and since percpu areas are not deallocated on
suspend anymore, the cache gets stale.
Reset the cache on resume, to ensure the next write will really hit the
hardware. Choose value 0, as it will never be a legitimate write to
those registers - and so, will force write (and cache update).

Note the cache is used io get_xcr0() and get_msr_xss() too, but:
- set_xcr0() is called few lines below in xstate_init(), so it will
  update the cache with appropriate value
- get_msr_xss() is not used anywhere - and thus not before any
  set_msr_xss() that will fill the cache

Fixes: aca2a985a55a "xen: don't free percpu areas during suspend"
Signed-off-by: Marek Marczykowski-Górecki 
---
 xen/arch/x86/xstate.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index 6aaf9a2f1546..28726d8fbf2b 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -642,6 +642,13 @@ void xstate_init(struct cpuinfo_x86 *c)
 return;
 }
 
+/*
+ * Clear the cached value to make set_xcr0() and set_msr_xss() really
+ * write it.
+ */
+this_cpu(xcr0) = 0;
+this_cpu(xss) = 0;
+
 cpuid_count(XSTATE_CPUID, 0, , , , );
 feature_mask = (((u64)edx << 32) | eax) & XCNTXT_MASK;
 BUG_ON(!valid_xcr0(feature_mask));
-- 
2.31.1




Re: [PATCH] RFC: Version support policy

2021-08-18 Thread Marek Marczykowski-Górecki
On Fri, Aug 13, 2021 at 12:37:27PM +0100, Ian Jackson wrote:
> The current policy for minimum supported versions of tools, compilers,
> etc. is unsatisfactory: For many dependencies no minimum version is
> specified.  For those where a version is stated, updating it is a
> decision that has to be explicitly taken for that tool.
> 
> The result is persistent debates over what is good to support,
> conducted in detail in the context of individual patches.
> 
> Decisions about support involve tradeoffs, often tradeoffs between the
> interests of different people.  Currently we don't have anything
> resembling a guideline.  The result is that the individual debates are
> inconclusive; and also, this framework does not lead to good feelings
> amongst participants.
> 
> I suggest instead that we adopt a date-based policy: we define a
> maximum *age* of dependencies that we will support.

I wonder about another approach: specify supported toolchain version(s)
based on environments we choose to care about. That would be things like
"Debian, including LTS (or even ELTS) one", "RHEL/CentOS until X...",
etc. Based on this, it's easy to derive what's the oldest version that
needs to be supported.
This would be also much friendlier for testing - a clear definition
what environments should be used (in gitlab-ci, I guess).

Thoughts?

> The existing documentation about actually known working versions
> then becomes a practical consequence of that policy.
> 
> In this patch I propose a cutoff of 6 years.
> Obviously there will be debate about the precise value.
> 
> It will also be necessary to make exceptions, and/or to make different
> rules for different architectures.  In particular, new architectures,
> new configurations, or new features, may need an absolute earliest
> tooling date which is considerably less than the usual limit.
> 
> I have tried to transcribe the current compiler version info into this
> format.  The dates in the exceptions are all more recent than my
> suggested 6 year cutoff, so if this patch is applied to staging and
> not applied retrospectively, they could be removed.
> 
> I'm not sure if this policy should be here in README (where the
> version support was until now) or in SUPPORT.md.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab


signature.asc
Description: PGP signature


meaning and use of IOMMU_FLUSHF_added

2021-08-18 Thread Jan Beulich
Paul,

back at the time I did already question your intended meaning of
this flag. I notice that there's presently no consumer of it being
set (apart from yielding non-zero flush_flags). I'm afraid this
model makes accumulation of flush flags not work properly: With
both flags set and more than a single page altered, it is
impossible to tell apart whether two present PTEs were altered, or
a non-present and a present one.

VT-d's flushing needs to know the distinction; it may in fact be
necessary to issue two flushes (or a single "heavier" one) when
both non-present and present entries got transitioned to present
in one go. Luckily no flush accumulation has been committed so
far (besides some during Dom0 construction), meaning this has only
been a latent issue until now that I try to get large page
mappings to work. (I think I have page table construction working,
but after the removal of some debug output I'm now facing faults
on non-present entries which I believe are actually present in the
page tables, albeit I yet have to check that.)

Question therefore is: Do we want to re-purpose the flag (my
preference), or do I need to add a 3rd one (in which case I'm
afraid I can't think of a good name, with "added" already in use)?

Jan




Re: [PATCH] xen/sched: fix get_cpu_idle_time() for smt=0 suspend/resume

2021-08-18 Thread Juergen Gross

On 18.08.21 12:35, Jan Beulich wrote:

On 18.08.2021 12:21, Juergen Gross wrote:

With smt=0 during a suspend/resume cycle of the machine the threads
which have been parked before will briefly come up again. This can
result in problems e.g. with cpufreq driver being active as this will
call into get_cpu_idle_time() for a cpu without initialized scheduler
data.

Fix that by letting get_cpu_idle_time() deal with this case.

Fixes: 132cbe8f35632fb2 ("sched: fix get_cpu_idle_time() with core scheduling")
Reported-by: Marek Marczykowski-Górecki 
Signed-off-by: Juergen Gross 
Tested-by: Marek Marczykowski-Górecki 


Reviewed-by: Jan Beulich 


--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -337,7 +337,7 @@ uint64_t get_cpu_idle_time(unsigned int cpu)
  struct vcpu_runstate_info state = { 0 };
  const struct vcpu *v = idle_vcpu[cpu];
  
-if ( cpu_online(cpu) && v )

+if ( cpu_online(cpu) && v && get_sched_res(cpu) )
  vcpu_runstate_get(v, );


My earlier question was aiming at getting rid of the (now) middle part
of the condition; I thought this may be okay to do as a secondary change
here. But perhaps you intentionally left it there, so I'm unsure whether
to suggest to make the adjustment while committing (awaiting a
maintainer ack first anyway).


Ah, okay. Yes, I think the test of v being non-NULL can be removed.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] xen/sched: fix get_cpu_idle_time() for smt=0 suspend/resume

2021-08-18 Thread Jan Beulich
On 18.08.2021 12:21, Juergen Gross wrote:
> With smt=0 during a suspend/resume cycle of the machine the threads
> which have been parked before will briefly come up again. This can
> result in problems e.g. with cpufreq driver being active as this will
> call into get_cpu_idle_time() for a cpu without initialized scheduler
> data.
> 
> Fix that by letting get_cpu_idle_time() deal with this case.
> 
> Fixes: 132cbe8f35632fb2 ("sched: fix get_cpu_idle_time() with core 
> scheduling")
> Reported-by: Marek Marczykowski-Górecki 
> Signed-off-by: Juergen Gross 
> Tested-by: Marek Marczykowski-Górecki 

Reviewed-by: Jan Beulich 

> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -337,7 +337,7 @@ uint64_t get_cpu_idle_time(unsigned int cpu)
>  struct vcpu_runstate_info state = { 0 };
>  const struct vcpu *v = idle_vcpu[cpu];
>  
> -if ( cpu_online(cpu) && v )
> +if ( cpu_online(cpu) && v && get_sched_res(cpu) )
>  vcpu_runstate_get(v, );

My earlier question was aiming at getting rid of the (now) middle part
of the condition; I thought this may be okay to do as a secondary change
here. But perhaps you intentionally left it there, so I'm unsure whether
to suggest to make the adjustment while committing (awaiting a
maintainer ack first anyway).

Jan




[PATCH] xen/sched: fix get_cpu_idle_time() for smt=0 suspend/resume

2021-08-18 Thread Juergen Gross
With smt=0 during a suspend/resume cycle of the machine the threads
which have been parked before will briefly come up again. This can
result in problems e.g. with cpufreq driver being active as this will
call into get_cpu_idle_time() for a cpu without initialized scheduler
data.

Fix that by letting get_cpu_idle_time() deal with this case.

Fixes: 132cbe8f35632fb2 ("sched: fix get_cpu_idle_time() with core scheduling")
Reported-by: Marek Marczykowski-Górecki 
Signed-off-by: Juergen Gross 
Tested-by: Marek Marczykowski-Górecki 
---
An alternative way to fix the issue would be to keep the sched_resource
of offline cpus allocated like we already do with idle vcpus and units.
This fix would be more intrusive, but it would avoid similar other bugs
like this one.
---
 xen/common/sched/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index 6d34764d38..9ac1b01ca8 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -337,7 +337,7 @@ uint64_t get_cpu_idle_time(unsigned int cpu)
 struct vcpu_runstate_info state = { 0 };
 const struct vcpu *v = idle_vcpu[cpu];
 
-if ( cpu_online(cpu) && v )
+if ( cpu_online(cpu) && v && get_sched_res(cpu) )
 vcpu_runstate_get(v, );
 
 return state.time[RUNSTATE_running];
-- 
2.26.2




Re: S3 resume issue in cpufreq -> get_cpu_idle_time->vcpu_runstate_get

2021-08-18 Thread Marek Marczykowski-Górecki
On Wed, Aug 18, 2021 at 08:24:39AM +0200, Juergen Gross wrote:
> Marek,
> 
> On 17.08.21 17:15, Juergen Gross wrote:
> > On 17.08.21 16:50, Marek Marczykowski-Górecki wrote:
> > I'll be looking into that.
> 
> Could you please try the attached patch?

It works, thanks!

So, the only other suspend-related issue I'm aware of, is:
https://lore.kernel.org/xen-devel/20210131021526.GB6354@mail-itl/

But I don't have reliable reproducer for that one...


> From e38d0816a33fbaa3c0c45bcd6e9d433b1e021222 Mon Sep 17 00:00:00 2001
> From: Juergen Gross 
> To: xen-devel@lists.xenproject.org
> Cc: George Dunlap 
> Cc: Dario Faggioli 
> Date: Wed, 18 Aug 2021 08:18:20 +0200
> Subject: [PATCH] xen/sched: fix get_cpu_idle_time() for smt=0 suspend/resume
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> With smt=0 during a suspend/resume cycle of the machine the threads
> which have been parked before will briefly come up again. This can
> result in problems e.g. with cpufreq driver being active as this will
> call into get_cpu_idle_time() for a cpu without initialized scheduler
> data.
> 
> Fix that by letting get_cpu_idle_time() deal with this case.
> 
> Fixes: 132cbe8f35632fb2 ("sched: fix get_cpu_idle_time() with core 
> scheduling")
> Reported-by: Marek Marczykowski-Górecki 
> Signed-off-by: Juergen Gross 

Tested-by: Marek Marczykowski-Górecki 

> ---
> An alternative way to fix the issue would be to keep the sched_resource
> of offline cpus allocated like we already do with idle vcpus and units.
> This fix would be more intrusive, but it would avoid similar other bugs
> like this one.
> ---
>  xen/common/sched/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> index 6d34764d38..9ac1b01ca8 100644
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -337,7 +337,7 @@ uint64_t get_cpu_idle_time(unsigned int cpu)
>  struct vcpu_runstate_info state = { 0 };
>  const struct vcpu *v = idle_vcpu[cpu];
>  
> -if ( cpu_online(cpu) && v )
> +if ( cpu_online(cpu) && v && get_sched_res(cpu) )
>  vcpu_runstate_get(v, );
>  
>  return state.time[RUNSTATE_running];
> -- 
> 2.26.2
> 






-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab


signature.asc
Description: PGP signature


[xen-unstable-coverity test] 164239: all pass - PUSHED

2021-08-18 Thread osstest service owner
flight 164239 xen-unstable-coverity real [real]
http://logs.test-lab.xenproject.org/osstest/logs/164239/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 xen  54c9736382e0d558a6acd820e44185e020131c48
baseline version:
 xen  5c34b9af05b9e5abd25d88efc4fb783136547810

Last test of basis   164193  2021-08-15 09:18:30 Z3 days
Testing same since   164239  2021-08-18 09:19:40 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 
  Jane Malalane 
  Kevin Stefanov 
  Marek Marczykowski-Górecki 

jobs:
 coverity-amd64   pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   5c34b9af05..54c9736382  54c9736382e0d558a6acd820e44185e020131c48 -> 
coverity-tested/smoke



Re: [PATCH] libs/guest: Move the guest ABI check earlier into xc_dom_parse_image()

2021-08-18 Thread Andrew Cooper
On 17/08/2021 16:19, Jane Malalane wrote:
> Xen may not support 32-bit PV guest for a number of reasons (lack of
> CONFIG_PV32, explicit pv=no-32 command line argument, or implicitly
> due to CET being enabled) and advertises this to the toolstack via the
> absence of xen-3.0-x86_32p ABI.
>
> Currently, when trying to boot a 32-bit PV guest, the ABI check is too
> late and the build explodes in the following manner yielding an
> unhelpful error message:
>
>   xc: error: panic: xg_dom_boot.c:121: xc_dom_boot_mem_init: can't allocate 
> low memory for domain: Out of memory
>   libxl: error: libxl_dom.c:586:libxl__build_dom: xc_dom_boot_mem_init 
> failed: Operation not supported
>   libxl: error: libxl_create.c:1573:domcreate_rebuild_done: Domain 1:cannot 
> (re-)build domain: -3
>   libxl: error: libxl_domain.c:1182:libxl__destroy_domid: Domain 
> 1:Non-existant domain
>   libxl: error: libxl_domain.c:1136:domain_destroy_callback: Domain 1:Unable 
> to destroy guest
>   libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain 1:Destruction 
> of domain failed
>
> Move the ABI check earlier into xc_dom_parse_image() along with other
> ELF-note feature checks.  With this adjustment, it now looks like
> this:
>
>   xc: error: panic: xg_dom_boot.c:88: xc_dom_compat_check: guest type 
> xen-3.0-x86_32p not supported by xen kernel, sorry: Invalid kernel
>   libxl: error: libxl_dom.c:571:libxl__build_dom: xc_dom_parse_image failed
>   domainbuilder: detail: xc_dom_release: called
>   libxl: error: libxl_create.c:1573:domcreate_rebuild_done: Domain 11:cannot 
> (re-)build domain: -3
>   libxl: error: libxl_domain.c:1182:libxl__destroy_domid: Domain 
> 11:Non-existant domain
>   libxl: error: libxl_domain.c:1136:domain_destroy_callback: Domain 11:Unable 
> to destroy guest
>   libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain 11:Destruction 
> of domain failed
>
> Suggested-by: Andrew Cooper 
> Signed-off-by: Jane Malalane 

FWIW, Reviewed-by: Andrew Cooper 

The net behaviour of `xl create` is still not great (the -3 in
particular is ESRCH looking for qemu which isn't remotely relevant), but
at least with this change, you get "guest type xen-3.0-x86_32p not
supported by xen" out of libxc which is the root cause of the failure.

~Andrew




Re: [PATCH] x86: mark compat hypercall regs clobbering for intended fall-through

2021-08-18 Thread Andrew Cooper
On 13/07/2021 09:08, Jan Beulich wrote:
> Oddly enough in the original report Coverity only complained about the
> native hypercall related switch() statements. Now that it has seen those
> fixed, it complains about (only HVM) compat ones. Hence the CIDs below
> are all for the HVM side of things, yet while at it take care of the PV
> side as well.
>
> Coverity-ID: 1487105, 1487106, 1487107, 1487108, 1487109.
> Signed-off-by: Jan Beulich 

Acked-by: Andrew Cooper 



RE: Enabling hypervisor agnosticism for VirtIO backends

2021-08-18 Thread Wei Chen
Hi Akashi,

> -Original Message-
> From: AKASHI Takahiro 
> Sent: 2021年8月18日 13:39
> To: Wei Chen 
> Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> ; Alex Benn??e ; Stratos
> Mailing List ; virtio-dev@lists.oasis-
> open.org; Arnd Bergmann ; Viresh Kumar
> ; Stefano Stabellini
> ; stefa...@redhat.com; Jan Kiszka
> ; Carl van Schaik ;
> prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> Philippe Brucker ; Mathieu Poirier
> ; Oleksandr Tyshchenko
> ; Bertrand Marquis
> ; Artem Mygaiev ; Julien
> Grall ; Juergen Gross ; Paul Durrant
> ; Xen Devel 
> Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
>
> On Tue, Aug 17, 2021 at 08:39:09AM +, Wei Chen wrote:
> > Hi Akashi,
> >
> > > -Original Message-
> > > From: AKASHI Takahiro 
> > > Sent: 2021年8月17日 16:08
> > > To: Wei Chen 
> > > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > > ; Alex Benn??e ;
> Stratos
> > > Mailing List ; virtio-
> dev@lists.oasis-
> > > open.org; Arnd Bergmann ; Viresh Kumar
> > > ; Stefano Stabellini
> > > ; stefa...@redhat.com; Jan Kiszka
> > > ; Carl van Schaik ;
> > > prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> > > Philippe Brucker ; Mathieu Poirier
> > > ; Oleksandr Tyshchenko
> > > ; Bertrand Marquis
> > > ; Artem Mygaiev ;
> Julien
> > > Grall ; Juergen Gross ; Paul Durrant
> > > ; Xen Devel 
> > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > >
> > > Hi Wei, Oleksandr,
> > >
> > > On Mon, Aug 16, 2021 at 10:04:03AM +, Wei Chen wrote:
> > > > Hi All,
> > > >
> > > > Thanks for Stefano to link my kvmtool for Xen proposal here.
> > > > This proposal is still discussing in Xen and KVM communities.
> > > > The main work is to decouple the kvmtool from KVM and make
> > > > other hypervisors can reuse the virtual device implementations.
> > > >
> > > > In this case, we need to introduce an intermediate hypervisor
> > > > layer for VMM abstraction, Which is, I think it's very close
> > > > to stratos' virtio hypervisor agnosticism work.
> > >
> > > # My proposal[1] comes from my own idea and doesn't always represent
> > > # Linaro's view on this subject nor reflect Alex's concerns.
> Nevertheless,
> > >
> > > Your idea and my proposal seem to share the same background.
> > > Both have the similar goal and currently start with, at first, Xen
> > > and are based on kvm-tool. (Actually, my work is derived from
> > > EPAM's virtio-disk, which is also based on kvm-tool.)
> > >
> > > In particular, the abstraction of hypervisor interfaces has a same
> > > set of interfaces (for your "struct vmm_impl" and my "RPC interfaces").
> > > This is not co-incident as we both share the same origin as I said
> above.
> > > And so we will also share the same issues. One of them is a way of
> > > "sharing/mapping FE's memory". There is some trade-off between
> > > the portability and the performance impact.
> > > So we can discuss the topic here in this ML, too.
> > > (See Alex's original email, too).
> > >
> > Yes, I agree.
> >
> > > On the other hand, my approach aims to create a "single-binary"
> solution
> > > in which the same binary of BE vm could run on any hypervisors.
> > > Somehow similar to your "proposal-#2" in [2], but in my solution, all
> > > the hypervisor-specific code would be put into another entity (VM),
> > > named "virtio-proxy" and the abstracted operations are served via RPC.
> > > (In this sense, BE is hypervisor-agnostic but might have OS
> dependency.)
> > > But I know that we need discuss if this is a requirement even
> > > in Stratos project or not. (Maybe not)
> > >
> >
> > Sorry, I haven't had time to finish reading your virtio-proxy completely
> > (I will do it ASAP). But from your description, it seems we need a
> > 3rd VM between FE and BE? My concern is that, if my assumption is right,
> > will it increase the latency in data transport path? Even if we're
> > using some lightweight guest like RTOS or Unikernel,
>
> Yes, you're right. But I'm afraid that it is a matter of degree.
> As far as we execute 'mapping' operations at every fetch of payload,
> we will see latency issue (even in your case) and if we have some solution
> for it, we won't see it neither in my proposal :)
>

Oleksandr has sent a proposal to Xen mailing list to reduce this kind
of "mapping/unmapping" operations. So the latency caused by this behavior
on Xen may eventually be eliminated, and Linux-KVM doesn't have that problem.

> > > Specifically speaking about kvm-tool, I have a concern about its
> > > license term; Targeting different hypervisors and different OSs
> > > (which I assume includes RTOS's), the resultant library should be
> > > license permissive and GPL for kvm-tool might be an issue.
> > > Any thoughts?
> > >
> >
> > Yes. If user want to implement a FreeBSD device model, but the virtio
> > library is GPL. Then GPL would be a problem. If we have another good
> > candidate, I am open to it.
>
> I have some candidates, particularly for vq/vring, in my mind:
> * 

[PATCH] Arm: relax iomem_access_permitted() check

2021-08-18 Thread Jan Beulich
Ranges checked by iomem_access_permitted() are inclusive; to permit a
mapping there's no need for access to also have been granted for the
subsequent page.

Fixes: 80f9c3167084 ("xen/arm: acpi: Map MMIO on fault in stage-2 page table 
for the hardware domain")
Signed-off-by: Jan Beulich 

--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1893,7 +1893,7 @@ static bool try_map_mmio(gfn_t gfn)
 return false;
 
 /* The hardware domain can only map permitted MMIO regions */
-if ( !iomem_access_permitted(d, mfn_x(mfn), mfn_x(mfn) + 1) )
+if ( !iomem_access_permitted(d, mfn_x(mfn), mfn_x(mfn)) )
 return false;
 
 return !map_regions_p2mt(d, gfn, 1, mfn, p2m_mmio_direct_c);




[xen-unstable test] 164230: regressions - FAIL

2021-08-18 Thread osstest service owner
flight 164230 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/164230/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-arndale  broken  in 164219
 test-amd64-amd64-amd64-pvgrub 12 debian-di-install   fail REGR. vs. 164178
 test-amd64-amd64-i386-pvgrub 12 debian-di-installfail REGR. vs. 164178
 build-armhf-libvirt   6 libvirt-build  fail in 164219 REGR. vs. 164178

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-examine  5 host-install   broken in 164219 pass in 164230
 test-armhf-armhf-xl-arndale  5 host-install(5) broken in 164219 pass in 164230
 test-amd64-amd64-xl-credit2  22 guest-start/debian.repeat  fail pass in 164219

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked in 164219 n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked in 164219 n/a
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 164178
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 164178
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 164178
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 164178
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 164178
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 164178
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 164178
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 164178
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 164178
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 164178
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 164178
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 

Re: S3 resume issue in cpufreq -> get_cpu_idle_time->vcpu_runstate_get

2021-08-18 Thread Juergen Gross

On 18.08.21 08:41, Jan Beulich wrote:

On 18.08.2021 08:24, Juergen Gross wrote:

Could you please try the attached patch?


Seeing the patch, two questions:

1) Can idle_vcpu[cpu] ever be NULL when cpu_online(cpu) returned true?


No.


2) Seeing get_sched_res()'es access to per-CPU data, would it make sense
to move the cpu_online() check into there? While I guess the majority of
users are guaranteed to invoke it for online CPUs, I wonder if there
aren't any further uses on the CPU bringup / teardown code paths.


As get_sched_res() is private to the scheduler sources, this would imply
a scheduler function being used for an offline cpu, which is rather
unlikely. Especially as get_cpu_idle_time() is the only official
scheduler function taking a physical cpu number as parameter. All other
functions only work with vcpus, and this would require to wake, pause,
... an idle vcpu of an offline cpu to hit the problem.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: S3 resume issue in cpufreq -> get_cpu_idle_time->vcpu_runstate_get

2021-08-18 Thread Jan Beulich
On 18.08.2021 08:24, Juergen Gross wrote:
> Could you please try the attached patch?

Seeing the patch, two questions:

1) Can idle_vcpu[cpu] ever be NULL when cpu_online(cpu) returned true?

2) Seeing get_sched_res()'es access to per-CPU data, would it make sense
to move the cpu_online() check into there? While I guess the majority of
users are guaranteed to invoke it for online CPUs, I wonder if there
aren't any further uses on the CPU bringup / teardown code paths.

Jan




Re: S3 resume issue in cpufreq -> get_cpu_idle_time->vcpu_runstate_get

2021-08-18 Thread Juergen Gross

Marek,

On 17.08.21 17:15, Juergen Gross wrote:

On 17.08.21 16:50, Marek Marczykowski-Górecki wrote:

On Tue, Aug 17, 2021 at 04:04:24PM +0200, Jan Beulich wrote:

On 17.08.2021 15:48, Marek Marczykowski-Górecki wrote:

On Tue, Aug 17, 2021 at 02:29:20PM +0100, Andrew Cooper wrote:

On 17/08/2021 14:21, Jan Beulich wrote:

On 17.08.2021 15:06, Andrew Cooper wrote:

Perhaps we want the cpu_down() logic to explicitly invalidate their
lazily cached values?

I'd rather do this on the cpu_up() path (no point clobbering what may
get further clobbered, and then perhaps not to a value of our 
liking),
yet then we can really avoid doing this from a notifier and 
instead do
it early enough in xstate_init() (taking care of XSS at the same 
time).


Funny you mention notifiers. Apparently cpufreq driver does use it to
initialize things. And fails to do so:

(XEN) Finishing wakeup from ACPI S3 state.
(XEN) CPU0: xstate: size: 0x440 (uncompressed 0x440) and states: 0x1f
(XEN) Enabling non-boot CPUs  ...
(XEN) CPU1: xstate: size: 0x440 (uncompressed 0x440) and states: 0x1f
(XEN) [ Xen-4.16-unstable  x86_64  debug=y  Not tainted ]
(XEN) CPU:    0
(XEN) RIP:    e008:[] vcpu_runstate_get+0x153/0x244
(XEN) RFLAGS: 00010246   CONTEXT: hypervisor
(XEN) rax:    rbx: 830049667c50   rcx: 
0001
(XEN) rdx: 00321d74d000   rsi: 830049667c50   rdi: 
83025dcc
(XEN) rbp: 830049667c40   rsp: 830049667c10   r8:  
83020511a820
(XEN) r9:  82d04057ef78   r10: 0180   r11: 
8000
(XEN) r12: 83025dcc   r13: 830205118c60   r14: 
0001
(XEN) r15: 0010   cr0: 8005003b   cr4: 
003526e0

(XEN) cr3: 49656000   cr2: 0028
(XEN) fsb:    gsb:    gss: 


(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) Xen code around  
(vcpu_runstate_get+0x153/0x244):
(XEN)  48 8b 14 ca 48 8b 04 02 <4c> 8b 70 28 e9 01 ff ff ff 4c 8d 3d 
dd 64 32 00

(XEN) Xen stack trace from rsp=830049667c10:
(XEN)    0180 83025dcbd410 83020511bf30 
830205118c60
(XEN)    0001 0010 830049667c80 
82d04024ae73
(XEN)       

(XEN)      830049667cb8 
82d0402560a9
(XEN)    830205118320 0001 83020511bf30 
83025dc7a6f0
(XEN)     830049667d58 82d040254cb1 
0001402e9f74
(XEN)     830049667d10 82d040224eda 
0025dc81
(XEN)    00321d74d000 82d040571278 0001 
830049667d28
(XEN)    82d040228b44 82d0403102cf  
82d0402283a4
(XEN)    82d040459688 82d040459680 82d040459240 
0004
(XEN)     830049667d68 82d04025510e 
830049667db0
(XEN)    82d040221ba4  0001 
0001
(XEN)     830049667e00 0001 
82d04058a5c0
(XEN)    830049667dc8 82d040203867 0001 
830049667df0
(XEN)    82d040203c51 82d040459400 0001 
0010
(XEN)    830049667e20 82d040203e26 830049667ef8 

(XEN)    0003 0200 830049667e50 
82d040270bac
(XEN)    83020116a640 830258ff6000  

(XEN)    830049667e70 82d0402056aa 830258ff61b8 
82d0405701b0
(XEN)    830049667e88 82d04022963c 82d0405701a0 
830049667eb8

(XEN) Xen call trace:
(XEN)    [] R vcpu_runstate_get+0x153/0x244


This is xen/common/sched/core.c:322. get_sched_res(v->processor) is
NULL at this point for CPU1.

The only place that can calls set_sched_res() and doesn't expect the
previous value to be valid, is cpu_schedule_up(). For non-BSP its called
_only_ from notifier at CPU_UP_PREPARE (cpu_schedule_callback()), but
that notifier explicitly exclude suspend/resume case:

 static int cpu_schedule_callback(
 struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
 unsigned int cpu = (unsigned long)hcpu;
 int rc = 0;

 /*
  * All scheduler related suspend/resume handling needed is 
done in

  * cpupool.c.
  */
 if ( system_state > SYS_STATE_active )
 return NOTIFY_DONE;

But, nothing in cpupool.c is calling into set_sched_res().

On the other hand, sched_rm_cpu() (which I believe is called as part of
parking the CPU) calls cpu_schedule_down(), which then calls
set_sched_res(cpu, NULL).

In short: scheduler for parked CPUs is not re-initialized during resume.
But cpufreq expects it to be...


I'll be looking into that.


Could you please try the attached patch?


Juergen

From e38d0816a33fbaa3c0c45bcd6e9d433b1e021222 Mon Sep 17 00:00:00 2001
From: Juergen 

Re: [PATCH v2] x86/PV: assert page state in mark_pv_pt_pages_rdonly()

2021-08-18 Thread Jan Beulich
On 17.08.2021 19:25, Andrew Cooper wrote:
> On 17/08/2021 15:29, Jan Beulich wrote:
>> About every time I look at dom0_construct_pv()'s "calculation" of
>> nr_pt_pages I question (myself) whether the result is precise or merely
>> an upper bound. I think it is meant to be precise, but I think we would
>> be better off having some checking in place. Hence add ASSERT()s to
>> verify that
>> - all pages have a valid L1...Ln (currently L4) page table type and
>> - no other bits are set, in particular the type refcount is still zero.
>>
>> Signed-off-by: Jan Beulich 
> 
> Reviewed-by: Andrew Cooper , thanks.
> 
> Are you planning further cleanups here imminently?  If not, I can
> probably shuffle some of the easy ROUNDUP() refactoring in the direction
> of an intern or grad.

Not any cleanup, I don't think, but quite a few further changes to make
Dom0 use large IOMMU page mappings where possible (without introducing
logic yet to un-shatter split pages, not the least because relying on
just that would then undermine the secondary effect of improving boot
time). The two changes posted so far were really just fallout from that
work, with it seeming reasonable to post up front to reduce the size of
the actual series, which has grown quite a bit longer than I though it
would.

Jan