[Xen-devel] [ovmf baseline-only test] 66895: tolerable trouble: blocked/broken/pass

2016-08-03 Thread Platform Team regression test user
This run is configured for baseline tests only.

flight 66895 ovmf real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/66895/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 build-amd64   3 host-install(3)  broken like 66893

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a

version targeted for testing:
 ovmf 4884e81b5d0da1bcc05361dd4c5fc06b3722f144
baseline version:
 ovmf 846ea5f537c8f8313abb2f8f29794d13ac4becec

Last test of basis66893  2016-08-03 04:49:38 Z0 days
Testing same since66895  2016-08-03 07:16:34 Z0 days1 attempts


People who touched revisions under test:
  Feng Tian 
  Gary Lin 
  Hess Chen 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  broken  
 build-i386   pass
 build-amd64-libvirt  blocked 
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary

broken-step build-amd64 host-install(3)

Push not applicable.

(No revision log; it would be 470 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [libvirt test] 99913: tolerable FAIL - PUSHED

2016-08-03 Thread osstest service owner
flight 99913 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/99913/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  8dbc95cbd18a5abab8f313b1ea9da7b3d7faaa32
baseline version:
 libvirt  230c631917f293865856cf675172f1b36cb7e680

Last test of basis99895  2016-08-02 04:22:25 Z1 days
Testing same since99913  2016-08-03 04:21:11 Z0 days1 attempts


People who touched revisions under test:
  Chunyan Liu 
  Cédric Bosdonnat 
  Daniel Veillard 
  Erik Skultety 
  John Ferlan 
  Jovanka Gulicoska 
  Martin Kletzander 
  Michal Privoznik 
  Nikolay Shirokovskiy 
  Peter Krempa 
  Yuri Chornoivan 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm fail
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-armhf-armhf-libvirt fail
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-armhf-armhf-libvirt-qcow2   fail
 test-armhf-armhf-libvirt-raw fail
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=libvirt
+ revision=8dbc95cbd18a5abab8f313b1ea9da7b3d7faaa32
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home

[Xen-devel] [xen-unstable test] 99911: regressions - FAIL

2016-08-03 Thread osstest service owner
flight 99911 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/99911/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-libvirt-raw  9 debian-di-install fail REGR. vs. 99906

Regressions which are regarded as allowable (not blocking):
 build-amd64-rumpuserxen   6 xen-buildfail   like 99906
 build-i386-rumpuserxen6 xen-buildfail   like 99906
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 99906
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 99906
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 99906
 test-amd64-amd64-xl-rtds  9 debian-install   fail   like 99906
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 99906

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass

version targeted for testing:
 xen  e9522e4932aaa7f083688b6612b5897839409260
baseline version:
 xen  2eee1c746af6f683247700642786b7c21c991234

Last test of basis99906  2016-08-02 15:19:48 Z0 days
Testing same since99911  2016-08-03 00:44:31 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Boris Ostrovsky 
  George Dunlap 
  Ian Jackson 
  Jan Beulich 
  Jim Fehlig 
  Juergen Gross 
  Kevin Tian 
  Paul Durrant 
  Roger Pau Monne 
  Roger Pau Monné 
  Tamas K Lengyel 
  Wei Liu 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-oldkern

[Xen-devel] [distros-debian-squeeze test] 66894: tolerable trouble: broken/fail/pass

2016-08-03 Thread Platform Team regression test user
flight 66894 distros-debian-squeeze real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/66894/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-amd64-squeeze-netboot-pygrub 3 host-install(3) broken blocked 
in 66838
 test-amd64-amd64-i386-squeeze-netboot-pygrub 9 debian-di-install fail blocked 
in 66838
 test-amd64-i386-i386-squeeze-netboot-pygrub 9 debian-di-install fail blocked 
in 66838
 test-amd64-i386-amd64-squeeze-netboot-pygrub 9 debian-di-install fail blocked 
in 66838

baseline version:
 flight   66838

jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-amd64-squeeze-netboot-pygrubbroken  
 test-amd64-i386-amd64-squeeze-netboot-pygrub fail
 test-amd64-amd64-i386-squeeze-netboot-pygrub fail
 test-amd64-i386-i386-squeeze-netboot-pygrub  fail



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] mwait-idle: add Denverton

2016-08-03 Thread Jan Beulich
Denverton is an Intel Atom based micro server which shares the same
Goldmont architecture as Broxton. The available C-states on
Denverton is a subset of Broxton with only C1, C1e, and C6.

Signed-off-by: Jacob Pan 
Signed-off-by: Len Brown 
Signed-off-by: Rafael J. Wysocki 
[Linux commit: 0080d65b7719fc58e60b5595fc61acded330004f]
Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -658,6 +658,28 @@ static struct cpuidle_state bxt_cstates[
{}
 };
 
+static const struct cpuidle_state dnv_cstates[] = {
+   {
+   .name = "C1-DNV",
+   .flags = MWAIT2flg(0x00),
+   .exit_latency = 2,
+   .target_residency = 2,
+   },
+   {
+   .name = "C1E-DNV",
+   .flags = MWAIT2flg(0x01),
+   .exit_latency = 10,
+   .target_residency = 20,
+   },
+   {
+   .name = "C6-DNV",
+   .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
+   .exit_latency = 50,
+   .target_residency = 500,
+   },
+   {}
+};
+
 static void mwait_idle(void)
 {
unsigned int cpu = smp_processor_id();
@@ -844,6 +866,11 @@ static const struct idle_cpu idle_cpu_bx
.disable_promotion_to_c1e = 1,
 };
 
+static const struct idle_cpu idle_cpu_dnv = {
+   .state_table = dnv_cstates,
+   .disable_promotion_to_c1e = 1,
+};
+
 #define ICPU(model, cpu) \
 { X86_VENDOR_INTEL, 6, model, X86_FEATURE_MONITOR, \
 &idle_cpu_##cpu}
@@ -881,6 +908,7 @@ static const struct x86_cpu_id intel_idl
ICPU(0x55, skx),
ICPU(0x57, knl),
ICPU(0x5c, bxt),
+   ICPU(0x5f, dnv),
{}
 };
 



mwait-idle: add Denverton

Denverton is an Intel Atom based micro server which shares the same
Goldmont architecture as Broxton. The available C-states on
Denverton is a subset of Broxton with only C1, C1e, and C6.

Signed-off-by: Jacob Pan 
Signed-off-by: Len Brown 
Signed-off-by: Rafael J. Wysocki 
[Linux commit: 0080d65b7719fc58e60b5595fc61acded330004f]
Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -658,6 +658,28 @@ static struct cpuidle_state bxt_cstates[
{}
 };
 
+static const struct cpuidle_state dnv_cstates[] = {
+   {
+   .name = "C1-DNV",
+   .flags = MWAIT2flg(0x00),
+   .exit_latency = 2,
+   .target_residency = 2,
+   },
+   {
+   .name = "C1E-DNV",
+   .flags = MWAIT2flg(0x01),
+   .exit_latency = 10,
+   .target_residency = 20,
+   },
+   {
+   .name = "C6-DNV",
+   .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
+   .exit_latency = 50,
+   .target_residency = 500,
+   },
+   {}
+};
+
 static void mwait_idle(void)
 {
unsigned int cpu = smp_processor_id();
@@ -844,6 +866,11 @@ static const struct idle_cpu idle_cpu_bx
.disable_promotion_to_c1e = 1,
 };
 
+static const struct idle_cpu idle_cpu_dnv = {
+   .state_table = dnv_cstates,
+   .disable_promotion_to_c1e = 1,
+};
+
 #define ICPU(model, cpu) \
 { X86_VENDOR_INTEL, 6, model, X86_FEATURE_MONITOR, \
 &idle_cpu_##cpu}
@@ -881,6 +908,7 @@ static const struct x86_cpu_id intel_idl
ICPU(0x55, skx),
ICPU(0x57, knl),
ICPU(0x5c, bxt),
+   ICPU(0x5f, dnv),
{}
 };
 
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86: support newer Intel CPU models

2016-08-03 Thread Jan Beulich
... as per the June 2016 edition of the SDM.

Also remove a couple of dead break statements as well as unused
*MSR_PM_LASTBRANCH* #define-s.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -61,14 +61,14 @@
 
 #define GET_HW_RES_IN_NS(msr, val) \
 do { rdmsrl(msr, val); val = tsc_ticks2ns(val); } while( 0 )
-#define GET_MC6_RES(val)  GET_HW_RES_IN_NS(0x664, val) /* Atom E3000 only */
+#define GET_MC6_RES(val)  GET_HW_RES_IN_NS(0x664, val)
 #define GET_PC2_RES(val)  GET_HW_RES_IN_NS(0x60D, val) /* SNB onwards */
 #define GET_PC3_RES(val)  GET_HW_RES_IN_NS(0x3F8, val)
 #define GET_PC6_RES(val)  GET_HW_RES_IN_NS(0x3F9, val)
 #define GET_PC7_RES(val)  GET_HW_RES_IN_NS(0x3FA, val)
-#define GET_PC8_RES(val)  GET_HW_RES_IN_NS(0x630, val) /* some Haswells only */
-#define GET_PC9_RES(val)  GET_HW_RES_IN_NS(0x631, val) /* some Haswells only */
-#define GET_PC10_RES(val) GET_HW_RES_IN_NS(0x632, val) /* some Haswells only */
+#define GET_PC8_RES(val)  GET_HW_RES_IN_NS(0x630, val)
+#define GET_PC9_RES(val)  GET_HW_RES_IN_NS(0x631, val)
+#define GET_PC10_RES(val) GET_HW_RES_IN_NS(0x632, val)
 #define GET_CC1_RES(val)  GET_HW_RES_IN_NS(0x660, val) /* Silvermont only */
 #define GET_CC3_RES(val)  GET_HW_RES_IN_NS(0x3FC, val)
 #define GET_CC6_RES(val)  GET_HW_RES_IN_NS(0x3FD, val)
@@ -142,6 +142,8 @@ static void do_get_hw_residencies(void *
 {
 /* 4th generation Intel Core (Haswell) */
 case 0x45:
+/* Xeon E5/E7 v4 (Broadwell) */
+case 0x4F:
 GET_PC8_RES(hw_res->pc8);
 GET_PC9_RES(hw_res->pc9);
 GET_PC10_RES(hw_res->pc10);
@@ -158,10 +160,11 @@ static void do_get_hw_residencies(void *
 case 0x46:
 /* Broadwell */
 case 0x3D:
-case 0x4F:
+case 0x47:
 case 0x56:
-/* future */
+/* Skylake */
 case 0x4E:
+case 0x5E:
 GET_PC2_RES(hw_res->pc2);
 GET_CC7_RES(hw_res->cc7);
 /* fall through */
@@ -198,18 +201,28 @@ static void do_get_hw_residencies(void *
 break;
 /* Silvermont */
 case 0x37:
-GET_MC6_RES(hw_res->mc6);
-/* fall through */
 case 0x4A:
 case 0x4D:
 case 0x5A:
 case 0x5D:
 /* Airmont */
 case 0x4C:
+GET_MC6_RES(hw_res->mc6);
 GET_PC7_RES(hw_res->pc6); /* abusing GET_PC7_RES */
 GET_CC1_RES(hw_res->cc1);
 GET_CC6_RES(hw_res->cc6);
 break;
+/* Goldmont */
+case 0x5C:
+case 0x5F:
+GET_PC2_RES(hw_res->pc2);
+GET_PC3_RES(hw_res->pc3);
+GET_PC6_RES(hw_res->pc6);
+GET_PC10_RES(hw_res->pc10);
+GET_CC1_RES(hw_res->cc1);
+GET_CC3_RES(hw_res->cc3);
+GET_CC6_RES(hw_res->cc6);
+break;
 }
 }
 
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2529,6 +2529,14 @@ static const struct lbr_info {
 { MSR_P4_LASTBRANCH_0_FROM_LIP, NUM_MSR_P4_LASTBRANCH_FROM_TO },
 { MSR_P4_LASTBRANCH_0_TO_LIP,   NUM_MSR_P4_LASTBRANCH_FROM_TO },
 { 0, 0 }
+}, sk_lbr[] = {
+{ MSR_IA32_LASTINTFROMIP,   1 },
+{ MSR_IA32_LASTINTTOIP, 1 },
+{ MSR_SKL_LASTBRANCH_TOS,   1 },
+{ MSR_SKL_LASTBRANCH_0_FROM_IP, NUM_MSR_SKL_LASTBRANCH },
+{ MSR_SKL_LASTBRANCH_0_TO_IP,   NUM_MSR_SKL_LASTBRANCH },
+{ MSR_SKL_LASTBRANCH_0_INFO,NUM_MSR_SKL_LASTBRANCH },
+{ 0, 0 }
 }, at_lbr[] = {
 { MSR_IA32_LASTINTFROMIP,   1 },
 { MSR_IA32_LASTINTTOIP, 1 },
@@ -2536,6 +2544,13 @@ static const struct lbr_info {
 { MSR_C2_LASTBRANCH_0_FROM_IP,  NUM_MSR_ATOM_LASTBRANCH_FROM_TO },
 { MSR_C2_LASTBRANCH_0_TO_IP,NUM_MSR_ATOM_LASTBRANCH_FROM_TO },
 { 0, 0 }
+}, gm_lbr[] = {
+{ MSR_IA32_LASTINTFROMIP,   1 },
+{ MSR_IA32_LASTINTTOIP, 1 },
+{ MSR_GM_LASTBRANCH_TOS,1 },
+{ MSR_GM_LASTBRANCH_0_FROM_IP,  NUM_MSR_GM_LASTBRANCH_FROM_TO },
+{ MSR_GM_LASTBRANCH_0_TO_IP,NUM_MSR_GM_LASTBRANCH_FROM_TO },
+{ 0, 0 }
 };
 
 static const struct lbr_info *last_branch_msr_get(void)
@@ -2550,7 +2565,6 @@ static const struct lbr_info *last_branc
 /* Enhanced Core */
 case 23:
 return c2_lbr;
-break;
 /* Nehalem */
 case 26: case 30: case 31: case 46:
 /* Westmere */
@@ -2562,11 +2576,13 @@ static const struct lbr_info *last_branc
 /* Haswell */
 case 60: case 63: case 69: case 70:
 /* Broadwell */
-case 61: case 79: case 86:
-/* future */
-case 78:
+case 61: case 71: case 79: case 86:
 return nh_lbr;
-break;
+/* Skylake */
+case 78: case 94:
+/* future */
+case 142: case 158:
+return sk_lbr;
 /* Atom */
 case 28: case 38: case 39: case 53: case 54:
 /* Silvermont */
@@ -2576,7 +2592,9 @@ static const struct lbr_info *last_branc
 /* Airmont */
 case 76:
 return at_lbr;
-b

Re: [Xen-devel] [RFC Design Doc v2] Add vNVDIMM support for Xen

2016-08-03 Thread Jan Beulich
>>> On 03.08.16 at 08:54,  wrote:
> On 08/02/16 08:46, Jan Beulich wrote:
>> >>> On 18.07.16 at 02:29,  wrote:
>> >  (4) Because the reserved area is now used by Xen hypervisor, it
>> >  should not be accessible by Dom0 any more. Therefore, if a host
>> >  pmem device is recorded by Xen hypervisor, Xen will unmap its
>> >  reserved area from Dom0. Our design also needs to extend Linux
>> >  NVDIMM driver to "balloon out" the reserved area after it
>> >  successfully reports a pmem device to Xen hypervisor.
>> 
>> ... "balloon out" ... _after_? That'd be unsafe.
>>
> 
> Before ballooning is accomplished, the pmem driver does not create any
> device node under /dev/ and hence no one except the pmem drive can
> access the reserved area on pmem, so I think it's okey to balloon
> after reporting.

Right now Dom0 isn't allowed to access any memory in use by Xen
(and not explicitly shared), and I don't think we should deviate
from that model for pmem.

>> > 4.2.3 Get Host Machine Address (SPA) of Host pmem Files
>> > 
>> >  Before a pmem file is assigned to a domain, we need to know the host
>> >  SPA ranges that are allocated to this file. We do this work in xl.
>> > 
>> >  If a pmem device /dev/pmem0 is given, xl will read
>> >  /sys/block/pmem0/device/{resource,size} respectively for the start
>> >  SPA and size of the pmem device.
>> > 
>> >  If a pre-allocated file /mnt/dax/file is given,
>> >  (1) xl first finds the host pmem device where /mnt/dax/file is. Then
>> >  it uses the method above to get the start SPA of the host pmem
>> >  device.
>> >  (2) xl then uses fiemap ioctl to get the extend mappings of
>> >  /mnt/dax/file, and adds the corresponding physical offsets and
>> >  lengths in each mapping entries to above start SPA to get the SPA
>> >  ranges pre-allocated for this file.
>> 
>> Remind me again: These extents never change, not even across
>> reboot? I think this would be good to be written down here explicitly.
> 
> Yes
> 
>> Hadn't there been talk of using labels to be able to allow a guest to
>> own the exact same physical range again after reboot or guest or
>> host?
> 
> You mean labels in NVDIMM label storage area? As defined in Intel
> NVDIMM Namespace Specification, labels are used to specify
> namespaces. For a pmem interleave set (possible cross several dimms),
> at most one pmem namespace (and hence at most one label) is
> allowed. Therefore, labels can not be used to partition pmem.

Okay. But then how do particular ranges get associated with the
owning guest(s)? Merely by SPA would seem rather fragile to me.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4] x86/cpuid: AVX-512 Feature Detection

2016-08-03 Thread Jan Beulich
>>> On 03.08.16 at 03:32,  wrote:
>>  On 25/07/16 07:09, Kang, Luwei wrote:
>>  First of all - please don't top post.
>> 
>> > What about remove the dependency between AVX2 and AVX512F
>> >> ( AVX2:
>>  [AVX512F], ) ?
>> 
>>  Yes, that's what I think we want, but we need Andrew's agreement here.
>> 
>> >>> Hi Andrew,  what is your opinion ?
>> >> You are in a better position to answer than me.
>> >>
>> >> For a specific instruction which may be VEX and EVEX encoded, is the
>> >> circuitry for a specific instruction shared, or discrete?  Is there a
>> >> plausible future where you might support only the EVEX variant of an
>> >> instruction, and not the VEX variant?
>> >>
>> >> These dependences are about what may be reasonably assumed about the
>> >> way the environment is structured.  It doesn't look reasonable to
>> >> advertise an AVX512 environment to guests while at the same time
>> >> stating that AVX2 is not present.  If this is correct, then the 
>> >> dependency 
> should stay.
>> >> If Intel plausibly things it might release hardware with !AVX2 but
>> >> AVX512, then the dependency should be dropped.
>> > Regarding the dependency between AVX2 and AVX512F, I have ask some 
>> > hardware 
> architecture engineer.
>> >
>> > AVX512 is a superset of AVX2, the most important item being the state. If 
> the state for AVX2 isn't enabled (in XCR0), then AVX512
>> also can't function.
>> >
>> > So if we want to use AVX512, AVX2 must be supported and enabled. The 
> dependence between AVX512F and AVX2 may need be
>> reserved.
>> 
>> Ok, so AVX512 strictly depends on AVX2.
>> 
>> In which case, the python code was correct.  The meaning of the key/value 
> relationship is "if the feature in the key is not present, all
>> features in the value must also be disabled".
> 
> Hi jan, what is your opinion?

There's no opinion to be had here, according to your earlier reply. I
do, however, continue to question that model: State and instruction
set are independent items. Of course YMM state is a prereq to ZMM
state, but I do not buy AVX2 insn support being a prereq to AVX-512
insns. Yet to me the AVX2 and AVX-512F feature flags solely
represent the instructions, while the XSTATE leaf bits represent the
states.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/9] x86/hypercall: Move some of the hvm hypercall infrastructure into hypercall.h

2016-08-03 Thread Jan Beulich
>>> On 02.08.16 at 20:43,  wrote:
> On Tue, 2 Aug 2016, Jan Beulich wrote:
>> >>> On 02.08.16 at 16:59,  wrote:
>> > On 02/08/16 15:54, Jan Beulich wrote:
>> > On 02.08.16 at 16:26,  wrote:
>> >>> On 02/08/16 15:17, Jan Beulich wrote:
>>  Well, I find it quite odd for hypercall argument counts to differ
>>  between arches. I.e. I'd conclude the ABI was mis-specified.
>> >>> Is it documented somewhere for the x86 code? Looking at Linux, the 
>> >>> privcmd call is only passing 5 arguments on both ARM and x86.
>> >> arch-x86/xen-x86_32.h has
>> >>
>> >>  * Hypercall interface:
>> >>  *  Input:  %ebx, %ecx, %edx, %esi, %edi, %ebp (arguments 1-6)
>> >>  *  Output: %eax
>> >>
>> >> while arch-x86/xen-x86_64.h has
>> >>
>> >>  * Hypercall interface:
>> >>  *  Input:  %rdi, %rsi, %rdx, %r10, %r8, %r9 (arguments 1-6)
>> >>  *  Output: %rax
>> > 
>> > The only actual 6 argument hypercall is the v4v hypercall, better known
>> > as __HYPERVISOR_xc_reserved_op at index 39, but that isn't implemented
>> > anywhere upstream.
>> 
>> But it serves as an example what now wouldn't work on ARM.
> 
> At the time the arm hypercall ABI was published, it matched the x86
> hypercall ABI, which had only 5 hypercall arguments.
> 
> The issue is that the x86 hypercall ABI changed, and now is out of sync
> with ARM. The faulty commit being:

That's one way of viewing it, but I don't think an appropriate one.
6-argument hypercalls had always been possible on x86, just that
they might not have been documented in the public headers (but
instead only in the actual hypercall implementation).

Jan

> commit 4af64160c580b02f28c992c09d55957cb20a9b91
> Author: David Vrabel 
> Date:   Wed May 30 09:25:11 2012 +0100
> 
> x86: document register for 6th hypercall argument
> 
> From: David Vrabel 
> 
> Signed-off-by: David Vrabel 
> Committed-by: Keir Fraser 
> 
> 
> While the ARM ABI is from few months earlier:
> 
> commit 40f20c4bfcd5d25c90f9419250ca8a229bf4c1e5
> Author: Stefano Stabellini 
> Date:   Tue Mar 13 16:04:05 2012 +
> 
> arm: use r12 to pass the hypercall number
> 
> ** This is a guest visible ABI change which requires an updated guest 
> kernel **
> 
> Use r12 to pass the hypercall number and r0-r4 for the hypercall
> arguments.
> 
> Use the ISS to pass an hypervisor specific tag.
> 
> Remove passing unused registers to arm_hypercall_table: we don't have 6
> arguments hypercalls and we never use 64 bit values as hypercall
> arguments, 64 bit values are only contained within structs passed as
> arguments.
> 
> Signed-off-by: Stefano Stabellini 
> [ use #ifndef NDEBUG, fix coding style, expand calling convention 
> comment
>   slightly and added a big fat note about ABI change - ijc ]
>  




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.7.0 boot PANIC on kernel 4.7.0-4 + UEFI ?

2016-08-03 Thread Jan Beulich
>>> On 02.08.16 at 21:02,  wrote:
>> > Can you try
>> > 
>> > ((void *)(md) + (m)->desc_size - 1) < 
>> > (m)->map_end;   \
>> > 
>> > instead?
> 
> with the 'baseline' as referenced + a patched kernel
> 
>   > Can you try
>   >((void *)(md) + (m)->desc_size - 1) < (m)->map_end;
>\
> 
> with efi cmd line opts: +"/mapbs"
> 
> The system now boots correctly

Thanks. Does the use of /mapbs really matter for booting? I was
assuming it would be relevant only for shutdown/reboot?

I'd like to credit you with Reported-by and Tested-by tags, but those
will look odd without an real name. Please let me know if I should
rather not add such tags to the patch or what your real name is.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] mwait-idle: add Denverton

2016-08-03 Thread Andrew Cooper
On 03/08/16 09:36, Jan Beulich wrote:
> Denverton is an Intel Atom based micro server which shares the same
> Goldmont architecture as Broxton. The available C-states on
> Denverton is a subset of Broxton with only C1, C1e, and C6.
>
> Signed-off-by: Jacob Pan 
> Signed-off-by: Len Brown 
> Signed-off-by: Rafael J. Wysocki 
> [Linux commit: 0080d65b7719fc58e60b5595fc61acded330004f]
> Signed-off-by: Jan Beulich 

Acked-by: Andrew Cooper 
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] tools:libxl: return tty path for all serials

2016-08-03 Thread Wei Liu
I would suggest changing the title to 

  libxl: return any serial tty path in libxl_console_get_tty

On Wed, Aug 03, 2016 at 09:27:19AM +0800, Bob Liu wrote:
> When specifying a serial list in domain config, users of
> libxl_console_get_tty cannot get the tty path of a second specified pty 
> serial,
> since right now it always returns the tty path of serial 0.
> 
> Signed-off-by: Bob Liu 
> ---
>  tools/libxl/libxl.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 2cf7451..00af286 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -1795,7 +1795,7 @@ int libxl_console_get_tty(libxl_ctx *ctx, uint32_t 
> domid, int cons_num,
>  
>  switch (type) {
>  case LIBXL_CONSOLE_TYPE_SERIAL:
> -tty_path = GCSPRINTF("%s/serial/0/tty", dom_path);
> +tty_path = GCSPRINTF("%s/serial/%d/tty", dom_path, cons_num);

Code-wise, this patch looks good to me. It's a valid bug fix that should
be backported.

Please CC Ian and me in your future submissions.

Wei.
>  break;
>  case LIBXL_CONSOLE_TYPE_PV:
>  if (cons_num == 0)
> -- 
> 1.7.10.4
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] libxl: use llabs() instead abs() for int64_t argument

2016-08-03 Thread Wei Liu
On Tue, Aug 02, 2016 at 06:28:37PM +0100, Andrew Cooper wrote:
> On 02/08/16 18:25, Juergen Gross wrote:
> > Commit 57f8b13c724023c78fa15a80452d1de3e51a1418 ("libxl: memory size
> > in kb requires 64 bit variable") introduced a bug: abs() shouldn't
> > be called with an int64_t argument. llabs() is to be used here.
> 
> Possibly worth identifying that this was caught by a clang build, citing:
> 
> libxl.c:4198:33: error: absolute value function 'abs' given an argument
> of type
> 'int64_t' (aka 'long') but has parameter of type 'int' which may cause
> truncation of value [-Werror,-Wabsolute-value]
> if (target_memkb < 0 && abs(target_memkb) > current_target_memkb)
>  ^

Juergen, I can fix up the commit message for you, if you don't object to
Andrew's suggestion.

---
Commit 57f8b13c724023c78fa15a80452d1de3e51a1418 ("libxl: memory size
in kb requires 64 bit variable") introduced a bug: abs() shouldn't
be called with an int64_t argument. llabs() is to be used here.

Caught by clang build with error message:

libxl.c:4198:33: error: absolute value function 'abs' given an argument
of type
'int64_t' (aka 'long') but has parameter of type 'int' which may cause
truncation of value [-Werror,-Wabsolute-value]
if (target_memkb < 0 && abs(target_memkb) > current_target_memkb)
---

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 2/4] tools: split out xenstored starting form xencommons

2016-08-03 Thread Wei Liu
On Tue, Aug 02, 2016 at 06:10:45PM +0200, Juergen Gross wrote:
> In order to prepare starting a xenstore domain split out the starting
> of the xenstore daemon from the xencommons script into a dedicated
> launch-xenstore script.
> 
> A rerun of autogen.sh is required.
> 
> Signed-off-by: Juergen Gross 

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86: support newer Intel CPU models

2016-08-03 Thread Andrew Cooper
On 03/08/16 09:38, Jan Beulich wrote:
> ... as per the June 2016 edition of the SDM.
>
> Also remove a couple of dead break statements as well as unused
> *MSR_PM_LASTBRANCH* #define-s.
>
> Signed-off-by: Jan Beulich 

Acked-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] libxl: use llabs() instead abs() for int64_t argument

2016-08-03 Thread Juergen Gross
On 03/08/16 11:13, Wei Liu wrote:
> On Tue, Aug 02, 2016 at 06:28:37PM +0100, Andrew Cooper wrote:
>> On 02/08/16 18:25, Juergen Gross wrote:
>>> Commit 57f8b13c724023c78fa15a80452d1de3e51a1418 ("libxl: memory size
>>> in kb requires 64 bit variable") introduced a bug: abs() shouldn't
>>> be called with an int64_t argument. llabs() is to be used here.
>>
>> Possibly worth identifying that this was caught by a clang build, citing:
>>
>> libxl.c:4198:33: error: absolute value function 'abs' given an argument
>> of type
>> 'int64_t' (aka 'long') but has parameter of type 'int' which may cause
>> truncation of value [-Werror,-Wabsolute-value]
>> if (target_memkb < 0 && abs(target_memkb) > current_target_memkb)
>>  ^
> 
> Juergen, I can fix up the commit message for you, if you don't object to
> Andrew's suggestion.

No, I don't object.


Thanks,

Juergen

> 
> ---
> Commit 57f8b13c724023c78fa15a80452d1de3e51a1418 ("libxl: memory size
> in kb requires 64 bit variable") introduced a bug: abs() shouldn't
> be called with an int64_t argument. llabs() is to be used here.
> 
> Caught by clang build with error message:
> 
> libxl.c:4198:33: error: absolute value function 'abs' given an argument
> of type
> 'int64_t' (aka 'long') but has parameter of type 'int' which may cause
> truncation of value [-Werror,-Wabsolute-value]
> if (target_memkb < 0 && abs(target_memkb) > current_target_memkb)
> ---
> 
> Wei.
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/8] x86/time: calibrate TSC against platform timer

2016-08-03 Thread Jan Beulich
>>> On 02.08.16 at 21:25,  wrote:
> On 20/06/16 16:19, Jan Beulich wrote:
> On 20.06.16 at 16:20,  wrote:
>>> On 15/06/16 11:28, Jan Beulich wrote:
 --- a/xen/arch/x86/i8259.c
 +++ b/xen/arch/x86/i8259.c
 @@ -359,12 +359,7 @@ void __init init_IRQ(void)
  
  apic_intr_init();
  
 -/* Set the clock to HZ Hz */
 -#define CLOCK_TICK_RATE 1193182 /* crystal freq (Hz) */
 -#define LATCH (((CLOCK_TICK_RATE)+(HZ/2))/HZ)
 -outb_p(0x34, PIT_MODE);/* binary, mode 2, LSB/MSB, ch 0 */
 -outb_p(LATCH & 0xff, PIT_CH0); /* LSB */
 -outb(LATCH >> 8, PIT_CH0); /* MSB */
 +preinit_pit();
>>> This highlights the fact that we have two unconditional calls to
>>> preinit_pit() during startup, which is one too many.
>>>
>>> init_IRQ() is called rather earlier than early_time_init(), but I can't
>>> spot anything inbetween the two calls which would require the PIT to be
>>> set up.  AFAICT, it is safe to just drop the preinit_pit() call here.
>> LAPIC initialization makes use of the PIT, and I think that would
>> break when removing it here. And since doing it twice is benign,
>> I'd also like to not drop it from early_time_init().
> 
> Where? LAPIC initialisation is before this point - its higher up in
> init_IRQ() so surely can't depend on this currently working.

Hmm, looks like my earlier flow analysis was wrong (or perhaps
based on the old placement of the call to init_platform_timer() in
init_xen_time()):

__start_xen()
-> [line 1388] init_IRQ()
-> [line 1429] early_time_init()
-> [line 1447] smp_prepare_cpus()
 -> setup_boot_APIC_clock()
  -> calibrate_APIC_clock()
-> [line 1456] init_xen_time();

Let me verify that removing the one here also works in practice.

> As for benign, at the best it is a waste of time, and on reduced
> hardware platforms, wrong.  We shouldn't be propagating problems like these.
> 
>>
 @@ -340,7 +348,8 @@ static struct platform_timesource __init
  .frequency = CLOCK_TICK_RATE,
  .read_counter = read_pit_count,
  .counter_bits = 32,
 -.init = init_pit
 +.init = init_pit,
 +.resume = resume_pit
>>> Please add a redundant comma at the end, to reduce the next diff to
>>> change this block.
>> Well, I'd like the three instance to remain consistent in this regard
>> (yet touching the others doesn't seem warranted). And a change
>> here isn't all that likely.
> 
> This is just like any other bit of style - it should be fixed up while
> editing even if the rest of the file isn't completely up to scratch.

Well, I actually had done that part already.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC Design Doc v2] Add vNVDIMM support for Xen

2016-08-03 Thread Haozhong Zhang
On 08/03/16 02:45, Jan Beulich wrote:
> >>> On 03.08.16 at 08:54,  wrote:
> > On 08/02/16 08:46, Jan Beulich wrote:
> >> >>> On 18.07.16 at 02:29,  wrote:
> >> >  (4) Because the reserved area is now used by Xen hypervisor, it
> >> >  should not be accessible by Dom0 any more. Therefore, if a host
> >> >  pmem device is recorded by Xen hypervisor, Xen will unmap its
> >> >  reserved area from Dom0. Our design also needs to extend Linux
> >> >  NVDIMM driver to "balloon out" the reserved area after it
> >> >  successfully reports a pmem device to Xen hypervisor.
> >> 
> >> ... "balloon out" ... _after_? That'd be unsafe.
> >>
> > 
> > Before ballooning is accomplished, the pmem driver does not create any
> > device node under /dev/ and hence no one except the pmem drive can
> > access the reserved area on pmem, so I think it's okey to balloon
> > after reporting.
> 
> Right now Dom0 isn't allowed to access any memory in use by Xen
> (and not explicitly shared), and I don't think we should deviate
> from that model for pmem.
>

In this design, Xen hypervisor unmaps the reserved area from Dom0 so
that Dom0 cannot access the reserved area afterwards. And "balloon" is
in fact not a memory ballooning, because Linux kernel never allocates
from pmem like normal ram. In my current implementation, it's just to
remove the reserved area from a resource struct covering pmem.

> >> > 4.2.3 Get Host Machine Address (SPA) of Host pmem Files
> >> > 
> >> >  Before a pmem file is assigned to a domain, we need to know the host
> >> >  SPA ranges that are allocated to this file. We do this work in xl.
> >> > 
> >> >  If a pmem device /dev/pmem0 is given, xl will read
> >> >  /sys/block/pmem0/device/{resource,size} respectively for the start
> >> >  SPA and size of the pmem device.
> >> > 
> >> >  If a pre-allocated file /mnt/dax/file is given,
> >> >  (1) xl first finds the host pmem device where /mnt/dax/file is. Then
> >> >  it uses the method above to get the start SPA of the host pmem
> >> >  device.
> >> >  (2) xl then uses fiemap ioctl to get the extend mappings of
> >> >  /mnt/dax/file, and adds the corresponding physical offsets and
> >> >  lengths in each mapping entries to above start SPA to get the SPA
> >> >  ranges pre-allocated for this file.
> >> 
> >> Remind me again: These extents never change, not even across
> >> reboot? I think this would be good to be written down here explicitly.
> > 
> > Yes
> > 
> >> Hadn't there been talk of using labels to be able to allow a guest to
> >> own the exact same physical range again after reboot or guest or
> >> host?
> > 
> > You mean labels in NVDIMM label storage area? As defined in Intel
> > NVDIMM Namespace Specification, labels are used to specify
> > namespaces. For a pmem interleave set (possible cross several dimms),
> > at most one pmem namespace (and hence at most one label) is
> > allowed. Therefore, labels can not be used to partition pmem.
> 
> Okay. But then how do particular ranges get associated with the
> owning guest(s)? Merely by SPA would seem rather fragile to me.
> 

By using the file name, e.g. if I specify vnvdimm = [ 'file=/mnt/dax/foo' ]
in a domain config file, SPA occupied by /mnt/dax/foo are mapped to
the domain.  If the same file is used every time the domain is created,
the same virtual device will be seen by that domain.

Thanks,
Haozhong

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] live migration to qemu.git fails

2016-08-03 Thread Wei Liu
Cc Anthony and Stefano

On Tue, Aug 02, 2016 at 11:58:05AM +0200, Olaf Hering wrote:
> Doing a live migration of a HVM domU from staging-4.5 with its included
> qemu-xen to staging-4.7 with qemu-xen from qemu.org (either master or
> 2.6.0) fails:
> 
> # cat /var/log/xen/qemu-dm-fv-x64-sles12sp1-clean--incoming.log
> char device redirected to /dev/pts/4 (label serial0)
> xen_ram_alloc: do not alloc f80 bytes of ram at 0 when runstate is 
> INMIGRATE
> xen_ram_alloc: do not alloc 80 bytes of ram at f80 when runstate is 
> INMIGRATE
> xen_ram_alloc: do not alloc 1 bytes of ram at 1000 when runstate is 
> INMIGRATE
> xen_ram_alloc: do not alloc 4 bytes of ram at 1001 when runstate is 
> INMIGRATE
> VNC server running on 127.0.0.1:5900
> qemu-system-i386: Expected vmdescription section, but got 0
> 
> 
> 'xl vnc domU' on the receiver side just shows "guest has not initialized
> display yet". For some reason the domU is still alive, 'vmstat 1' via
> ssh keeps running.
> 
> 'xl shutdown domU' fails. It looks like the guest OS shuts down, the ssh
> connection gets terminated, but the domU becomes busy according to 'xl
> top', and never shuts down.
> 
> domU.cfg looks like this:
> name="fv-x64-sles12sp1-clean"
> memory=256
> serial="pty"
> builder="hvm"
> boot="cd"
> disk=[
> 'file:/disk0.raw,hda,w',
> 'file:/DVD1.iso,hdc:cdrom,r',
> ]
> vif=[
> 'bridge=br0'
> ]
> keymap="de"
> vfb = [
> 'type=vnc,vncunused=1,keymap=de'
> ]
> 
> usb=1
> usbdevice='tablet'
> device_model_override="/root/dm.sh"
> 
> 
> The override is to make sure to use qemu-xen as included in xen-4.5 on
> the sending side and qemu.git on the receiving side.
> 
> Olaf



> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/8] x86/time: correctly honor late clearing of TSC related feature flags

2016-08-03 Thread Jan Beulich
>>> On 02.08.16 at 21:08,  wrote:
> On 04/07/16 16:53, Jan Beulich wrote:
> On 04.07.16 at 17:39,  wrote:
>>> On 20/06/16 16:20, Jan Beulich wrote:
>>> On 20.06.16 at 16:32,  wrote:
> On 15/06/16 11:28, Jan Beulich wrote:
>> --- a/xen/arch/x86/time.c
>> +++ b/xen/arch/x86/time.c
>> @@ -1358,6 +1358,24 @@ static void time_calibration(void *unuse
>>   &r, 1);
>>  }
>>  
>> +void __init clear_tsc_cap(unsigned int feature)
>> +{
>> +void (*rendezvous_fn)(void *) = time_calibration_std_rendezvous;
> This should read time_calibration_rendezvous_fn rather than assuming
> time_calibration_std_rendezvous.
>
> Otherwise, there is a risk that it could be reset back to
> time_calibration_std_rendezvous.
 But that's the purpose: We may need to switch back.
>>> Under what circumstances could we ever move from re-syncing back to not
>>> re-syncing?
>> verify_tsc_reliability() may result in X86_FEATURE_TSC_RELIABLE
>> getting cleared. That's an initcall, which means it runs after
>> init_xen_time(), and hence after the rendezvous function got
>> established initially.
> 
> Right, but that isn't important.
> 
> There will never be a case where, once TSC_RELIABLE is cleared, it is
> safe to revert back to std_rendezvous, even if TSC_RELIABLE is
> subsequently re-set.

You've got this backwards: TSC_RELIABLE may get _cleared_ late.
Nothing can ever set it late, due to the use of setup_clear_cpu_cap().
Reverting back to time_calibration_std_rendezvous() would only be
possible if CONSTANT_TSC got cleared late, ...

> Therefore, this function must never accidentally revert
> time_calibration_rendezvous_fn from time_calibration_tsc_rendezvous back
> to time_calibration_std_rendezvous.

... yet I can't see what would be wrong with that especially when
that still happened at boot time. Or else how would it be safe to
switch the other direction? (If neither switch was safe, then all we
could do upon finding the TSC unreliable in verify_tsc_reliability()
would be to panic().)

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC Design Doc v2] Add vNVDIMM support for Xen

2016-08-03 Thread Jan Beulich
>>> On 03.08.16 at 11:37,  wrote:
> On 08/03/16 02:45, Jan Beulich wrote:
>> >>> On 03.08.16 at 08:54,  wrote:
>> > On 08/02/16 08:46, Jan Beulich wrote:
>> >> >>> On 18.07.16 at 02:29,  wrote:
>> >> >  (4) Because the reserved area is now used by Xen hypervisor, it
>> >> >  should not be accessible by Dom0 any more. Therefore, if a host
>> >> >  pmem device is recorded by Xen hypervisor, Xen will unmap its
>> >> >  reserved area from Dom0. Our design also needs to extend Linux
>> >> >  NVDIMM driver to "balloon out" the reserved area after it
>> >> >  successfully reports a pmem device to Xen hypervisor.
>> >> 
>> >> ... "balloon out" ... _after_? That'd be unsafe.
>> >>
>> > 
>> > Before ballooning is accomplished, the pmem driver does not create any
>> > device node under /dev/ and hence no one except the pmem drive can
>> > access the reserved area on pmem, so I think it's okey to balloon
>> > after reporting.
>> 
>> Right now Dom0 isn't allowed to access any memory in use by Xen
>> (and not explicitly shared), and I don't think we should deviate
>> from that model for pmem.
> 
> In this design, Xen hypervisor unmaps the reserved area from Dom0 so
> that Dom0 cannot access the reserved area afterwards. And "balloon" is
> in fact not a memory ballooning, because Linux kernel never allocates
> from pmem like normal ram. In my current implementation, it's just to
> remove the reserved area from a resource struct covering pmem.

Ah, in that case please either use a different term, or explain what
"balloon out" is meant to mean in this context.

>> >> > 4.2.3 Get Host Machine Address (SPA) of Host pmem Files
>> >> > 
>> >> >  Before a pmem file is assigned to a domain, we need to know the host
>> >> >  SPA ranges that are allocated to this file. We do this work in xl.
>> >> > 
>> >> >  If a pmem device /dev/pmem0 is given, xl will read
>> >> >  /sys/block/pmem0/device/{resource,size} respectively for the start
>> >> >  SPA and size of the pmem device.
>> >> > 
>> >> >  If a pre-allocated file /mnt/dax/file is given,
>> >> >  (1) xl first finds the host pmem device where /mnt/dax/file is. Then
>> >> >  it uses the method above to get the start SPA of the host pmem
>> >> >  device.
>> >> >  (2) xl then uses fiemap ioctl to get the extend mappings of
>> >> >  /mnt/dax/file, and adds the corresponding physical offsets and
>> >> >  lengths in each mapping entries to above start SPA to get the SPA
>> >> >  ranges pre-allocated for this file.
>> >> 
>> >> Remind me again: These extents never change, not even across
>> >> reboot? I think this would be good to be written down here explicitly.
>> > 
>> > Yes
>> > 
>> >> Hadn't there been talk of using labels to be able to allow a guest to
>> >> own the exact same physical range again after reboot or guest or
>> >> host?
>> > 
>> > You mean labels in NVDIMM label storage area? As defined in Intel
>> > NVDIMM Namespace Specification, labels are used to specify
>> > namespaces. For a pmem interleave set (possible cross several dimms),
>> > at most one pmem namespace (and hence at most one label) is
>> > allowed. Therefore, labels can not be used to partition pmem.
>> 
>> Okay. But then how do particular ranges get associated with the
>> owning guest(s)? Merely by SPA would seem rather fragile to me.
>> 
> 
> By using the file name, e.g. if I specify vnvdimm = [ 'file=/mnt/dax/foo' ]
> in a domain config file, SPA occupied by /mnt/dax/foo are mapped to
> the domain.  If the same file is used every time the domain is created,
> the same virtual device will be seen by that domain.

So what if the file got deleted and re-created in between? Since
I don't think you can specify the SPAs to use when creating such
a file, such an operation would be quite different from removing
and re-adding e.g. a specific PCI device (to be used by a guest)
on a host (while the guest is not running).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] libxl: use llabs() instead abs() for int64_t argument

2016-08-03 Thread Wei Liu
On Wed, Aug 03, 2016 at 11:27:04AM +0200, Juergen Gross wrote:
> On 03/08/16 11:13, Wei Liu wrote:
> > On Tue, Aug 02, 2016 at 06:28:37PM +0100, Andrew Cooper wrote:
> >> On 02/08/16 18:25, Juergen Gross wrote:
> >>> Commit 57f8b13c724023c78fa15a80452d1de3e51a1418 ("libxl: memory size
> >>> in kb requires 64 bit variable") introduced a bug: abs() shouldn't
> >>> be called with an int64_t argument. llabs() is to be used here.
> >>
> >> Possibly worth identifying that this was caught by a clang build, citing:
> >>
> >> libxl.c:4198:33: error: absolute value function 'abs' given an argument
> >> of type
> >> 'int64_t' (aka 'long') but has parameter of type 'int' which may cause
> >> truncation of value [-Werror,-Wabsolute-value]
> >> if (target_memkb < 0 && abs(target_memkb) > current_target_memkb)
> >>  ^
> > 
> > Juergen, I can fix up the commit message for you, if you don't object to
> > Andrew's suggestion.
> 
> No, I don't object.
> 

Thanks. I will make the adjustment while committing.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/mmcfg: Fix initalisation of variables in pci_mmcfg_nvidia_mcp55()

2016-08-03 Thread Andrew Cooper
Shifting into the sign bit of an integer is undefined behaviour.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 

I was experimenting with -fsanitise=undefined and this bit failed to compile,
as the initialisation became a non-constant expression.
---
 xen/arch/x86/x86_64/mmconfig-shared.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/x86_64/mmconfig-shared.c 
b/xen/arch/x86/x86_64/mmconfig-shared.c
index 742bc18..a7592c6 100644
--- a/xen/arch/x86/x86_64/mmconfig-shared.c
+++ b/xen/arch/x86/x86_64/mmconfig-shared.c
@@ -182,10 +182,10 @@ static const char __init *pci_mmcfg_nvidia_mcp55(void)
 int bus, i;
 
 static const u32 extcfg_regnum  = 0x90;
-static const u32 extcfg_enable_mask = 1<<31;
-static const u32 extcfg_start_mask  = 0xff<<16;
+static const u32 extcfg_enable_mask = 1u << 31;
+static const u32 extcfg_start_mask  = 0xffu << 16;
 static const int extcfg_start_shift = 16;
-static const u32 extcfg_size_mask   = 0x3<<28;
+static const u32 extcfg_size_mask   = 3u << 28;
 static const int extcfg_size_shift  = 28;
 static const int extcfg_sizebus[]   = {0xff, 0x7f, 0x3f, 0x1f};
 static const u32 extcfg_base_mask[] = {0x7ff8, 0x7ffc, 0x7ffe, 0x7fff};
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 99915: all pass - PUSHED

2016-08-03 Thread osstest service owner
flight 99915 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/99915/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf deaacda3b2740477733564066eb69d5c94b41bba
baseline version:
 ovmf 4884e81b5d0da1bcc05361dd4c5fc06b3722f144

Last test of basis99914  2016-08-03 04:47:07 Z0 days
Testing same since99915  2016-08-03 07:44:35 Z0 days1 attempts


People who touched revisions under test:
  Liming Gao 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=ovmf
+ revision=deaacda3b2740477733564066eb69d5c94b41bba
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push ovmf 
deaacda3b2740477733564066eb69d5c94b41bba
+ branch=ovmf
+ revision=deaacda3b2740477733564066eb69d5c94b41bba
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=ovmf
+ xenbranch=xen-unstable
+ '[' xovmf = xlinux ']'
+ linuxbranch=
+ '[' x = x ']'
+ qemuubranch=qemu-upstream-unstable
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable
+ prevxenbranch=xen-4.7-testing
+ '[' xdeaacda3b2740477733564066eb69d5c94b41bba = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src 
'[fetch=try]'
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local 'options=[fetch=try]'
 getconfig GitCacheProxy
 perl -e '
use Osstest;
readglobalconfig();
print $c{"GitCacheProxy"} or die $!;
'
+++ local cache=git://cache:9419/
+++ '[' xgit://cache:9419/ '!=' x ']'
+++ echo 
'git://cache:9419/https://github.com/rumpkernel/rumpkernel-netbsd-src%20[fetch=try]'
++ : 
'git://cache:9419

Re: [Xen-devel] [RFC Design Doc v2] Add vNVDIMM support for Xen

2016-08-03 Thread Haozhong Zhang
On 08/03/16 03:47, Jan Beulich wrote:
> >>> On 03.08.16 at 11:37,  wrote:
> > On 08/03/16 02:45, Jan Beulich wrote:
> >> >>> On 03.08.16 at 08:54,  wrote:
> >> > On 08/02/16 08:46, Jan Beulich wrote:
> >> >> >>> On 18.07.16 at 02:29,  wrote:
> >> >> >  (4) Because the reserved area is now used by Xen hypervisor, it
> >> >> >  should not be accessible by Dom0 any more. Therefore, if a host
> >> >> >  pmem device is recorded by Xen hypervisor, Xen will unmap its
> >> >> >  reserved area from Dom0. Our design also needs to extend Linux
> >> >> >  NVDIMM driver to "balloon out" the reserved area after it
> >> >> >  successfully reports a pmem device to Xen hypervisor.
> >> >> 
> >> >> ... "balloon out" ... _after_? That'd be unsafe.
> >> >>
> >> > 
> >> > Before ballooning is accomplished, the pmem driver does not create any
> >> > device node under /dev/ and hence no one except the pmem drive can
> >> > access the reserved area on pmem, so I think it's okey to balloon
> >> > after reporting.
> >> 
> >> Right now Dom0 isn't allowed to access any memory in use by Xen
> >> (and not explicitly shared), and I don't think we should deviate
> >> from that model for pmem.
> > 
> > In this design, Xen hypervisor unmaps the reserved area from Dom0 so
> > that Dom0 cannot access the reserved area afterwards. And "balloon" is
> > in fact not a memory ballooning, because Linux kernel never allocates
> > from pmem like normal ram. In my current implementation, it's just to
> > remove the reserved area from a resource struct covering pmem.
> 
> Ah, in that case please either use a different term, or explain what
> "balloon out" is meant to mean in this context.
> 
> >> >> > 4.2.3 Get Host Machine Address (SPA) of Host pmem Files
> >> >> > 
> >> >> >  Before a pmem file is assigned to a domain, we need to know the host
> >> >> >  SPA ranges that are allocated to this file. We do this work in xl.
> >> >> > 
> >> >> >  If a pmem device /dev/pmem0 is given, xl will read
> >> >> >  /sys/block/pmem0/device/{resource,size} respectively for the start
> >> >> >  SPA and size of the pmem device.
> >> >> > 
> >> >> >  If a pre-allocated file /mnt/dax/file is given,
> >> >> >  (1) xl first finds the host pmem device where /mnt/dax/file is. Then
> >> >> >  it uses the method above to get the start SPA of the host pmem
> >> >> >  device.
> >> >> >  (2) xl then uses fiemap ioctl to get the extend mappings of
> >> >> >  /mnt/dax/file, and adds the corresponding physical offsets and
> >> >> >  lengths in each mapping entries to above start SPA to get the SPA
> >> >> >  ranges pre-allocated for this file.
> >> >> 
> >> >> Remind me again: These extents never change, not even across
> >> >> reboot? I think this would be good to be written down here explicitly.
> >> > 
> >> > Yes
> >> > 
> >> >> Hadn't there been talk of using labels to be able to allow a guest to
> >> >> own the exact same physical range again after reboot or guest or
> >> >> host?
> >> > 
> >> > You mean labels in NVDIMM label storage area? As defined in Intel
> >> > NVDIMM Namespace Specification, labels are used to specify
> >> > namespaces. For a pmem interleave set (possible cross several dimms),
> >> > at most one pmem namespace (and hence at most one label) is
> >> > allowed. Therefore, labels can not be used to partition pmem.
> >> 
> >> Okay. But then how do particular ranges get associated with the
> >> owning guest(s)? Merely by SPA would seem rather fragile to me.
> >> 
> > 
> > By using the file name, e.g. if I specify vnvdimm = [ 'file=/mnt/dax/foo' ]
> > in a domain config file, SPA occupied by /mnt/dax/foo are mapped to
> > the domain.  If the same file is used every time the domain is created,
> > the same virtual device will be seen by that domain.
> 
> So what if the file got deleted and re-created in between? Since
> I don't think you can specify the SPAs to use when creating such
> a file, such an operation would be quite different from removing
> and re-adding e.g. a specific PCI device (to be used by a guest)
> on a host (while the guest is not running).
> 

If modified in between, guest will see a virtual pmem device of
different data. But the usage of pmem is similar to disk: if a file of
the same content is given every time, the guest can get a virtual
pmem/disk of the same data as last reboot/shutdown; keeping the data
unchanged between multiple boots is out of the scope of Xen.

Haozhong

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH -v3 1/1] ratelimit: Implement rate limit for credit2 scheduler Rate limit assures that a vcpu will execute for a minimum amount of time before being put at the back of a queue o

2016-08-03 Thread George Dunlap
On 26/07/16 17:21, Dario Faggioli wrote:
> On Mon, 2016-07-25 at 17:12 +0100, Anshul Makkar wrote:
>> It introduces context-switch rate-limiting. The patch enables the VM
>> to batch
>> its work and prevents the system from spending most of its time in
>> context
>> switches because of a VM that is waking/sleeping at high rate.
>>
>> ratelimit can be disabled by setting it to 0.
>>
> Thanks Anshul. I've looked at the patch, and it seemed all right to me.
> 
> I decided to build and test it, and I've seem something that I found
> weird.
> 
> But first of all, one thing that I'm sorry I'm mentioning now
> (especially considering it was quite evident! :-/).
> 
> The subject, which will become the "title" of the commit, is way too
> long. That must be a very concise headline of what the patch is about,
> and should also have tags, specifying what areas of the codebase are
> affected. So what do you think of this:
> 
>   xen: credit2: implement context switch rate-limiting.

It looks like it's just missing a carrage return -- I could fix that up
on check-in.

> 
> On a more technical side, I think that...
> 
>> @@ -2116,9 +2147,22 @@ csched2_runtime(const struct scheduler *ops,
>> int cpu, struct csched2_vcpu *snext
>>   * 1) Run until snext's credit will be 0
>>   * 2) But if someone is waiting, run until snext's credit is
>> equal
>>   * to his
>> - * 3) But never run longer than MAX_TIMER or shorter than
>> MIN_TIMER.
>> + * 3) But never run longer than MAX_TIMER or shorter than
>> MIN_TIMER or
>> + * the ratelimit time.
>>   */
>>  
>> +/* Calculate mintime */
>> +min_time = CSCHED2_MIN_TIMER;
>> +if ( prv->ratelimit_us )
>> +{
>> +s_time_t ratelimit_min = prv->ratelimit_us;
>>
> ... this should be:
> 
>  s_time_t ratelimit_min = MICROSECS(prv->ratelimit_us);
> 
> I realized that by looking at traces and seeing entries for which
> CSCHED2_MIN_TIMER was being returned, even if I had set
> sched_ratelimit_us to a value greater than that.

Yes, Dario is correct here.

There's also a small typo in one of the comments ("onw" instead of "own").

With all those changed:

Reviewed-by: George Dunlap 

If Anshul and Dario are happy, I can fix all those thing up on commit.

-George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC Design Doc v2] Add vNVDIMM support for Xen

2016-08-03 Thread Jan Beulich
>>> On 03.08.16 at 12:08,  wrote:
> On 08/03/16 03:47, Jan Beulich wrote:
>> >>> On 03.08.16 at 11:37,  wrote:
>> > By using the file name, e.g. if I specify vnvdimm = [ 'file=/mnt/dax/foo' ]
>> > in a domain config file, SPA occupied by /mnt/dax/foo are mapped to
>> > the domain.  If the same file is used every time the domain is created,
>> > the same virtual device will be seen by that domain.
>> 
>> So what if the file got deleted and re-created in between? Since
>> I don't think you can specify the SPAs to use when creating such
>> a file, such an operation would be quite different from removing
>> and re-adding e.g. a specific PCI device (to be used by a guest)
>> on a host (while the guest is not running).
> 
> If modified in between, guest will see a virtual pmem device of
> different data. But the usage of pmem is similar to disk: if a file of
> the same content is given every time, the guest can get a virtual
> pmem/disk of the same data as last reboot/shutdown; keeping the data
> unchanged between multiple boots is out of the scope of Xen.

Except that here we're talking of handing a piece of hardware to a
guest, which to me is more like a PCI device than a (virtual) disk. But
anyway ...

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/mmcfg: Fix initalisation of variables in pci_mmcfg_nvidia_mcp55()

2016-08-03 Thread Jan Beulich
>>> On 03.08.16 at 11:51,  wrote:
> --- a/xen/arch/x86/x86_64/mmconfig-shared.c
> +++ b/xen/arch/x86/x86_64/mmconfig-shared.c
> @@ -182,10 +182,10 @@ static const char __init *pci_mmcfg_nvidia_mcp55(void)
>  int bus, i;
>  
>  static const u32 extcfg_regnum  = 0x90;
> -static const u32 extcfg_enable_mask = 1<<31;
> -static const u32 extcfg_start_mask  = 0xff<<16;
> +static const u32 extcfg_enable_mask = 1u << 31;

Aiui only this one change is really related to the patch subject; the
other two I suppose you just alter for consistency? Anyway,
Acked-by: Jan Beulich 

> +static const u32 extcfg_start_mask  = 0xffu << 16;
>  static const int extcfg_start_shift = 16;
> -static const u32 extcfg_size_mask   = 0x3<<28;
> +static const u32 extcfg_size_mask   = 3u << 28;
>  static const int extcfg_size_shift  = 28;
>  static const int extcfg_sizebus[]   = {0xff, 0x7f, 0x3f, 0x1f};
>  static const u32 extcfg_base_mask[] = {0x7ff8, 0x7ffc, 0x7ffe, 0x7fff};
> -- 
> 2.1.4




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Device model operation hypercall (DMOP, re qemu depriv)

2016-08-03 Thread Ian Jackson
Wei Liu writes ("Re: Device model operation hypercall (DMOP, re qemu depriv)"):
> On Mon, Aug 01, 2016 at 06:41:20AM -0600, Jan Beulich wrote:
> > > A DMOP is defined to never put at risk the stability or security of
> > > the whole system, nor of the domain which calls DMOP.  However, a DMOP
> > > may have arbitrary effects on the target domid.
> > 
> > With the exception of this and the privcmd layer described below,
> > DMOP == HVMCTL afaics. The privcmd layer is independent anyway.
> > And the security aspect mentioned above won't disappear if we
> > use DMOP instead of HVMCTL. So I don't see why the hvmctl
> > series as is can't be the starting point of this, with the stability/
> > security concerns addressed subsequently, for being orthogonal.

I don't (currently) have a clear understanding of how my proposed DMOP
relates to HVMCTL.

I thought it useful to set out the DMOP proposal from first
principles, with clear motivation, discussion of not-chosen
alternatives, and of course with a clear statement of the principles
of operation and of the security design.

The security property I have quoted above is absolutely critical to
the DMOP proposal.  I'm a bit concerned by comments like the above
`with the exception of this' (which seems to refer to the security
property).

Earlier during one of the HVMCTL threads I asked

This is a slight digression, but is it intended that all of these
hvmctl's are safe to expose to a deprivileged device model process in
dom0, or to a device model stub domain ?

Jan replied:

Yes, afaict (they've been exposed the same way before).

Does that mean that functionality exposed by all the prooposed HVMCTLs
is currently available to stubdoms ?

> Yeah, to turn HVMCTL to DMOP:
> 
> 1. s/HVMCTL/DMOP/
> 2. maybe s/interface_version//

Well, that would certainly be nice.  But there are some caveats I
would like sorting out.

> >  So I don't see why the hvmctl series as is can't be the starting
> > point of this, with the stability/ security concerns addressed
> > subsequently, for being orthogonal.

Please don't misunderstand me as trying to compete with or block
your HVMCTL work.  It may well be that HVMCTL is what I want, but:

If we adopt the design principles I describe in my DMOP proposal, I
don't think the security concerns are separable.

ISTM that a patch series introducing DMOP should start with a patch
which introduces the DMOP hypercall, with no sub-operations.

Such a patch would have code content very like that in
  [PATCH 01/11] public / x86: introduce hvmctl hypercall

But, such a patch should also explain the semantics.  The Xen public
headers ought to contain explanations of the promises that the
hypervisor makes about DMOP.  Importantly:
 - the promise that a DMOP cannot harm anyone except the target domid
 - the ABI stability of the target domid field
 - what the ABI stability policy is wrt the actual DMOPs themselves

If the 01/ patch contains such promises, then logically the 02/ patch
which introduces the first DMOP is extending that promise to that
operation.  It is at that point that the security decision should be
made.

Now, there may be other ways to represent/record the security status.
But it will be necessary to either (i) avoid violating the DMOP
security promise, by making questionable calls not available via DMOP
or (ii) trying to retrofit the security promise to DMOP later.

I think (ii) is not a good approach.  It would amount to introducing a
whole new set of interfaces, and then later trying to redefine them to
have a particular security property which was not originally there.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-coverity test] 99918: regressions - ALL FAIL

2016-08-03 Thread osstest service owner
flight 99918 xen-unstable-coverity real [real]
http://logs.test-lab.xenproject.org/osstest/logs/99918/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 coverity-amd645 coverity-buildfail REGR. vs. 96924

version targeted for testing:
 xen  e9522e4932aaa7f083688b6612b5897839409260
baseline version:
 xen  7da483b0236d8974cc97f81780dcf8e559a63175

Last test of basis96924  2016-07-10 09:19:23 Z   24 days
Failing since 97501  2016-07-17 09:26:52 Z   17 days4 attempts
Testing same since99918  2016-08-03 09:19:54 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Anshul Makkar 
  Boris Ostrovsky 
  Chao Gao 
  Chris Patterson 
  Corneliu ZUZU 
  Daniel De Graaf 
  Dario Faggioli 
  David Scott 
  Doug Goldstein 
  Elena Ufimtseva 
  George Dunlap 
  Ian Jackson 
  Jan Beulich 
  Jim Fehlig 
  Jonathan Daugherty 
  Juergen Gross 
  Julien Grall 
  Kevin Tian 
  Konrad Rzeszutek Wilk 
  Marek Marczykowski-Górecki 
  Paul Durrant 
  Quan Xu 
  Quan Xu 
  Razvan Cojocaru 
  Roger Pau Monne 
  Roger Pau Monné 
  Sander Eikelenboom 
  Shanker Donthineni 
  Stefano Stabellini 
  Tamas K Lengyel 
  Tim Deegan 
  Vitaly Kuznetsov 
  Wei Liu 

jobs:
 coverity-amd64   fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 2846 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable baseline-only test] 66891: trouble: blocked/broken/fail/pass

2016-08-03 Thread Platform Team regression test user
This run is configured for baseline tests only.

flight 66891 xen-unstable real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/66891/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt-xsm  3 host-install(3) broken REGR. vs. 66886
 test-amd64-amd64-xl-multivcpu  3 host-install(3)broken REGR. vs. 66886
 test-amd64-amd64-libvirt-pair 4 host-install/dst_host(4) broken REGR. vs. 66886
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 3 host-install(3) broken REGR. 
vs. 66886
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 3 host-install(3) broken REGR. 
vs. 66886
 test-amd64-amd64-xl-qemut-debianhvm-amd64 3 host-install(3) broken REGR. vs. 
66886

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-qemut-rhel6hvm-amd  3 host-install(3)  broken blocked in 66886
 test-amd64-i386-freebsd10-amd64  3 host-install(3) broken blocked in 66886
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 3 host-install(3) broken blocked in 
66886
 test-amd64-i386-migrupgrade 3 host-install/src_host(3) broken blocked in 66886
 test-amd64-i386-pair3 host-install/src_host(3) broken blocked in 66886
 test-amd64-i386-xl-qemuu-ovmf-amd64  3 host-install(3) broken blocked in 66886
 test-amd64-i386-xl-qemut-winxpsp3  3 host-install(3)   broken blocked in 66886
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  3 host-install(3) broken like 66886
 test-amd64-amd64-xl-credit2   3 host-install(3)  broken like 66886
 test-amd64-amd64-qemuu-nested-amd  3 host-install(3) broken like 66886
 test-amd64-amd64-libvirt  3 host-install(3)  broken like 66886
 test-amd64-amd64-pygrub   3 host-install(3)  broken like 66886
 test-amd64-amd64-migrupgrade  4 host-install/dst_host(4) broken like 66886
 test-amd64-amd64-pair 4 host-install/dst_host(4) broken like 66886
 test-amd64-amd64-libvirt-vhd  3 host-install(3)  broken like 66886
 test-amd64-amd64-xl-rtds  3 host-install(3)  broken like 66886
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 3 host-install(3) broken 
like 66886
 test-amd64-amd64-xl-qemuu-ovmf-amd64  3 host-install(3)  broken like 66886
 build-i386-rumpuserxen6 xen-buildfail blocked in 66886
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stopfail blocked in 66886
 build-amd64-rumpuserxen   6 xen-buildfail   like 66886
 test-amd64-amd64-i386-pvgrub 10 guest-start  fail   like 66886
 test-amd64-amd64-qemuu-nested-intel 16 debian-hvm-install/l1/l2 fail like 66886

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-midway   12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-am

Re: [Xen-devel] [PATCH 1/9] x86/hypercall: Move some of the hvm hypercall infrastructure into hypercall.h

2016-08-03 Thread Julien Grall

Hi Jan,

On 03/08/16 09:53, Jan Beulich wrote:

On 02.08.16 at 20:43,  wrote:

On Tue, 2 Aug 2016, Jan Beulich wrote:

On 02.08.16 at 16:59,  wrote:

On 02/08/16 15:54, Jan Beulich wrote:

On 02.08.16 at 16:26,  wrote:

On 02/08/16 15:17, Jan Beulich wrote:

Well, I find it quite odd for hypercall argument counts to differ
between arches. I.e. I'd conclude the ABI was mis-specified.

Is it documented somewhere for the x86 code? Looking at Linux, the
privcmd call is only passing 5 arguments on both ARM and x86.

arch-x86/xen-x86_32.h has

 * Hypercall interface:
 *  Input:  %ebx, %ecx, %edx, %esi, %edi, %ebp (arguments 1-6)
 *  Output: %eax

while arch-x86/xen-x86_64.h has

 * Hypercall interface:
 *  Input:  %rdi, %rsi, %rdx, %r10, %r8, %r9 (arguments 1-6)
 *  Output: %rax


The only actual 6 argument hypercall is the v4v hypercall, better known
as __HYPERVISOR_xc_reserved_op at index 39, but that isn't implemented
anywhere upstream.


But it serves as an example what now wouldn't work on ARM.


At the time the arm hypercall ABI was published, it matched the x86
hypercall ABI, which had only 5 hypercall arguments.

The issue is that the x86 hypercall ABI changed, and now is out of sync
with ARM. The faulty commit being:


That's one way of viewing it, but I don't think an appropriate one.
6-argument hypercalls had always been possible on x86, just that
they might not have been documented in the public headers (but
instead only in the actual hypercall implementation).


I would tend to say that anything not documented in the public header is 
not part of the ABI regardless how it has been implemented before hand.


Anyway, I looked at the hypercall implementation on ARM and it seems 
that we half support the 6th argument. For instance 
hypercall_create_continuation is clobbering r5/x5 which is not part of 
the ABI.


However do_trap_hypercall is only supporting up to 5 argument.

I don't think it would be an issue to support 6 arguments on ARM. 
Stefano, what do you think?


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/mmcfg: Fix initalisation of variables in pci_mmcfg_nvidia_mcp55()

2016-08-03 Thread Andrew Cooper
On 03/08/16 11:24, Jan Beulich wrote:
 On 03.08.16 at 11:51,  wrote:
>> --- a/xen/arch/x86/x86_64/mmconfig-shared.c
>> +++ b/xen/arch/x86/x86_64/mmconfig-shared.c
>> @@ -182,10 +182,10 @@ static const char __init *pci_mmcfg_nvidia_mcp55(void)
>>  int bus, i;
>>  
>>  static const u32 extcfg_regnum  = 0x90;
>> -static const u32 extcfg_enable_mask = 1<<31;
>> -static const u32 extcfg_start_mask  = 0xff<<16;
>> +static const u32 extcfg_enable_mask = 1u << 31;
> Aiui only this one change is really related to the patch subject; the
> other two I suppose you just alter for consistency? Anyway,
> Acked-by: Jan Beulich 

Indeed.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/11] public / x86: introduce hvmctl hypercall

2016-08-03 Thread George Dunlap
On 24/06/16 11:28, Jan Beulich wrote:
> ... as a means to replace all HVMOP_* which a domain can't issue on
> itself (i.e. intended for use by only the control domain or device
> model).
> 
> Signed-off-by: Jan Beulich 
> Reviewed-by: Wei Liu 

Jan,

Could you post this branch to a git repo somewhere?

I normally review patches "in situ" by importing them to a tree; and
saving each of your individual patches and applying them in order is a
bit much.

 -George

> ---
> v2: Widen cmd field to 32 bits and opaque one to 64. Drop
> HVMCTL_iter_*.
> 
> --- a/xen/arch/x86/hvm/Makefile
> +++ b/xen/arch/x86/hvm/Makefile
> @@ -2,6 +2,7 @@ subdir-y += svm
>  subdir-y += vmx
>  
>  obj-y += asid.o
> +obj-y += control.o
>  obj-y += emulate.o
>  obj-y += event.o
>  obj-y += hpet.o
> --- /dev/null
> +++ b/xen/arch/x86/hvm/control.c
> @@ -0,0 +1,82 @@
> +/*
> + * control.c: Hardware virtual machine control operations.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along 
> with
> + * this program; If not, see .
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +long do_hvmctl(XEN_GUEST_HANDLE_PARAM(xen_hvmctl_t) u_hvmctl)
> +{
> +xen_hvmctl_t op;
> +struct domain *d;
> +int rc;
> +
> +BUILD_BUG_ON(sizeof(op.u) > sizeof(op.u.pad));
> +
> +if ( copy_from_guest(&op, u_hvmctl, 1) )
> +return -EFAULT;
> +
> +if ( op.interface_version != XEN_HVMCTL_INTERFACE_VERSION )
> +return -EACCES;
> +
> +rc = rcu_lock_remote_domain_by_id(op.domain, &d);
> +if ( rc )
> +return rc;
> +
> +if ( !has_hvm_container_domain(d) )
> +{
> +rcu_unlock_domain(d);
> +return -EINVAL;
> +}
> +
> +rc = xsm_hvm_control(XSM_DM_PRIV, d, op.cmd);
> +if ( rc )
> +{
> +rcu_unlock_domain(d);
> +return rc;
> +}
> +
> +switch ( op.cmd )
> +{
> +default:
> +rc = -EOPNOTSUPP;
> +break;
> +}
> +
> +rcu_unlock_domain(d);
> +
> +if ( rc == -ERESTART )
> +{
> +if ( unlikely(copy_field_to_guest(u_hvmctl, &op, opaque)) )
> +rc = -EFAULT;
> +else
> +rc = hypercall_create_continuation(__HYPERVISOR_hvmctl, "h",
> +   u_hvmctl);
> +}
> +
> +return rc;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4111,6 +4111,7 @@ static const struct {
>  COMPAT_CALL(platform_op),
>  COMPAT_CALL(mmuext_op),
>  HYPERCALL(xenpmu_op),
> +HYPERCALL(hvmctl),
>  HYPERCALL(arch_1)
>  };
>  
> --- a/xen/arch/x86/x86_64/compat/entry.S
> +++ b/xen/arch/x86/x86_64/compat/entry.S
> @@ -469,6 +469,7 @@ ENTRY(compat_hypercall_table)
>  .quad do_tmem_op
>  .quad do_ni_hypercall   /* reserved for XenClient */
>  .quad do_xenpmu_op  /* 40 */
> +.quad do_hvmctl
>  .rept __HYPERVISOR_arch_0-((.-compat_hypercall_table)/8)
>  .quad compat_ni_hypercall
>  .endr
> @@ -520,6 +521,7 @@ ENTRY(compat_hypercall_args_table)
>  .byte 1 /* do_tmem_op   */
>  .byte 0 /* reserved for XenClient   */
>  .byte 2 /* do_xenpmu_op */  /* 40 */
> +.byte 1 /* do_hvmctl*/
>  .rept __HYPERVISOR_arch_0-(.-compat_hypercall_args_table)
>  .byte 0 /* compat_ni_hypercall  */
>  .endr
> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -792,6 +792,7 @@ ENTRY(hypercall_table)
>  .quad do_tmem_op
>  .quad do_ni_hypercall   /* reserved for XenClient */
>  .quad do_xenpmu_op  /* 40 */
> +.quad do_hvmctl
>  .rept __HYPERVISOR_arch_0-((.-hypercall_table)/8)
>  .quad do_ni_hypercall
>  .endr
> @@ -843,6 +844,7 @@ ENTRY(hypercall_args_table)
>  .byte 1 /* do_tmem_op   */
>  .byte 0 /* reserved for XenClient */
>  .byte 2 /* do_xenpmu_op */  /* 40 */
> +.byte 1 /* do_hvmctl*/
>  .rept __HYPERVISOR_arch_0-(.-hypercall_args_table)
>  .byte 0 /* do_ni_hypercall  */
>  .endr
> --- a/xen/include/Makefile
> +++ b/xen/include/Makefile
> @@ -93,7 +93,7 @@ all: headers.chk headers++.chk
>  
>  PUBLIC_HEA

Re: [Xen-devel] [PATCH v3] xen: credit2: issues in csched2_cpu_pick(), when tracing is enabled.

2016-08-03 Thread George Dunlap
On 27/07/16 04:09, Dario Faggioli wrote:
> In fact, when not finding a suitable runqueue where to
> place a vCPU, and hence using a fallback, we either:
>  - don't issue any trace record (while we should, at
>least, output the chosen pcpu),
>  - risk underruning when accessing the runqueues
>array, while preparing the trace record.
> 
> Fix both issues and, while there, also a couple of style
> problems found nearby.
> 
> Spotted by Coverity.
> 
> Signed-off-by: Dario Faggioli 
> Reported-by: Andrew Cooper 

Thanks,

Reviewed-by: George Dunlap 



> ---
> Cc: George Dunlap 
> Cc: Anshul Makkar 
> ---
> Changes from v2:
>  * use local variables to prevent unlocked access to prv, as
>suggested during review.
> Changes from v1:
>  * cite Coverity in the changelog.
> ---
>  xen/common/sched_credit2.c |   23 ++-
>  1 file changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index 7ec7f62..00d3300 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -1464,7 +1464,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct 
> vcpu *vc)
>  struct csched2_private *prv = CSCHED2_PRIV(ops);
>  int i, min_rqi = -1, new_cpu;
>  struct csched2_vcpu *svc = CSCHED2_VCPU(vc);
> -s_time_t min_avgload;
> +s_time_t min_avgload = MAX_LOAD;
>  
>  ASSERT(!cpumask_empty(&prv->active_queues));
>  
> @@ -1488,7 +1488,12 @@ csched2_cpu_pick(const struct scheduler *ops, struct 
> vcpu *vc)
>  {
>  /* We may be here because someone requested us to migrate. */
>  __clear_bit(__CSFLAG_runq_migrate_request, &svc->flags);
> -return get_fallback_cpu(svc);
> +new_cpu = get_fallback_cpu(svc);
> +/*
> + * Tracing of runq and its load won't be accurate, since we could
> + * not get the lock, but at least we will output the chosen pcpu.
> + */
> +goto out;
>  }
>  
>  /*
> @@ -1513,8 +1518,6 @@ csched2_cpu_pick(const struct scheduler *ops, struct 
> vcpu *vc)
>  /* Fall-through to normal cpu pick */
>  }
>  
> -min_avgload = MAX_LOAD;
> -
>  /* Find the runqueue with the lowest average load. */
>  for_each_cpu(i, &prv->active_queues)
>  {
> @@ -1552,7 +1555,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct 
> vcpu *vc)
>  if ( rqd_avgload < min_avgload )
>  {
>  min_avgload = rqd_avgload;
> -min_rqi=i;
> +min_rqi = i;
>  }
>  }
>  
> @@ -1560,6 +1563,8 @@ csched2_cpu_pick(const struct scheduler *ops, struct 
> vcpu *vc)
>  if ( min_rqi == -1 )
>  {
>  new_cpu = get_fallback_cpu(svc);
> +min_rqi = c2r(ops, new_cpu);
> +min_avgload = prv->rqd[min_rqi].b_avgload;
>  goto out_up;
>  }
>  
> @@ -1570,18 +1575,18 @@ csched2_cpu_pick(const struct scheduler *ops, struct 
> vcpu *vc)
>  
>   out_up:
>  read_unlock(&prv->lock);
> -
> + out:
>  if ( unlikely(tb_init_done) )
>  {
>  struct {
>  uint64_t b_avgload;
>  unsigned vcpu:16, dom:16;
>  unsigned rq_id:16, new_cpu:16;
> -   } d;
> -d.b_avgload = prv->rqd[min_rqi].b_avgload;
> +} d;
>  d.dom = vc->domain->domain_id;
>  d.vcpu = vc->vcpu_id;
> -d.rq_id = c2r(ops, new_cpu);
> +d.rq_id = min_rqi;
> +d.b_avgload = min_avgload;
>  d.new_cpu = new_cpu;
>  __trace_var(TRC_CSCHED2_PICKED_CPU, 1,
>  sizeof(d),
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [ovmf baseline-only test] 66896: tolerable trouble: blocked/broken/pass

2016-08-03 Thread Platform Team regression test user
This run is configured for baseline tests only.

flight 66896 ovmf real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/66896/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 build-amd64   3 host-install(3)  broken like 66895

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a

version targeted for testing:
 ovmf deaacda3b2740477733564066eb69d5c94b41bba
baseline version:
 ovmf 4884e81b5d0da1bcc05361dd4c5fc06b3722f144

Last test of basis66895  2016-08-03 07:16:34 Z0 days
Testing same since66896  2016-08-03 10:18:33 Z0 days1 attempts


People who touched revisions under test:
  Liming Gao 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  broken  
 build-i386   pass
 build-amd64-libvirt  blocked 
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary

broken-step build-amd64 host-install(3)

Push not applicable.


commit deaacda3b2740477733564066eb69d5c94b41bba
Author: Liming Gao 
Date:   Tue Aug 2 13:37:55 2016 +0800

MdeModulePkg LoadFileOnFv2: Fix the potential NULL pointer access

Check NULL pointer before access it.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Liming Gao 
Reviewed-by: Feng Tian 

commit b40ad7b54d1dbdded77f779fa4d1dbe860783c8e
Author: Liming Gao 
Date:   Tue Aug 2 10:33:53 2016 +0800

MdeModulePkg LoadFileOnFv2: Correct copy right format

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Liming Gao 
Reviewed-by: Feng Tian 

commit e9e44f654711c894bedd24ae1db7a6a2b106dc19
Author: Liming Gao 
Date:   Tue Aug 2 10:26:50 2016 +0800

MdeModulePkg UefiBootManagerLib: Fix VS2012 build failure

Initialize local variable Description as NULL first.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Liming Gao 
Reviewed-by: Feng Tian 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] Xen 4.8 Development Update

2016-08-03 Thread Wei Liu
This email only tracks big items for xen.git tree. Please reply for items you
woulk like to see in 4.8 so that people have an idea what is going on and
prioritise accordingly.

You're welcome to provide description and use cases of the feature you're
working on.

= Timeline =

We now adopt a fixed cut-off date scheme. We will release twice a
year. The upcoming 4.8 timeline are as followed:

* Last posting date: September 16, 2016
* Hard code freeze: September 30, 2016
* RC1: TBD
* Release: December 2, 2016

Note that we don't have freeze exception scheme anymore. All patches
that wish to go into 4.8 must be posted no later than the last posting
date. All patches posted after that date will be automatically queued
into next release.

RCs will be arranged immediately after freeze.

= Projects =

== Hypervisor == 

*  Make credit2 default scheduler for Xen
  -  Dario Faggioli

*  Boot Xen on EFI platforms using GRUB2 (multiboot2 protocol)
  -  Daniel Kiper

=== x86 === 

*  Allow ioreq server interface to support XenGT
  -  Yu Zhang
  -  Paul Durrant

*  PVHv2 support
  -  Roger Pau Monne

*  vNVDIMM support
  -  Haozhong Zhang

=== ARM === 

*  ITS emulation (Dom0 only)
  -  Andre Przywara

*  Xen ARM DomU ACPI support
  -  Shannon Zhao

*  Alternative patching support
  -  Julien Grall

*  Altp2m for ARM
  -  Sergej Proskurin

== Toolstack == 

*  Make ACPI builder available to components other than hvmloader
  -  Boris Ostrovsky

*  Libxl PVSCSI support
  -  Olaf Hering

*  Load BIOS via toolstack
  -  Anthony Perard

*  Libxl depriv QEMU
  -  Ian Jackson

*  Logging solution for Xen system
  -  Wei Liu

== Mini-OS == 

*  Mini-os ballooning support
  -  Juergen Gross

== Testing == 

*  Have OSSTest running XTF tests
  -  Wei Liu
  -  Andrew Cooper
  -  Ian Jackson

== Completed == 

*  Clean up all hard-coded paths in toolstack code
  -  Wei Liu

*  Refactor libxl device handling framework
  -  Juergen Gross

*  IOMMU flush issue
  -  Quan Xu

*  Refactor XSM policy
  -  Daniel De Graaf


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] arm/mem_access: properly handle traps caused by no-longer current settings

2016-08-03 Thread Julien Grall

Hi Tamas,

On 02/08/16 23:47, Tamas K Lengyel wrote:

On Tue, Aug 2, 2016 at 4:05 PM, Julien Grall  wrote:

On 02/08/2016 22:34, Tamas K Lengyel wrote:

On Tue, Aug 2, 2016 at 2:02 PM, Julien Grall  wrote:

On 02/08/2016 19:53, Tamas K Lengyel wrote:

Well, the data abort can only be a permission fault if memaccess is inuse so
far. Unless there is another race condition in the memaccess code and in
this case this is not the fault of the guest. So sending a data abort to the
guest will not really help to know what's going on.


From my perspective it doesn't matter whether the fault is injected
into the guest or not when mem_access is not in use. Since that's the
default behavior right now, my opinion is that it should get
reinjected but that's beyond the scope of mem_access itself so it's up
to you to decide. If you really prefer to have the mem_access check
just be a void function and not inject a fault to the guest, I'm
certainly fine with that.


I would prefer to see the function p2m_mem_access_check unchanged. The 
current behavior of the function is to return either "the fault is not 
for me" or "I handled the fault".


With this patch, this function could also return "this might be a race 
condition, try again".


The function p2m_mem_access_check is returning a boolean. You cannot 
encode 3 states in a boolean. So you could not hand over the fault to 
another helper.





Also, you are assuming that it will never be possible in the future to have
another usage of the permission fault. By returning false you say "I handled
the fault, it is not necessary to hand over to someone else".


I only return false here if mem_access is enabled.


But this code can never be reached when mem_access is not enabled. And 
once mem_access is turned on, it is never possible to turned off back.


If you think that in the future mem_access can be turned off, then the 
code could be racy. IHMO it is already kind of racy because the code 
assumes that p2m->mem_access_enabled will be seen well before the P2M 
was changed.



If any other system
in the future is going to make use of these permissions then it needs
to be carefully evaluated what the handover setup should be when the
mem_access state doesn't seem to be the reason for the violation. I
can forsee some issues if one system would like the instruction to be
reexecuted while the other to do something else. As this all being
hypothetical at this point I don't really know what to do with this
right now.


IHMO, right now, the best solution is to re-execute the instruction in 
any case and keep p2m_mem_access_check unchanged.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH -v3 1/1] ratelimit: Implement rate limit for credit2 scheduler Rate limit assures that a vcpu will execute for a minimum amount of time before being put at the back of a queue o

2016-08-03 Thread anshul makkar

On 03/08/16 11:16, George Dunlap wrote:

On 26/07/16 17:21, Dario Faggioli wrote:

On Mon, 2016-07-25 at 17:12 +0100, Anshul Makkar wrote:

It introduces context-switch rate-limiting. The patch enables the VM

The subject, which will become the "title" of the commit, is way too
long. That must be a very concise headline of what the patch is about,
and should also have tags, specifying what areas of the codebase are
affected. So what do you think of this:

   xen: credit2: implement context switch rate-limiting.


It looks like it's just missing a carrage return -- I could fix that up
on check-in.



On a more technical side, I think that...

+if ( prv->ratelimit_us )
+{
+s_time_t ratelimit_min = prv->ratelimit_us;


... this should be:

  s_time_t ratelimit_min = MICROSECS(prv->ratelimit_us);

I realized that by looking at traces and seeing entries for which
CSCHED2_MIN_TIMER was being returned, even if I had set
sched_ratelimit_us to a value greater than that.


Yes, Dario is correct here.

There's also a small typo in one of the comments ("onw" instead of "own").

With all those changed:

Reviewed-by: George Dunlap 

If Anshul and Dario are happy, I can fix all those thing up on commit.


Fine with me.

-George


Anshul


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/11] public / x86: introduce hvmctl hypercall

2016-08-03 Thread Jan Beulich
>>> On 03.08.16 at 13:06,  wrote:
> On 24/06/16 11:28, Jan Beulich wrote:
>> ... as a means to replace all HVMOP_* which a domain can't issue on
>> itself (i.e. intended for use by only the control domain or device
>> model).
>> 
>> Signed-off-by: Jan Beulich 
>> Reviewed-by: Wei Liu 
> 
> Could you post this branch to a git repo somewhere?

Well, I can see to find time to get some git tree set up for such a
purpose, but I don't normally use git for my work, so I have
nothing available right away.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 00/25] arm/altp2m: Introducing altp2m to ARM.

2016-08-03 Thread Julien Grall



On 02/08/16 17:08, Andrew Cooper wrote:

On 02/08/16 08:34, Julien Grall wrote:

Hi Andrew,

On 02/08/2016 00:14, Andrew Cooper wrote:

On 01/08/2016 19:15, Julien Grall wrote:

On 01/08/16 18:10, Sergej Proskurin wrote:


Hello all,


Hello Sergej,


The following patch series can be found on Github[0] and is part of my
contribution to this year's Google Summer of Code (GSoC)[1]. My
project is
managed by the organization The Honeynet Project. As part of GSoC, I
am being
supervised by the Xen developer Tamas K. Lengyel
, George
D. Webster, and Steven Maresca.

In this patch series, we provide an implementation of the altp2m
subsystem for
ARM. Our implementation is based on the altp2m subsystem for x86,
providing
additional --alternate-- views on the guest's physical memory by
means of the
ARM 2nd stage translation mechanism. The patches introduce new HVMOPs
and
extend the p2m subsystem. Also, we extend libxl to support altp2m on
ARM and
modify xen-access to test the suggested functionality.

To be more precise, altp2m allows to create and switch to additional
p2m views
(i.e. gfn to mfn mappings). These views can be manipulated and
activated as
will through the provided HVMOPs. In this way, the active guest
instance in
question can seamlessly proceed execution without noticing that
anything has
changed. The prime scope of application of altp2m is Virtual Machine
Introspection, where guest systems are analyzed from the outside of
the VM.

Altp2m can be activated by means of the guest control parameter
"altp2m" on x86
and ARM architectures.  The altp2m functionality by default can also
be used
from within the guest by design. For use-cases requiring purely
external access
to altp2m, a custom XSM policy is necessary on both x86 and ARM.


As said on the previous version, altp2m operation *should not* be
exposed to ARM guest. Any design written for x86 may not fit exactly
for ARM (and vice versa), you will need to explain why you think we
should follow the same pattern.


Sorry, but I am going to step in here and disagree.  All the x86
justifications for altp2m being accessible to guests apply equally to
ARM, as they are hardware independent.

I realise you are maintainer, but the onus is on you to justify why the
behaviour should be different between x86 and ARM, rather than simply to
complain at it being the same.

Naturally, technical issues about the details of the implementation, or
the algorithms etc. are of course fine, but I don't see any plausible
reason why ARM should purposefully different from x86 in terms of
available functionality, and several good reasons why it should be the
same (least of all, feature parity across architectures).


The question here, is how a guest could take advantage to access to
altp2m on ARM today? Whilst on x86 a guest could be notified about
memaccess change, this is not yet the case on ARM.


Does ARM have anything like #VE whereby an in-guest entity can receive
notification of violations?


I am not entirely sure what is exactly the #VE. From my understanding, 
it use to report stage 2 violation to the guest, right? If so, I am not 
aware of any.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] xen: fix a (latent) cpupool-related race during domain destroy

2016-08-03 Thread George Dunlap
On Thu, Jul 28, 2016 at 6:29 PM, Dario Faggioli
 wrote:
> On Mon, 2016-07-18 at 16:09 +0200, Juergen Gross wrote:
>> Acked-by: Juergen Gross 
>>
>> for this patch then.
>>
> George,
>
> Ping about this series.

Dario,

Somehow I can only find patch 1/2 in either Google or my Citrix
mailbox.  What's the title of the 2nd patch?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.8 Development Update

2016-08-03 Thread Julien Grall

Hi Wei,

On 03/08/16 12:33, Wei Liu wrote:

=== ARM ===

*  ITS emulation (Dom0 only)
  -  Andre Przywara

*  Xen ARM DomU ACPI support
  -  Shannon Zhao

*  Alternative patching support
  -  Julien Grall

*  Altp2m for ARM
  -  Sergej Proskurin


Another big item to track for ARM:

P2M rework to follow break-before-make sequence.

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [DRAFT v4] PV Calls protocol design document (former XenSock)

2016-08-03 Thread Wei Liu
On Tue, Aug 02, 2016 at 05:35:08PM -0700, Stefano Stabellini wrote:
> Hi all,
> 
> This is the design document of the PV Calls protocol. You can find
> prototypes of the Linux frontend and backend drivers here:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen.git pvcalls-4
> 
> To use them, make sure to enable CONFIG_PVCALLS in your kernel config
> and add "pvcalls=1" to the command line of your DomU Linux kernel. You
> also need the toolstack to create the initial xenstore nodes for the
> protocol. To do that, please apply the attached patch to libxl (the
> patch is based on Xen 4.7.0-rc3) and add "pvcalls=1" to your DomU config
> file.
> 
> Note that previous versions of the protocols were named XenSock. It has
> been renamed for clarity of scope and to avoid confusion with hv_sock
> and vsock, which are used for inter-VMs communications.
> 
> Cheers,
> 
> Stefano
> 
> Changes in v4:
> - rename xensock to pvcalls
> 
> Changes in v3:
> - add a dummy element to struct xen_xensock_request to make sure the
>   size of the struct is the same on both x86_32 and x86_64
> 
> Changes in v2:
> - add max-dataring-page-order
> - add "Publish backend features and transport parameters" to backend
>   xenbus workflow
> - update new cmd values
> - update xen_xensock_request
> - add backlog parameter to listen and binary layout
> - add description of new data ring format (interface+data)
> - modify connect and accept to reflect new data ring format
> - add link to POSIX docs
> - add error numbers
> - add address format section and relevant numeric definitions
> - add explicit mention of unimplemented commands
> - add protocol node name
> - add xenbus shutdown diagram
> - add socket operation
> 
> ---
> 
> # PV Calls Protocol
> 
> ## Rationale
> 
> PV Calls is a paravirtualized protocol for the POSIX socket API.
> 
> The purpose of PV Calls is to allow the implementation of a specific set
> of POSIX functions to be done in a domain other than your own. It allows
> connect, accept, bind, release, listen, poll, recvmsg and sendmsg to be
> implemented in another domain.
> 

The wording isn't really clear here. This design document as-is would
inevitably make people start to compare PV Calls to various HV socks I'm
afraid.

Is PV Calls going to cover other stuff other than socket API? If it
targets POSIX interfaces, maybe call it PV POSIX?

But then, if you extend the scope to cover POSIX APIs, I think you might
want some discovery mechanism to see what APIs are paravirtualised?

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 00/25] arm/altp2m: Introducing altp2m to ARM.

2016-08-03 Thread Andrew Cooper
On 03/08/16 12:53, Julien Grall wrote:
>
>
> On 02/08/16 17:08, Andrew Cooper wrote:
>> On 02/08/16 08:34, Julien Grall wrote:
>>> Hi Andrew,
>>>
>>> On 02/08/2016 00:14, Andrew Cooper wrote:
 On 01/08/2016 19:15, Julien Grall wrote:
> On 01/08/16 18:10, Sergej Proskurin wrote:
>>
>> Hello all,
>
> Hello Sergej,
>
>> The following patch series can be found on Github[0] and is part
>> of my
>> contribution to this year's Google Summer of Code (GSoC)[1]. My
>> project is
>> managed by the organization The Honeynet Project. As part of GSoC, I
>> am being
>> supervised by the Xen developer Tamas K. Lengyel
>> , George
>> D. Webster, and Steven Maresca.
>>
>> In this patch series, we provide an implementation of the altp2m
>> subsystem for
>> ARM. Our implementation is based on the altp2m subsystem for x86,
>> providing
>> additional --alternate-- views on the guest's physical memory by
>> means of the
>> ARM 2nd stage translation mechanism. The patches introduce new
>> HVMOPs
>> and
>> extend the p2m subsystem. Also, we extend libxl to support altp2m on
>> ARM and
>> modify xen-access to test the suggested functionality.
>>
>> To be more precise, altp2m allows to create and switch to additional
>> p2m views
>> (i.e. gfn to mfn mappings). These views can be manipulated and
>> activated as
>> will through the provided HVMOPs. In this way, the active guest
>> instance in
>> question can seamlessly proceed execution without noticing that
>> anything has
>> changed. The prime scope of application of altp2m is Virtual Machine
>> Introspection, where guest systems are analyzed from the outside of
>> the VM.
>>
>> Altp2m can be activated by means of the guest control parameter
>> "altp2m" on x86
>> and ARM architectures.  The altp2m functionality by default can also
>> be used
>> from within the guest by design. For use-cases requiring purely
>> external access
>> to altp2m, a custom XSM policy is necessary on both x86 and ARM.
>
> As said on the previous version, altp2m operation *should not* be
> exposed to ARM guest. Any design written for x86 may not fit exactly
> for ARM (and vice versa), you will need to explain why you think we
> should follow the same pattern.

 Sorry, but I am going to step in here and disagree.  All the x86
 justifications for altp2m being accessible to guests apply equally to
 ARM, as they are hardware independent.

 I realise you are maintainer, but the onus is on you to justify why
 the
 behaviour should be different between x86 and ARM, rather than
 simply to
 complain at it being the same.

 Naturally, technical issues about the details of the
 implementation, or
 the algorithms etc. are of course fine, but I don't see any plausible
 reason why ARM should purposefully different from x86 in terms of
 available functionality, and several good reasons why it should be the
 same (least of all, feature parity across architectures).
>>>
>>> The question here, is how a guest could take advantage to access to
>>> altp2m on ARM today? Whilst on x86 a guest could be notified about
>>> memaccess change, this is not yet the case on ARM.
>>
>> Does ARM have anything like #VE whereby an in-guest entity can receive
>> notification of violations?
>
> I am not entirely sure what is exactly the #VE. From my understanding,
> it use to report stage 2 violation to the guest, right? If so, I am
> not aware of any.

#VE is a newly specified CPU exception, precisely for reporting state 2
violations (in ARM terminology).  It works very much like a pagefault.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Device model operation hypercall (DMOP, re qemu depriv)

2016-08-03 Thread Jan Beulich
>>> On 03.08.16 at 12:29,  wrote:
> Wei Liu writes ("Re: Device model operation hypercall (DMOP, re qemu 
> depriv)"):
>> On Mon, Aug 01, 2016 at 06:41:20AM -0600, Jan Beulich wrote:
>> > > A DMOP is defined to never put at risk the stability or security of
>> > > the whole system, nor of the domain which calls DMOP.  However, a DMOP
>> > > may have arbitrary effects on the target domid.
>> > 
>> > With the exception of this and the privcmd layer described below,
>> > DMOP == HVMCTL afaics. The privcmd layer is independent anyway.
>> > And the security aspect mentioned above won't disappear if we
>> > use DMOP instead of HVMCTL. So I don't see why the hvmctl
>> > series as is can't be the starting point of this, with the stability/
>> > security concerns addressed subsequently, for being orthogonal.
> 
> I don't (currently) have a clear understanding of how my proposed DMOP
> relates to HVMCTL.
> 
> I thought it useful to set out the DMOP proposal from first
> principles, with clear motivation, discussion of not-chosen
> alternatives, and of course with a clear statement of the principles
> of operation and of the security design.

Okay; nevertheless I did get the feeling that some of this was
prompted by the hvmctl series posting.

> The security property I have quoted above is absolutely critical to
> the DMOP proposal.  I'm a bit concerned by comments like the above
> `with the exception of this' (which seems to refer to the security
> property).

Indeed it does.

> Earlier during one of the HVMCTL threads I asked
> 
> This is a slight digression, but is it intended that all of these
> hvmctl's are safe to expose to a deprivileged device model process in
> dom0, or to a device model stub domain ?
> 
> Jan replied:
> 
> Yes, afaict (they've been exposed the same way before).
> 
> Does that mean that functionality exposed by all the prooposed HVMCTLs
> is currently available to stubdoms ?

That series only moves code from one hypercall to another (new) one,
without any security implications at all. What has been available to
stubdoms prior to that series will be available the same way once it
got applied.

>> >  So I don't see why the hvmctl series as is can't be the starting
>> > point of this, with the stability/ security concerns addressed
>> > subsequently, for being orthogonal.
> 
> Please don't misunderstand me as trying to compete with or block
> your HVMCTL work.  It may well be that HVMCTL is what I want, but:
> 
> If we adopt the design principles I describe in my DMOP proposal, I
> don't think the security concerns are separable.
> 
> ISTM that a patch series introducing DMOP should start with a patch
> which introduces the DMOP hypercall, with no sub-operations.
> 
> Such a patch would have code content very like that in
>   [PATCH 01/11] public / x86: introduce hvmctl hypercall
> 
> But, such a patch should also explain the semantics.  The Xen public
> headers ought to contain explanations of the promises that the
> hypervisor makes about DMOP.  Importantly:
>  - the promise that a DMOP cannot harm anyone except the target domid
>  - the ABI stability of the target domid field
>  - what the ABI stability policy is wrt the actual DMOPs themselves

Well, none of that was the original goal of the series; some of this
could be merged in.

> If the 01/ patch contains such promises, then logically the 02/ patch
> which introduces the first DMOP is extending that promise to that
> operation.  It is at that point that the security decision should be
> made.

Correct. Yet again the original goal of the series was just proper
separation of two groups of operations that should never have
been all thrown under the same hypercall.

> Now, there may be other ways to represent/record the security status.
> But it will be necessary to either (i) avoid violating the DMOP
> security promise, by making questionable calls not available via DMOP
> or (ii) trying to retrofit the security promise to DMOP later.
> 
> I think (ii) is not a good approach.  It would amount to introducing a
> whole new set of interfaces, and then later trying to redefine them to
> have a particular security property which was not originally there.

I agree that (i) would be the better approach, but I don't think I
can assess when I would find the time to do the required auditing
of all involved code. Plus I don't see the difference between going
the (ii) route with the presented hvmctl series vs keeping things as
they are (under hvmop) - in both cases the security promise will
need to be retrofit onto existing code.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.8 Development Update

2016-08-03 Thread Wei Liu
On Wed, Aug 03, 2016 at 12:56:14PM +0100, Julien Grall wrote:
> Hi Wei,
> 
> On 03/08/16 12:33, Wei Liu wrote:
> >=== ARM ===
> >
> >*  ITS emulation (Dom0 only)
> >  -  Andre Przywara
> >
> >*  Xen ARM DomU ACPI support
> >  -  Shannon Zhao
> >
> >*  Alternative patching support
> >  -  Julien Grall
> >
> >*  Altp2m for ARM
> >  -  Sergej Proskurin
> 
> Another big item to track for ARM:
> 
> P2M rework to follow break-before-make sequence.
> 

What does break-before-make mean? I'm afraid I can't parse this
sentence. Is there any patch series that I can look at?

Wei.

> Regards,
> 
> -- 
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 00/25] arm/altp2m: Introducing altp2m to ARM.

2016-08-03 Thread Julien Grall



On 03/08/16 13:00, Andrew Cooper wrote:

On 03/08/16 12:53, Julien Grall wrote:

On 02/08/16 17:08, Andrew Cooper wrote:

On 02/08/16 08:34, Julien Grall wrote:

Hi Andrew,

On 02/08/2016 00:14, Andrew Cooper wrote:

On 01/08/2016 19:15, Julien Grall wrote:

On 01/08/16 18:10, Sergej Proskurin wrote:


Hello all,


Hello Sergej,


The following patch series can be found on Github[0] and is part
of my
contribution to this year's Google Summer of Code (GSoC)[1]. My
project is
managed by the organization The Honeynet Project. As part of GSoC, I
am being
supervised by the Xen developer Tamas K. Lengyel
, George
D. Webster, and Steven Maresca.

In this patch series, we provide an implementation of the altp2m
subsystem for
ARM. Our implementation is based on the altp2m subsystem for x86,
providing
additional --alternate-- views on the guest's physical memory by
means of the
ARM 2nd stage translation mechanism. The patches introduce new
HVMOPs
and
extend the p2m subsystem. Also, we extend libxl to support altp2m on
ARM and
modify xen-access to test the suggested functionality.

To be more precise, altp2m allows to create and switch to additional
p2m views
(i.e. gfn to mfn mappings). These views can be manipulated and
activated as
will through the provided HVMOPs. In this way, the active guest
instance in
question can seamlessly proceed execution without noticing that
anything has
changed. The prime scope of application of altp2m is Virtual Machine
Introspection, where guest systems are analyzed from the outside of
the VM.

Altp2m can be activated by means of the guest control parameter
"altp2m" on x86
and ARM architectures.  The altp2m functionality by default can also
be used
from within the guest by design. For use-cases requiring purely
external access
to altp2m, a custom XSM policy is necessary on both x86 and ARM.


As said on the previous version, altp2m operation *should not* be
exposed to ARM guest. Any design written for x86 may not fit exactly
for ARM (and vice versa), you will need to explain why you think we
should follow the same pattern.


Sorry, but I am going to step in here and disagree.  All the x86
justifications for altp2m being accessible to guests apply equally to
ARM, as they are hardware independent.

I realise you are maintainer, but the onus is on you to justify why
the
behaviour should be different between x86 and ARM, rather than
simply to
complain at it being the same.

Naturally, technical issues about the details of the
implementation, or
the algorithms etc. are of course fine, but I don't see any plausible
reason why ARM should purposefully different from x86 in terms of
available functionality, and several good reasons why it should be the
same (least of all, feature parity across architectures).


The question here, is how a guest could take advantage to access to
altp2m on ARM today? Whilst on x86 a guest could be notified about
memaccess change, this is not yet the case on ARM.


Does ARM have anything like #VE whereby an in-guest entity can receive
notification of violations?


I am not entirely sure what is exactly the #VE. From my understanding,
it use to report stage 2 violation to the guest, right? If so, I am
not aware of any.


#VE is a newly specified CPU exception, precisely for reporting state 2
violations (in ARM terminology).  It works very much like a pagefault.


Thank you for the explanation. We don't have any specific exception to 
report stage 2 (I guess EPT for x86 terminology) violations.


If the guest physical address does not belong to an emulated device or 
does not have an associated host address, the hypervisor will inject a 
data/prefetch abort to the guest.


Those aborts contains a fault status. For now it is always the same 
fault: debug fault on AArch32 and address size fault on AArch64. I don't 
think we can re-use one of the fault (see ARM D7-1949 in DDI 0487A.j for 
the list of fault code) to behave as #VE.


I guess the best would be an event channel for this purpose.


~Andrew



--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.8 Development Update

2016-08-03 Thread Julien Grall



On 03/08/16 13:03, Wei Liu wrote:

On Wed, Aug 03, 2016 at 12:56:14PM +0100, Julien Grall wrote:

Hi Wei,

On 03/08/16 12:33, Wei Liu wrote:

=== ARM ===

*  ITS emulation (Dom0 only)
 -  Andre Przywara

*  Xen ARM DomU ACPI support
 -  Shannon Zhao

*  Alternative patching support
 -  Julien Grall

*  Altp2m for ARM
 -  Sergej Proskurin


Another big item to track for ARM:

P2M rework to follow break-before-make sequence.



What does break-before-make mean? I'm afraid I can't parse this
sentence. Is there any patch series that I can look at?


break-before-make is a 3 steps sequence the P2M code has to use to 
update the page tables under certain circumstances.


See 
https://lists.xenproject.org/archives/html/xen-devel/2016-07/msg02952.html


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 00/25] arm/altp2m: Introducing altp2m to ARM.

2016-08-03 Thread Andrew Cooper
On 03/08/16 13:13, Julien Grall wrote:
>
>
> On 03/08/16 13:00, Andrew Cooper wrote:
>> On 03/08/16 12:53, Julien Grall wrote:
>>> On 02/08/16 17:08, Andrew Cooper wrote:
 On 02/08/16 08:34, Julien Grall wrote:
> Hi Andrew,
>
> On 02/08/2016 00:14, Andrew Cooper wrote:
>> On 01/08/2016 19:15, Julien Grall wrote:
>>> On 01/08/16 18:10, Sergej Proskurin wrote:

 Hello all,
>>>
>>> Hello Sergej,
>>>
 The following patch series can be found on Github[0] and is part
 of my
 contribution to this year's Google Summer of Code (GSoC)[1]. My
 project is
 managed by the organization The Honeynet Project. As part of
 GSoC, I
 am being
 supervised by the Xen developer Tamas K. Lengyel
 , George
 D. Webster, and Steven Maresca.

 In this patch series, we provide an implementation of the altp2m
 subsystem for
 ARM. Our implementation is based on the altp2m subsystem for x86,
 providing
 additional --alternate-- views on the guest's physical memory by
 means of the
 ARM 2nd stage translation mechanism. The patches introduce new
 HVMOPs
 and
 extend the p2m subsystem. Also, we extend libxl to support
 altp2m on
 ARM and
 modify xen-access to test the suggested functionality.

 To be more precise, altp2m allows to create and switch to
 additional
 p2m views
 (i.e. gfn to mfn mappings). These views can be manipulated and
 activated as
 will through the provided HVMOPs. In this way, the active guest
 instance in
 question can seamlessly proceed execution without noticing that
 anything has
 changed. The prime scope of application of altp2m is Virtual
 Machine
 Introspection, where guest systems are analyzed from the
 outside of
 the VM.

 Altp2m can be activated by means of the guest control parameter
 "altp2m" on x86
 and ARM architectures.  The altp2m functionality by default can
 also
 be used
 from within the guest by design. For use-cases requiring purely
 external access
 to altp2m, a custom XSM policy is necessary on both x86 and ARM.
>>>
>>> As said on the previous version, altp2m operation *should not* be
>>> exposed to ARM guest. Any design written for x86 may not fit
>>> exactly
>>> for ARM (and vice versa), you will need to explain why you think we
>>> should follow the same pattern.
>>
>> Sorry, but I am going to step in here and disagree.  All the x86
>> justifications for altp2m being accessible to guests apply
>> equally to
>> ARM, as they are hardware independent.
>>
>> I realise you are maintainer, but the onus is on you to justify why
>> the
>> behaviour should be different between x86 and ARM, rather than
>> simply to
>> complain at it being the same.
>>
>> Naturally, technical issues about the details of the
>> implementation, or
>> the algorithms etc. are of course fine, but I don't see any
>> plausible
>> reason why ARM should purposefully different from x86 in terms of
>> available functionality, and several good reasons why it should
>> be the
>> same (least of all, feature parity across architectures).
>
> The question here, is how a guest could take advantage to access to
> altp2m on ARM today? Whilst on x86 a guest could be notified about
> memaccess change, this is not yet the case on ARM.

 Does ARM have anything like #VE whereby an in-guest entity can receive
 notification of violations?
>>>
>>> I am not entirely sure what is exactly the #VE. From my understanding,
>>> it use to report stage 2 violation to the guest, right? If so, I am
>>> not aware of any.
>>
>> #VE is a newly specified CPU exception, precisely for reporting state 2
>> violations (in ARM terminology).  It works very much like a pagefault.
>
> Thank you for the explanation. We don't have any specific exception to
> report stage 2 (I guess EPT for x86 terminology) violations.

It is currently specific to Intel's EPT implementation, but there is
nothing in principle preventing AMD reusing the meaning for their NPT
implementation.

>
> If the guest physical address does not belong to an emulated device or
> does not have an associated host address, the hypervisor will inject a
> data/prefetch abort to the guest.

This is where x86 and ARM differ quite a bit.  For "areas which don't
exist", the default is to discard writes and reads return ~0, rather
than to raise a fault with the software.

>
> Those aborts contains a fault status. For now it is always the same
> fault: debug fault on AArch32 and address size fault on AArch64. I
> don't think we can re-use one of the fault (

Re: [Xen-devel] [PATCH -v3 1/1] ratelimit: Implement rate limit for credit2 scheduler Rate limit assures that a vcpu will execute for a minimum amount of time before being put at the back of a queue o

2016-08-03 Thread Dario Faggioli
On Wed, 2016-08-03 at 12:43 +0100, anshul makkar wrote:
> On 03/08/16 11:16, George Dunlap wrote:
> > 
> > Reviewed-by: George Dunlap 
> > 
> > If Anshul and Dario are happy, I can fix all those thing up on
> > commit.
> > 
> Fine with me.
>
I'm ok too.

And with those changes, you can also add:

Reviewed-by: Dario Faggioli 

Thanks and Regards,
Dario

-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] xen: fix a (latent) cpupool-related race during domain destroy

2016-08-03 Thread George Dunlap
On 18/07/16 15:09, Juergen Gross wrote:
> On 18/07/16 16:03, Dario Faggioli wrote:
>> On Fri, 2016-07-15 at 16:23 +0200, Dario Faggioli wrote:
>>> On Fri, 2016-07-15 at 14:52 +0200, Juergen Gross wrote:
 On 15/07/16 13:52, Dario Faggioli wrote:
> Therefore, I still think this patch is correct, but I'm up for
> investigating further and finding a way to solve the "zombie in
> cpupool" issue as well.
 I'm not saying your patch is wrong. I just wanted to give you a
 hint
 about the history of the stuff you are changing. :-)

>>> Sure, and that's much appreciated! :-)
>>>
 If it is working I'd really prefer it over the current situation.

>>> Right. I've got to leave now. But I'll produce some zombies on
>>> Monday,
>>> and will see if they move. :-D
>>>
>> Here you go:
>>
>> http://pastebin.com/r93TGCZU
>>
>> Is that "vm1 -b-" --> "(null) ---" domain zombie enough?
> 
> Yes, seems to be okay for a zombie.
> 
>> If yes, as you can see, it moves to cpupool0 while being destroyed, and
>> I can destroy the pool without problems.
> 
> Great.
> 
>> I also shutdown the system with the domain still there (in cpupool0),
>> and it works.
> 
> Fine. You can add
> 
> Acked-by: Juergen Gross 

Acked-by: George Dunlap 

And queued.

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] live migration to qemu.git fails

2016-08-03 Thread Anthony PERARD
On Wed, Aug 03, 2016 at 10:41:25AM +0100, Wei Liu wrote:
> Cc Anthony and Stefano
> 
> On Tue, Aug 02, 2016 at 11:58:05AM +0200, Olaf Hering wrote:
> > Doing a live migration of a HVM domU from staging-4.5 with its included
> > qemu-xen to staging-4.7 with qemu-xen from qemu.org (either master or
> > 2.6.0) fails:
> > 
> > # cat /var/log/xen/qemu-dm-fv-x64-sles12sp1-clean--incoming.log
> > char device redirected to /dev/pts/4 (label serial0)
> > xen_ram_alloc: do not alloc f80 bytes of ram at 0 when runstate is 
> > INMIGRATE
> > xen_ram_alloc: do not alloc 80 bytes of ram at f80 when runstate is 
> > INMIGRATE
> > xen_ram_alloc: do not alloc 1 bytes of ram at 1000 when runstate is 
> > INMIGRATE
> > xen_ram_alloc: do not alloc 4 bytes of ram at 1001 when runstate is 
> > INMIGRATE
> > VNC server running on 127.0.0.1:5900
> > qemu-system-i386: Expected vmdescription section, but got 0
> > 
> > 
> > 'xl vnc domU' on the receiver side just shows "guest has not initialized
> > display yet". For some reason the domU is still alive, 'vmstat 1' via
> > ssh keeps running.
> > 
> > 'xl shutdown domU' fails. It looks like the guest OS shuts down, the ssh
> > connection gets terminated, but the domU becomes busy according to 'xl
> > top', and never shuts down.

Does the patch "xen: handle inbound migration of VMs without ioreq
server pages" from Paul fix the issue?
<1470042985-680-1-git-send-email-paul.durr...@citrix.com>
https://lists.xen.org/archives/html/xen-devel/2016-08/msg00020.html

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 00/25] arm/altp2m: Introducing altp2m to ARM.

2016-08-03 Thread Sergej Proskurin


On 08/03/2016 02:18 PM, Andrew Cooper wrote:
> On 03/08/16 13:13, Julien Grall wrote:
>>
>> On 03/08/16 13:00, Andrew Cooper wrote:
>>> On 03/08/16 12:53, Julien Grall wrote:
 On 02/08/16 17:08, Andrew Cooper wrote:
> On 02/08/16 08:34, Julien Grall wrote:
>> Hi Andrew,
>>
>> On 02/08/2016 00:14, Andrew Cooper wrote:
>>> On 01/08/2016 19:15, Julien Grall wrote:
 On 01/08/16 18:10, Sergej Proskurin wrote:
> Hello all,
 Hello Sergej,

> The following patch series can be found on Github[0] and is part
> of my
> contribution to this year's Google Summer of Code (GSoC)[1]. My
> project is
> managed by the organization The Honeynet Project. As part of
> GSoC, I
> am being
> supervised by the Xen developer Tamas K. Lengyel
> , George
> D. Webster, and Steven Maresca.
>
> In this patch series, we provide an implementation of the altp2m
> subsystem for
> ARM. Our implementation is based on the altp2m subsystem for x86,
> providing
> additional --alternate-- views on the guest's physical memory by
> means of the
> ARM 2nd stage translation mechanism. The patches introduce new
> HVMOPs
> and
> extend the p2m subsystem. Also, we extend libxl to support
> altp2m on
> ARM and
> modify xen-access to test the suggested functionality.
>
> To be more precise, altp2m allows to create and switch to
> additional
> p2m views
> (i.e. gfn to mfn mappings). These views can be manipulated and
> activated as
> will through the provided HVMOPs. In this way, the active guest
> instance in
> question can seamlessly proceed execution without noticing that
> anything has
> changed. The prime scope of application of altp2m is Virtual
> Machine
> Introspection, where guest systems are analyzed from the
> outside of
> the VM.
>
> Altp2m can be activated by means of the guest control parameter
> "altp2m" on x86
> and ARM architectures.  The altp2m functionality by default can
> also
> be used
> from within the guest by design. For use-cases requiring purely
> external access
> to altp2m, a custom XSM policy is necessary on both x86 and ARM.
 As said on the previous version, altp2m operation *should not* be
 exposed to ARM guest. Any design written for x86 may not fit
 exactly
 for ARM (and vice versa), you will need to explain why you think we
 should follow the same pattern.
>>> Sorry, but I am going to step in here and disagree.  All the x86
>>> justifications for altp2m being accessible to guests apply
>>> equally to
>>> ARM, as they are hardware independent.
>>>
>>> I realise you are maintainer, but the onus is on you to justify why
>>> the
>>> behaviour should be different between x86 and ARM, rather than
>>> simply to
>>> complain at it being the same.
>>>
>>> Naturally, technical issues about the details of the
>>> implementation, or
>>> the algorithms etc. are of course fine, but I don't see any
>>> plausible
>>> reason why ARM should purposefully different from x86 in terms of
>>> available functionality, and several good reasons why it should
>>> be the
>>> same (least of all, feature parity across architectures).
>> The question here, is how a guest could take advantage to access to
>> altp2m on ARM today? Whilst on x86 a guest could be notified about
>> memaccess change, this is not yet the case on ARM.
> Does ARM have anything like #VE whereby an in-guest entity can receive
> notification of violations?
 I am not entirely sure what is exactly the #VE. From my understanding,
 it use to report stage 2 violation to the guest, right? If so, I am
 not aware of any.
>>> #VE is a newly specified CPU exception, precisely for reporting state 2
>>> violations (in ARM terminology).  It works very much like a pagefault.
>> Thank you for the explanation. We don't have any specific exception to
>> report stage 2 (I guess EPT for x86 terminology) violations.
> It is currently specific to Intel's EPT implementation, but there is
> nothing in principle preventing AMD reusing the meaning for their NPT
> implementation.
>
>> If the guest physical address does not belong to an emulated device or
>> does not have an associated host address, the hypervisor will inject a
>> data/prefetch abort to the guest.
> This is where x86 and ARM differ quite a bit.  For "areas which don't
> exist", the default is to discard writes and reads return ~0, rather
> than to raise a fault with the software.
>
>> Those aborts contains a fault status. For now it is always the same
>> fault: d

Re: [Xen-devel] live migration to qemu.git fails

2016-08-03 Thread Olaf Hering
On Wed, Aug 03, Anthony PERARD wrote:

> Does the patch "xen: handle inbound migration of VMs without ioreq
> server pages" from Paul fix the issue?
> <1470042985-680-1-git-send-email-paul.durr...@citrix.com>
> https://lists.xen.org/archives/html/xen-devel/2016-08/msg00020.html

Would that help with VMs created by staging-4.7 and its qemu-xen?
I think that change is just for staging-4.4 and its qemu-xen.
But I will try it anyway.

Olaf


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 0/6] x86/time: improve cross-CPU clock monotonicity (and more)

2016-08-03 Thread Jan Beulich
1: calibrate TSC against platform timer
2: correctly honor late clearing of TSC related feature flags
3: support 32-bit wide ACPI PM timer
4: fold recurring code
5: group time stamps into a structure
6: relax barriers

Signed-off-by: Jan Beulich 
---
v2: Only patch 1 changed (see there) and a new 6th patch got added.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 1/6] x86/time: calibrate TSC against platform timer

2016-08-03 Thread Jan Beulich
... instead of unconditionally against the PIT. This allows for local
and master system times to remain in better sync (which matters even
when, on any modern system, the master time is really used only during
secondary CPU bringup, as the error between the two is in fact
noticable in cross-CPU NOW() invocation monotonicity).

This involves moving the init_platform_timer() invocation into
early_time_init(), splitting out the few things which really need to be
done in init_xen_time(). That in turn allows dropping the open coded
PIT initialization from init_IRQ() (it was needed for APIC clock
calibration, which runs between early_time_init() and init_xen_time()).

In the course of this re-ordering also set the timer channel 2 gate low
after having finished calibration. This should be benign to overall
system operation, but appears to be the more clean state.

Also do away with open coded 8254 register manipulation from 8259 code.

Signed-off-by: Jan Beulich 
---
v2: Ditch preinit_pit() call from init_IRQ(). Adjust style.

--- a/xen/arch/x86/i8259.c
+++ b/xen/arch/x86/i8259.c
@@ -359,13 +359,6 @@ void __init init_IRQ(void)
 
 apic_intr_init();
 
-/* Set the clock to HZ Hz */
-#define CLOCK_TICK_RATE 1193182 /* crystal freq (Hz) */
-#define LATCH (((CLOCK_TICK_RATE)+(HZ/2))/HZ)
-outb_p(0x34, PIT_MODE);/* binary, mode 2, LSB/MSB, ch 0 */
-outb_p(LATCH & 0xff, PIT_CH0); /* LSB */
-outb(LATCH >> 8, PIT_CH0); /* MSB */
-
 setup_irq(2, 0, &cascade);
 }
 
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -59,7 +59,7 @@ struct platform_timesource {
 char *name;
 u64 frequency;
 u64 (*read_counter)(void);
-int (*init)(struct platform_timesource *);
+s64 (*init)(struct platform_timesource *);
 void (*resume)(struct platform_timesource *);
 int counter_bits;
 };
@@ -224,49 +224,18 @@ static struct irqaction __read_mostly ir
 timer_interrupt, "timer", NULL
 };
 
-/* -- Calibrate the TSC --- 
- * Return processor ticks per second / CALIBRATE_FRAC.
- */
-
 #define CLOCK_TICK_RATE 1193182 /* system crystal frequency (Hz) */
 #define CALIBRATE_FRAC  20  /* calibrate over 50ms */
-#define CALIBRATE_LATCH ((CLOCK_TICK_RATE+(CALIBRATE_FRAC/2))/CALIBRATE_FRAC)
+#define CALIBRATE_VALUE(freq) (((freq) + CALIBRATE_FRAC / 2) / CALIBRATE_FRAC)
 
-static u64 init_pit_and_calibrate_tsc(void)
+static void preinit_pit(void)
 {
-u64 start, end;
-unsigned long count;
-
 /* Set PIT channel 0 to HZ Hz. */
 #define LATCH (((CLOCK_TICK_RATE)+(HZ/2))/HZ)
 outb_p(0x34, PIT_MODE);/* binary, mode 2, LSB/MSB, ch 0 */
 outb_p(LATCH & 0xff, PIT_CH0); /* LSB */
 outb(LATCH >> 8, PIT_CH0); /* MSB */
-
-/* Set the Gate high, disable speaker */
-outb((inb(0x61) & ~0x02) | 0x01, 0x61);
-
-/*
- * Now let's take care of CTC channel 2
- *
- * Set the Gate high, program CTC channel 2 for mode 0, (interrupt on
- * terminal count mode), binary count, load 5 * LATCH count, (LSB and MSB)
- * to begin countdown.
- */
-outb(0xb0, PIT_MODE);   /* binary, mode 0, LSB/MSB, Ch 2 */
-outb(CALIBRATE_LATCH & 0xff, PIT_CH2); /* LSB of count */
-outb(CALIBRATE_LATCH >> 8, PIT_CH2);   /* MSB of count */
-
-start = rdtsc_ordered();
-for ( count = 0; (inb(0x61) & 0x20) == 0; count++ )
-continue;
-end = rdtsc_ordered();
-
-/* Error if the CTC doesn't behave itself. */
-if ( count == 0 )
-return 0;
-
-return ((end - start) * (u64)CALIBRATE_FRAC);
+#undef LATCH
 }
 
 void set_time_scale(struct time_scale *ts, u64 ticks_per_sec)
@@ -327,10 +296,49 @@ static u64 read_pit_count(void)
 return count32;
 }
 
-static int __init init_pit(struct platform_timesource *pts)
+static s64 __init init_pit(struct platform_timesource *pts)
 {
+u8 portb = inb(0x61);
+u64 start, end;
+unsigned long count;
+
 using_pit = 1;
-return 1;
+
+/* Set the Gate high, disable speaker. */
+outb((portb & ~0x02) | 0x01, 0x61);
+
+/*
+ * Now let's take care of CTC channel 2: mode 0, (interrupt on
+ * terminal count mode), binary count, load CALIBRATE_LATCH count,
+ * (LSB and MSB) to begin countdown.
+ */
+#define CALIBRATE_LATCH CALIBRATE_VALUE(CLOCK_TICK_RATE)
+outb(0xb0, PIT_MODE);  /* binary, mode 0, LSB/MSB, Ch 2 */
+outb(CALIBRATE_LATCH & 0xff, PIT_CH2); /* LSB of count */
+outb(CALIBRATE_LATCH >> 8, PIT_CH2);   /* MSB of count */
+#undef CALIBRATE_LATCH
+
+start = rdtsc_ordered();
+for ( count = 0; !(inb(0x61) & 0x20); ++count )
+continue;
+end = rdtsc_ordered();
+
+/* Set the Gate low, disable speaker. */
+outb(portb & ~0x03, 0x61);
+
+/* Error if the CTC doesn't behave itself. */
+if ( count == 0 )
+return 0;
+
+return (end - start) * CALIBRATE_FRAC;
+}
+
+static void resume_pit(struct platform_timesource *pts)
+{
+/* Set CTC channel 2 to mode 0 again; initial v

[Xen-devel] [PATCH v2 2/6] x86/time: correctly honor late clearing of TSC related feature flags

2016-08-03 Thread Jan Beulich
As such clearing of flags may have an impact on the selected rendezvous
function, handle such in a central place.

But don't allow such feature flags to be cleared during CPU hotplug:
Platform and local system times may have diverged significantly by
then, potentially causing noticably (even if only temporary) strange
behavior. As we're anyway expecting only sufficiently similar CPUs to
appear during hotplug, this shouldn't be introducing new limitations.

Reported-by: Joao Martins 
Signed-off-by: Jan Beulich 
Tested-by: Dario Faggioli 
Tested-by: Joao Martins 

--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -1162,8 +1162,8 @@ static int mwait_idle_cpu_init(struct no
}
 
if (state > 2 && !boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
-   !pm_idle_save)
-   setup_clear_cpu_cap(X86_FEATURE_TSC_RELIABLE);
+   !pm_idle_save && system_state < SYS_STATE_active)
+   clear_tsc_cap(X86_FEATURE_TSC_RELIABLE);
 
cx = dev->states + dev->count;
cx->type = state;
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1353,6 +1353,24 @@ static void time_calibration(void *unuse
  &r, 1);
 }
 
+void __init clear_tsc_cap(unsigned int feature)
+{
+void (*rendezvous_fn)(void *) = time_calibration_std_rendezvous;
+
+if ( feature )
+setup_clear_cpu_cap(feature);
+
+/* If we have constant-rate TSCs then scale factor can be shared. */
+if ( boot_cpu_has(X86_FEATURE_CONSTANT_TSC) )
+{
+/* If TSCs are not marked as 'reliable', re-sync during rendezvous. */
+if ( !boot_cpu_has(X86_FEATURE_TSC_RELIABLE) )
+rendezvous_fn = time_calibration_tsc_rendezvous;
+}
+
+time_calibration_rendezvous_fn = rendezvous_fn;
+}
+
 static struct {
 s_time_t local_stime, master_stime;
 } ap_bringup_ref;
@@ -1478,7 +1496,7 @@ static int __init verify_tsc_reliability
 {
 printk("%s: TSC warp detected, disabling TSC_RELIABLE\n",
__func__);
-setup_clear_cpu_cap(X86_FEATURE_TSC_RELIABLE);
+clear_tsc_cap(X86_FEATURE_TSC_RELIABLE);
 }
 }
 
@@ -1491,13 +1509,7 @@ int __init init_xen_time(void)
 {
 tsc_check_writability();
 
-/* If we have constant-rate TSCs then scale factor can be shared. */
-if ( boot_cpu_has(X86_FEATURE_CONSTANT_TSC) )
-{
-/* If TSCs are not marked as 'reliable', re-sync during rendezvous. */
-if ( !boot_cpu_has(X86_FEATURE_TSC_RELIABLE) )
-time_calibration_rendezvous_fn = time_calibration_tsc_rendezvous;
-}
+clear_tsc_cap(0);
 
 open_softirq(TIME_CALIBRATE_SOFTIRQ, local_time_calibration);
 
--- a/xen/include/asm-x86/time.h
+++ b/xen/include/asm-x86/time.h
@@ -70,6 +70,7 @@ void tsc_get_info(struct domain *d, uint
 void force_update_vcpu_system_time(struct vcpu *v);
 
 int host_tsc_is_safe(void);
+void clear_tsc_cap(unsigned int feature);
 void cpuid_time_leaf(uint32_t sub_idx, uint32_t *eax, uint32_t *ebx,
  uint32_t *ecx, uint32_t *edx);
 



x86/time: correctly honor late clearing of TSC related feature flags

As such clearing of flags may have an impact on the selected rendezvous
function, handle such in a central place.

But don't allow such feature flags to be cleared during CPU hotplug:
Platform and local system times may have diverged significantly by
then, potentially causing noticably (even if only temporary) strange
behavior. As we're anyway expecting only sufficiently similar CPUs to
appear during hotplug, this shouldn't be introducing new limitations.

Reported-by: Joao Martins 
Signed-off-by: Jan Beulich 
Tested-by: Dario Faggioli 
Tested-by: Joao Martins 

--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -1162,8 +1162,8 @@ static int mwait_idle_cpu_init(struct no
}
 
if (state > 2 && !boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
-   !pm_idle_save)
-   setup_clear_cpu_cap(X86_FEATURE_TSC_RELIABLE);
+   !pm_idle_save && system_state < SYS_STATE_active)
+   clear_tsc_cap(X86_FEATURE_TSC_RELIABLE);
 
cx = dev->states + dev->count;
cx->type = state;
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1353,6 +1353,24 @@ static void time_calibration(void *unuse
  &r, 1);
 }
 
+void __init clear_tsc_cap(unsigned int feature)
+{
+void (*rendezvous_fn)(void *) = time_calibration_std_rendezvous;
+
+if ( feature )
+setup_clear_cpu_cap(feature);
+
+/* If we have constant-rate TSCs then scale factor can be shared. */
+if ( boot_cpu_has(X86_FEATURE_CONSTANT_TSC) )
+{
+/* If TSCs are not marked as 'reliable', re-sync during rendezvous. */
+if ( !boot_cpu_has(X86_FEATURE_TSC_RELIABLE) )
+rendezvous_fn = time_calibrat

[Xen-devel] [PATCH v2 3/6] x86/time: support 32-bit wide ACPI PM timer

2016-08-03 Thread Jan Beulich
I have no idea why we didn't do so from the beginning.

Signed-off-by: Jan Beulich 
Tested-by: Dario Faggioli 
Tested-by: Joao Martins 
Reviewed-by: Andrew Cooper 

--- a/xen/arch/x86/acpi/boot.c
+++ b/xen/arch/x86/acpi/boot.c
@@ -481,22 +481,23 @@ static int __init acpi_parse_fadt(struct
if (fadt->header.revision >= FADT2_REVISION_ID) {
/* FADT rev. 2 */
if (fadt->xpm_timer_block.space_id ==
-   ACPI_ADR_SPACE_SYSTEM_IO)
+   ACPI_ADR_SPACE_SYSTEM_IO) {
pmtmr_ioport = fadt->xpm_timer_block.address;
-   /*
-* "X" fields are optional extensions to the original V1.0
-* fields, so we must selectively expand V1.0 fields if the
-* corresponding X field is zero.
-*/
-   if (!pmtmr_ioport)
-   pmtmr_ioport = fadt->pm_timer_block;
-   } else {
-   /* FADT rev. 1 */
+   pmtmr_width = fadt->xpm_timer_block.bit_width;
+   }
+   }
+   /*
+* "X" fields are optional extensions to the original V1.0
+* fields, so we must selectively expand V1.0 fields if the
+* corresponding X field is zero.
+*/
+   if (!pmtmr_ioport) {
pmtmr_ioport = fadt->pm_timer_block;
+   pmtmr_width = fadt->pm_timer_length == 4 ? 24 : 0;
}
if (pmtmr_ioport)
-   printk(KERN_INFO PREFIX "PM-Timer IO Port: %#x\n",
-  pmtmr_ioport);
+   printk(KERN_INFO PREFIX "PM-Timer IO Port: %#x (%u bits)\n",
+  pmtmr_ioport, pmtmr_width);
 #endif
 
acpi_smi_cmd   = fadt->smi_command;
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -403,6 +403,7 @@ static struct platform_timesource __init
  */
 
 u32 __read_mostly pmtmr_ioport;
+unsigned int __initdata pmtmr_width;
 
 /* ACPI PM timer ticks at 3.579545 MHz. */
 #define ACPI_PM_FREQUENCY 3579545
@@ -417,9 +418,15 @@ static s64 __init init_pmtimer(struct pl
 u64 start;
 u32 count, target, mask = 0xff;
 
-if ( pmtmr_ioport == 0 )
+if ( !pmtmr_ioport || !pmtmr_width )
 return 0;
 
+if ( pmtmr_width == 32 )
+{
+pts->counter_bits = 32;
+mask = 0x;
+}
+
 count = inl(pmtmr_ioport) & mask;
 start = rdtsc_ordered();
 target = count + CALIBRATE_VALUE(ACPI_PM_FREQUENCY);
--- a/xen/include/asm-x86/acpi.h
+++ b/xen/include/asm-x86/acpi.h
@@ -146,6 +146,7 @@ extern u32 x86_acpiid_to_apicid[];
 #define INVALID_ACPIID (-1U)
 
 extern u32 pmtmr_ioport;
+extern unsigned int pmtmr_width;
 
 int acpi_dmar_init(void);
 void acpi_mmcfg_init(void);



x86/time: support 32-bit wide ACPI PM timer

I have no idea why we didn't do so from the beginning.

Signed-off-by: Jan Beulich 
Tested-by: Dario Faggioli 
Tested-by: Joao Martins 
Reviewed-by: Andrew Cooper 

--- a/xen/arch/x86/acpi/boot.c
+++ b/xen/arch/x86/acpi/boot.c
@@ -481,22 +481,23 @@ static int __init acpi_parse_fadt(struct
if (fadt->header.revision >= FADT2_REVISION_ID) {
/* FADT rev. 2 */
if (fadt->xpm_timer_block.space_id ==
-   ACPI_ADR_SPACE_SYSTEM_IO)
+   ACPI_ADR_SPACE_SYSTEM_IO) {
pmtmr_ioport = fadt->xpm_timer_block.address;
-   /*
-* "X" fields are optional extensions to the original V1.0
-* fields, so we must selectively expand V1.0 fields if the
-* corresponding X field is zero.
-*/
-   if (!pmtmr_ioport)
-   pmtmr_ioport = fadt->pm_timer_block;
-   } else {
-   /* FADT rev. 1 */
+   pmtmr_width = fadt->xpm_timer_block.bit_width;
+   }
+   }
+   /*
+* "X" fields are optional extensions to the original V1.0
+* fields, so we must selectively expand V1.0 fields if the
+* corresponding X field is zero.
+*/
+   if (!pmtmr_ioport) {
pmtmr_ioport = fadt->pm_timer_block;
+   pmtmr_width = fadt->pm_timer_length == 4 ? 24 : 0;
}
if (pmtmr_ioport)
-   printk(KERN_INFO PREFIX "PM-Timer IO Port: %#x\n",
-  pmtmr_ioport);
+   printk(KERN_INFO PREFIX "PM-Timer IO Port: %#x (%u bits)\n",
+  pmtmr_ioport, pmtmr_width);
 #endif
 
acpi_smi_cmd   = fadt->smi_command;
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -403,6 +403,7 @@ static struct platform_timesource __init
  */
 
 u32 __read_mostly pmtmr_ioport;
+unsigned int __initdata pmtmr_width;
 
 /* ACPI PM timer ticks at 3.579545 MHz. */
 #define ACPI_PM_FREQUENCY 3579545
@@ -417,9 +418,15 @@ static s64 __init init_pmtimer(struct pl
 u64 start;
 u32 count, target, mask = 0xff;
 
-if ( pmtmr_ioport == 0 )
+

[Xen-devel] [PATCH v2 4/6] x86/time: fold recurring code

2016-08-03 Thread Jan Beulich
Common code between time_calibration_{std,tsc}_rendezvous() can better
live in a single place, eliminating the risk of adjusting one without
the other.

Signed-off-by: Jan Beulich 
Tested-by: Dario Faggioli 
Tested-by: Joao Martins 
Reviewed-by: Andrew Cooper 

--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1258,6 +1258,18 @@ struct calibration_rendezvous {
 u64 master_tsc_stamp;
 };
 
+static void
+time_calibration_rendezvous_tail(const struct calibration_rendezvous *r)
+{
+struct cpu_calibration *c = &this_cpu(cpu_calibration);
+
+c->local_tsc_stamp = rdtsc_ordered();
+c->stime_local_stamp = get_s_time_fixed(c->local_tsc_stamp);
+c->stime_master_stamp = r->master_stime;
+
+raise_softirq(TIME_CALIBRATE_SOFTIRQ);
+}
+
 /*
  * Keep TSCs in sync when they run at the same rate, but may stop in
  * deep-sleep C states.
@@ -1265,7 +1277,6 @@ struct calibration_rendezvous {
 static void time_calibration_tsc_rendezvous(void *_r)
 {
 int i;
-struct cpu_calibration *c = &this_cpu(cpu_calibration);
 struct calibration_rendezvous *r = _r;
 unsigned int total_cpus = cpumask_weight(&r->cpu_calibration_map);
 
@@ -1306,17 +1317,12 @@ static void time_calibration_tsc_rendezv
 }
 }
 
-c->local_tsc_stamp = rdtsc_ordered();
-c->stime_local_stamp = get_s_time_fixed(c->local_tsc_stamp);
-c->stime_master_stamp = r->master_stime;
-
-raise_softirq(TIME_CALIBRATE_SOFTIRQ);
+time_calibration_rendezvous_tail(r);
 }
 
 /* Ordinary rendezvous function which does not modify TSC values. */
 static void time_calibration_std_rendezvous(void *_r)
 {
-struct cpu_calibration *c = &this_cpu(cpu_calibration);
 struct calibration_rendezvous *r = _r;
 unsigned int total_cpus = cpumask_weight(&r->cpu_calibration_map);
 
@@ -1336,11 +1342,7 @@ static void time_calibration_std_rendezv
 mb(); /* receive signal /then/ read r->master_stime */
 }
 
-c->local_tsc_stamp = rdtsc_ordered();
-c->stime_local_stamp = get_s_time_fixed(c->local_tsc_stamp);
-c->stime_master_stamp = r->master_stime;
-
-raise_softirq(TIME_CALIBRATE_SOFTIRQ);
+time_calibration_rendezvous_tail(r);
 }
 
 static void (*time_calibration_rendezvous_fn)(void *) =



x86/time: fold recurring code

Common code between time_calibration_{std,tsc}_rendezvous() can better
live in a single place, eliminating the risk of adjusting one without
the other.

Signed-off-by: Jan Beulich 
Tested-by: Dario Faggioli 
Tested-by: Joao Martins 
Reviewed-by: Andrew Cooper 

--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1258,6 +1258,18 @@ struct calibration_rendezvous {
 u64 master_tsc_stamp;
 };
 
+static void
+time_calibration_rendezvous_tail(const struct calibration_rendezvous *r)
+{
+struct cpu_calibration *c = &this_cpu(cpu_calibration);
+
+c->local_tsc_stamp = rdtsc_ordered();
+c->stime_local_stamp = get_s_time_fixed(c->local_tsc_stamp);
+c->stime_master_stamp = r->master_stime;
+
+raise_softirq(TIME_CALIBRATE_SOFTIRQ);
+}
+
 /*
  * Keep TSCs in sync when they run at the same rate, but may stop in
  * deep-sleep C states.
@@ -1265,7 +1277,6 @@ struct calibration_rendezvous {
 static void time_calibration_tsc_rendezvous(void *_r)
 {
 int i;
-struct cpu_calibration *c = &this_cpu(cpu_calibration);
 struct calibration_rendezvous *r = _r;
 unsigned int total_cpus = cpumask_weight(&r->cpu_calibration_map);
 
@@ -1306,17 +1317,12 @@ static void time_calibration_tsc_rendezv
 }
 }
 
-c->local_tsc_stamp = rdtsc_ordered();
-c->stime_local_stamp = get_s_time_fixed(c->local_tsc_stamp);
-c->stime_master_stamp = r->master_stime;
-
-raise_softirq(TIME_CALIBRATE_SOFTIRQ);
+time_calibration_rendezvous_tail(r);
 }
 
 /* Ordinary rendezvous function which does not modify TSC values. */
 static void time_calibration_std_rendezvous(void *_r)
 {
-struct cpu_calibration *c = &this_cpu(cpu_calibration);
 struct calibration_rendezvous *r = _r;
 unsigned int total_cpus = cpumask_weight(&r->cpu_calibration_map);
 
@@ -1336,11 +1342,7 @@ static void time_calibration_std_rendezv
 mb(); /* receive signal /then/ read r->master_stime */
 }
 
-c->local_tsc_stamp = rdtsc_ordered();
-c->stime_local_stamp = get_s_time_fixed(c->local_tsc_stamp);
-c->stime_master_stamp = r->master_stime;
-
-raise_softirq(TIME_CALIBRATE_SOFTIRQ);
+time_calibration_rendezvous_tail(r);
 }
 
 static void (*time_calibration_rendezvous_fn)(void *) =
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 5/6] x86/time: group time stamps into a structure

2016-08-03 Thread Jan Beulich
If that had been done from the beginning, mistakes like the one
corrected in commit b64438c7c1 ("x86/time: use correct (local) time
stamp in constant-TSC calibration fast path") would likely never have
happened.

Also add a few "const" to make more obvious when things aren't expected
to change.

Signed-off-by: Jan Beulich 
Tested-by: Dario Faggioli 
Tested-by: Joao Martins 
Reviewed-by: Andrew Cooper 

--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -47,10 +47,14 @@ unsigned long __read_mostly cpu_khz;  /*
 DEFINE_SPINLOCK(rtc_lock);
 unsigned long pit0_ticks;
 
+struct cpu_time_stamp {
+u64 local_tsc;
+s_time_t local_stime;
+s_time_t master_stime;
+};
+
 struct cpu_time {
-u64 local_tsc_stamp;
-s_time_t stime_local_stamp;
-s_time_t stime_master_stamp;
+struct cpu_time_stamp stamp;
 struct time_scale tsc_scale;
 };
 
@@ -118,7 +122,7 @@ static inline u32 mul_frac(u32 multiplic
  * Scale a 64-bit delta by scaling and multiplying by a 32-bit fraction,
  * yielding a 64-bit result.
  */
-u64 scale_delta(u64 delta, struct time_scale *scale)
+u64 scale_delta(u64 delta, const struct time_scale *scale)
 {
 u64 product;
 
@@ -635,11 +639,11 @@ u64 stime2tsc(s_time_t stime)
 t = &this_cpu(cpu_time);
 sys_to_tsc = scale_reciprocal(t->tsc_scale);
 
-stime_delta = stime - t->stime_local_stamp;
+stime_delta = stime - t->stamp.local_stime;
 if ( stime_delta < 0 )
 stime_delta = 0;
 
-return t->local_tsc_stamp + scale_delta(stime_delta, &sys_to_tsc);
+return t->stamp.local_tsc + scale_delta(stime_delta, &sys_to_tsc);
 }
 
 void cstate_restore_tsc(void)
@@ -790,7 +794,7 @@ static unsigned long get_cmos_time(void)
 
 s_time_t get_s_time_fixed(u64 at_tsc)
 {
-struct cpu_time *t = &this_cpu(cpu_time);
+const struct cpu_time *t = &this_cpu(cpu_time);
 u64 tsc, delta;
 s_time_t now;
 
@@ -798,8 +802,8 @@ s_time_t get_s_time_fixed(u64 at_tsc)
 tsc = at_tsc;
 else
 tsc = rdtsc_ordered();
-delta = tsc - t->local_tsc_stamp;
-now = t->stime_local_stamp + scale_delta(delta, &t->tsc_scale);
+delta = tsc - t->stamp.local_tsc;
+now = t->stamp.local_stime + scale_delta(delta, &t->tsc_scale);
 
 return now;
 }
@@ -818,7 +822,7 @@ uint64_t tsc_ticks2ns(uint64_t ticks)
 
 static void __update_vcpu_system_time(struct vcpu *v, int force)
 {
-struct cpu_time   *t;
+const struct cpu_time *t;
 struct vcpu_time_info *u, _u = {};
 struct domain *d = v->domain;
 s_time_t tsc_stamp;
@@ -831,7 +835,7 @@ static void __update_vcpu_system_time(st
 
 if ( d->arch.vtsc )
 {
-s_time_t stime = t->stime_local_stamp;
+s_time_t stime = t->stamp.local_stime;
 
 if ( is_hvm_domain(d) )
 {
@@ -853,20 +857,20 @@ static void __update_vcpu_system_time(st
 {
 if ( has_hvm_container_domain(d) && hvm_tsc_scaling_supported )
 {
-tsc_stamp= hvm_scale_tsc(d, t->local_tsc_stamp);
+tsc_stamp= hvm_scale_tsc(d, t->stamp.local_tsc);
 _u.tsc_to_system_mul = d->arch.vtsc_to_ns.mul_frac;
 _u.tsc_shift = d->arch.vtsc_to_ns.shift;
 }
 else
 {
-tsc_stamp= t->local_tsc_stamp;
+tsc_stamp= t->stamp.local_tsc;
 _u.tsc_to_system_mul = t->tsc_scale.mul_frac;
 _u.tsc_shift = t->tsc_scale.shift;
 }
 }
 
 _u.tsc_timestamp = tsc_stamp;
-_u.system_time   = t->stime_local_stamp;
+_u.system_time   = t->stamp.local_stime;
 
 if ( is_hvm_domain(d) )
 _u.tsc_timestamp += v->arch.hvm_vcpu.cache_tsc_offset;
@@ -966,12 +970,12 @@ int cpu_frequency_change(u64 freq)
 
 local_irq_disable();
 /* Platform time /first/, as we may be delayed by platform_timer_lock. */
-t->stime_master_stamp = read_platform_stime();
-/* TSC-extrapolated time may be bogus after frequency change. */
-/*t->stime_local_stamp = get_s_time();*/
-t->stime_local_stamp = t->stime_master_stamp;
+t->stamp.master_stime = read_platform_stime();
 curr_tsc = rdtsc_ordered();
-t->local_tsc_stamp = curr_tsc;
+/* TSC-extrapolated time may be bogus after frequency change. */
+/*t->stamp.local_stime = get_s_time_fixed(curr_tsc);*/
+t->stamp.local_stime = t->stamp.master_stime;
+t->stamp.local_tsc = curr_tsc;
 set_time_scale(&t->tsc_scale, freq);
 local_irq_enable();
 
@@ -988,28 +992,19 @@ int cpu_frequency_change(u64 freq)
 }
 
 /* Per-CPU communication between rendezvous IRQ and softirq handler. */
-struct cpu_calibration {
-u64 local_tsc_stamp;
-s_time_t stime_local_stamp;
-s_time_t stime_master_stamp;
-};
-static DEFINE_PER_CPU(struct cpu_calibration, cpu_calibration);
+static DEFINE_PER_CPU(struct cpu_time_stamp, cpu_calibration);
 
 /* Softirq handler for per-CPU time calibration. */
 static void local_time_calibration(void)
 {
 stru

[Xen-devel] [PATCH v2 6/6] x86/time: relax barriers

2016-08-03 Thread Jan Beulich
On x86 there's no need for full barriers in loops waiting for some
memory location to change. Nor do we need full barriers between two
reads and two writes - SMP ones fully suffice (and I actually think
they could in fact be dropped, since atomic_*() operations should
already provide enough ordering).

--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1206,7 +1206,7 @@ static void tsc_check_slave(void *unused
 unsigned int cpu = smp_processor_id();
 local_irq_disable();
 while ( !cpumask_test_cpu(cpu, &tsc_check_cpumask) )
-mb();
+cpu_relax();
 check_tsc_warp(cpu_khz, &tsc_max_warp);
 cpumask_clear_cpu(cpu, &tsc_check_cpumask);
 local_irq_enable();
@@ -1271,7 +1271,7 @@ static void time_calibration_tsc_rendezv
 if ( smp_processor_id() == 0 )
 {
 while ( atomic_read(&r->semaphore) != (total_cpus - 1) )
-mb();
+cpu_relax();
 
 if ( r->master_stime == 0 )
 {
@@ -1284,21 +1284,21 @@ static void time_calibration_tsc_rendezv
 write_tsc(r->master_tsc_stamp);
 
 while ( atomic_read(&r->semaphore) != (2*total_cpus - 1) )
-mb();
+cpu_relax();
 atomic_set(&r->semaphore, 0);
 }
 else
 {
 atomic_inc(&r->semaphore);
 while ( atomic_read(&r->semaphore) < total_cpus )
-mb();
+cpu_relax();
 
 if ( i == 0 )
 write_tsc(r->master_tsc_stamp);
 
 atomic_inc(&r->semaphore);
 while ( atomic_read(&r->semaphore) > total_cpus )
-mb();
+cpu_relax();
 }
 }
 
@@ -1316,7 +1316,7 @@ static void time_calibration_std_rendezv
 while ( atomic_read(&r->semaphore) != (total_cpus - 1) )
 cpu_relax();
 r->master_stime = read_platform_stime();
-mb(); /* write r->master_stime /then/ signal */
+smp_wmb(); /* write r->master_stime /then/ signal */
 atomic_inc(&r->semaphore);
 }
 else
@@ -1324,7 +1324,7 @@ static void time_calibration_std_rendezv
 atomic_inc(&r->semaphore);
 while ( atomic_read(&r->semaphore) != total_cpus )
 cpu_relax();
-mb(); /* receive signal /then/ read r->master_stime */
+smp_rmb(); /* receive signal /then/ read r->master_stime */
 }
 
 time_calibration_rendezvous_tail(r);


x86/time: relax barriers

On x86 there's no need for full barriers in loops waiting for some
memory location to change. Nor do we need full barriers between two
reads and two writes - SMP ones fully suffice (and I actually think
they could in fact be dropped, since atomic_*() operations should
already provide enough ordering).

--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1206,7 +1206,7 @@ static void tsc_check_slave(void *unused
 unsigned int cpu = smp_processor_id();
 local_irq_disable();
 while ( !cpumask_test_cpu(cpu, &tsc_check_cpumask) )
-mb();
+cpu_relax();
 check_tsc_warp(cpu_khz, &tsc_max_warp);
 cpumask_clear_cpu(cpu, &tsc_check_cpumask);
 local_irq_enable();
@@ -1271,7 +1271,7 @@ static void time_calibration_tsc_rendezv
 if ( smp_processor_id() == 0 )
 {
 while ( atomic_read(&r->semaphore) != (total_cpus - 1) )
-mb();
+cpu_relax();
 
 if ( r->master_stime == 0 )
 {
@@ -1284,21 +1284,21 @@ static void time_calibration_tsc_rendezv
 write_tsc(r->master_tsc_stamp);
 
 while ( atomic_read(&r->semaphore) != (2*total_cpus - 1) )
-mb();
+cpu_relax();
 atomic_set(&r->semaphore, 0);
 }
 else
 {
 atomic_inc(&r->semaphore);
 while ( atomic_read(&r->semaphore) < total_cpus )
-mb();
+cpu_relax();
 
 if ( i == 0 )
 write_tsc(r->master_tsc_stamp);
 
 atomic_inc(&r->semaphore);
 while ( atomic_read(&r->semaphore) > total_cpus )
-mb();
+cpu_relax();
 }
 }
 
@@ -1316,7 +1316,7 @@ static void time_calibration_std_rendezv
 while ( atomic_read(&r->semaphore) != (total_cpus - 1) )
 cpu_relax();
 r->master_stime = read_platform_stime();
-mb(); /* write r->master_stime /then/ signal */
+smp_wmb(); /* write r->master_stime /then/ signal */
 atomic_inc(&r->semaphore);
 }
 else
@@ -1324,7 +1324,7 @@ static void time_calibration_std_rendezv
 atomic_inc(&r->semaphore);
 while ( atomic_read(&r->semaphore) != total_cpus )
 cpu_relax();
-mb(); /* receive signal /then/ read r->master_stime */
+smp_rmb(); /* receive signal /then/ read r->master_stime */
 }
 
 time_calibration_rendezvous_tail(r);
___

[Xen-devel] [PULL 4/6] xen: when removing a backend don't remove many of them

2016-08-03 Thread Gerd Hoffmann
From: Juergen Gross 

When a Xenstore watch fires indicating a backend has to be removed
don't remove all backends for that domain with the specified device
index, but just the one which has the correct type.

The easiest way to achieve this is to use the already determined
xendev as parameter for xen_be_del_xendev() instead of only the domid
and device index.

This at once removes the open coded QTAILQ_FOREACH_SAVE() in
xen_be_del_xendev() as there is no need to search for the correct
xendev any longer.

Signed-off-by: Juergen Gross 
Reviewed-by: Stefano Stabellini 
Message-id: 1470140044-16492-2-git-send-email-jgr...@suse.com
Signed-off-by: Gerd Hoffmann 
---
 hw/xen/xen_backend.c | 58 +---
 1 file changed, 19 insertions(+), 39 deletions(-)

diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
index bab79b1..3ceb778 100644
--- a/hw/xen/xen_backend.c
+++ b/hw/xen/xen_backend.c
@@ -321,48 +321,28 @@ static struct XenDevice *xen_be_get_xendev(const char 
*type, int dom, int dev,
 /*
  * release xen backend device.
  */
-static struct XenDevice *xen_be_del_xendev(int dom, int dev)
+static void xen_be_del_xendev(struct XenDevice *xendev)
 {
-struct XenDevice *xendev, *xnext;
-
-/*
- * This is pretty much like QTAILQ_FOREACH(xendev, &xendevs, next) but
- * we save the next pointer in xnext because we might free xendev.
- */
-xnext = xendevs.tqh_first;
-while (xnext) {
-xendev = xnext;
-xnext = xendev->next.tqe_next;
-
-if (xendev->dom != dom) {
-continue;
-}
-if (xendev->dev != dev && dev != -1) {
-continue;
-}
-
-if (xendev->ops->free) {
-xendev->ops->free(xendev);
-}
-
-if (xendev->fe) {
-char token[XEN_BUFSIZE];
-snprintf(token, sizeof(token), "fe:%p", xendev);
-xs_unwatch(xenstore, xendev->fe, token);
-g_free(xendev->fe);
-}
+if (xendev->ops->free) {
+xendev->ops->free(xendev);
+}
 
-if (xendev->evtchndev != NULL) {
-xenevtchn_close(xendev->evtchndev);
-}
-if (xendev->gnttabdev != NULL) {
-xengnttab_close(xendev->gnttabdev);
-}
+if (xendev->fe) {
+char token[XEN_BUFSIZE];
+snprintf(token, sizeof(token), "fe:%p", xendev);
+xs_unwatch(xenstore, xendev->fe, token);
+g_free(xendev->fe);
+}
 
-QTAILQ_REMOVE(&xendevs, xendev, next);
-g_free(xendev);
+if (xendev->evtchndev != NULL) {
+xenevtchn_close(xendev->evtchndev);
 }
-return NULL;
+if (xendev->gnttabdev != NULL) {
+xengnttab_close(xendev->gnttabdev);
+}
+
+QTAILQ_REMOVE(&xendevs, xendev, next);
+g_free(xendev);
 }
 
 /*
@@ -682,7 +662,7 @@ static void xenstore_update_be(char *watch, char *type, int 
dom,
 if (xendev != NULL) {
 bepath = xs_read(xenstore, 0, xendev->be, &len);
 if (bepath == NULL) {
-xen_be_del_xendev(dom, dev);
+xen_be_del_xendev(xendev);
 } else {
 free(bepath);
 xen_be_backend_changed(xendev, path);
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PULL 6/6] xen: use a common function for pv and hvm guest backend register calls

2016-08-03 Thread Gerd Hoffmann
From: Juergen Gross 

Instead of calling xen_be_register() for each supported backend type
for hvm and pv guests in their machine init functions use a common
function in order not to have to add new backends twice.

This at once fixes the error that hvm domains couldn't use the qusb
backend.

Signed-off-by: Juergen Gross 
Acked-by: Anthony PERARD 
Message-id: 1470119552-16170-1-git-send-email-jgr...@suse.com
Signed-off-by: Gerd Hoffmann 
---
 hw/xen/xen_backend.c | 10 ++
 hw/xenpv/xen_machine_pv.c|  7 +--
 include/hw/xen/xen_backend.h |  1 +
 xen-hvm.c|  4 +---
 4 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
index 3ceb778..69a2388 100644
--- a/hw/xen/xen_backend.c
+++ b/hw/xen/xen_backend.c
@@ -780,6 +780,16 @@ int xen_be_register(const char *type, struct XenDevOps 
*ops)
 return xenstore_scan(type, xen_domid, ops);
 }
 
+void xen_be_register_common(void)
+{
+xen_be_register("console", &xen_console_ops);
+xen_be_register("vkbd", &xen_kbdmouse_ops);
+xen_be_register("qdisk", &xen_blkdev_ops);
+#ifdef CONFIG_USB_LIBUSB
+xen_be_register("qusb", &xen_usb_ops);
+#endif
+}
+
 int xen_be_bind_evtchn(struct XenDevice *xendev)
 {
 if (xendev->local_port != -1) {
diff --git a/hw/xenpv/xen_machine_pv.c b/hw/xenpv/xen_machine_pv.c
index 48f725c..79aef4e 100644
--- a/hw/xenpv/xen_machine_pv.c
+++ b/hw/xenpv/xen_machine_pv.c
@@ -67,14 +67,9 @@ static void xen_init_pv(MachineState *machine)
 break;
 }
 
-xen_be_register("console", &xen_console_ops);
-xen_be_register("vkbd", &xen_kbdmouse_ops);
+xen_be_register_common();
 xen_be_register("vfb", &xen_framebuffer_ops);
-xen_be_register("qdisk", &xen_blkdev_ops);
 xen_be_register("qnic", &xen_netdev_ops);
-#ifdef CONFIG_USB_LIBUSB
-xen_be_register("qusb", &xen_usb_ops);
-#endif
 
 /* configure framebuffer */
 if (xenfb_enabled) {
diff --git a/include/hw/xen/xen_backend.h b/include/hw/xen/xen_backend.h
index 754c0a4..0df282a 100644
--- a/include/hw/xen/xen_backend.h
+++ b/include/hw/xen/xen_backend.h
@@ -87,6 +87,7 @@ void xen_be_check_state(struct XenDevice *xendev);
 
 /* xen backend driver bits */
 int xen_be_init(void);
+void xen_be_register_common(void);
 int xen_be_register(const char *type, struct XenDevOps *ops);
 int xen_be_set_state(struct XenDevice *xendev, enum xenbus_state state);
 int xen_be_bind_evtchn(struct XenDevice *xendev);
diff --git a/xen-hvm.c b/xen-hvm.c
index eb57792..3b0343a 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -1318,9 +1318,7 @@ void xen_hvm_init(PCMachineState *pcms, MemoryRegion 
**ram_memory)
 error_report("xen backend core setup failed");
 goto err;
 }
-xen_be_register("console", &xen_console_ops);
-xen_be_register("vkbd", &xen_kbdmouse_ops);
-xen_be_register("qdisk", &xen_blkdev_ops);
+xen_be_register_common();
 xen_read_physmap(state);
 return;
 
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/6] x86/time: calibrate TSC against platform timer

2016-08-03 Thread Andrew Cooper
On 03/08/16 14:00, Jan Beulich wrote:
> ... instead of unconditionally against the PIT. This allows for local
> and master system times to remain in better sync (which matters even
> when, on any modern system, the master time is really used only during
> secondary CPU bringup, as the error between the two is in fact
> noticable in cross-CPU NOW() invocation monotonicity).
>
> This involves moving the init_platform_timer() invocation into
> early_time_init(), splitting out the few things which really need to be
> done in init_xen_time(). That in turn allows dropping the open coded
> PIT initialization from init_IRQ() (it was needed for APIC clock
> calibration, which runs between early_time_init() and init_xen_time()).
>
> In the course of this re-ordering also set the timer channel 2 gate low
> after having finished calibration. This should be benign to overall
> system operation, but appears to be the more clean state.
>
> Also do away with open coded 8254 register manipulation from 8259 code.
>
> Signed-off-by: Jan Beulich 

Reviewed-by: Andrew Cooper 
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 99919: all pass - PUSHED

2016-08-03 Thread osstest service owner
flight 99919 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/99919/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf e80cb37ee7b507b6cfe4e4b0f23dc4c5cb2c1d5d
baseline version:
 ovmf deaacda3b2740477733564066eb69d5c94b41bba

Last test of basis99915  2016-08-03 07:44:35 Z0 days
Testing same since99919  2016-08-03 10:15:14 Z0 days1 attempts


People who touched revisions under test:
  Gary Lin 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=ovmf
+ revision=e80cb37ee7b507b6cfe4e4b0f23dc4c5cb2c1d5d
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push ovmf 
e80cb37ee7b507b6cfe4e4b0f23dc4c5cb2c1d5d
+ branch=ovmf
+ revision=e80cb37ee7b507b6cfe4e4b0f23dc4c5cb2c1d5d
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=ovmf
+ xenbranch=xen-unstable
+ '[' xovmf = xlinux ']'
+ linuxbranch=
+ '[' x = x ']'
+ qemuubranch=qemu-upstream-unstable
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable
+ prevxenbranch=xen-4.7-testing
+ '[' xe80cb37ee7b507b6cfe4e4b0f23dc4c5cb2c1d5d = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src 
'[fetch=try]'
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local 'options=[fetch=try]'
 getconfig GitCacheProxy
 perl -e '
use Osstest;
readglobalconfig();
print $c{"GitCacheProxy"} or die $!;
'
+++ local cache=git://cache:9419/
+++ '[' xgit://cache:9419/ '!=' x ']'
+++ echo 
'git://cache:9419/https://github.com/rumpkernel/rumpkernel-netbsd-src%20[fetch=try]'
++ : 
'git://cache:9419/h

Re: [Xen-devel] live migration to qemu.git fails

2016-08-03 Thread Anthony PERARD
On Wed, Aug 03, 2016 at 02:52:54PM +0200, Olaf Hering wrote:
> On Wed, Aug 03, Anthony PERARD wrote:
> 
> > Does the patch "xen: handle inbound migration of VMs without ioreq
> > server pages" from Paul fix the issue?
> > <1470042985-680-1-git-send-email-paul.durr...@citrix.com>
> > https://lists.xen.org/archives/html/xen-devel/2016-08/msg00020.html
> 
> Would that help with VMs created by staging-4.7 and its qemu-xen?
> I think that change is just for staging-4.4 and its qemu-xen.
> But I will try it anyway.

I quote your original email:

On Tue, Aug 02, 2016 at 11:58:05AM +0200, Olaf Hering wrote:
> Doing a live migration of a HVM domU from staging-4.5 with its included
> qemu-xen to staging-4.7 with qemu-xen from qemu.org (either master or
> 2.6.0) fails:

Haven't you try to create a guest with Xen 4.5 and qemu-xen-4.5, and
then migrate to Xen 4.7 with QEMU-2.6/master?

If you did a live migration with Xen 4.7 from qemu-xen-4.7 to
QEMU-master, the patch might not help.


-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 6/6] x86/time: relax barriers

2016-08-03 Thread Andrew Cooper
On 03/08/16 14:04, Jan Beulich wrote:
> On x86 there's no need for full barriers in loops waiting for some
> memory location to change. Nor do we need full barriers between two
> reads and two writes - SMP ones fully suffice (and I actually think
> they could in fact be dropped, since atomic_*() operations should
> already provide enough ordering).

Missing a SoB,

Which "ones" are you referring to?  atomic_*() is only ordered with
respect to the atomic_t used.

Overall, I think the change is correct, so Reviewed-by: Andrew Cooper


There are definitely more mis-uses of mandatory barriers in Xen,
although I haven't done a full audit yet.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] live migration to qemu.git fails

2016-08-03 Thread Olaf Hering
On Wed, Aug 03, Anthony PERARD wrote:

> Haven't you try to create a guest with Xen 4.5 and qemu-xen-4.5, and
> then migrate to Xen 4.7 with QEMU-2.6/master?

In the end I tried xen-4.5/6/7/8 as source and their qemu-xen, and
migrated to qemu#master. All failed.

Olaf


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 6/6] x86/time: relax barriers

2016-08-03 Thread Jan Beulich
>>> On 03.08.16 at 15:11,  wrote:
> On 03/08/16 14:04, Jan Beulich wrote:
>> On x86 there's no need for full barriers in loops waiting for some
>> memory location to change. Nor do we need full barriers between two
>> reads and two writes - SMP ones fully suffice (and I actually think
>> they could in fact be dropped, since atomic_*() operations should
>> already provide enough ordering).
> 
> Missing a SoB,

Oops.

> Which "ones" are you referring to?  atomic_*() is only ordered with
> respect to the atomic_t used.

Oh, right - the one followed by atomic_inc() could be converted
to plain barrier() (but not dropped entirely), while the other one
follows an atomic_read() only, and hence can't be dropped. I'll
remove that part of the description.

> Overall, I think the change is correct, so Reviewed-by: Andrew Cooper
> 

Thanks.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabled by default

2016-08-03 Thread Kevin.Mayer
Hi guys

I got around to take a closer look at the crash dump today.

tl;dr:
You were right, vmx_vmenter_helper is not called at all in the call stack.
The real reason behind the [] vmx_vmenter_helper+0x27e/0x30a 
should be a failed
__vmwrite(HOST_CR0, v->arch.hvm_vmx.host_cr0); in static void 
vmx_fpu_leave(struct vcpu *v).
Long story in "Chapter1".

Concerning the stray vmx_vcpu_update_eptp:
This seems to be leftovers (either due to a corrupted stack or simply 
uninitialized local variables) of previous function calls originating in 
hvm_vcpu_destroy.
More precisely:
if ( hvm_altp2m_supported() )
altp2m_vcpu_destroy(v);
being called BEFORE free_compat_arg_xlat.
I assume some kind of error in the altp2m_vcpu_destroy-path to be responsible 
for the crash, but I have no idea where and how to start investigating.
Long story in "Chapter2".



Chapter1:
I started with a function in the callstack and followed the assembly code to 
deduce where the (XEN)[] vmx_vmenter_helper+0x27e/0x30a 
comes from:

sync_local_execstate:
0x82d080178c36 : mov%rsp,%rbp
0x82d080178c39 : callq  0x82d080178bbb 
<__sync_local_execstate>
0x82d080178c3e : pop%rbp

__sync_local_execstate:
0x82d080178c09 <__sync_local_execstate+78>:  cmp%rsi,0x7fe8(%rax)
0x82d080178c10 <__sync_local_execstate+85>:  je 0x82d080178c14 
<__sync_local_execstate+89>
0x82d080178c12 <__sync_local_execstate+87>:  ud2
0x82d080178c13 <__sync_local_execstate+88>:  or %eax,%ebp
0x82d080178c15 <__sync_local_execstate+90>:  popfq
0x82d080178c16 <__sync_local_execstate+91>:  retq   $0x
0x82d080178c19 <__sync_local_execstate+94>:  and$0x200,%ebx

Here crash / gdb seem to get confused with the je.

crash> x /3i __sync_local_execstate+89
   0x82d080178c14 <__sync_local_execstate+89>:  callq  
0x82d080174eb6 <__context_switch>
   0x82d080178c19 <__sync_local_execstate+94>:  and$0x200,%ebx
   0x82d080178c1f <__sync_local_execstate+100>: pushfq
   
It seems this code calls the __context_switch:
switch_required = (this_cpu(curr_vcpu) != current);

if ( switch_required )
{
ASSERT(current == idle_vcpu[smp_processor_id()]);
__context_switch();
}

Up to the __context_switch everything seems to be running as it should. Except 
for the stray
[830617fd7d38] vmx_vcpu_update_eptp at 82d0801f7c6b
which can be found in the "crash> bt" output but not in the "dmesg".

__context_switch:
0x82d080174f7f <__context_switch+201>:   mov%r14,%rdi
0x82d080174f82 <__context_switch+204>:   callq  0x82d08017c474 

0x82d080174f87 <__context_switch+209>:   mov%r14,%rdi
0x82d080174f8a <__context_switch+212>:   callq  *0x3a8(%r14)

Following r14 / rdi ( 0x8300e6fc ) as given in the crash dump seemingly 
leads to a vtable with a function pointer at the offset 0x3a8:
0x82d0801fa06e
crash> x /i 0x82d0801fa06e
   0x82d0801fa06e :   push   %rbp

This call, which does not show up in the backtrace, is expected at this 
position when looking at the C-code:
static void __context_switch(void)
[...]
if ( !is_idle_domain(pd) )
{
memcpy(&p->arch.user_regs, stack_regs, CTXT_SWITCH_STACK_BYTES);
vcpu_save_fpu(p);
p->arch.ctxt_switch_from(p);
[...]

as it is set in:
static int vmx_vcpu_initialise(struct vcpu *v)
[...]
v->arch.ctxt_switch_from = vmx_ctxt_switch_from;
[...]

Finally at:

0x82d0801fa0c3 :mov$0x6c00,%edx
0x82d0801fa0c8 :vmwrite %rax,%rdx
0x82d0801fa0cb :jbe0x82d0801fd23a

The jump to [] vmx_vmenter_helper+0x27e/0x30a (ud2 following 
vmx_vmenter_helper) is done.
vmx_ctxt_switch_from is rather short in C and the called static functions are 
inlined.
static void vmx_ctxt_switch_from(struct vcpu *v)
{
/*
 * Return early if trying to do a context switch without VMX enabled,
 * this can happen when the hypervisor shuts down with HVM guests
 * still running.
 */
if ( unlikely(!this_cpu(vmxon)) )
return;

vmx_fpu_leave(v);
vmx_save_guest_msrs(v);
vmx_restore_host_msrs();
vmx_save_dr(v);
}

The unlikely path is not taken and the two ud2 (I assume the ud2 are the 
ASSERTs in vmx_fpu_leave?) are not reached either:
0x82d0801fa077 : lea 0x15c692(%rip),%rax# 
0x82d080356710 
0x82d0801fa07e :mov%rsp,%rdx
0x82d0801fa081 :and
$0x8000,%rdx
0x82d0801fa088 :mov 0x7ff0(%rdx),%rdx
0x82d0801fa08f :cmpb   $0x0,(%rdx,%rax,1)
0x82d0801fa093 :je  0x82d0801fa1d9 

0x82d0801fa099 :cmpb   $0x0,0x109(%rdi)
0x82d0801fa0a0 :je  0x82d0801fa0a4 

0x82d0801fa0a2 :ud2
0x82d0801fa0a3 :or (%rdi),%ecx
0x82d0801fa0a5 :and%al,%al
0x82d0801fa0a7 :test   $0x8,%al
0x82d0801fa0a9 :jne 0x82d0801fa0ad 

0x82d080

Re: [Xen-devel] Xen 4.7.0 boot PANIC on kernel 4.7.0-4 + UEFI ?

2016-08-03 Thread lists
On Wed, Aug 3, 2016, at 02:01 AM, Jan Beulich wrote:
> Thanks. Does the use of /mapbs really matter for booting? I was
> assuming it would be relevant only for shutdown/reboot?

It has no effect on boot. With or without the "/mapbs" it boots Xen OK.

Without the "/mapbs" the system used to crash on reboot, but still continued 
with a reboot.  Reboot with the "/mapbs" was OK.

Now, even with the "/mapbs" the system crashes on reboot.


[  196.052042] reboot: Restarting system
(XEN) [2016-08-03 13:06:54] Hardware Dom0 shutdown: rebooting machine
(XEN) [2016-08-03 13:06:54] APIC error on CPU0: 40(00)
(XEN) [2016-08-03 13:06:54] [ Xen-4.7.0_09-454  x86_64  debug=n  
Not tainted ]
(XEN) [2016-08-03 13:06:54] CPU:0
(XEN) [2016-08-03 13:06:54] RIP:e008:[<9e746340>] 
9e746340
(XEN) [2016-08-03 13:06:54] RFLAGS: 00010202   CONTEXT: 
hypervisor (d0v0)
(XEN) [2016-08-03 13:06:54] rax: 9e746340   rbx: 
   rcx: 
(XEN) [2016-08-03 13:06:54] rdx:    rsi: 
   rdi: 
(XEN) [2016-08-03 13:06:54] rbp:    rsp: 
83008ce27dc0   r8:  
(XEN) [2016-08-03 13:06:54] r9:     r10: 
   r11: 
(XEN) [2016-08-03 13:06:54] r12:    r13: 
0cf9   r14: 0065
(XEN) [2016-08-03 13:06:54] r15: 8300   cr0: 
80050033   cr4: 001526e0
(XEN) [2016-08-03 13:06:54] cr3: 00084b4e5000   cr2: 
9e746340
(XEN) [2016-08-03 13:06:54] ds:    es:    fs:    gs:    
ss: e010   cs: e008
(XEN) [2016-08-03 13:06:54] Xen code around <9e746340> 
(9e746340):
(XEN) [2016-08-03 13:06:54]  00 00 00 00 00 00 00 00 <00> 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00
(XEN) [2016-08-03 13:06:54] Xen stack trace from rsp=83008ce27dc0:
(XEN) [2016-08-03 13:06:55]9efe42f6 0cf9 
82d08022f741 efff
(XEN) [2016-08-03 13:06:55]82d0807fe000 00084b4e5000 
82d08022f94a 00085da98000
(XEN) [2016-08-03 13:06:55] 0007 
e008 0292
Initializing Adapter 0

...
XEN) [2016-08-03 13:30:44] Xen call trace:
(XEN) [2016-08-03 13:30:44][<9e7463c6>] 9e7463c6
(XEN) [2016-08-03 13:30:44][] 
i387.c#_vcpu_save_fpu+0x86/0x190
(XEN) [2016-08-03 13:30:44][] 
efi_reset_system+0x3a/0x60
(XEN) [2016-08-03 13:30:44][] 
machine_restart+0x208/0x2d0
(XEN) [2016-08-03 13:30:44][] 
hwdom_shutdown+0x7d/0xf0
(XEN) [2016-08-03 13:30:44][] 
domain_shutdown+0xf1/0x100
(XEN) [2016-08-03 13:30:44][] 
do_sched_op+0x1b2/0x460
(XEN) [2016-08-03 13:30:44][] 
lstar_enter+0x9b/0xa0
(XEN) [2016-08-03 13:30:44]
(XEN) [2016-08-03 13:30:44]
(XEN) [2016-08-03 13:30:44] 
(XEN) [2016-08-03 13:30:44] Panic on CPU 0:
(XEN) [2016-08-03 13:30:44] GENERAL PROTECTION FAULT
(XEN) [2016-08-03 13:30:44] [error_code=]
(XEN) [2016-08-03 13:30:44] 

And still reboots.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Device model operation hypercall (DMOP, re qemu depriv)

2016-08-03 Thread Ian Jackson
Jan Beulich writes ("Re: Device model operation hypercall (DMOP, re qemu 
depriv)"):
> On 03.08.16 at 12:29,  wrote:
> > I thought it useful to set out the DMOP proposal from first
> > principles, with clear motivation, discussion of not-chosen
> > alternatives, and of course with a clear statement of the principles
> > of operation and of the security design.
> 
> Okay; nevertheless I did get the feeling that some of this was
> prompted by the hvmctl series posting.

Well, I think the HVMCTL series is an important part of this
conversation - even though this DMOP idea is motivated by the qemu
depriv proposals (initiated mostly by Stefano in the last release
cycle).

> > Does that mean that functionality exposed by all the prooposed HVMCTLs
> > is currently available to stubdoms ?
> 
> That series only moves code from one hypercall to another (new) one,
> without any security implications at all. What has been available to
> stubdoms prior to that series will be available the same way once it
> got applied.

So what you mean is that in HVMCTL, the privilege check is retained in
the individual HVMCTL sub-operation ?  (Ie what used to be IS_PRIV or
IS_PRIV_FOR - and implicitly, some sub-ops would be IS_PRIV and some
IS_PRIV_FOR.)

But looking at your v2 01/11, I see this in do_hvmctl:

   +rc = xsm_hvm_control(XSM_DM_PRIV, d, op.cmd);
   +if ( rc )
   +{
   +rcu_unlock_domain(d);
   +return rc;
   +}

And looking at v2 02/11, I didn't see any privilege check in the
specific hypercall.

With DMOP it would make sense for the privilege check to be
IS_PRIV_FOR, in the DMOP dispatcher.  But it seems that this privilege
check already exists in HVMCTL in the right form.

So AFAICT HVMCTLs already guarantee not to have an adverse impact on
the whole system.  If that weren't the case, then a stub dm could
exploit the situation.

Is the audit that is required, to check that the DMOP doesn't have an
adverse effect on the _calling domain_ ?  AFAICT most HVMCTLs/DMOPs
have /no/ effect on the calling domain, other than as part of argument
passing.  So that audit should be easy.

So I am confused.  What am I missing ?

> > If the 01/ patch contains such promises, then logically the 02/ patch
> > which introduces the first DMOP is extending that promise to that
> > operation.  It is at that point that the security decision should be
> > made.
> 
> Correct. Yet again the original goal of the series was just proper
> separation of two groups of operations that should never have
> been all thrown under the same hypercall.

What I am doing is presenting another, additional goal.  I don't
dispute the usefulness of the HVMCTL cleanup.  But I want to
understand our future intentions.

In particular, as a proponent of the DMOP idea, I want to avoid the
situation that my idea is blocked somehow by a conflicting series.

> > Now, there may be other ways to represent/record the security status.
> > But it will be necessary to either (i) avoid violating the DMOP
> > security promise, by making questionable calls not available via DMOP
> > or (ii) trying to retrofit the security promise to DMOP later.
> > 
> > I think (ii) is not a good approach.  It would amount to introducing a
> > whole new set of interfaces, and then later trying to redefine them to
> > have a particular security property which was not originally there.
> 
> I agree that (i) would be the better approach, but I don't think I
> can assess when I would find the time to do the required auditing
> of all involved code. Plus I don't see the difference between going
> the (ii) route with the presented hvmctl series vs keeping things as
> they are (under hvmop) - in both cases the security promise will
> need to be retrofit onto existing code.

If we don't apply HVMCTL, we can introduce DMOP and then individually
move hypercalls from hvmop to DMOP as they are audited.

Would a similar approach be acceptable even after HVMCTL ?

That is, the following plan:

1. Apply HVMCTL right away.  This solves the cleanup problem,
   but leaves the qemu depriv problem unsolved.

2. After the necessary discussion to understand and refine the DMOP
   design, create a new DMOP hypercall with appropriate security
   promises and whatever stability promises are agreed, but with no
   sub-ops.

3. Move each HVMCTL to DMOP, one by one, as it is audited.

4. Perhaps some HVMCTLs will remain which are inherently unsuitable
   for use with qemu depriv.  If not, perhaps eventually delete HVMCTL
   entirely, or leave it empty (with no subops).

This has the downside that we end up moving all the DMOPs twice.  But
it does allow us to separate the audit work from the cleanup/reorg.

Regards,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] x86/hap: use the right cache attributes when MTRR is disabled

2016-08-03 Thread Jan Beulich
>>> On 02.08.16 at 15:55,  wrote:
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2283,7 +2283,11 @@ int hvm_set_cr0(unsigned long value, bool_t may_defer)
>  if ( !nestedhvm_vmswitch_in_progress(v) && 
> nestedhvm_vcpu_in_guestmode(v) )
>  paging_update_nestedmode(v);
>  else
> +{
>  paging_update_paging_modes(v);
> +/* Force an update of the memory cache attributes. */
> +memory_type_changed(d);
> +}

Why only in the "else" case, rather than after the if/else block?

> --- a/xen/arch/x86/hvm/mtrr.c
> +++ b/xen/arch/x86/hvm/mtrr.c
> @@ -814,10 +814,19 @@ int epte_get_entry_emt(struct domain *d, unsigned long 
> gfn, mfn_t mfn,
>  if ( gmtrr_mtype == -EADDRNOTAVAIL )
>  return -1;
>  
> -gmtrr_mtype = is_hvm_domain(d) && v ?
> -  get_mtrr_type(&v->arch.hvm_vcpu.mtrr,
> -gfn << PAGE_SHIFT, order) :
> -  MTRR_TYPE_WRBACK;
> +if ( !v )
> +gmtrr_mtype = MTRR_TYPE_WRBACK;
> +else if ( v->arch.hvm_vcpu.mtrr.enabled & 0x2 )
> +/* MTRR is enabled, use MTRR. */
> +gmtrr_mtype = get_mtrr_type(&v->arch.hvm_vcpu.mtrr, gfn << 
> PAGE_SHIFT,
> +order);
> +else if ( !hvm_paging_enabled(v) )
> +/* MTRR is not enabled and paging is disabled, force UC. */
> +gmtrr_mtype = MTRR_TYPE_UNCACHABLE;
> +else
> +/* MTRR is not enabled and paging is enabled, use PAT. */
> +gmtrr_mtype = MTRR_TYPE_WRBACK;

I'm afraid this needs to be UC, or else you provide non-architectural
behavior to the guest (see the SDM's description of the enable flag
being clear). Which means the original code was correct. I'm sorry
for not having checked the SDM upon looking at v1 already.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabled by default

2016-08-03 Thread Jan Beulich
>>> On 03.08.16 at 15:24,  wrote:
> I got around to take a closer look at the crash dump today.
> 
> tl;dr:
> You were right, vmx_vmenter_helper is not called at all in the call stack.
> The real reason behind the [] 
> vmx_vmenter_helper+0x27e/0x30a 
> should be a failed
> __vmwrite(HOST_CR0, v->arch.hvm_vmx.host_cr0); in static void 
> vmx_fpu_leave(struct vcpu *v).

Ah - that's what you get for not using most recent code, and what
I get for not considering the effect of you being on 4.6.x. In any
event - the call stack is then fine, and you'll want to figure out
which bit(s) of the new CR0 value are in conflict with the rest of the
active VMCS.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.7.0 boot PANIC on kernel 4.7.0-4 + UEFI ?

2016-08-03 Thread Jan Beulich
>>> On 03.08.16 at 15:33,  wrote:
> On Wed, Aug 3, 2016, at 02:01 AM, Jan Beulich wrote:
>> Thanks. Does the use of /mapbs really matter for booting? I was
>> assuming it would be relevant only for shutdown/reboot?
> 
> It has no effect on boot. With or without the "/mapbs" it boots Xen OK.

As expected.

> Without the "/mapbs" the system used to crash on reboot, but still continued 
> with a reboot.  Reboot with the "/mapbs" was OK.
> 
> Now, even with the "/mapbs" the system crashes on reboot.

But afaict in a different way:

>   [  196.052042] reboot: Restarting system
>   (XEN) [2016-08-03 13:06:54] Hardware Dom0 shutdown: rebooting machine
>   (XEN) [2016-08-03 13:06:54] APIC error on CPU0: 40(00)
>   (XEN) [2016-08-03 13:06:54] [ Xen-4.7.0_09-454  x86_64  debug=n  
> Not tainted ]
>   (XEN) [2016-08-03 13:06:54] CPU:0
>   (XEN) [2016-08-03 13:06:54] RIP:e008:[<9e746340>] 
> 9e746340
>[...]
>   (XEN) [2016-08-03 13:30:44] 
>   (XEN) [2016-08-03 13:30:44] Panic on CPU 0:
>   (XEN) [2016-08-03 13:30:44] GENERAL PROTECTION FAULT
>   (XEN) [2016-08-03 13:30:44] [error_code=]
>   (XEN) [2016-08-03 13:30:44] 

A #GP fault in firmware code. Not much we can do about, I'm afraid,
except for having you go with one of the mentioned workarounds
(and perhaps getting in touch with the BIOS vendor).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2] hvmloader: don't hard-code IO-APIC parameters

2016-08-03 Thread Jan Beulich
The IO-APIC address has variable bits determined by the PCI-to-ISA
bridge (albeit for now we refrain from actually evaluating them, as
there's still implicit rather than explicit agreement on the IO-APIC
base address between qemu and the hypervisor), and the IO-APIC version
should be read from the IO-APIC.

Signed-off-by: Jan Beulich 
---
v2: Comment out ioapic_base_address adjustment for now (see comment
in apic_setup()).

--- a/tools/firmware/hvmloader/acpi/build.c
+++ b/tools/firmware/hvmloader/acpi/build.c
@@ -137,7 +137,7 @@ static struct acpi_20_madt *construct_ma
 io_apic->type= ACPI_IO_APIC;
 io_apic->length  = sizeof(*io_apic);
 io_apic->ioapic_id   = IOAPIC_ID;
-io_apic->ioapic_addr = IOAPIC_BASE_ADDRESS;
+io_apic->ioapic_addr = ioapic_base_address;
 
 lapic = (struct acpi_20_madt_lapic *)(io_apic + 1);
 info->nr_cpus = hvm_info->nr_vcpus;
--- a/tools/firmware/hvmloader/config.h
+++ b/tools/firmware/hvmloader/config.h
@@ -42,9 +42,10 @@ extern struct bios_config ovmf_config;
 #define PAGE_SHIFT 12
 #define PAGE_SIZE  (1ul << PAGE_SHIFT)
 
-#define IOAPIC_BASE_ADDRESS 0xfec0
+extern uint32_t ioapic_base_address;
+extern uint8_t ioapic_version;
+
 #define IOAPIC_ID   0x01
-#define IOAPIC_VERSION  0x11
 
 #define LAPIC_BASE_ADDRESS  0xfee0
 #define LAPIC_ID(vcpu_id)   ((vcpu_id) * 2)
--- a/tools/firmware/hvmloader/hvmloader.c
+++ b/tools/firmware/hvmloader/hvmloader.c
@@ -108,6 +108,9 @@ asm (
 
 unsigned long scratch_start = SCRATCH_PHYSICAL_ADDRESS;
 
+uint32_t ioapic_base_address = 0xfec0;
+uint8_t ioapic_version;
+
 static void init_hypercalls(void)
 {
 uint32_t eax, ebx, ecx, edx;
@@ -185,6 +188,14 @@ static void init_vm86_tss(void)
 
 static void apic_setup(void)
 {
+/*
+ * This would the The Right Thing To Do (tm), if only qemu
+ * negotiated with Xen where the IO-APIC actually sits. Uncomment
+ * this code once that is the case.
+ioapic_base_address |= (pci_readb(PCI_ISA_DEVFN, 0x80) & 0x3f) << 10;
+ */
+ioapic_version = ioapic_read(0x01) & 0xff;
+
 /* Set the IOAPIC ID to the static value used in the MP/ACPI tables. */
 ioapic_write(0x00, IOAPIC_ID);
 
--- a/tools/firmware/hvmloader/mp_tables.c
+++ b/tools/firmware/hvmloader/mp_tables.c
@@ -227,9 +227,9 @@ static void fill_mp_ioapic_entry(struct
 {
 mpie->type = ENTRY_TYPE_IOAPIC;
 mpie->ioapic_id = IOAPIC_ID;
-mpie->ioapic_version = IOAPIC_VERSION;
+mpie->ioapic_version = ioapic_version;
 mpie->ioapic_flags = 1; /* enabled */
-mpie->ioapic_addr = IOAPIC_BASE_ADDRESS;
+mpie->ioapic_addr = ioapic_base_address;
 }
 
 
--- a/tools/firmware/hvmloader/util.c
+++ b/tools/firmware/hvmloader/util.c
@@ -490,14 +490,14 @@ void *scratch_alloc(uint32_t size, uint3
 
 uint32_t ioapic_read(uint32_t reg)
 {
-*(volatile uint32_t *)(IOAPIC_BASE_ADDRESS + 0x00) = reg;
-return *(volatile uint32_t *)(IOAPIC_BASE_ADDRESS + 0x10);
+*(volatile uint32_t *)(ioapic_base_address + 0x00) = reg;
+return *(volatile uint32_t *)(ioapic_base_address + 0x10);
 }
 
 void ioapic_write(uint32_t reg, uint32_t val)
 {
-*(volatile uint32_t *)(IOAPIC_BASE_ADDRESS + 0x00) = reg;
-*(volatile uint32_t *)(IOAPIC_BASE_ADDRESS + 0x10) = val;
+*(volatile uint32_t *)(ioapic_base_address + 0x00) = reg;
+*(volatile uint32_t *)(ioapic_base_address + 0x10) = val;
 }
 
 uint32_t lapic_read(uint32_t reg)



hvmloader: don't hard-code IO-APIC parameters

The IO-APIC address has variable bits determined by the PCI-to-ISA
bridge (albeit for now we refrain from actually evaluating them, as
there's still implicit rather than explicit agreement on the IO-APIC
base address between qemu and the hypervisor), and the IO-APIC version
should be read from the IO-APIC.

Signed-off-by: Jan Beulich 
---
v2: Comment out ioapic_base_address adjustment for now (see comment
in apic_setup()).

--- a/tools/firmware/hvmloader/acpi/build.c
+++ b/tools/firmware/hvmloader/acpi/build.c
@@ -137,7 +137,7 @@ static struct acpi_20_madt *construct_ma
 io_apic->type= ACPI_IO_APIC;
 io_apic->length  = sizeof(*io_apic);
 io_apic->ioapic_id   = IOAPIC_ID;
-io_apic->ioapic_addr = IOAPIC_BASE_ADDRESS;
+io_apic->ioapic_addr = ioapic_base_address;
 
 lapic = (struct acpi_20_madt_lapic *)(io_apic + 1);
 info->nr_cpus = hvm_info->nr_vcpus;
--- a/tools/firmware/hvmloader/config.h
+++ b/tools/firmware/hvmloader/config.h
@@ -42,9 +42,10 @@ extern struct bios_config ovmf_config;
 #define PAGE_SHIFT 12
 #define PAGE_SIZE  (1ul << PAGE_SHIFT)
 
-#define IOAPIC_BASE_ADDRESS 0xfec0
+extern uint32_t ioapic_base_address;
+extern uint8_t ioapic_version;
+
 #define IOAPIC_ID   0x01
-#define IOAPIC_VERSION  0x11
 
 #define LAPIC_BASE_ADDRESS  0xfee0
 #define LAPIC_ID(vcpu_id)   ((vcpu_id) * 2)
--- a/tools/firmware/hvmloader/hvmloader.c
+++ b/tools/firmware/hvmloader/hvmloader.c
@@ -108,6 +108,9 @@ asm (
 
 unsigned long 

[Xen-devel] [xen-unstable-smoke test] 99921: tolerable all pass - PUSHED

2016-08-03 Thread osstest service owner
flight 99921 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/99921/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  45a348ebaba3f6e0a26455a3ff181a41722943a0
baseline version:
 xen  e9522e4932aaa7f083688b6612b5897839409260

Last test of basis99909  2016-08-02 19:03:26 Z0 days
Testing same since99921  2016-08-03 12:02:09 Z0 days1 attempts


People who touched revisions under test:
  Juergen Gross 
  Wei Liu 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=45a348ebaba3f6e0a26455a3ff181a41722943a0
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
45a348ebaba3f6e0a26455a3ff181a41722943a0
+ branch=xen-unstable-smoke
+ revision=45a348ebaba3f6e0a26455a3ff181a41722943a0
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.7-testing
+ '[' x45a348ebaba3f6e0a26455a3ff181a41722943a0 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src 
'[fetch=try]'
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local 'options=[fetch=try]'
 getconfig GitCacheProxy
 perl -e '
use Osstest;
readglobalconfig();
print $c{"GitCacheProxy"} or die $!;
'
+++ local cache=git://cache:9419/
+++ '[' xgit://cache:9419/ '!=' x ']'
+++ echo 
'git://cache:9419/ht

Re: [Xen-devel] Xen 4.7.0 boot PANIC on kernel 4.7.0-4 + UEFI ?

2016-08-03 Thread Andrew Cooper
On 03/08/16 14:57, Jan Beulich wrote:
 On 03.08.16 at 15:33,  wrote:
>> On Wed, Aug 3, 2016, at 02:01 AM, Jan Beulich wrote:
>>> Thanks. Does the use of /mapbs really matter for booting? I was
>>> assuming it would be relevant only for shutdown/reboot?
>> It has no effect on boot. With or without the "/mapbs" it boots Xen OK.
> As expected.
>
>> Without the "/mapbs" the system used to crash on reboot, but still continued 
>> with a reboot.  Reboot with the "/mapbs" was OK.
>>
>> Now, even with the "/mapbs" the system crashes on reboot.
> But afaict in a different way:
>
>>  [  196.052042] reboot: Restarting system
>>  (XEN) [2016-08-03 13:06:54] Hardware Dom0 shutdown: rebooting machine
>>  (XEN) [2016-08-03 13:06:54] APIC error on CPU0: 40(00)
>>  (XEN) [2016-08-03 13:06:54] [ Xen-4.7.0_09-454  x86_64  debug=n  
>> Not tainted ]
>>  (XEN) [2016-08-03 13:06:54] CPU:0
>>  (XEN) [2016-08-03 13:06:54] RIP:e008:[<9e746340>] 
>> 9e746340
>> [...]
>>  (XEN) [2016-08-03 13:30:44] 
>>  (XEN) [2016-08-03 13:30:44] Panic on CPU 0:
>>  (XEN) [2016-08-03 13:30:44] GENERAL PROTECTION FAULT
>>  (XEN) [2016-08-03 13:30:44] [error_code=]
>>  (XEN) [2016-08-03 13:30:44] 
> A #GP fault in firmware code. Not much we can do about, I'm afraid,
> except for having you go with one of the mentioned workarounds
> (and perhaps getting in touch with the BIOS vendor).

One bit you snipped was

(XEN) [2016-08-03 13:06:54] Xen code around <9e746340>
(9e746340):
(XEN) [2016-08-03 13:06:54]  00 00 00 00 00 00 00 00 <00> 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00
(XEN) [2016-08-03 13:06:54] Xen stack trace from rsp=83008ce27dc0:

which shows that the EFI firmware has ended up in a block of zeroes. 
This clearly isn't conducive to its overall health.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] hvmloader: don't hard-code IO-APIC parameters

2016-08-03 Thread Andrew Cooper
On 03/08/16 14:58, Jan Beulich wrote:
> @@ -185,6 +188,14 @@ static void init_vm86_tss(void)
>  
>  static void apic_setup(void)
>  {
> +/*
> + * This would the The Right Thing To Do (tm), if only qemu
> + * negotiated with Xen where the IO-APIC actually sits. Uncomment
> + * this code once that is the case.

I would possibly extend the text with "This is currently hard coded in
Xen and can't be controlled externally."

Either way, Reviewed-by: Andrew Cooper 

> +ioapic_base_address |= (pci_readb(PCI_ISA_DEVFN, 0x80) & 0x3f) << 10;
> + */
> +ioapic_version = ioapic_read(0x01) & 0xff;
> +
>  /* Set the IOAPIC ID to the static value used in the MP/ACPI tables. */
>  ioapic_write(0x00, IOAPIC_ID);
>  

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 00/25] arm/altp2m: Introducing altp2m to ARM.

2016-08-03 Thread Julien Grall

Hello Sergej,

Please try to reply to all when answering on the ML. Otherwise the 
answer may be delayed/lost.


On 03/08/16 13:45, Sergej Proskurin wrote:

The interesting part about #VE is that it allows to handle certain
violations (currently limited to EPT violations -- future
implementations might introduce also further violations) inside of the
guest, without the need to explicitly trap into the VMM. Thus, #VE allow
switching of different memory views in-guest. Because of this, I also
agree that event channels would suffice in our case, since we do not
have sufficient hardware support on ARM and would need to trap into the
VMM anyway.


The cost of doing an hypercall on ARM is very small compare to x86 (~1/3 
of the number of x86 cycles) because we don't have to save all the state 
every time. So I am not convinced by the argument of limiting the number 
of trap to the hypervisor and allow a guest to play with altp2m on ARM.


I will have to see a concrete example before going forward with the 
event channel.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 00/25] arm/altp2m: Introducing altp2m to ARM.

2016-08-03 Thread Sergej Proskurin
Hi Julien,


On 08/03/2016 04:08 PM, Julien Grall wrote:
> Hello Sergej,
>
> Please try to reply to all when answering on the ML. Otherwise the
> answer may be delayed/lost.
>

Right, sorry.

> On 03/08/16 13:45, Sergej Proskurin wrote:
>> The interesting part about #VE is that it allows to handle certain
>> violations (currently limited to EPT violations -- future
>> implementations might introduce also further violations) inside of the
>> guest, without the need to explicitly trap into the VMM. Thus, #VE allow
>> switching of different memory views in-guest. Because of this, I also
>> agree that event channels would suffice in our case, since we do not
>> have sufficient hardware support on ARM and would need to trap into the
>> VMM anyway.
>
> The cost of doing an hypercall on ARM is very small compare to x86
> (~1/3 of the number of x86 cycles) because we don't have to save all
> the state every time. So I am not convinced by the argument of
> limiting the number of trap to the hypervisor and allow a guest to
> play with altp2m on ARM.
>

I am not saying that we will be able to handle traps inside of the guest
-- simply because we don't have sufficient hardware support on ARM. What
I am trying to say is that an emulation of #VE is not required at this
point but could be implemented in the future by means of event channels,
if required.

Best regards,
~Sergej


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Device model operation hypercall (DMOP, re qemu depriv)

2016-08-03 Thread George Dunlap
On 03/08/16 15:16, Jan Beulich wrote:
 On 03.08.16 at 15:37,  wrote:
>> Jan Beulich writes ("Re: Device model operation hypercall (DMOP, re qemu 
>> depriv)"):
>>> On 03.08.16 at 12:29,  wrote:
 Does that mean that functionality exposed by all the prooposed HVMCTLs
 is currently available to stubdoms ?
>>>
>>> That series only moves code from one hypercall to another (new) one,
>>> without any security implications at all. What has been available to
>>> stubdoms prior to that series will be available the same way once it
>>> got applied.
>>
>> So what you mean is that in HVMCTL, the privilege check is retained in
>> the individual HVMCTL sub-operation ?  (Ie what used to be IS_PRIV or
>> IS_PRIV_FOR - and implicitly, some sub-ops would be IS_PRIV and some
>> IS_PRIV_FOR.)
>>
>> But looking at your v2 01/11, I see this in do_hvmctl:
>>
>>+rc = xsm_hvm_control(XSM_DM_PRIV, d, op.cmd);
>>+if ( rc )
>>+{
>>+rcu_unlock_domain(d);
>>+return rc;
>>+}
>>
>> And looking at v2 02/11, I didn't see any privilege check in the
>> specific hypercall.
> 
> Of course - there were no explicit IS_PRIV{,_FOR}() uses before
> (i.e. the patch also doesn't remove any), these all sit behind the
> XSM hook. And no, there are no per-sub-op privilege checks,
> they're being consolidated to just the one you quote above.
> 
>> With DMOP it would make sense for the privilege check to be
>> IS_PRIV_FOR, in the DMOP dispatcher.  But it seems that this privilege
>> check already exists in HVMCTL in the right form.
>>
>> So AFAICT HVMCTLs already guarantee not to have an adverse impact on
>> the whole system.  If that weren't the case, then a stub dm could
>> exploit the situation.
> 
> And to tell you the truth, I'm not entirely convinced that all the
> code implementing those operations (be it in there current hvmop
> form or the new hvmctl one - again, the series only moves code
> around) is really safe in this regard. But with there even being
> at least one domctl not on xsm-flask.txt's safe-for-disaggregation
> list, but reportedly used by qemu (I don't recall right now which
> exact one it is), stubdom-s can't be considered fully secure right
> now anyway.
> 
>> Is the audit that is required, to check that the DMOP doesn't have an
>> adverse effect on the _calling domain_ ?  AFAICT most HVMCTLs/DMOPs
>> have /no/ effect on the calling domain, other than as part of argument
>> passing.  So that audit should be easy.
>>
>> So I am confused.  What am I missing ?
> 
> The main adverse effect on the whole system that I can see
> would be long latency operations, but I can't exclude others: Just
> look at the final patch of the series, which fixes a serialization bug
> which I noticed while doing the code movement. I don't think lack
> of proper serialization is guaranteed to only affect the target
> domain.
> 
 Now, there may be other ways to represent/record the security status.
 But it will be necessary to either (i) avoid violating the DMOP
 security promise, by making questionable calls not available via DMOP
 or (ii) trying to retrofit the security promise to DMOP later.

 I think (ii) is not a good approach.  It would amount to introducing a
 whole new set of interfaces, and then later trying to redefine them to
 have a particular security property which was not originally there.
>>>
>>> I agree that (i) would be the better approach, but I don't think I
>>> can assess when I would find the time to do the required auditing
>>> of all involved code. Plus I don't see the difference between going
>>> the (ii) route with the presented hvmctl series vs keeping things as
>>> they are (under hvmop) - in both cases the security promise will
>>> need to be retrofit onto existing code.
>>
>> If we don't apply HVMCTL, we can introduce DMOP and then individually
>> move hypercalls from hvmop to DMOP as they are audited.
>>
>> Would a similar approach be acceptable even after HVMCTL ?
>>
>> That is, the following plan:
>>
>> 1. Apply HVMCTL right away.  This solves the cleanup problem,
>>but leaves the qemu depriv problem unsolved.
>>
>> 2. After the necessary discussion to understand and refine the DMOP
>>design, create a new DMOP hypercall with appropriate security
>>promises and whatever stability promises are agreed, but with no
>>sub-ops.
>>
>> 3. Move each HVMCTL to DMOP, one by one, as it is audited.
>>
>> 4. Perhaps some HVMCTLs will remain which are inherently unsuitable
>>for use with qemu depriv.  If not, perhaps eventually delete HVMCTL
>>entirely, or leave it empty (with no subops).
>>
>> This has the downside that we end up moving all the DMOPs twice.  But
>> it does allow us to separate the audit work from the cleanup/reorg.
> 
> Well, moving things twice doesn't sound attractive. I was rather
> thinking of declaring hvmctl (or dmop, if that's to intended name)
> sub-ops secure one by one as they get audited.

Re: [Xen-devel] Device model operation hypercall (DMOP, re qemu depriv)

2016-08-03 Thread Jan Beulich
>>> On 03.08.16 at 15:37,  wrote:
> Jan Beulich writes ("Re: Device model operation hypercall (DMOP, re qemu 
> depriv)"):
>> On 03.08.16 at 12:29,  wrote:
>> > Does that mean that functionality exposed by all the prooposed HVMCTLs
>> > is currently available to stubdoms ?
>> 
>> That series only moves code from one hypercall to another (new) one,
>> without any security implications at all. What has been available to
>> stubdoms prior to that series will be available the same way once it
>> got applied.
> 
> So what you mean is that in HVMCTL, the privilege check is retained in
> the individual HVMCTL sub-operation ?  (Ie what used to be IS_PRIV or
> IS_PRIV_FOR - and implicitly, some sub-ops would be IS_PRIV and some
> IS_PRIV_FOR.)
> 
> But looking at your v2 01/11, I see this in do_hvmctl:
> 
>+rc = xsm_hvm_control(XSM_DM_PRIV, d, op.cmd);
>+if ( rc )
>+{
>+rcu_unlock_domain(d);
>+return rc;
>+}
> 
> And looking at v2 02/11, I didn't see any privilege check in the
> specific hypercall.

Of course - there were no explicit IS_PRIV{,_FOR}() uses before
(i.e. the patch also doesn't remove any), these all sit behind the
XSM hook. And no, there are no per-sub-op privilege checks,
they're being consolidated to just the one you quote above.

> With DMOP it would make sense for the privilege check to be
> IS_PRIV_FOR, in the DMOP dispatcher.  But it seems that this privilege
> check already exists in HVMCTL in the right form.
> 
> So AFAICT HVMCTLs already guarantee not to have an adverse impact on
> the whole system.  If that weren't the case, then a stub dm could
> exploit the situation.

And to tell you the truth, I'm not entirely convinced that all the
code implementing those operations (be it in there current hvmop
form or the new hvmctl one - again, the series only moves code
around) is really safe in this regard. But with there even being
at least one domctl not on xsm-flask.txt's safe-for-disaggregation
list, but reportedly used by qemu (I don't recall right now which
exact one it is), stubdom-s can't be considered fully secure right
now anyway.

> Is the audit that is required, to check that the DMOP doesn't have an
> adverse effect on the _calling domain_ ?  AFAICT most HVMCTLs/DMOPs
> have /no/ effect on the calling domain, other than as part of argument
> passing.  So that audit should be easy.
> 
> So I am confused.  What am I missing ?

The main adverse effect on the whole system that I can see
would be long latency operations, but I can't exclude others: Just
look at the final patch of the series, which fixes a serialization bug
which I noticed while doing the code movement. I don't think lack
of proper serialization is guaranteed to only affect the target
domain.

>> > Now, there may be other ways to represent/record the security status.
>> > But it will be necessary to either (i) avoid violating the DMOP
>> > security promise, by making questionable calls not available via DMOP
>> > or (ii) trying to retrofit the security promise to DMOP later.
>> > 
>> > I think (ii) is not a good approach.  It would amount to introducing a
>> > whole new set of interfaces, and then later trying to redefine them to
>> > have a particular security property which was not originally there.
>> 
>> I agree that (i) would be the better approach, but I don't think I
>> can assess when I would find the time to do the required auditing
>> of all involved code. Plus I don't see the difference between going
>> the (ii) route with the presented hvmctl series vs keeping things as
>> they are (under hvmop) - in both cases the security promise will
>> need to be retrofit onto existing code.
> 
> If we don't apply HVMCTL, we can introduce DMOP and then individually
> move hypercalls from hvmop to DMOP as they are audited.
> 
> Would a similar approach be acceptable even after HVMCTL ?
> 
> That is, the following plan:
> 
> 1. Apply HVMCTL right away.  This solves the cleanup problem,
>but leaves the qemu depriv problem unsolved.
> 
> 2. After the necessary discussion to understand and refine the DMOP
>design, create a new DMOP hypercall with appropriate security
>promises and whatever stability promises are agreed, but with no
>sub-ops.
> 
> 3. Move each HVMCTL to DMOP, one by one, as it is audited.
> 
> 4. Perhaps some HVMCTLs will remain which are inherently unsuitable
>for use with qemu depriv.  If not, perhaps eventually delete HVMCTL
>entirely, or leave it empty (with no subops).
> 
> This has the downside that we end up moving all the DMOPs twice.  But
> it does allow us to separate the audit work from the cleanup/reorg.

Well, moving things twice doesn't sound attractive. I was rather
thinking of declaring hvmctl (or dmop, if that's to intended name)
sub-ops secure one by one as they get audited. Part of my reason
to be hesitant to do such an audit myself (apart from the time
constraint) is me remembering my failure to do so properly as a
f

Re: [Xen-devel] [PATCH v4 1/2] Interface for grant copy operation in libs.

2016-08-03 Thread David Vrabel
On 02/08/16 15:06, Paulina Szubarczyk wrote:
> 
> +/**
> + * Copy memory from or to grant references. The information of each 
> operations
> + * are contained in 'xengnttab_grant_copy_segment_t'. The @flag value 
> indicate
> + * the direction of an operation (GNTCOPY_source_gref\GNTCOPY_dest_gref).
> + *
> + * The sum of fields @offset[i] and @len[i] of 
> 'xengnttab_grant_copy_segment_t'
> + * should not exceed XEN_PAGE_SIZE

"For each segment, @virt may cross a page boundary but @offset + @len
must be less than XEN_PAGE_SIZE."  might be better.

With or without the above change:

Reviewed-by: David Vrabel 

David


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] mem_access: sanitize code around sending vm_event request

2016-08-03 Thread George Dunlap
On 01/08/16 17:52, Tamas K Lengyel wrote:
> The two functions monitor_traps and mem_access_send_req duplicate some of the
> same functionality. The mem_access_send_req however leaves a lot of the
> standard vm_event fields to be filled by other functions.
> 
> Remove mem_access_send_req() completely, making use of monitor_traps() to put
> requests into the monitor ring.  This in turn causes some cleanup around the
> old callsites of mem_access_send_req(), and on ARM, the introduction of the
> __p2m_mem_access_send_req() helper to fill in common mem_access information.
> We also update monitor_traps to now include setting the common vcpu_id field
> so that all other call-sites can ommit this step.
> 
> Finally, this change identifies that errors from mem_access_send_req() were
> never checked.  As errors constitute a problem with the monitor ring,
> crashing the domain is the most appropriate action to take.
> 
> Signed-off-by: Tamas K Lengyel 
> Reviewed-by: Andrew Cooper 

This appears to be v3, not v2?

> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> index 812dbf6..27f9d26 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -1728,13 +1728,8 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long 
> gla,
>  if ( req )
>  {
>  *req_ptr = req;
> -req->reason = VM_EVENT_REASON_MEM_ACCESS;
> -
> -/* Pause the current VCPU */
> -if ( p2ma != p2m_access_n2rwx )
> -req->flags |= VM_EVENT_FLAG_VCPU_PAUSED;
>  
> -/* Send request to mem event */
> +req->reason = VM_EVENT_REASON_MEM_ACCESS;
>  req->u.mem_access.gfn = gfn;
>  req->u.mem_access.offset = gpa & ((1 << PAGE_SHIFT) - 1);
>  if ( npfec.gla_valid )
> @@ -1750,23 +1745,10 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned 
> long gla,
>  req->u.mem_access.flags |= npfec.read_access? MEM_ACCESS_R : 0;
>  req->u.mem_access.flags |= npfec.write_access   ? MEM_ACCESS_W : 0;
>  req->u.mem_access.flags |= npfec.insn_fetch ? MEM_ACCESS_X : 0;
> -req->vcpu_id = v->vcpu_id;
> -
> -vm_event_fill_regs(req);
> -
> -if ( altp2m_active(v->domain) )
> -{
> -req->flags |= VM_EVENT_FLAG_ALTERNATE_P2M;
> -req->altp2m_idx = vcpu_altp2m(v).p2midx;
> -}
>  }
>  
> -/* Pause the current VCPU */
> -if ( p2ma != p2m_access_n2rwx )
> -vm_event_vcpu_pause(v);
> -
> -/* VCPU may be paused, return whether we promoted automatically */
> -return (p2ma == p2m_access_n2rwx);
> +/* Return whether vCPU pause is required (aka. sync event) */
> +return (p2ma != p2m_access_n2rwx);
>  }
>  
>  static inline

p2m-bits:

Acked-by: George Dunlap 

But I agree with Julien -- this patch has several independent changes
which makes it quite difficult to tell what's going on.  I'm sure it's
taken the two of us a lot more time together to figure out what is and
is not happening than it would have for you to break it down into
several little chunks.

If you're not already familiar with it, I would recommend looking into
stackgit.  My modus operandi for things like this is to get things
working in one big patch, then pop it off the stack and apply bits of it
at a time to make a series.

It's not only more considerate of your reviewers, but it's also a
helpful exercise for yourself.

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.7.0 boot PANIC on kernel 4.7.0-4 + UEFI ?

2016-08-03 Thread lists
> > A #GP fault in firmware code. Not much we can do about, I'm afraid,
> > except for having you go with one of the mentioned workarounds

I tried

+ efi=no-rs
+ efi=attr=uc

on the Xen cmd line.

With efi=attr=uc, crashes on reboot with or without /mapbs

With efi=no-rs, reboots OK with or without /mapbs

So *today's* simplest working combination seems to be 

- /mapbs, + efi=no-rs

Two weeks  or so ago, it was just

+ /mapbs

> > (and perhaps getting in touch with the BIOS vendor).

The BIOS vendor refuses to talk to end users.  They deflect to the Motherboard 
vendor.

The motherboard vendor is Supermicro.  We have four different servers running 
on their motherboards.

Supermicro are not interested in a small user's problems.  Their responses have 
included

- we don't support linux
- we don't support Xen
- use Microsoft WIndows
- we don't write the BIOS. there's nothing we can do.

and finally just ignoring our emails.

At least Supermicro, being a 'server grade' mobo provider, has tech support you 
can get to that seem aware of linux & Xen.

The 'consumer grade' providers were just completely hopeless the moment you 
mention linux, xen, and or server.

And to be honest, what exactly would *I* tell them?  "It doesn't work"? It's 
not like I have any real idea what's broken in detail or how to fix it.

Unless devs that know what they're talking about, and are from a big vendor or 
project with visible presence or clout, get in touch with them nothing will 
change. 

> One bit you snipped was
> 
> (XEN) [2016-08-03 13:06:54] Xen code around <9e746340>
> (9e746340):
> (XEN) [2016-08-03 13:06:54]  00 00 00 00 00 00 00 00 <00> 00 00 00 00 00
> 00 00 00 00 00 00 00 00 00 00
> (XEN) [2016-08-03 13:06:54] Xen stack trace from rsp=83008ce27dc0:
> 
> which shows that the EFI firmware has ended up in a block of zeroes. 
> This clearly isn't conducive to its overall health.

What options to the Xen or EFI command line can fix that?
Or is that a Xen or kernel code fix?
Or again a "BIOS issue"?

Fwiw 3 of those 4 servers are now being migrated to other Hypervisor/Container 
platforms.

So far KVM, FreeBSD and Microsoft are all booting & rebooting on UEFI hardware 
OK. Or at least not seeing crashes.  Haven't gotten much further in testing 
than that yet.

I'm not clear why 'BIOS' problems are only showing up when we use Xen, and only 
since 'recent' upgrades (it was working a few weeks ago), and now need (at 
least) efi=no-rs when they didn't before.

If I tell Supermicro that their motherboard works on these other platforms, 
particularly Microsoft, but not with Xen, I'm pretty sure I know exactly what 
they'll say.  And as an enduser I don't have the detailed knowledge to argue 
with them.  Let alone get them to do anything about it.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 6/9] xen/multicall: Rework arch multicall handling

2016-08-03 Thread Jan Beulich
>>> On 18.07.16 at 11:51,  wrote:
> --- a/xen/arch/x86/hypercall.c
> +++ b/xen/arch/x86/hypercall.c
> @@ -338,6 +338,34 @@ long pv_hypercall(struct cpu_user_regs *regs)
>  return ret;
>  }
>  
> +void arch_do_multicall_call(struct mc_state *state)
> +{
> +if ( !is_pv_32bit_vcpu(current) )
> +{
> +struct multicall_entry *call = &state->call;
> +
> +if ( (call->op < NR_hypercalls) && hypercall_table[call->op] )
> +call->result = hypercall_table[call->op](
> +call->args[0], call->args[1], call->args[2],
> +call->args[3], call->args[4], call->args[5]);
> +else
> +call->result = -ENOSYS;
> +}
> +#ifdef CONFIG_COMPAT
> +else
> +{
> +struct compat_multicall_entry *call = &state->compat_call;
> +
> +if ( (call->op < NR_hypercalls) && compat_hypercall_table[call->op] )

Why two distinct checks here when pv_hypercall() does just one
outside the if/else? With them folded (or if there is a good reason),
Reviewed-by: Jan Beulich 
with one more remark:

> --- a/xen/common/multicall.c
> +++ b/xen/common/multicall.c
> @@ -63,7 +63,7 @@ do_multicall(
>  
>  trace_multicall_call(&mcs->call);
>  
> -do_multicall_call(&mcs->call);
> +arch_do_multicall_call(mcs);

I think do_multicall_call() as a name was fine, but otoh I also don't
really mind the name change.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 7/9] x86/pv: Merge the pv hypercall tables

2016-08-03 Thread Jan Beulich
>>> On 18.07.16 at 11:51,  wrote:
> For the same reason as c/s 33a231e3f "x86/HVM: fold hypercall tables", this
> removes the risk of accidentally updating only one of the tables.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 

But having come here I still can't see why this can't be folded with
patch 5 without also folding in patch 6. Anyway - as long as they're
going to get committed without too big of a time gap in between,
the final result is what matters most.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 01/12] x86/paging: introduce paging_set_allocation

2016-08-03 Thread George Dunlap
On Tue, Aug 2, 2016 at 5:12 PM, Jan Beulich  wrote:
 On 02.08.16 at 17:49,  wrote:
>> On Tue, Aug 02, 2016 at 11:47:22AM +0200, Roger Pau Monne wrote:
>>> On Fri, Jul 29, 2016 at 05:47:24PM +0100, Andrew Cooper wrote:
>>> > As this is for the construction of dom0, it would be better to take a
>>> > preemptible pointer, loop in construct_dom0(), with a
>>> > process_pending_softirqs() call in between.
>>>
>>> Now fixed.
>>
>> Hm, I have to stand corrected, using hypercall_preempt_check (as
>> any of the *_set_allocation function use), is not safe at this point:
>>
>> (XEN) [ Xen-4.8-unstable  x86_64  debug=y  Tainted:C  ]
>> (XEN) CPU:0
>> (XEN) RIP:e008:[] 
>> hap.c#local_events_need_delivery+0x27/0x40
>> (XEN) RFLAGS: 00010246   CONTEXT: hypervisor
>> (XEN) rax:    rbx: 83023f5a5000   rcx: 82d080312900
>> (XEN) rdx: 0001   rsi: 83023f5a56c8   rdi: 8300b213d000
>> (XEN) rbp: 82d080307cc8   rsp: 82d080307cc8   r8:  0180
>> (XEN) r9:     r10: 00247000   r11: 82d08029a5b0
>> (XEN) r12: 0011   r13: 23ac   r14: 82d080307d4c
>> (XEN) r15: 83023f5a56c8   cr0: 8005003b   cr4: 001526e0
>> (XEN) cr3: b20fc000   cr2: 
>> (XEN) ds:    es:    fs:    gs:    ss:    cs: e008
>> (XEN) Xen code around  
>> (hap.c#local_events_need_delivery+0x27/0x40):
>> (XEN)  0d ad fa ff 48 8b 47 08 <80> 38 00 74 09 80 78 01 00 0f 94 c0 eb 02 
>> 31 c0
>> (XEN) Xen stack trace from rsp=82d080307cc8:
>> (XEN)82d080307d08 82d08022fc47  83023f5a5000
>> (XEN)83023f5a5648  82d080307d4c 2400
>> (XEN)82d080307d38 82d08020008c 000d 8300b1efd000
>> (XEN)83023f5a5000 82d080307d4c 82d080307d78 82d0802cad30
>> (XEN)00203000 83023f5a5000 82d0802bf860 
>> (XEN)0001 8308bef0 82d080307de8 82d0802c91e0
>> (XEN)82d080307de8 82d080143900 82d080307de8 
>> (XEN)8308bf00 82d0802eb480 82d080307dc4 82d08028b1cd
>> (XEN)8308bf00  0001 83023f5a5000
>> (XEN)82d080307f08 82d0802bf0c9  
>> (XEN) 82d080307f18 8308bee0 0001
>> (XEN)0001 0001  0010
>> (XEN)0001 00247000 8308bef4 0010
>> (XEN)8301 00247001 0014 0001
>> (XEN)8300ffec 8308bef0 82d0802e0640 8308bfb0
>> (XEN)  0111 0008
>> (XEN)0001006e 0003 02f8 
>> (XEN)ad5c0bd0  0001 0008
>> (XEN) 82d080100073  
>> (XEN)   
>> (XEN) Xen call trace:
>> (XEN)[] hap.c#local_events_need_delivery+0x27/0x40
>> (XEN)[] hap_set_allocation+0x107/0x130
>> (XEN)[] paging_set_allocation+0x4c/0x80
>> (XEN)[] domain_build.c#hvm_setup_p2m+0x70/0x1a0
>> (XEN)[] domain_build.c#construct_dom0_hvm+0x60/0x120
>> (XEN)[] __start_xen+0x1ea9/0x23a0
>> (XEN)[] __high_start+0x53/0x60
>> (XEN)
>> (XEN) Pagetable walk from :
>
> Sadly you don't make clear what pointer it is that is NULL at that point.

It sounds from what he says in the following paragraph like current is NULL.

>> I've tried setting current to d->vcpu[0], but that just makes the call to
>> hypercall_preempt_check crash in some scheduler assert. In any case, I've
>> added the preempt parameter to the paging_set_allocation function, but I
>> don't plan to use it in the domain builder for the time being. Does that
>> sound right?
>
> Not really, new huge latency issues like this shouldn't be reintroduced;
> we've been fighting hard to get rid of those (and we still occasionally
> find some no-one had noticed before).

You mean latency in processing softirqs?

Maybe what we need to do is to make local_events_need_delivery() safe
to call at this point by having it return 0 if current is NULL rather
than crashing?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 6/9] xen/multicall: Rework arch multicall handling

2016-08-03 Thread Andrew Cooper
On 03/08/16 16:02, Jan Beulich wrote:
 On 18.07.16 at 11:51,  wrote:
>> --- a/xen/arch/x86/hypercall.c
>> +++ b/xen/arch/x86/hypercall.c
>> @@ -338,6 +338,34 @@ long pv_hypercall(struct cpu_user_regs *regs)
>>  return ret;
>>  }
>>  
>> +void arch_do_multicall_call(struct mc_state *state)
>> +{
>> +if ( !is_pv_32bit_vcpu(current) )
>> +{
>> +struct multicall_entry *call = &state->call;
>> +
>> +if ( (call->op < NR_hypercalls) && hypercall_table[call->op] )
>> +call->result = hypercall_table[call->op](
>> +call->args[0], call->args[1], call->args[2],
>> +call->args[3], call->args[4], call->args[5]);
>> +else
>> +call->result = -ENOSYS;
>> +}
>> +#ifdef CONFIG_COMPAT
>> +else
>> +{
>> +struct compat_multicall_entry *call = &state->compat_call;
>> +
>> +if ( (call->op < NR_hypercalls) && compat_hypercall_table[call->op] 
>> )
> Why two distinct checks here when pv_hypercall() does just one
> outside the if/else? With them folded (or if there is a good reason),
> Reviewed-by: Jan Beulich 
> with one more remark:

multicall_entry and compat_multicall_entry are different.  call->op
lives at the same point in the union, but the field is a different width.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 8/9] x86/hypercall: Merge the hypercall arg tables

2016-08-03 Thread Jan Beulich
>>> On 18.07.16 at 11:51,  wrote:
> For the same reason as c/s 33a231e3f "x86/HVM: fold hypercall tables" and
> (TODO - changeset) "x86/pv: Merge the pv hypercall tables", this removes the
> risk of accidentally updating only one of the tables.

Based on this argument perhaps hypercall and args tables should
get folded too, but I guess that's a work item for another day.

> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 8/9] x86/hypercall: Merge the hypercall arg tables

2016-08-03 Thread Andrew Cooper
On 03/08/16 16:12, Jan Beulich wrote:
 On 18.07.16 at 11:51,  wrote:
>> For the same reason as c/s 33a231e3f "x86/HVM: fold hypercall tables" and
>> (TODO - changeset) "x86/pv: Merge the pv hypercall tables", this removes the
>> risk of accidentally updating only one of the tables.
> Based on this argument perhaps hypercall and args tables should
> get folded too, but I guess that's a work item for another day.

That is rather harder to do.  I thought about it, but couldn't find a
neat way of doing it.

The call table is an array of pointers, while the args table is an array
of bytes.  Merging them would result in excessive padding for alignment
purposes, unless it was packed, at which point we are calling
non-aligned function pointers, and taking that associated performance hit.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 9/9] x86/hypercall: Reduce the size of the hypercall tables

2016-08-03 Thread Jan Beulich
>>> On 18.07.16 at 11:51,  wrote:
> The highest populated entry in each hypercall table is currently at index 49.
> There is no need to extend both to tables to 64 entries.
> 
> Range check eax against the hypercall table array size, and use a
> BUILD_BUG_ON() to ensure that the hypercall tables don't grow larger than the
> args table.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] mem_access: sanitize code around sending vm_event request

2016-08-03 Thread Tamas K Lengyel
On Wed, Aug 3, 2016 at 8:41 AM, George Dunlap  wrote:
> On 01/08/16 17:52, Tamas K Lengyel wrote:
>> The two functions monitor_traps and mem_access_send_req duplicate some of the
>> same functionality. The mem_access_send_req however leaves a lot of the
>> standard vm_event fields to be filled by other functions.
>>
>> Remove mem_access_send_req() completely, making use of monitor_traps() to put
>> requests into the monitor ring.  This in turn causes some cleanup around the
>> old callsites of mem_access_send_req(), and on ARM, the introduction of the
>> __p2m_mem_access_send_req() helper to fill in common mem_access information.
>> We also update monitor_traps to now include setting the common vcpu_id field
>> so that all other call-sites can ommit this step.
>>
>> Finally, this change identifies that errors from mem_access_send_req() were
>> never checked.  As errors constitute a problem with the monitor ring,
>> crashing the domain is the most appropriate action to take.
>>
>> Signed-off-by: Tamas K Lengyel 
>> Reviewed-by: Andrew Cooper 
>
> This appears to be v3, not v2?

No, it's still just v2.

>
>> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
>> index 812dbf6..27f9d26 100644
>> --- a/xen/arch/x86/mm/p2m.c
>> +++ b/xen/arch/x86/mm/p2m.c
>> @@ -1728,13 +1728,8 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned 
>> long gla,
>>  if ( req )
>>  {
>>  *req_ptr = req;
>> -req->reason = VM_EVENT_REASON_MEM_ACCESS;
>> -
>> -/* Pause the current VCPU */
>> -if ( p2ma != p2m_access_n2rwx )
>> -req->flags |= VM_EVENT_FLAG_VCPU_PAUSED;
>>
>> -/* Send request to mem event */
>> +req->reason = VM_EVENT_REASON_MEM_ACCESS;
>>  req->u.mem_access.gfn = gfn;
>>  req->u.mem_access.offset = gpa & ((1 << PAGE_SHIFT) - 1);
>>  if ( npfec.gla_valid )
>> @@ -1750,23 +1745,10 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned 
>> long gla,
>>  req->u.mem_access.flags |= npfec.read_access? MEM_ACCESS_R : 0;
>>  req->u.mem_access.flags |= npfec.write_access   ? MEM_ACCESS_W : 0;
>>  req->u.mem_access.flags |= npfec.insn_fetch ? MEM_ACCESS_X : 0;
>> -req->vcpu_id = v->vcpu_id;
>> -
>> -vm_event_fill_regs(req);
>> -
>> -if ( altp2m_active(v->domain) )
>> -{
>> -req->flags |= VM_EVENT_FLAG_ALTERNATE_P2M;
>> -req->altp2m_idx = vcpu_altp2m(v).p2midx;
>> -}
>>  }
>>
>> -/* Pause the current VCPU */
>> -if ( p2ma != p2m_access_n2rwx )
>> -vm_event_vcpu_pause(v);
>> -
>> -/* VCPU may be paused, return whether we promoted automatically */
>> -return (p2ma == p2m_access_n2rwx);
>> +/* Return whether vCPU pause is required (aka. sync event) */
>> +return (p2ma != p2m_access_n2rwx);
>>  }
>>
>>  static inline
>
> p2m-bits:
>
> Acked-by: George Dunlap 
>
> But I agree with Julien -- this patch has several independent changes
> which makes it quite difficult to tell what's going on.  I'm sure it's
> taken the two of us a lot more time together to figure out what is and
> is not happening than it would have for you to break it down into
> several little chunks.
>
> If you're not already familiar with it, I would recommend looking into
> stackgit.  My modus operandi for things like this is to get things
> working in one big patch, then pop it off the stack and apply bits of it
> at a time to make a series.
>
> It's not only more considerate of your reviewers, but it's also a
> helpful exercise for yourself.
>

The extra work doesn't just come from splitting the code itself
(although I don't know which bits would really make sense to split
here that would worth the effort) but testing a series on various
platforms. As you are in the same boat that this should be multiple
patches (so it's 3v2) I have no problem just postponing the ARM
sanitization beside what's absolutely required at the moment. I'm
already looking at how to move the mem_accss code out of p2m on both
platforms so we don't have to end up in this loop in the future.

Thanks,
Tamas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 01/12] x86/paging: introduce paging_set_allocation

2016-08-03 Thread Jan Beulich
>>> On 03.08.16 at 17:11,  wrote:
> On Tue, Aug 2, 2016 at 5:12 PM, Jan Beulich  wrote:
> On 02.08.16 at 17:49,  wrote:
>>> On Tue, Aug 02, 2016 at 11:47:22AM +0200, Roger Pau Monne wrote:
 On Fri, Jul 29, 2016 at 05:47:24PM +0100, Andrew Cooper wrote:
 > As this is for the construction of dom0, it would be better to take a
 > preemptible pointer, loop in construct_dom0(), with a
 > process_pending_softirqs() call in between.

 Now fixed.
>>>
>>> Hm, I have to stand corrected, using hypercall_preempt_check (as
>>> any of the *_set_allocation function use), is not safe at this point:
>>>
>>> (XEN) [ Xen-4.8-unstable  x86_64  debug=y  Tainted:C  ]
>>> (XEN) CPU:0
>>> (XEN) RIP:e008:[] 
> hap.c#local_events_need_delivery+0x27/0x40
>>> (XEN) RFLAGS: 00010246   CONTEXT: hypervisor
>>> (XEN) rax:    rbx: 83023f5a5000   rcx: 82d080312900
>>> (XEN) rdx: 0001   rsi: 83023f5a56c8   rdi: 8300b213d000
>>> (XEN) rbp: 82d080307cc8   rsp: 82d080307cc8   r8:  0180
>>> (XEN) r9:     r10: 00247000   r11: 82d08029a5b0
>>> (XEN) r12: 0011   r13: 23ac   r14: 82d080307d4c
>>> (XEN) r15: 83023f5a56c8   cr0: 8005003b   cr4: 001526e0
>>> (XEN) cr3: b20fc000   cr2: 
>>> (XEN) ds:    es:    fs:    gs:    ss:    cs: e008
>>> (XEN) Xen code around  
> (hap.c#local_events_need_delivery+0x27/0x40):
>>> (XEN)  0d ad fa ff 48 8b 47 08 <80> 38 00 74 09 80 78 01 00 0f 94 c0 eb 02 
>>> 31 
> c0
>>> (XEN) Xen stack trace from rsp=82d080307cc8:
>>> (XEN)82d080307d08 82d08022fc47  83023f5a5000
>>> (XEN)83023f5a5648  82d080307d4c 2400
>>> (XEN)82d080307d38 82d08020008c 000d 8300b1efd000
>>> (XEN)83023f5a5000 82d080307d4c 82d080307d78 82d0802cad30
>>> (XEN)00203000 83023f5a5000 82d0802bf860 
>>> (XEN)0001 8308bef0 82d080307de8 82d0802c91e0
>>> (XEN)82d080307de8 82d080143900 82d080307de8 
>>> (XEN)8308bf00 82d0802eb480 82d080307dc4 82d08028b1cd
>>> (XEN)8308bf00  0001 83023f5a5000
>>> (XEN)82d080307f08 82d0802bf0c9  
>>> (XEN) 82d080307f18 8308bee0 0001
>>> (XEN)0001 0001  0010
>>> (XEN)0001 00247000 8308bef4 0010
>>> (XEN)8301 00247001 0014 0001
>>> (XEN)8300ffec 8308bef0 82d0802e0640 8308bfb0
>>> (XEN)  0111 0008
>>> (XEN)0001006e 0003 02f8 
>>> (XEN)ad5c0bd0  0001 0008
>>> (XEN) 82d080100073  
>>> (XEN)   
>>> (XEN) Xen call trace:
>>> (XEN)[] hap.c#local_events_need_delivery+0x27/0x40
>>> (XEN)[] hap_set_allocation+0x107/0x130
>>> (XEN)[] paging_set_allocation+0x4c/0x80
>>> (XEN)[] domain_build.c#hvm_setup_p2m+0x70/0x1a0
>>> (XEN)[] domain_build.c#construct_dom0_hvm+0x60/0x120
>>> (XEN)[] __start_xen+0x1ea9/0x23a0
>>> (XEN)[] __high_start+0x53/0x60
>>> (XEN)
>>> (XEN) Pagetable walk from :
>>
>> Sadly you don't make clear what pointer it is that is NULL at that point.
> 
> It sounds from what he says in the following paragraph like current is NULL.

I don't recall us re-setting current to this late in the boot process.
Even during early boot we set it to a bogus non-NULL value rather
than NULL.

>>> I've tried setting current to d->vcpu[0], but that just makes the call to
>>> hypercall_preempt_check crash in some scheduler assert. In any case, I've
>>> added the preempt parameter to the paging_set_allocation function, but I
>>> don't plan to use it in the domain builder for the time being. Does that
>>> sound right?
>>
>> Not really, new huge latency issues like this shouldn't be reintroduced;
>> we've been fighting hard to get rid of those (and we still occasionally
>> find some no-one had noticed before).
> 
> You mean latency in processing softirqs?
> 
> Maybe what we need to do is to make local_events_need_delivery() safe
> to call at this point by having it return 0 if current is NULL rather
> than crashing?

That would have the same effect - no softirq processing, and hence
possible time issues on huge systems.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen

Re: [Xen-devel] [PATCH RFC 01/12] x86/paging: introduce paging_set_allocation

2016-08-03 Thread George Dunlap
On 03/08/16 16:25, Jan Beulich wrote:
 On 03.08.16 at 17:11,  wrote:
>> On Tue, Aug 2, 2016 at 5:12 PM, Jan Beulich  wrote:
>> On 02.08.16 at 17:49,  wrote:
 On Tue, Aug 02, 2016 at 11:47:22AM +0200, Roger Pau Monne wrote:
> On Fri, Jul 29, 2016 at 05:47:24PM +0100, Andrew Cooper wrote:
>> As this is for the construction of dom0, it would be better to take a
>> preemptible pointer, loop in construct_dom0(), with a
>> process_pending_softirqs() call in between.
>
> Now fixed.

 Hm, I have to stand corrected, using hypercall_preempt_check (as
 any of the *_set_allocation function use), is not safe at this point:

 (XEN) [ Xen-4.8-unstable  x86_64  debug=y  Tainted:C  ]
 (XEN) CPU:0
 (XEN) RIP:e008:[] 
>> hap.c#local_events_need_delivery+0x27/0x40
 (XEN) RFLAGS: 00010246   CONTEXT: hypervisor
 (XEN) rax:    rbx: 83023f5a5000   rcx: 82d080312900
 (XEN) rdx: 0001   rsi: 83023f5a56c8   rdi: 8300b213d000
 (XEN) rbp: 82d080307cc8   rsp: 82d080307cc8   r8:  0180
 (XEN) r9:     r10: 00247000   r11: 82d08029a5b0
 (XEN) r12: 0011   r13: 23ac   r14: 82d080307d4c
 (XEN) r15: 83023f5a56c8   cr0: 8005003b   cr4: 001526e0
 (XEN) cr3: b20fc000   cr2: 
 (XEN) ds:    es:    fs:    gs:    ss:    cs: e008
 (XEN) Xen code around  
>> (hap.c#local_events_need_delivery+0x27/0x40):
 (XEN)  0d ad fa ff 48 8b 47 08 <80> 38 00 74 09 80 78 01 00 0f 94 c0 eb 02 
 31 
>> c0
 (XEN) Xen stack trace from rsp=82d080307cc8:
 (XEN)82d080307d08 82d08022fc47  
 83023f5a5000
 (XEN)83023f5a5648  82d080307d4c 
 2400
 (XEN)82d080307d38 82d08020008c 000d 
 8300b1efd000
 (XEN)83023f5a5000 82d080307d4c 82d080307d78 
 82d0802cad30
 (XEN)00203000 83023f5a5000 82d0802bf860 
 
 (XEN)0001 8308bef0 82d080307de8 
 82d0802c91e0
 (XEN)82d080307de8 82d080143900 82d080307de8 
 
 (XEN)8308bf00 82d0802eb480 82d080307dc4 
 82d08028b1cd
 (XEN)8308bf00  0001 
 83023f5a5000
 (XEN)82d080307f08 82d0802bf0c9  
 
 (XEN) 82d080307f18 8308bee0 
 0001
 (XEN)0001 0001  
 0010
 (XEN)0001 00247000 8308bef4 
 0010
 (XEN)8301 00247001 0014 
 0001
 (XEN)8300ffec 8308bef0 82d0802e0640 
 8308bfb0
 (XEN)  0111 
 0008
 (XEN)0001006e 0003 02f8 
 
 (XEN)ad5c0bd0  0001 
 0008
 (XEN) 82d080100073  
 
 (XEN)   
 
 (XEN) Xen call trace:
 (XEN)[] hap.c#local_events_need_delivery+0x27/0x40
 (XEN)[] hap_set_allocation+0x107/0x130
 (XEN)[] paging_set_allocation+0x4c/0x80
 (XEN)[] domain_build.c#hvm_setup_p2m+0x70/0x1a0
 (XEN)[] domain_build.c#construct_dom0_hvm+0x60/0x120
 (XEN)[] __start_xen+0x1ea9/0x23a0
 (XEN)[] __high_start+0x53/0x60
 (XEN)
 (XEN) Pagetable walk from :
>>>
>>> Sadly you don't make clear what pointer it is that is NULL at that point.
>>
>> It sounds from what he says in the following paragraph like current is NULL.
> 
> I don't recall us re-setting current to this late in the boot process.
> Even during early boot we set it to a bogus non-NULL value rather
> than NULL.
> 
 I've tried setting current to d->vcpu[0], but that just makes the call to
 hypercall_preempt_check crash in some scheduler assert. In any case, I've
 added the preempt parameter to the paging_set_allocation function, but I
 don't plan to use it in the domain builder for the time being. Does that
 sound right?
>>>
>>> Not really, new huge latency issues like this shouldn't be reintroduced;
>>> we've been fighting hard to get rid of those (and we still occasionally
>>> find some no-one had noticed before).
>>
>> You mean latency in processing softirqs?
>>
>> Maybe what we need to do is to make local_events_need_delivery() safe
>> to call at this point by having it return 0 if current is NULL rathe

Re: [Xen-devel] [PATCH 8/9] x86/hypercall: Merge the hypercall arg tables

2016-08-03 Thread Jan Beulich
>>> On 03.08.16 at 17:15,  wrote:
> On 03/08/16 16:12, Jan Beulich wrote:
> On 18.07.16 at 11:51,  wrote:
>>> For the same reason as c/s 33a231e3f "x86/HVM: fold hypercall tables" and
>>> (TODO - changeset) "x86/pv: Merge the pv hypercall tables", this removes the
>>> risk of accidentally updating only one of the tables.
>> Based on this argument perhaps hypercall and args tables should
>> get folded too, but I guess that's a work item for another day.
> 
> That is rather harder to do.  I thought about it, but couldn't find a
> neat way of doing it.
> 
> The call table is an array of pointers, while the args table is an array
> of bytes.  Merging them would result in excessive padding for alignment
> purposes, unless it was packed, at which point we are calling
> non-aligned function pointers, and taking that associated performance hit.

If we folded everything together (pv, hvm, args), there would be
6 bytes padding per 34 or actual data, so I wouldn't be worried
too much. But I admit that merging hvm and pv tables wouldn't be
entirely trivial.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [ovmf baseline-only test] 66904: all pass

2016-08-03 Thread Platform Team regression test user
This run is configured for baseline tests only.

flight 66904 ovmf real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/66904/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf e80cb37ee7b507b6cfe4e4b0f23dc4c5cb2c1d5d
baseline version:
 ovmf deaacda3b2740477733564066eb69d5c94b41bba

Last test of basis66896  2016-08-03 10:18:33 Z0 days
Testing same since66904  2016-08-03 13:16:29 Z0 days1 attempts


People who touched revisions under test:
  Gary Lin 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


commit e80cb37ee7b507b6cfe4e4b0f23dc4c5cb2c1d5d
Author: Gary Lin 
Date:   Fri Jul 22 17:47:12 2016 +0800

Vlv2TbltDevicePkg/FvbRuntimeDxe: Remove unused variables

Fix the following errors from gcc:

Vlv2TbltDevicePkg/FvbRuntimeDxe/FvbService.c: In function 
‘FvbWriteBlock’:
Vlv2TbltDevicePkg/FvbRuntimeDxe/FvbService.c:368:44: error: variable 
‘FwhInstance’ set but not used [-Werror=unused-but-set-variable]

Vlv2TbltDevicePkg/FvbRuntimeDxe/FvbService.c: In function 
‘FvbEraseBlock’:
Vlv2TbltDevicePkg/FvbRuntimeDxe/FvbService.c:448:44: error: variable 
‘FwhInstance’ set but not used [-Werror=unused-but-set-variable]

Vlv2TbltDevicePkg/FvbRuntimeDxe/FvbService.c: In function 
‘FvbInitialize’:
Vlv2TbltDevicePkg/FvbRuntimeDxe/FvbService.c:1028:41: error: variable 
‘FvHeaderValid’ set but not used [-Werror=unused-but-set-variable]

Cc: David Wei 
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Gary Lin 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] mem_access: sanitize code around sending vm_event request

2016-08-03 Thread George Dunlap
On 03/08/16 16:18, Tamas K Lengyel wrote:
> On Wed, Aug 3, 2016 at 8:41 AM, George Dunlap  
> wrote:
>> On 01/08/16 17:52, Tamas K Lengyel wrote:
>>> The two functions monitor_traps and mem_access_send_req duplicate some of 
>>> the
>>> same functionality. The mem_access_send_req however leaves a lot of the
>>> standard vm_event fields to be filled by other functions.
>>>
>>> Remove mem_access_send_req() completely, making use of monitor_traps() to 
>>> put
>>> requests into the monitor ring.  This in turn causes some cleanup around the
>>> old callsites of mem_access_send_req(), and on ARM, the introduction of the
>>> __p2m_mem_access_send_req() helper to fill in common mem_access information.
>>> We also update monitor_traps to now include setting the common vcpu_id field
>>> so that all other call-sites can ommit this step.
>>>
>>> Finally, this change identifies that errors from mem_access_send_req() were
>>> never checked.  As errors constitute a problem with the monitor ring,
>>> crashing the domain is the most appropriate action to take.
>>>
>>> Signed-off-by: Tamas K Lengyel 
>>> Reviewed-by: Andrew Cooper 
>>
>> This appears to be v3, not v2?
> 
> No, it's still just v2.
> 
>>
>>> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
>>> index 812dbf6..27f9d26 100644
>>> --- a/xen/arch/x86/mm/p2m.c
>>> +++ b/xen/arch/x86/mm/p2m.c
>>> @@ -1728,13 +1728,8 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned 
>>> long gla,
>>>  if ( req )
>>>  {
>>>  *req_ptr = req;
>>> -req->reason = VM_EVENT_REASON_MEM_ACCESS;
>>> -
>>> -/* Pause the current VCPU */
>>> -if ( p2ma != p2m_access_n2rwx )
>>> -req->flags |= VM_EVENT_FLAG_VCPU_PAUSED;
>>>
>>> -/* Send request to mem event */
>>> +req->reason = VM_EVENT_REASON_MEM_ACCESS;
>>>  req->u.mem_access.gfn = gfn;
>>>  req->u.mem_access.offset = gpa & ((1 << PAGE_SHIFT) - 1);
>>>  if ( npfec.gla_valid )
>>> @@ -1750,23 +1745,10 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned 
>>> long gla,
>>>  req->u.mem_access.flags |= npfec.read_access? MEM_ACCESS_R : 0;
>>>  req->u.mem_access.flags |= npfec.write_access   ? MEM_ACCESS_W : 0;
>>>  req->u.mem_access.flags |= npfec.insn_fetch ? MEM_ACCESS_X : 0;
>>> -req->vcpu_id = v->vcpu_id;
>>> -
>>> -vm_event_fill_regs(req);
>>> -
>>> -if ( altp2m_active(v->domain) )
>>> -{
>>> -req->flags |= VM_EVENT_FLAG_ALTERNATE_P2M;
>>> -req->altp2m_idx = vcpu_altp2m(v).p2midx;
>>> -}
>>>  }
>>>
>>> -/* Pause the current VCPU */
>>> -if ( p2ma != p2m_access_n2rwx )
>>> -vm_event_vcpu_pause(v);
>>> -
>>> -/* VCPU may be paused, return whether we promoted automatically */
>>> -return (p2ma == p2m_access_n2rwx);
>>> +/* Return whether vCPU pause is required (aka. sync event) */
>>> +return (p2ma != p2m_access_n2rwx);
>>>  }
>>>
>>>  static inline
>>
>> p2m-bits:
>>
>> Acked-by: George Dunlap 
>>
>> But I agree with Julien -- this patch has several independent changes
>> which makes it quite difficult to tell what's going on.  I'm sure it's
>> taken the two of us a lot more time together to figure out what is and
>> is not happening than it would have for you to break it down into
>> several little chunks.
>>
>> If you're not already familiar with it, I would recommend looking into
>> stackgit.  My modus operandi for things like this is to get things
>> working in one big patch, then pop it off the stack and apply bits of it
>> at a time to make a series.
>>
>> It's not only more considerate of your reviewers, but it's also a
>> helpful exercise for yourself.
>>
> 
> The extra work doesn't just come from splitting the code itself
> (although I don't know which bits would really make sense to split
> here that would worth the effort) but testing a series on various
> platforms.

I don't understand this statement -- why is testing a 3-patch series
more difficult than testing a one-patch series?  Are you testing each
individual patch?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


  1   2   3   >