[Xen-devel] [linux-next test] 128276: regressions - FAIL
flight 128276 linux-next real [real] http://logs.test-lab.xenproject.org/osstest/logs/128276/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-xl-qemuu-debianhvm-amd64 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-xl-credit2 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-qemuu-rhel6hvm-intel 7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-xl-qemut-ws16-amd64 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-freebsd10-i386 7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-qemuu-nested-intel 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-xl-qemuu-win10-i386 7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-xl-qemuu-debianhvm-amd64 7 xen-bootfail REGR. vs. 128170 test-amd64-amd64-xl-qemuu-ws16-amd64 7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-i386-pvgrub 7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-xl-qemut-win7-amd64 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-xl-raw7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-xl-qemuu-ovmf-amd64 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-xl-qemuu-ovmf-amd64 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-freebsd10-amd64 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-pair 10 xen-boot/src_hostfail REGR. vs. 128170 test-amd64-i386-pair 11 xen-boot/dst_hostfail REGR. vs. 128170 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-libvirt 7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-xl-shadow7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-qemuu-nested-amd 7 xen-bootfail REGR. vs. 128170 test-amd64-amd64-xl-qemut-win10-i386 7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-xl-pvhv2-amd 7 xen-bootfail REGR. vs. 128170 test-amd64-i386-xl-qemut-ws16-amd64 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-qemut-rhel6hvm-amd 7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-xl 7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-xl-xsm 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-xl-xsm7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-libvirt-pair 10 xen-boot/src_host fail REGR. vs. 128170 test-amd64-amd64-libvirt-pair 11 xen-boot/dst_host fail REGR. vs. 128170 test-amd64-amd64-xl-pvshim7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-libvirt 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-xl-qemuu-ws16-amd64 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-libvirt-xsm 7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-pair10 xen-boot/src_hostfail REGR. vs. 128170 test-amd64-amd64-pair11 xen-boot/dst_hostfail REGR. vs. 128170 test-amd64-amd64-xl-qemut-debianhvm-amd64 7 xen-bootfail REGR. vs. 128170 test-amd64-i386-xl-qemuu-win7-amd64 7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-qemuu-rhel6hvm-amd 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-xl-pvshim 7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-amd64-pvgrub 7 xen-bootfail REGR. vs. 128170 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 128170 test-amd64-i386-xl-qemut-win7-amd64 7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 128170 test-amd64-amd64-xl-qemuu-win10-i386 7 xen-boot fail REGR. vs. 128170 Tests which are failing intermittently (not blocking): test-armhf-armhf-xl 6 xen-installfail pass in 128167 test-armhf-armhf-xl-cubietruck 7 xen-boot fail pass in 128167 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-credit1 7 xen-bootfail baseline untested test-armhf-armhf-xl 13 migrate-support-check fail in 128167 never pass test-armhf-armhf-xl 14 saverestore-support-check fail in 128167 never pass
[Xen-devel] [xen-unstable-smoke test] 128296: tolerable all pass - PUSHED
flight 128296 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/128296/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-arm64-arm64-xl-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass version targeted for testing: xen 0bc6a68da585cb5d781f27404943a1dbd393e95f baseline version: xen 3722e563432bdbf0cbd16ace3075cf643186b01e Last test of basis 128288 2018-10-01 17:00:45 Z0 days Testing same since 128296 2018-10-01 22:00:33 Z0 days1 attempts People who touched revisions under test: Julien Grall Stefano Stabellini Volodymyr Babchuk jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git 3722e56343..0bc6a68da5 0bc6a68da585cb5d781f27404943a1dbd393e95f -> smoke ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [[PATCH v3] 4/4] xen/arm: Replace call_smc with arm_smccc_smc
Hi Stefano, On 10/01/2018 09:52 PM, Stefano Stabellini wrote: On Mon, 1 Oct 2018, Stefano Stabellini wrote: On Mon, 1 Oct 2018, Julien Grall wrote: call_smc is a subset of arm_smccc_smc. Rather than having 2 methods to do SMCCC call, replace all call to the former by the later. Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini (you might want to make sure the double [[ is removed on commit.) Hey Julien, I was going to commit the series, but I am getting a build error: psci.c: In function 'call_psci_cpu_on': psci.c:45:40: error: request for member 'a0' in something not a structure or union #define PSCI_RET(res) ((int32_t)(res).a0) ^ psci.c:54:12: note: in expansion of macro 'PSCI_RET' return PSCI_RET(res.a0); ^ psci.c:55:1: error: control reaches end of non-void function [-Werror=return-type] } I already mentioned this error in reply to the patch [1] and suggested it could be fixed on commit... Cheers, [1] <9e0160ad-853c-6efa-1a04-15f801ef0...@arm.com> -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 2/2] xen/arm: vgic-v3: Don't create empty re-distributor regions
On Mon, 1 Oct 2018, Julien Grall wrote: > At the moment, Xen is assuming the hardware domain will have the same > number of re-distributor regions as the host. However, as the > number of CPUs or the stride (e.g on GICv4) may be different we end up > exposing regions which does not contain any re-distributors. > > When booting, Linux will go through all the re-distributor region to > check whether a property (e.g vPLIs) is available accross all the > re-distributors. This will result to a data abort on empty regions > because there are no underlying re-distributor. > > So we need to limit the number of regions exposed to the hardware > domain. The code reworked to only expose the minimun number of regions > required by the hardware domain. It is assumed the regions will be > populated starting from the first one. > > Lastly, rename vgic_v3_rdist_count to reflect the value return by the > helper. > > Reported-by: Shameerali Kolothum Thodi > Signed-off-by: Julien Grall > Tested-by: Shameer Kolothum > > --- > Changes in v2: > - Rename vgic_v3_rdist_count to vgic_v3_max_rdist_count > - Fixup #re-distributors > - Fix typoes > - Add Shameer's tested tag > --- > xen/arch/arm/gic-v3.c | 14 +++--- > xen/arch/arm/vgic-v3.c | 21 ++--- > 2 files changed, 29 insertions(+), 6 deletions(-) > > diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c > index c98a163ee7..2c1454f425 100644 > --- a/xen/arch/arm/gic-v3.c > +++ b/xen/arch/arm/gic-v3.c > @@ -1265,7 +1265,8 @@ static int gicv3_make_hwdom_dt_node(const struct domain > *d, > if ( res ) > return res; > > -res = fdt_property_cell(fdt, "#redistributor-regions", > gicv3.rdist_count); > +res = fdt_property_cell(fdt, "#redistributor-regions", > +d->arch.vgic.nr_regions); > if ( res ) > return res; > > @@ -1274,8 +1275,10 @@ static int gicv3_make_hwdom_dt_node(const struct > domain *d, > * GIC has two memory regions: Distributor + rdist regions > * CPU interface and virtual cpu interfaces accessesed as System > registers > * So cells are created only for Distributor and rdist regions > + * The hardware domain may not use all the regions. So only copy > + * what is necessary. > */ > -new_len = new_len * (gicv3.rdist_count + 1); > +new_len = new_len * (d->arch.vgic.nr_regions + 1); > > hw_reg = dt_get_property(gic, "reg", ); > if ( !hw_reg ) > @@ -1466,6 +1469,7 @@ static inline bool gic_dist_supports_dvis(void) > } > > static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) > + > { Aside from this spurious change, the patch is OK. Provided you remove the blank on commit: Reviewed-by: Stefano Stabellini > struct acpi_subtable_header *header; > struct acpi_madt_generic_interrupt *host_gicc, *gicc; > @@ -1503,7 +1507,11 @@ static int gicv3_make_hwdom_madt(const struct domain > *d, u32 offset) > > /* Add Generic Redistributor */ > size = sizeof(struct acpi_madt_generic_redistributor); > -for ( i = 0; i < gicv3.rdist_count; i++ ) > +/* > + * The hardware domain may not used all the regions. So only copy > + * what is necessary. > + */ > +for ( i = 0; i < d->arch.vgic.nr_regions; i++ ) > { > gicr = (struct acpi_madt_generic_redistributor *)(base_ptr + > table_len); > gicr->header.type = ACPI_MADT_TYPE_GENERIC_REDISTRIBUTOR; > diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c > index df1bab3a35..efe824c6fb 100644 > --- a/xen/arch/arm/vgic-v3.c > +++ b/xen/arch/arm/vgic-v3.c > @@ -1640,7 +1640,11 @@ static int vgic_v3_vcpu_init(struct vcpu *v) > return 0; > } > > -static inline unsigned int vgic_v3_rdist_count(struct domain *d) > +/* > + * Return the maximum number possible of re-distributor regions for > + * a given domain. > + */ > +static inline unsigned int vgic_v3_max_rdist_count(struct domain *d) > { > /* > * Normally there is only one GICv3 redistributor region. > @@ -1662,7 +1666,7 @@ static int vgic_v3_real_domain_init(struct domain *d) > int rdist_count, i, ret; > > /* Allocate memory for Re-distributor regions */ > -rdist_count = vgic_v3_rdist_count(d); > +rdist_count = vgic_v3_max_rdist_count(d); > > rdist_regions = xzalloc_array(struct vgic_rdist_region, rdist_count); > if ( !rdist_regions ) > @@ -1695,8 +1699,19 @@ static int vgic_v3_real_domain_init(struct domain *d) > d->arch.vgic.rdist_regions[i].first_cpu = first_cpu; > > first_cpu += size / GICV3_GICR_SIZE; > + > +if ( first_cpu >= d->max_vcpus ) > +break; > } > > +/* > + * The hardware domain may not use all the re-distributors > + * regions (e.g when the number of vCPUs does not match the > + * number of pCPUs). Update the number of regions to avoid > +
Re: [Xen-devel] [[PATCH v3] 4/4] xen/arm: Replace call_smc with arm_smccc_smc
On Mon, 1 Oct 2018, Stefano Stabellini wrote: > On Mon, 1 Oct 2018, Julien Grall wrote: > > call_smc is a subset of arm_smccc_smc. Rather than having 2 methods to > > do SMCCC call, replace all call to the former by the later. > > > > Signed-off-by: Julien Grall > > Reviewed-by: Stefano Stabellini > > (you might want to make sure the double [[ is removed on commit.) Hey Julien, I was going to commit the series, but I am getting a build error: psci.c: In function 'call_psci_cpu_on': psci.c:45:40: error: request for member 'a0' in something not a structure or union #define PSCI_RET(res) ((int32_t)(res).a0) ^ psci.c:54:12: note: in expansion of macro 'PSCI_RET' return PSCI_RET(res.a0); ^ psci.c:55:1: error: control reaches end of non-void function [-Werror=return-type] } > > --- > > > > Changes in v3: > > - Use PSCI_RET where needed > > --- > > xen/arch/arm/Makefile| 1 - > > xen/arch/arm/platforms/exynos5.c | 3 ++- > > xen/arch/arm/platforms/seattle.c | 4 ++-- > > xen/arch/arm/psci.c | 37 + > > xen/arch/arm/smc.S | 21 - > > xen/include/asm-arm/processor.h | 3 --- > > 6 files changed, 29 insertions(+), 40 deletions(-) > > delete mode 100644 xen/arch/arm/smc.S > > > > diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile > > index b9b141dc84..37fa8268b3 100644 > > --- a/xen/arch/arm/Makefile > > +++ b/xen/arch/arm/Makefile > > @@ -39,7 +39,6 @@ obj-y += processor.o > > obj-y += psci.o > > obj-y += setup.o > > obj-y += shutdown.o > > -obj-y += smc.o > > obj-y += smp.o > > obj-y += smpboot.o > > obj-y += sysctl.o > > diff --git a/xen/arch/arm/platforms/exynos5.c > > b/xen/arch/arm/platforms/exynos5.c > > index c15ecf80f5..e2c0b7b878 100644 > > --- a/xen/arch/arm/platforms/exynos5.c > > +++ b/xen/arch/arm/platforms/exynos5.c > > @@ -26,6 +26,7 @@ > > #include > > #include > > #include > > +#include > > > > static bool secure_firmware; > > > > @@ -249,7 +250,7 @@ static int exynos5_cpu_up(int cpu) > > iounmap(power); > > > > if ( secure_firmware ) > > -call_smc(SMC_CMD_CPU1BOOT, cpu, 0, 0); > > +arm_smccc_smc(SMC_CMD_CPU1BOOT, cpu, NULL); > > > > return cpu_up_send_sgi(cpu); > > } > > diff --git a/xen/arch/arm/platforms/seattle.c > > b/xen/arch/arm/platforms/seattle.c > > index 893cc17972..64cc1868c2 100644 > > --- a/xen/arch/arm/platforms/seattle.c > > +++ b/xen/arch/arm/platforms/seattle.c > > @@ -33,12 +33,12 @@ static const char * const seattle_dt_compat[] > > __initconst = > > */ > > static void seattle_system_reset(void) > > { > > -call_smc(PSCI_0_2_FN32_SYSTEM_RESET, 0, 0, 0); > > +arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_RESET, NULL); > > } > > > > static void seattle_system_off(void) > > { > > -call_smc(PSCI_0_2_FN32_SYSTEM_OFF, 0, 0, 0); > > +arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_OFF, NULL); > > } > > > > PLATFORM_START(seattle, "SEATTLE") > > diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c > > index 941eec921b..4ae6504f3e 100644 > > --- a/xen/arch/arm/psci.c > > +++ b/xen/arch/arm/psci.c > > @@ -42,42 +42,53 @@ uint32_t smccc_ver; > > > > static uint32_t psci_cpu_on_nr; > > > > +#define PSCI_RET(res) ((int32_t)(res).a0) > > + > > int call_psci_cpu_on(int cpu) > > { > > -return call_smc(psci_cpu_on_nr, cpu_logical_map(cpu), > > __pa(init_secondary), 0); > > +struct arm_smccc_res res; > > + > > +arm_smccc_smc(psci_cpu_on_nr, cpu_logical_map(cpu), > > __pa(init_secondary), > > + ); > > + > > +return PSCI_RET(res.a0); > > } > > > > void call_psci_cpu_off(void) > > { > > if ( psci_ver > PSCI_VERSION(0, 1) ) > > { > > -int errno; > > +struct arm_smccc_res res; > > > > /* If successfull the PSCI cpu_off call doesn't return */ > > -errno = call_smc(PSCI_0_2_FN32_CPU_OFF, 0, 0, 0); > > +arm_smccc_smc(PSCI_0_2_FN32_CPU_OFF, ); > > panic("PSCI cpu off failed for CPU%d err=%d\n", smp_processor_id(), > > - errno); > > + PSCI_RET(res)); > > } > > } > > > > void call_psci_system_off(void) > > { > > if ( psci_ver > PSCI_VERSION(0, 1) ) > > -call_smc(PSCI_0_2_FN32_SYSTEM_OFF, 0, 0, 0); > > +arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_OFF, NULL); > > } > > > > void call_psci_system_reset(void) > > { > > if ( psci_ver > PSCI_VERSION(0, 1) ) > > -call_smc(PSCI_0_2_FN32_SYSTEM_RESET, 0, 0, 0); > > +arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_RESET, NULL); > > } > > > > static int __init psci_features(uint32_t psci_func_id) > > { > > +struct arm_smccc_res res; > > + > > if ( psci_ver < PSCI_VERSION(1, 0) ) > > return PSCI_NOT_SUPPORTED; > > > > -return call_smc(PSCI_1_0_FN32_PSCI_FEATURES, psci_func_id, 0, 0); > > +
Re: [Xen-devel] [[PATCH v3] 4/4] xen/arm: Replace call_smc with arm_smccc_smc
On Mon, 1 Oct 2018, Julien Grall wrote: > call_smc is a subset of arm_smccc_smc. Rather than having 2 methods to > do SMCCC call, replace all call to the former by the later. > > Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini (you might want to make sure the double [[ is removed on commit.) > --- > > Changes in v3: > - Use PSCI_RET where needed > --- > xen/arch/arm/Makefile| 1 - > xen/arch/arm/platforms/exynos5.c | 3 ++- > xen/arch/arm/platforms/seattle.c | 4 ++-- > xen/arch/arm/psci.c | 37 + > xen/arch/arm/smc.S | 21 - > xen/include/asm-arm/processor.h | 3 --- > 6 files changed, 29 insertions(+), 40 deletions(-) > delete mode 100644 xen/arch/arm/smc.S > > diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile > index b9b141dc84..37fa8268b3 100644 > --- a/xen/arch/arm/Makefile > +++ b/xen/arch/arm/Makefile > @@ -39,7 +39,6 @@ obj-y += processor.o > obj-y += psci.o > obj-y += setup.o > obj-y += shutdown.o > -obj-y += smc.o > obj-y += smp.o > obj-y += smpboot.o > obj-y += sysctl.o > diff --git a/xen/arch/arm/platforms/exynos5.c > b/xen/arch/arm/platforms/exynos5.c > index c15ecf80f5..e2c0b7b878 100644 > --- a/xen/arch/arm/platforms/exynos5.c > +++ b/xen/arch/arm/platforms/exynos5.c > @@ -26,6 +26,7 @@ > #include > #include > #include > +#include > > static bool secure_firmware; > > @@ -249,7 +250,7 @@ static int exynos5_cpu_up(int cpu) > iounmap(power); > > if ( secure_firmware ) > -call_smc(SMC_CMD_CPU1BOOT, cpu, 0, 0); > +arm_smccc_smc(SMC_CMD_CPU1BOOT, cpu, NULL); > > return cpu_up_send_sgi(cpu); > } > diff --git a/xen/arch/arm/platforms/seattle.c > b/xen/arch/arm/platforms/seattle.c > index 893cc17972..64cc1868c2 100644 > --- a/xen/arch/arm/platforms/seattle.c > +++ b/xen/arch/arm/platforms/seattle.c > @@ -33,12 +33,12 @@ static const char * const seattle_dt_compat[] __initconst > = > */ > static void seattle_system_reset(void) > { > -call_smc(PSCI_0_2_FN32_SYSTEM_RESET, 0, 0, 0); > +arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_RESET, NULL); > } > > static void seattle_system_off(void) > { > -call_smc(PSCI_0_2_FN32_SYSTEM_OFF, 0, 0, 0); > +arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_OFF, NULL); > } > > PLATFORM_START(seattle, "SEATTLE") > diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c > index 941eec921b..4ae6504f3e 100644 > --- a/xen/arch/arm/psci.c > +++ b/xen/arch/arm/psci.c > @@ -42,42 +42,53 @@ uint32_t smccc_ver; > > static uint32_t psci_cpu_on_nr; > > +#define PSCI_RET(res) ((int32_t)(res).a0) > + > int call_psci_cpu_on(int cpu) > { > -return call_smc(psci_cpu_on_nr, cpu_logical_map(cpu), > __pa(init_secondary), 0); > +struct arm_smccc_res res; > + > +arm_smccc_smc(psci_cpu_on_nr, cpu_logical_map(cpu), __pa(init_secondary), > + ); > + > +return PSCI_RET(res.a0); > } > > void call_psci_cpu_off(void) > { > if ( psci_ver > PSCI_VERSION(0, 1) ) > { > -int errno; > +struct arm_smccc_res res; > > /* If successfull the PSCI cpu_off call doesn't return */ > -errno = call_smc(PSCI_0_2_FN32_CPU_OFF, 0, 0, 0); > +arm_smccc_smc(PSCI_0_2_FN32_CPU_OFF, ); > panic("PSCI cpu off failed for CPU%d err=%d\n", smp_processor_id(), > - errno); > + PSCI_RET(res)); > } > } > > void call_psci_system_off(void) > { > if ( psci_ver > PSCI_VERSION(0, 1) ) > -call_smc(PSCI_0_2_FN32_SYSTEM_OFF, 0, 0, 0); > +arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_OFF, NULL); > } > > void call_psci_system_reset(void) > { > if ( psci_ver > PSCI_VERSION(0, 1) ) > -call_smc(PSCI_0_2_FN32_SYSTEM_RESET, 0, 0, 0); > +arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_RESET, NULL); > } > > static int __init psci_features(uint32_t psci_func_id) > { > +struct arm_smccc_res res; > + > if ( psci_ver < PSCI_VERSION(1, 0) ) > return PSCI_NOT_SUPPORTED; > > -return call_smc(PSCI_1_0_FN32_PSCI_FEATURES, psci_func_id, 0, 0); > +arm_smccc_smc(PSCI_1_0_FN32_PSCI_FEATURES, psci_func_id, NULL); > + > +return PSCI_RET(res); > } > > static int __init psci_is_smc_method(const struct dt_device_node *psci) > @@ -112,11 +123,11 @@ static void __init psci_init_smccc(void) > > if ( psci_features(ARM_SMCCC_VERSION_FID) != PSCI_NOT_SUPPORTED ) > { > -uint32_t ret; > +struct arm_smccc_res res; > > -ret = call_smc(ARM_SMCCC_VERSION_FID, 0, 0, 0); > -if ( ret != ARM_SMCCC_NOT_SUPPORTED ) > -smccc_ver = ret; > +arm_smccc_smc(ARM_SMCCC_VERSION_FID, ); > +if ( PSCI_RET(res) != ARM_SMCCC_NOT_SUPPORTED ) > +smccc_ver = PSCI_RET(res); > } > > if ( smccc_ver >= SMCCC_VERSION(1, 1) ) > @@ -165,6 +176,7 @@ static int __init psci_init_0_2(void) > { /*
Re: [Xen-devel] [[PATCH v3] 2/4] xen/arm: cpufeature: Add helper to check constant caps
On Mon, 1 Oct 2018, Julien Grall wrote: > Some capababilities are set right during boot and will never change > afterwards. At the moment, the function cpu_have_caps will check whether > the cap is enabled from the memory. > > It is possible to avoid the load from the memory by using an > ALTERNATIVE. With that the check is just reduced to 1 instruction. > > Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini > --- > > This is the static key for the poor. At some point we might want to > introduce something similar to static key in Xen. > > Changes in v2: > - Use unlikely > --- > xen/include/asm-arm/cpufeature.h | 12 > 1 file changed, 12 insertions(+) > > diff --git a/xen/include/asm-arm/cpufeature.h > b/xen/include/asm-arm/cpufeature.h > index 3de6b54301..c6cbc2ec84 100644 > --- a/xen/include/asm-arm/cpufeature.h > +++ b/xen/include/asm-arm/cpufeature.h > @@ -63,6 +63,18 @@ static inline bool cpus_have_cap(unsigned int num) > return test_bit(num, cpu_hwcaps); > } > > +/* System capability check for constant cap */ > +#define cpus_have_const_cap(num) ({ \ > +bool __ret; \ > +\ > +asm volatile (ALTERNATIVE("mov %0, #0", \ > + "mov %0, #1", \ > + num) \ > + : "=r" (__ret)); \ > +\ > +unlikely(__ret);\ > +}) > + > static inline void cpus_set_cap(unsigned int num) > { > if (num >= ARM_NCAPS) > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [xen-unstable-smoke test] 128288: tolerable all pass - PUSHED
flight 128288 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/128288/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-arm64-arm64-xl-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass version targeted for testing: xen 3722e563432bdbf0cbd16ace3075cf643186b01e baseline version: xen 5c28a3c6814cc3a054fde133c0f6ef77d80c0412 Last test of basis 128283 2018-10-01 13:00:52 Z0 days Testing same since 128288 2018-10-01 17:00:45 Z0 days1 attempts People who touched revisions under test: George Dunlap jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git 5c28a3c681..3722e56343 3722e563432bdbf0cbd16ace3075cf643186b01e -> smoke ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 2/2] xen/arm: vgic-v3: Don't create empty re-distributor regions
Hi, I forgot to remove patch from the previous series before sending the new one. Please ignore that patch. Sorry for the noise. Cheers, On 10/01/2018 07:57 PM, Julien Grall wrote: At the moment, Xen is assuming the hardware domain will have the same number of re-distributor regions as the host. However, as the number of CPUs or the stride (e.g on GICv4) may be different we end up exposing regions which does not contain any re-distributors. When booting, Linux will go through all the re-distributor region to check whether a property (e.g vPLIs) is available accross all the re-distributors. This will result to a data abort on empty regions because there are no underlying re-distributor. So we need to limit the number of regions exposed to the hardware domain. The code reworked to only expose the minimun number of regions required by the hardware domain. It is assumed the regions will be populated starting from the first one. Lastly, rename vgic_v3_rdist_count to reflect the value return by the helper. Reported-by: Shameerali Kolothum Thodi Signed-off-by: Julien Grall Tested-by: Shameer Kolothum --- Changes in v2: - Rename vgic_v3_rdist_count to vgic_v3_max_rdist_count - Fixup #re-distributors - Fix typoes - Add Shameer's tested tag --- xen/arch/arm/gic-v3.c | 14 +++--- xen/arch/arm/vgic-v3.c | 21 ++--- 2 files changed, 29 insertions(+), 6 deletions(-) diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index c98a163ee7..2c1454f425 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1265,7 +1265,8 @@ static int gicv3_make_hwdom_dt_node(const struct domain *d, if ( res ) return res; -res = fdt_property_cell(fdt, "#redistributor-regions", gicv3.rdist_count); +res = fdt_property_cell(fdt, "#redistributor-regions", +d->arch.vgic.nr_regions); if ( res ) return res; @@ -1274,8 +1275,10 @@ static int gicv3_make_hwdom_dt_node(const struct domain *d, * GIC has two memory regions: Distributor + rdist regions * CPU interface and virtual cpu interfaces accessesed as System registers * So cells are created only for Distributor and rdist regions + * The hardware domain may not use all the regions. So only copy + * what is necessary. */ -new_len = new_len * (gicv3.rdist_count + 1); +new_len = new_len * (d->arch.vgic.nr_regions + 1); hw_reg = dt_get_property(gic, "reg", ); if ( !hw_reg ) @@ -1466,6 +1469,7 @@ static inline bool gic_dist_supports_dvis(void) } static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) + { struct acpi_subtable_header *header; struct acpi_madt_generic_interrupt *host_gicc, *gicc; @@ -1503,7 +1507,11 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) /* Add Generic Redistributor */ size = sizeof(struct acpi_madt_generic_redistributor); -for ( i = 0; i < gicv3.rdist_count; i++ ) +/* + * The hardware domain may not used all the regions. So only copy + * what is necessary. + */ +for ( i = 0; i < d->arch.vgic.nr_regions; i++ ) { gicr = (struct acpi_madt_generic_redistributor *)(base_ptr + table_len); gicr->header.type = ACPI_MADT_TYPE_GENERIC_REDISTRIBUTOR; diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c index df1bab3a35..efe824c6fb 100644 --- a/xen/arch/arm/vgic-v3.c +++ b/xen/arch/arm/vgic-v3.c @@ -1640,7 +1640,11 @@ static int vgic_v3_vcpu_init(struct vcpu *v) return 0; } -static inline unsigned int vgic_v3_rdist_count(struct domain *d) +/* + * Return the maximum number possible of re-distributor regions for + * a given domain. + */ +static inline unsigned int vgic_v3_max_rdist_count(struct domain *d) { /* * Normally there is only one GICv3 redistributor region. @@ -1662,7 +1666,7 @@ static int vgic_v3_real_domain_init(struct domain *d) int rdist_count, i, ret; /* Allocate memory for Re-distributor regions */ -rdist_count = vgic_v3_rdist_count(d); +rdist_count = vgic_v3_max_rdist_count(d); rdist_regions = xzalloc_array(struct vgic_rdist_region, rdist_count); if ( !rdist_regions ) @@ -1695,8 +1699,19 @@ static int vgic_v3_real_domain_init(struct domain *d) d->arch.vgic.rdist_regions[i].first_cpu = first_cpu; first_cpu += size / GICV3_GICR_SIZE; + +if ( first_cpu >= d->max_vcpus ) +break; } +/* + * The hardware domain may not use all the re-distributors + * regions (e.g when the number of vCPUs does not match the + * number of pCPUs). Update the number of regions to avoid + * exposing unused region as they will not get emulated. + */ +d->arch.vgic.nr_regions = i + 1; + d->arch.vgic.intid_bits = vgic_v3_hw.intid_bits;
Re: [Xen-devel] [PATCH v2 1/2] xen/arm: vgic-v3: Delay the initialization of the domain information
Hi, I forgot to remove patch from the previous series before sending the new one. Please ignore that patch. Sorry for the noise. Cheers, On 10/01/2018 07:57 PM, Julien Grall wrote: A follow-up patch will require to know the number of vCPUs when initializating the vGICv3 domain structure. However this information is not available at domain creation. This is only known once XEN_DOMCTL_max_vpus is called for that domain. In order to get the max vCPUs around, delay the domain part of the vGIC v3 initialization until the first vCPU of the domain is initialized. Signed-off-by: Julien Grall Tested-by: Shameer Kolothum Acked-but-disliked-by: Stefano Stabellini --- Cc: Andrew Cooper This is nasty but I can't find a better way for Xen 4.11 and older. We still need it for unstable because the number of vCPUs is not known in arch_domain_init. There are discussion to rework the domain creation a bit further but I would hope to fix the bug first. Andrew, I have CCed you to know whether you have a better idea where to place this call on Xen 4.11 and older. Changes in v2: - The patch is also needed for the time being on unstable - Add Stefano's recently invented tag - Add Shameer's tested tag --- xen/arch/arm/vgic-v3.c | 29 +++-- 1 file changed, 27 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c index 4b42739a52..df1bab3a35 100644 --- a/xen/arch/arm/vgic-v3.c +++ b/xen/arch/arm/vgic-v3.c @@ -1573,9 +1573,11 @@ static const struct mmio_handler_ops vgic_distr_mmio_handler = { .write = vgic_v3_distr_mmio_write, }; +static int vgic_v3_real_domain_init(struct domain *d); + static int vgic_v3_vcpu_init(struct vcpu *v) { -int i; +int i, rc; paddr_t rdist_base; struct vgic_rdist_region *region; unsigned int last_cpu; @@ -1584,6 +1586,19 @@ static int vgic_v3_vcpu_init(struct vcpu *v) struct domain *d = v->domain; /* + * This is the earliest place where the number of vCPUs is + * known. This is required to initialize correctly the vGIC v3 + * domain structure. We only to do that when vCPU 0 is + * initilialized. + */ +if ( v->vcpu_id == 0 ) +{ +rc = vgic_v3_real_domain_init(d); +if ( rc ) +return rc; +} + +/* * Find the region where the re-distributor lives. For this purpose, * we look one region ahead as we have only the first CPU in hand. */ @@ -1641,7 +1656,7 @@ static inline unsigned int vgic_v3_rdist_count(struct domain *d) GUEST_GICV3_RDIST_REGIONS; } -static int vgic_v3_domain_init(struct domain *d) +static int vgic_v3_real_domain_init(struct domain *d) { struct vgic_rdist_region *rdist_regions; int rdist_count, i, ret; @@ -1733,6 +1748,16 @@ static int vgic_v3_domain_init(struct domain *d) return 0; } +static int vgic_v3_domain_init(struct domain *d) +{ +/* + * The domain initialization for vGIC v3 is delayed until the first vCPU + * is created. This because the initialization may require to know the + * number of vCPUs that is not known when creating the domain. + */ +return 0; +} + static void vgic_v3_domain_free(struct domain *d) { vgic_v3_its_free_domain(d); -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v3 2/3] tools/libxl: Deprecate PV fields kernel, ramdisk, cmdline
The PV fields kernel, ramdisk, cmdline are only there for compatibility with old toolstack. Instead of manually copying them over to there new field, use the deprecated_by attribute in the IDL. Suggested-by: Roger Pau Monné Signed-off-by: Julien Grall --- Changes in v3: - Patch added --- tools/libxl/libxl_create.c | 19 --- tools/libxl/libxl_types.idl | 6 +++--- 2 files changed, 3 insertions(+), 22 deletions(-) diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index 580320d272..fe97eebdea 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -368,25 +368,6 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc, b_info->shadow_memkb = 0; if (b_info->u.pv.slack_memkb == LIBXL_MEMKB_DEFAULT) b_info->u.pv.slack_memkb = 0; - -/* For compatibility, fill in b_info->kernel|ramdisk|cmdline - * with the value in u.pv, later processing will use - * b_info->kernel|ramdisk|cmdline only. - * User with old APIs that passes u.pv.kernel|ramdisk|cmdline - * is not affected. - */ -if (!b_info->kernel && b_info->u.pv.kernel) { -b_info->kernel = b_info->u.pv.kernel; -b_info->u.pv.kernel = NULL; -} -if (!b_info->ramdisk && b_info->u.pv.ramdisk) { -b_info->ramdisk = b_info->u.pv.ramdisk; -b_info->u.pv.ramdisk = NULL; -} -if (!b_info->cmdline && b_info->u.pv.cmdline) { -b_info->cmdline = b_info->u.pv.cmdline; -b_info->u.pv.cmdline = NULL; -} break; case LIBXL_DOMAIN_TYPE_PVH: libxl_defbool_setdefault(_info->u.pvh.pvshim, false); diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 2cceb8c057..3b8f967651 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -588,12 +588,12 @@ libxl_domain_build_info = Struct("domain_build_info",[ ("rdm_mem_boundary_memkb", MemKB), ("mca_caps", uint64), ])), - ("pv", Struct(None, [("kernel", string), + ("pv", Struct(None, [("kernel", string, {'deprecated_by': 'kernel'}), ("slack_memkb", MemKB), ("bootloader", string, {'deprecated_by': 'bootloader'}), ("bootloader_args", libxl_string_list, {'deprecated_by': 'bootloader_args'}), - ("cmdline", string), - ("ramdisk", string), + ("cmdline", string, {'deprecated_by': 'cmdline'}), + ("ramdisk", string, {'deprecated_by': 'ramdisk'}), ("features", string, {'const': True}), # Use host's E820 for PCI passthrough. ("e820_host", libxl_defbool), -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 2/2] xen/arm: vgic-v3: Don't create empty re-distributor regions
At the moment, Xen is assuming the hardware domain will have the same number of re-distributor regions as the host. However, as the number of CPUs or the stride (e.g on GICv4) may be different we end up exposing regions which does not contain any re-distributors. When booting, Linux will go through all the re-distributor region to check whether a property (e.g vPLIs) is available accross all the re-distributors. This will result to a data abort on empty regions because there are no underlying re-distributor. So we need to limit the number of regions exposed to the hardware domain. The code reworked to only expose the minimun number of regions required by the hardware domain. It is assumed the regions will be populated starting from the first one. Lastly, rename vgic_v3_rdist_count to reflect the value return by the helper. Reported-by: Shameerali Kolothum Thodi Signed-off-by: Julien Grall Tested-by: Shameer Kolothum --- Changes in v2: - Rename vgic_v3_rdist_count to vgic_v3_max_rdist_count - Fixup #re-distributors - Fix typoes - Add Shameer's tested tag --- xen/arch/arm/gic-v3.c | 14 +++--- xen/arch/arm/vgic-v3.c | 21 ++--- 2 files changed, 29 insertions(+), 6 deletions(-) diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index c98a163ee7..2c1454f425 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1265,7 +1265,8 @@ static int gicv3_make_hwdom_dt_node(const struct domain *d, if ( res ) return res; -res = fdt_property_cell(fdt, "#redistributor-regions", gicv3.rdist_count); +res = fdt_property_cell(fdt, "#redistributor-regions", +d->arch.vgic.nr_regions); if ( res ) return res; @@ -1274,8 +1275,10 @@ static int gicv3_make_hwdom_dt_node(const struct domain *d, * GIC has two memory regions: Distributor + rdist regions * CPU interface and virtual cpu interfaces accessesed as System registers * So cells are created only for Distributor and rdist regions + * The hardware domain may not use all the regions. So only copy + * what is necessary. */ -new_len = new_len * (gicv3.rdist_count + 1); +new_len = new_len * (d->arch.vgic.nr_regions + 1); hw_reg = dt_get_property(gic, "reg", ); if ( !hw_reg ) @@ -1466,6 +1469,7 @@ static inline bool gic_dist_supports_dvis(void) } static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) + { struct acpi_subtable_header *header; struct acpi_madt_generic_interrupt *host_gicc, *gicc; @@ -1503,7 +1507,11 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) /* Add Generic Redistributor */ size = sizeof(struct acpi_madt_generic_redistributor); -for ( i = 0; i < gicv3.rdist_count; i++ ) +/* + * The hardware domain may not used all the regions. So only copy + * what is necessary. + */ +for ( i = 0; i < d->arch.vgic.nr_regions; i++ ) { gicr = (struct acpi_madt_generic_redistributor *)(base_ptr + table_len); gicr->header.type = ACPI_MADT_TYPE_GENERIC_REDISTRIBUTOR; diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c index df1bab3a35..efe824c6fb 100644 --- a/xen/arch/arm/vgic-v3.c +++ b/xen/arch/arm/vgic-v3.c @@ -1640,7 +1640,11 @@ static int vgic_v3_vcpu_init(struct vcpu *v) return 0; } -static inline unsigned int vgic_v3_rdist_count(struct domain *d) +/* + * Return the maximum number possible of re-distributor regions for + * a given domain. + */ +static inline unsigned int vgic_v3_max_rdist_count(struct domain *d) { /* * Normally there is only one GICv3 redistributor region. @@ -1662,7 +1666,7 @@ static int vgic_v3_real_domain_init(struct domain *d) int rdist_count, i, ret; /* Allocate memory for Re-distributor regions */ -rdist_count = vgic_v3_rdist_count(d); +rdist_count = vgic_v3_max_rdist_count(d); rdist_regions = xzalloc_array(struct vgic_rdist_region, rdist_count); if ( !rdist_regions ) @@ -1695,8 +1699,19 @@ static int vgic_v3_real_domain_init(struct domain *d) d->arch.vgic.rdist_regions[i].first_cpu = first_cpu; first_cpu += size / GICV3_GICR_SIZE; + +if ( first_cpu >= d->max_vcpus ) +break; } +/* + * The hardware domain may not use all the re-distributors + * regions (e.g when the number of vCPUs does not match the + * number of pCPUs). Update the number of regions to avoid + * exposing unused region as they will not get emulated. + */ +d->arch.vgic.nr_regions = i + 1; + d->arch.vgic.intid_bits = vgic_v3_hw.intid_bits; } else @@ -1825,7 +1840,7 @@ int vgic_v3_init(struct domain *d, int *mmio_count) } /* GICD region + number of Redistributors */ -*mmio_count = vgic_v3_rdist_count(d) + 1; +*mmio_count = vgic_v3_max_rdist_count(d) + 1;
[Xen-devel] [PATCH v3 1/3] tools/libxl: Rename libxl__arch_domain_build_info_acpi_setdefault to...
libxl__arch_domain_build_info_setdefault A follow-up will require to modify default of multiple fields of build_info. So rename the function accordingly. No functional change. Signed-off-by: Julien Grall Reviewed-by: Roger Pau Monné Acked-by: Wei Liu --- Changes in v3: - Add Roger's reviewed-by - Add Wei's acked-by --- tools/libxl/libxl_arch.h | 3 +-- tools/libxl/libxl_arm.c| 4 ++-- tools/libxl/libxl_create.c | 2 +- tools/libxl/libxl_x86.c| 3 +-- 4 files changed, 5 insertions(+), 7 deletions(-) diff --git a/tools/libxl/libxl_arch.h b/tools/libxl/libxl_arch.h index c8ccaaf14c..5ab0c95974 100644 --- a/tools/libxl/libxl_arch.h +++ b/tools/libxl/libxl_arch.h @@ -65,8 +65,7 @@ _hidden int libxl__arch_domain_map_irq(libxl__gc *gc, uint32_t domid, int irq); _hidden -void libxl__arch_domain_build_info_acpi_setdefault( -libxl_domain_build_info *b_info); +void libxl__arch_domain_build_info_setdefault(libxl_domain_build_info *b_info); _hidden int libxl__arch_extra_memory(libxl__gc *gc, diff --git a/tools/libxl/libxl_arm.c b/tools/libxl/libxl_arm.c index baa0d38e01..699fd9ddc6 100644 --- a/tools/libxl/libxl_arm.c +++ b/tools/libxl/libxl_arm.c @@ -1110,9 +1110,9 @@ int libxl__arch_domain_map_irq(libxl__gc *gc, uint32_t domid, int irq) return xc_domain_bind_pt_spi_irq(CTX->xch, domid, irq, irq); } -void libxl__arch_domain_build_info_acpi_setdefault( -libxl_domain_build_info *b_info) +void libxl__arch_domain_build_info_setdefault(libxl_domain_build_info *b_info) { +/* ACPI is disabled by default */ libxl_defbool_setdefault(_info->acpi, false); } diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index dcfde7787e..580320d272 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -215,7 +215,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc, if (!b_info->event_channels) b_info->event_channels = 1023; -libxl__arch_domain_build_info_acpi_setdefault(b_info); +libxl__arch_domain_build_info_setdefault(b_info); libxl_defbool_setdefault(_info->dm_restrict, false); switch (b_info->type) { diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c index 6f670b03b5..81523a568f 100644 --- a/tools/libxl/libxl_x86.c +++ b/tools/libxl/libxl_x86.c @@ -613,8 +613,7 @@ int libxl__arch_domain_finalise_hw_description(libxl__gc *gc, return rc; } -void libxl__arch_domain_build_info_acpi_setdefault( -libxl_domain_build_info *b_info) +void libxl__arch_domain_build_info_setdefault(libxl_domain_build_info *b_info) { libxl_defbool_setdefault(_info->acpi, true); } -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 1/2] xen/arm: vgic-v3: Delay the initialization of the domain information
A follow-up patch will require to know the number of vCPUs when initializating the vGICv3 domain structure. However this information is not available at domain creation. This is only known once XEN_DOMCTL_max_vpus is called for that domain. In order to get the max vCPUs around, delay the domain part of the vGIC v3 initialization until the first vCPU of the domain is initialized. Signed-off-by: Julien Grall Tested-by: Shameer Kolothum Acked-but-disliked-by: Stefano Stabellini --- Cc: Andrew Cooper This is nasty but I can't find a better way for Xen 4.11 and older. We still need it for unstable because the number of vCPUs is not known in arch_domain_init. There are discussion to rework the domain creation a bit further but I would hope to fix the bug first. Andrew, I have CCed you to know whether you have a better idea where to place this call on Xen 4.11 and older. Changes in v2: - The patch is also needed for the time being on unstable - Add Stefano's recently invented tag - Add Shameer's tested tag --- xen/arch/arm/vgic-v3.c | 29 +++-- 1 file changed, 27 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c index 4b42739a52..df1bab3a35 100644 --- a/xen/arch/arm/vgic-v3.c +++ b/xen/arch/arm/vgic-v3.c @@ -1573,9 +1573,11 @@ static const struct mmio_handler_ops vgic_distr_mmio_handler = { .write = vgic_v3_distr_mmio_write, }; +static int vgic_v3_real_domain_init(struct domain *d); + static int vgic_v3_vcpu_init(struct vcpu *v) { -int i; +int i, rc; paddr_t rdist_base; struct vgic_rdist_region *region; unsigned int last_cpu; @@ -1584,6 +1586,19 @@ static int vgic_v3_vcpu_init(struct vcpu *v) struct domain *d = v->domain; /* + * This is the earliest place where the number of vCPUs is + * known. This is required to initialize correctly the vGIC v3 + * domain structure. We only to do that when vCPU 0 is + * initilialized. + */ +if ( v->vcpu_id == 0 ) +{ +rc = vgic_v3_real_domain_init(d); +if ( rc ) +return rc; +} + +/* * Find the region where the re-distributor lives. For this purpose, * we look one region ahead as we have only the first CPU in hand. */ @@ -1641,7 +1656,7 @@ static inline unsigned int vgic_v3_rdist_count(struct domain *d) GUEST_GICV3_RDIST_REGIONS; } -static int vgic_v3_domain_init(struct domain *d) +static int vgic_v3_real_domain_init(struct domain *d) { struct vgic_rdist_region *rdist_regions; int rdist_count, i, ret; @@ -1733,6 +1748,16 @@ static int vgic_v3_domain_init(struct domain *d) return 0; } +static int vgic_v3_domain_init(struct domain *d) +{ +/* + * The domain initialization for vGIC v3 is delayed until the first vCPU + * is created. This because the initialization may require to know the + * number of vCPUs that is not known when creating the domain. + */ +return 0; +} + static void vgic_v3_domain_free(struct domain *d) { vgic_v3_its_free_domain(d); -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v3 0/3] tools/libxl: Switch Arm guest type to PVH
Hi all, This small patch series adds switch Arm guest type to PVH in libxl. See patch #2 for the rationale. Cheers, Julien Grall (3): tools/libxl: Rename libxl__arch_domain_build_info_acpi_setdefault to... tools/libxl: Deprecate PV fields kernel, ramdisk, cmdline tools/libxl: Switch Arm guest type to PVH docs/man/xl.cfg.pod.5.in| 5 +++-- tools/libxl/libxl_arch.h| 4 ++-- tools/libxl/libxl_arm.c | 28 +--- tools/libxl/libxl_create.c | 21 + tools/libxl/libxl_types.idl | 6 +++--- tools/libxl/libxl_x86.c | 4 ++-- tools/xl/xl_parse.c | 4 7 files changed, 40 insertions(+), 32 deletions(-) -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v3 3/3] tools/libxl: Switch Arm guest type to PVH
Currently, the toolstack is considering Arm guest always PV. However, they are very similar to PVH because HW virtualization extension are used and QEMU is not started. So switch Arm guest type to PVH. To keep compatibility with toolstack creating Arm guest with PV type (e.g libvirt), libxl will now convert those guests to PVH. Furthermore, the default type for Arm in xl will now be PVH to allow smooth transition for user. Signed-off-by: Julien Grall --- This was discussed at Xen Summit and also in various thread on xen-devel. The latest one was when Andrew sent a patch to deny guest creation on Arm with XEN_DOMCTL_CDF_hap unset. I suspect we first implemented Arm guest as PV in libxl because PVH was non-existent and the type was easier to avoid spawning QEMU. Note that Linux and Xen are already considering Arm guest as PVH. Changes in v3: - Properly reset u.pvh - Update documentation and print - Return ERROR_INVAL rather than ERROR_FAIL Changes in v2: - Rather than denying PV guest, convert them to PVH --- docs/man/xl.cfg.pod.5.in | 5 +++-- tools/libxl/libxl_arch.h | 3 ++- tools/libxl/libxl_arm.c| 26 -- tools/libxl/libxl_create.c | 2 +- tools/libxl/libxl_x86.c| 3 ++- tools/xl/xl_parse.c| 4 6 files changed, 36 insertions(+), 7 deletions(-) diff --git a/docs/man/xl.cfg.pod.5.in b/docs/man/xl.cfg.pod.5.in index b72718151b..b1c0be14cd 100644 --- a/docs/man/xl.cfg.pod.5.in +++ b/docs/man/xl.cfg.pod.5.in @@ -80,13 +80,14 @@ single host must be unique. =item B Specifies that this is to be a PV domain, suitable for hosting Xen-aware -guest operating systems. This is the default. +guest operating systems. This is the default on x86. =item B Specifies that this is to be an PVH domain. That is a lightweight HVM-like guest without a device model and without many of the emulated devices -available to HVM guests. Note that this mode requires a PVH aware kernel. +available to HVM guests. Note that this mode requires a PVH aware kernel on +x86. This is the default on Arm. =item B diff --git a/tools/libxl/libxl_arch.h b/tools/libxl/libxl_arch.h index 5ab0c95974..930570ef1e 100644 --- a/tools/libxl/libxl_arch.h +++ b/tools/libxl/libxl_arch.h @@ -65,7 +65,8 @@ _hidden int libxl__arch_domain_map_irq(libxl__gc *gc, uint32_t domid, int irq); _hidden -void libxl__arch_domain_build_info_setdefault(libxl_domain_build_info *b_info); +void libxl__arch_domain_build_info_setdefault(libxl__gc *gc, + libxl_domain_build_info *b_info); _hidden int libxl__arch_extra_memory(libxl__gc *gc, diff --git a/tools/libxl/libxl_arm.c b/tools/libxl/libxl_arm.c index 699fd9ddc6..25dc3defc6 100644 --- a/tools/libxl/libxl_arm.c +++ b/tools/libxl/libxl_arm.c @@ -953,7 +953,11 @@ int libxl__arch_domain_init_hw_description(libxl__gc *gc, int rc; uint64_t val; -assert(info->type == LIBXL_DOMAIN_TYPE_PV); +if (info->type != LIBXL_DOMAIN_TYPE_PVH) { +LOG(ERROR, "Unsupported Arm guest type %s", +libxl_domain_type_to_string(info->type)); +return ERROR_INVAL; +} /* Set the value of domain param HVM_PARAM_CALLBACK_IRQ. */ val = MASK_INSR(HVM_PARAM_CALLBACK_TYPE_PPI, @@ -1110,10 +1114,28 @@ int libxl__arch_domain_map_irq(libxl__gc *gc, uint32_t domid, int irq) return xc_domain_bind_pt_spi_irq(CTX->xch, domid, irq, irq); } -void libxl__arch_domain_build_info_setdefault(libxl_domain_build_info *b_info) +void libxl__arch_domain_build_info_setdefault(libxl__gc *gc, + libxl_domain_build_info *b_info) { /* ACPI is disabled by default */ libxl_defbool_setdefault(_info->acpi, false); + +/* + * Arm guest are now considered as PVH by the toolstack. To allow + * compatibility with previous toolstack, PV guest are automatically + * converted to PVH. + */ +if (b_info->type != LIBXL_DOMAIN_TYPE_PV) +return; + +LOG(WARN, "Converting PV guest to PVH."); +LOG(WARN, "Arm guest are now PVH."); +LOG(WARN, "Please fix your configuration file/toolstack."); + +/* Re-initialize type to PVH and all associated fields to defaults. */ +memset(_info->u, '\0', sizeof(b_info->u)); +b_info->type = LIBXL_DOMAIN_TYPE_INVALID; +libxl_domain_build_info_init_type(b_info, LIBXL_DOMAIN_TYPE_PVH); } /* diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index fe97eebdea..320dbed3c6 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -215,7 +215,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc, if (!b_info->event_channels) b_info->event_channels = 1023; -libxl__arch_domain_build_info_setdefault(b_info); +libxl__arch_domain_build_info_setdefault(gc, b_info); libxl_defbool_setdefault(_info->dm_restrict, false); switch (b_info->type) { diff --git
Re: [Xen-devel] [xen-unstable test] 128240: regressions - FAIL
On Mon, 2018-10-01 at 16:35 +0100, Wei Liu wrote: > On Mon, Oct 01, 2018 at 04:19:07PM +0100, George Dunlap wrote: > > Wait, the migration code reads the scheduler parameters -- even if > > these > > have not been explicitly set by the admin -- and sends them along > > with > > the migration stream? And if the remote scheduler is different, > > the > > migration fails? > > > > That's not so good. :-) > > But one can argue that the guest is specific configured that way so > it's > parameters should be preserved. We normally analyse things on a case > by > case basis. > Sure, but then this means that, in the restore path, we should query the scheduler which is in use on the host. If it is the same one that we also see in the migration stream, we try to set the parameters. If it is a different one, we skip the setparam all together (which is basically what Juergen is suggesting later in the thread, AFAICT). I don't think I've ever worked on migration, but I can try to look into fixing this (tomorrow). Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Software Engineer @ SUSE https://www.suse.com/ signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [xen-unstable test] 128240: regressions - FAIL
On Mon, 2018-10-01 at 18:07 +0200, Juergen Gross wrote: > On 01/10/2018 17:48, George Dunlap wrote: > > On 10/01/2018 04:40 PM, Andrew Cooper wrote: > > > On 01/10/18 16:35, Wei Liu wrote: > > > > > Wait, the migration code reads the scheduler parameters -- > > > > > even if these > > > > > have not been explicitly set by the admin -- and sends them > > > > > along with > > > > > the migration stream? And if the remote scheduler is > > > > > different, the > > > > > migration fails? > > > > > > > > > > That's not so good. :-) > > > > > > > > But one can argue that the guest is specific configured that > > > > way so it's > > > > parameters should be preserved. We normally analyse things on a > > > > case by > > > > case basis. > > > > > > If there isn't an obvious fix, then the switch of default > > > scheduler > > > needs reverting until there is a fix present. This is currently > > > blocking master. > > > > Agreed. I'd argue for ignoring failures to set scheduler > > parameters on > > migrate, on the grounds that this will be less risk to the project > > as a > > whole than reverting credit2 again. But either way we should do > > something quickly. > > We should ignore a mismatch of the scheduler. Failures when setting > parameters for a matching scheduler should not be ignored IMO. > Indeed! Especially considering that this isn't really related on what the default scheduler is (despite it being making Credit2 default that triggers the issue). In fact, what if: - user uses Credit1 (default and supported) on host A - user uses Credit2 (supported) on host B - user migrates VM - BOOOM! So, unless it is intended --and, I'd say, also documented somewhere-- that migrating between hosts which use different schedulers is to be avoided, this is already a bug, whatever the default scheduler is... George, let me know if you're working on a fix already, or if I should do that myself. Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Software Engineer @ SUSE https://www.suse.com/ signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [qemu-mainline test] 128274: regressions - FAIL
flight 128274 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/128274/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-qcow217 guest-localmigrate/x10 fail REGR. vs. 128215 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 128215 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 128215 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 128215 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 128215 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 128215 test-amd64-i386-xl-pvshim12 guest-start fail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 14 saverestore-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-arm64-arm64-xl 13 migrate-support-checkfail never pass test-arm64-arm64-xl 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 14 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass version targeted for testing: qemuu07f426c35eddd79388a23d11cb278600d7e3831d baseline version: qemuu042938f46e1c477419d1931381fdadffaa49d45e Last test of basis 128215 2018-09-29 02:08:34 Z2 days Testing same since 128274 2018-10-01 08:36:57 Z0 days1 attempts People who touched revisions under test: Emilio G. Cota Peter Maydell Richard Henderson Roman Kapl jobs: build-amd64-xsm pass build-arm64-xsm pass build-i386-xsm pass build-amd64 pass build-arm64 pass build-armhf pass build-i386 pass build-amd64-libvirt
Re: [Xen-devel] [PATCH 2/2] tools/libxl: Switch Arm guest type to PVH
Hi Roger, On 10/01/2018 05:37 PM, Roger Pau Monné wrote: On Mon, Oct 01, 2018 at 04:33:45PM +0100, Julien Grall wrote: Hi, On 10/01/2018 04:27 PM, Roger Pau Monné wrote: On Tue, Sep 25, 2018 at 08:36:36PM +0100, Julien Grall wrote: Hi Roger, Sorry for the late reply. On 09/03/2018 03:40 PM, Roger Pau Monné wrote: On Mon, Sep 03, 2018 at 12:15:16PM +0100, Julien Grall wrote: On 03/09/18 12:09, Julien Grall wrote: On 23/08/18 08:58, Roger Pau Monné wrote: On Wed, Aug 22, 2018 at 06:48:05PM +0100, Julien Grall wrote: + + b_info->type = LIBXL_DOMAIN_TYPE_PVH; + + /* + * They only fie ld in u.pv that matters on Arm are: kernel, cmdline, + * ramdisk. + */ + + if (!b_info->kernel && b_info->u.pv.kernel) + b_info->kernel = b_info->u.pv.kernel; + + if (!b_info->ramdisk && b_info->u.pv.ramdisk) + b_info->ramdisk = b_info->u.pv.ramdisk; + + if (!b_info->cmdline && b_info->u.pv.cmdline) + b_info->cmdline = b_info->u.pv.cmdline; + + /* Reset b_info->u.pvh to default values */ + memset(_info->u.pvh, 0, sizeof(b_info->u.pvh)); I'm afraid that's not correct. The default values for u.pvh are set by libxl__domain_build_info_setdefault. I thought that this should be covered by the switch right after the call of libxl__arch_domain_build_info_setdefault. Did I miss anything? Oh right, libxl__arch_domain_build_info_setdefault is called by libxl__domain_build_info_setdefault. What I wanted to do here is resetting the union to 0 so you don't get data mangled by the pv fields. Another possible option I think would be to mark those fields as deprecated in the IDL, and libxl__domain_build_info_copy_deprecated will take care of copying them to the new place. In fact I think all guest types should be using the top level kernel, ramdisk and cmdline fields. I will have a look at it. I'm not specially comfortable with changing the guest type in the middle of libxl__domain_build_info_setdefault, but I also don't have a much better suggestion apart from using the deprecation helper. I forgot to answer to this bit. I don't think the deprecation helper will do all the jobs. There are still PV specific parameters: slack_memkb, features, e820_host. Those can be left inside the PV sub-struct and shouldn't be marked as deprecated. Those are not necessary for Arm, if you don't zero them then you will not initialize the PVH structure with default values. How do you suggest to handle them? But I guess ARM doesn't use any of those fields (or else they should be moved to the pvh sub-struct anyway)? Those fields should not be used by Arm. Looking at the current fields in pv, they all should be zeroed. However, I am not sure if we can assume that will always be the case. If we can assume it, then I think it would just be sufficient to have b_info->type = LIBXL_DOMAIN_TYPE_PVH in Arm. Any thoughts? If we go down this route for ARM I think you should add to xl.cfg that the `type` option only applies to x86, and that there's only one guest type on ARM. IMO it's best to not use type if there's only one type available, so that it can be intruded later on ARM if required. That's not what we discussed before. The idea is to default Arm guest to PVH for xl. But "type=" would still be available. We still have to support other type as other toolstack (e.g libvirt) may use PV for Arm. So the idea here is libxl will check if it was PV and convert to PVH. In an earlier reply, you said that we can use deprecated to copy most of the options over. However, can we always assume the pv.* fields will always be zero after the value was copied over? If not, then we would have to memset them. Yes, the field is cleared/disposed after the value is copied over, see libxl_C_type_copy_deprecated in gentypes.py. That's answer part of my question :). AFAIU, nothing promise that a default value will always be 0 (see libxl_devid). I am concerned that someone may decide to add such field in either PV or PVH in the future. Anyway, I think I find a way to address that, I will resend a new version later today. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 1/2] xen/arm: vgic-v3: Delay the initialization of the domain information
A follow-up patch will require to know the number of vCPUs when initializating the vGICv3 domain structure. However this information is not available at domain creation. This is only known once XEN_DOMCTL_max_vpus is called for that domain. In order to get the max vCPUs around, delay the domain part of the vGIC v3 initialization until the first vCPU of the domain is initialized. Signed-off-by: Julien Grall Tested-by: Shameer Kolothum Acked-but-disliked-by: Stefano Stabellini --- Cc: Andrew Cooper This is nasty but I can't find a better way for Xen 4.11 and older. We still need it for unstable because the number of vCPUs is not known in arch_domain_init. There are discussion to rework the domain creation a bit further but I would hope to fix the bug first. Andrew, I have CCed you to know whether you have a better idea where to place this call on Xen 4.11 and older. Changes in v2: - The patch is also needed for the time being on unstable - Add Stefano's recently invented tag - Add Shameer's tested tag --- xen/arch/arm/vgic-v3.c | 29 +++-- 1 file changed, 27 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c index 4b42739a52..df1bab3a35 100644 --- a/xen/arch/arm/vgic-v3.c +++ b/xen/arch/arm/vgic-v3.c @@ -1573,9 +1573,11 @@ static const struct mmio_handler_ops vgic_distr_mmio_handler = { .write = vgic_v3_distr_mmio_write, }; +static int vgic_v3_real_domain_init(struct domain *d); + static int vgic_v3_vcpu_init(struct vcpu *v) { -int i; +int i, rc; paddr_t rdist_base; struct vgic_rdist_region *region; unsigned int last_cpu; @@ -1584,6 +1586,19 @@ static int vgic_v3_vcpu_init(struct vcpu *v) struct domain *d = v->domain; /* + * This is the earliest place where the number of vCPUs is + * known. This is required to initialize correctly the vGIC v3 + * domain structure. We only to do that when vCPU 0 is + * initilialized. + */ +if ( v->vcpu_id == 0 ) +{ +rc = vgic_v3_real_domain_init(d); +if ( rc ) +return rc; +} + +/* * Find the region where the re-distributor lives. For this purpose, * we look one region ahead as we have only the first CPU in hand. */ @@ -1641,7 +1656,7 @@ static inline unsigned int vgic_v3_rdist_count(struct domain *d) GUEST_GICV3_RDIST_REGIONS; } -static int vgic_v3_domain_init(struct domain *d) +static int vgic_v3_real_domain_init(struct domain *d) { struct vgic_rdist_region *rdist_regions; int rdist_count, i, ret; @@ -1733,6 +1748,16 @@ static int vgic_v3_domain_init(struct domain *d) return 0; } +static int vgic_v3_domain_init(struct domain *d) +{ +/* + * The domain initialization for vGIC v3 is delayed until the first vCPU + * is created. This because the initialization may require to know the + * number of vCPUs that is not known when creating the domain. + */ +return 0; +} + static void vgic_v3_domain_free(struct domain *d) { vgic_v3_its_free_domain(d); -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 2/2] xen/arm: vgic-v3: Don't create empty re-distributor regions
At the moment, Xen is assuming the hardware domain will have the same number of re-distributor regions as the host. However, as the number of CPUs or the stride (e.g on GICv4) may be different we end up exposing regions which does not contain any re-distributors. When booting, Linux will go through all the re-distributor region to check whether a property (e.g vPLIs) is available accross all the re-distributors. This will result to a data abort on empty regions because there are no underlying re-distributor. So we need to limit the number of regions exposed to the hardware domain. The code reworked to only expose the minimun number of regions required by the hardware domain. It is assumed the regions will be populated starting from the first one. Lastly, rename vgic_v3_rdist_count to reflect the value return by the helper. Reported-by: Shameerali Kolothum Thodi Signed-off-by: Julien Grall Tested-by: Shameer Kolothum --- Changes in v2: - Rename vgic_v3_rdist_count to vgic_v3_max_rdist_count - Fixup #re-distributors - Fix typoes - Add Shameer's tested tag --- xen/arch/arm/gic-v3.c | 14 +++--- xen/arch/arm/vgic-v3.c | 21 ++--- 2 files changed, 29 insertions(+), 6 deletions(-) diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index c98a163ee7..2c1454f425 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1265,7 +1265,8 @@ static int gicv3_make_hwdom_dt_node(const struct domain *d, if ( res ) return res; -res = fdt_property_cell(fdt, "#redistributor-regions", gicv3.rdist_count); +res = fdt_property_cell(fdt, "#redistributor-regions", +d->arch.vgic.nr_regions); if ( res ) return res; @@ -1274,8 +1275,10 @@ static int gicv3_make_hwdom_dt_node(const struct domain *d, * GIC has two memory regions: Distributor + rdist regions * CPU interface and virtual cpu interfaces accessesed as System registers * So cells are created only for Distributor and rdist regions + * The hardware domain may not use all the regions. So only copy + * what is necessary. */ -new_len = new_len * (gicv3.rdist_count + 1); +new_len = new_len * (d->arch.vgic.nr_regions + 1); hw_reg = dt_get_property(gic, "reg", ); if ( !hw_reg ) @@ -1466,6 +1469,7 @@ static inline bool gic_dist_supports_dvis(void) } static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) + { struct acpi_subtable_header *header; struct acpi_madt_generic_interrupt *host_gicc, *gicc; @@ -1503,7 +1507,11 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) /* Add Generic Redistributor */ size = sizeof(struct acpi_madt_generic_redistributor); -for ( i = 0; i < gicv3.rdist_count; i++ ) +/* + * The hardware domain may not used all the regions. So only copy + * what is necessary. + */ +for ( i = 0; i < d->arch.vgic.nr_regions; i++ ) { gicr = (struct acpi_madt_generic_redistributor *)(base_ptr + table_len); gicr->header.type = ACPI_MADT_TYPE_GENERIC_REDISTRIBUTOR; diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c index df1bab3a35..efe824c6fb 100644 --- a/xen/arch/arm/vgic-v3.c +++ b/xen/arch/arm/vgic-v3.c @@ -1640,7 +1640,11 @@ static int vgic_v3_vcpu_init(struct vcpu *v) return 0; } -static inline unsigned int vgic_v3_rdist_count(struct domain *d) +/* + * Return the maximum number possible of re-distributor regions for + * a given domain. + */ +static inline unsigned int vgic_v3_max_rdist_count(struct domain *d) { /* * Normally there is only one GICv3 redistributor region. @@ -1662,7 +1666,7 @@ static int vgic_v3_real_domain_init(struct domain *d) int rdist_count, i, ret; /* Allocate memory for Re-distributor regions */ -rdist_count = vgic_v3_rdist_count(d); +rdist_count = vgic_v3_max_rdist_count(d); rdist_regions = xzalloc_array(struct vgic_rdist_region, rdist_count); if ( !rdist_regions ) @@ -1695,8 +1699,19 @@ static int vgic_v3_real_domain_init(struct domain *d) d->arch.vgic.rdist_regions[i].first_cpu = first_cpu; first_cpu += size / GICV3_GICR_SIZE; + +if ( first_cpu >= d->max_vcpus ) +break; } +/* + * The hardware domain may not use all the re-distributors + * regions (e.g when the number of vCPUs does not match the + * number of pCPUs). Update the number of regions to avoid + * exposing unused region as they will not get emulated. + */ +d->arch.vgic.nr_regions = i + 1; + d->arch.vgic.intid_bits = vgic_v3_hw.intid_bits; } else @@ -1825,7 +1840,7 @@ int vgic_v3_init(struct domain *d, int *mmio_count) } /* GICD region + number of Redistributors */ -*mmio_count = vgic_v3_rdist_count(d) + 1; +*mmio_count = vgic_v3_max_rdist_count(d) + 1;
[Xen-devel] [PATCH v2 0/2] xen/arm: vgic-v3: bug fixes
Hi all, This series is meant to address Dom0 boot failure when running on GICv4 platforms as well as when the number of vCPUs is not equal to the numbers of pCPUs. They should be backported up Xen 4.8. Cheers, Julien Grall (2): xen/arm: vgic-v3: Delay the initialization of the domain information xen/arm: vgic-v3: Don't create empty re-distributor regions xen/arch/arm/gic-v3.c | 14 +++--- xen/arch/arm/vgic-v3.c | 50 +- 2 files changed, 56 insertions(+), 8 deletions(-) -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] tools/libxl: Switch Arm guest type to PVH
On Mon, Oct 01, 2018 at 04:33:45PM +0100, Julien Grall wrote: > Hi, > > On 10/01/2018 04:27 PM, Roger Pau Monné wrote: > > On Tue, Sep 25, 2018 at 08:36:36PM +0100, Julien Grall wrote: > > > Hi Roger, > > > > > > Sorry for the late reply. > > > > > > On 09/03/2018 03:40 PM, Roger Pau Monné wrote: > > > > On Mon, Sep 03, 2018 at 12:15:16PM +0100, Julien Grall wrote: > > > > > On 03/09/18 12:09, Julien Grall wrote: > > > > > > > > > > > > > > > > > > On 23/08/18 08:58, Roger Pau Monné wrote: > > > > > > > On Wed, Aug 22, 2018 at 06:48:05PM +0100, Julien Grall wrote: > > > > > > > > > > > > > > > > > > > + > > > > > > > > > > + b_info->type = LIBXL_DOMAIN_TYPE_PVH; > > > > > > > > > > + > > > > > > > > > > + /* > > > > > > > > > > + * They only fie ld in u.pv that matters on Arm are: > > > > > > > > > > kernel, cmdline, > > > > > > > > > > + * ramdisk. > > > > > > > > > > + */ > > > > > > > > > > + > > > > > > > > > > + if (!b_info->kernel && b_info->u.pv.kernel) > > > > > > > > > > + b_info->kernel = b_info->u.pv.kernel; > > > > > > > > > > + > > > > > > > > > > + if (!b_info->ramdisk && b_info->u.pv.ramdisk) > > > > > > > > > > + b_info->ramdisk = b_info->u.pv.ramdisk; > > > > > > > > > > + > > > > > > > > > > + if (!b_info->cmdline && b_info->u.pv.cmdline) > > > > > > > > > > + b_info->cmdline = b_info->u.pv.cmdline; > > > > > > > > > > + > > > > > > > > > > + /* Reset b_info->u.pvh to default values */ > > > > > > > > > > + memset(_info->u.pvh, 0, sizeof(b_info->u.pvh)); > > > > > > > > > > > > > > > > > > I'm afraid that's not correct. The default values for u.pvh > > > > > > > > > are set > > > > > > > > > by libxl__domain_build_info_setdefault. > > > > > > > > > > > > > > > > I thought that this should be covered by the switch right after > > > > > > > > the call of > > > > > > > > libxl__arch_domain_build_info_setdefault. Did I miss anything? > > > > > > > > > > > > > > Oh right, libxl__arch_domain_build_info_setdefault is called by > > > > > > > libxl__domain_build_info_setdefault. > > > > > > > > > > > > > > > What I wanted to do here is resetting the union to 0 so you > > > > > > > > don't get data > > > > > > > > mangled by the pv fields. > > > > > > > > > > > > > > Another possible option I think would be to mark those fields as > > > > > > > deprecated in the IDL, and > > > > > > > libxl__domain_build_info_copy_deprecated > > > > > > > will take care of copying them to the new place. In fact I think > > > > > > > all > > > > > > > guest types should be using the top level kernel, ramdisk and > > > > > > > cmdline > > > > > > > fields. > > > > > > > > > > > > I will have a look at it. > > > > > > > > > > > > > > > > > > > > I'm not specially comfortable with changing the guest type in the > > > > > > > middle of libxl__domain_build_info_setdefault, but I also don't > > > > > > > have a > > > > > > > much better suggestion apart from using the deprecation helper. > > > > > > > > > > I forgot to answer to this bit. I don't think the deprecation helper > > > > > will do > > > > > all the jobs. There are still PV specific parameters: slack_memkb, > > > > > features, > > > > > e820_host. > > > > > > > > Those can be left inside the PV sub-struct and shouldn't be marked as > > > > deprecated. > > > > > > > > > Those are not necessary for Arm, if you don't zero them then you will > > > > > not > > > > > initialize the PVH structure with default values. How do you suggest > > > > > to > > > > > handle them? > > > > > > > > But I guess ARM doesn't use any of those fields (or else they should > > > > be moved to the pvh sub-struct anyway)? > > > > > > Those fields should not be used by Arm. Looking at the current fields in > > > pv, > > > they all should be zeroed. > > > > > > However, I am not sure if we can assume that will always be the case. > > > If we can assume it, then I think it would just be sufficient to have > > > b_info->type = LIBXL_DOMAIN_TYPE_PVH in Arm. > > > > > > Any thoughts? > > > > If we go down this route for ARM I think you should add to xl.cfg that > > the `type` option only applies to x86, and that there's only one guest > > type on ARM. IMO it's best to not use type if there's only one type > > available, so that it can be intruded later on ARM if required. > > That's not what we discussed before. The idea is to default Arm guest to PVH > for xl. But "type=" would still be available. > > We still have to support other type as other toolstack (e.g libvirt) may use > PV for Arm. So the idea here is libxl will check if it was PV and convert to > PVH. > > In an earlier reply, you said that we can use deprecated to copy most of the > options over. However, can we always assume the pv.* fields will always be > zero after the value was copied over? If not, then we would have to memset > them. Yes, the field is cleared/disposed after the value is copied over, see libxl_C_type_copy_deprecated in
[Xen-devel] [PATCH v2 2/2] ns16550: enable use of PCI MSI
Which, on x86, requires fiddling with the INTx bit in PCI config space, since for internally used MSI we can't delegate this to Dom0. ns16550_init_postirq() also needs (benign) re-ordering of its operations. Signed-off-by: Jan Beulich --- v2: Re-base. --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -307,7 +307,7 @@ Flag to indicate whether to probe for a ACPI indicating none to be there. ### com1,com2 -> `= [/][,[DPS][,[|pci|amt][,[][,[][,[]]` +> `= [/][,[DPS][,[|pci|amt][,[|msi][,[][,[]]` Both option `com1` and `com2` follow the same format. @@ -328,7 +328,7 @@ Both option `com1` and `com2` follow the * `` is an integer which specifies the IO base port for UART registers. * `` is the IRQ number to use, or `0` to use the UART in poll - mode only. + mode only, or `msi` to set up a Message Signaled Interrupt. * `` is the PCI location of the UART, in `:.` notation. * `` is the PCI bridge behind which is the UART, in --- a/xen/arch/x86/msi.c +++ b/xen/arch/x86/msi.c @@ -742,6 +742,16 @@ static int msi_capability_init(struct pc *desc = entry; /* Restore the original MSI enabled bits */ +if ( !hardware_domain ) +{ +/* + * ..., except for internal requests (before Dom0 starts), in which + * case we rather need to behave "normally", i.e. not follow the split + * brain model where Dom0 actually enables MSI (and disables INTx). + */ +pci_intx(dev, 0); +control |= PCI_MSI_FLAGS_ENABLE; +} pci_conf_write16(seg, bus, slot, func, msi_control_reg(pos), control); return 0; @@ -1019,6 +1029,18 @@ static int msix_capability_init(struct p ++msix->used_entries; /* Restore MSI-X enabled bits */ +if ( !hardware_domain ) +{ +/* + * ..., except for internal requests (before Dom0 starts), in which + * case we rather need to behave "normally", i.e. not follow the split + * brain model where Dom0 actually enables MSI (and disables INTx). + */ +pci_intx(dev, 0); +control |= PCI_MSIX_FLAGS_ENABLE; +control &= ~PCI_MSIX_FLAGS_MASKALL; +maskall = 0; +} msix->host_maskall = maskall; pci_conf_write16(seg, bus, slot, func, msix_control_reg(pos), control); @@ -1073,6 +1095,8 @@ static void __pci_disable_msi(struct msi dev = entry->dev; msi_set_enable(dev, 0); +if ( entry->irq > 0 && !(irq_to_desc(entry->irq)->status & IRQ_GUEST) ) +pci_intx(dev, 1); BUG_ON(list_empty(>msi_list)); } --- a/xen/drivers/char/ns16550.c +++ b/xen/drivers/char/ns16550.c @@ -92,6 +92,7 @@ static struct ns16550 { u32 bar64; u16 cr; u8 bar_idx; +bool_t msi; const struct ns16550_config_param *param; /* Points into .init.*! */ #endif } ns16550_com[2] = { { 0 } }; @@ -712,6 +713,16 @@ static void __init ns16550_init_preirq(s uart->fifo_size = 16; } +static void __init ns16550_init_irq(struct serial_port *port) +{ +#ifdef CONFIG_HAS_PCI +struct ns16550 *uart = port->uart; + +if ( uart->msi ) +uart->irq = create_irq(0); +#endif +} + static void ns16550_setup_postirq(struct ns16550 *uart) { if ( uart->irq > 0 ) @@ -746,17 +757,6 @@ static void __init ns16550_init_postirq( uart->timeout_ms = max_t( unsigned int, 1, (bits * uart->fifo_size * 1000) / uart->baud); -if ( uart->irq > 0 ) -{ -uart->irqaction.handler = ns16550_interrupt; -uart->irqaction.name= "ns16550"; -uart->irqaction.dev_id = port; -if ( (rc = setup_irq(uart->irq, 0, >irqaction)) != 0 ) -printk("ERROR: Failed to allocate ns16550 IRQ %d\n", uart->irq); -} - -ns16550_setup_postirq(uart); - #ifdef CONFIG_HAS_PCI if ( uart->bar || uart->ps_bdf_enable ) { @@ -777,8 +777,65 @@ static void __init ns16550_init_postirq( uart->ps_bdf[0], uart->ps_bdf[1], uart->ps_bdf[2]); } + +if ( uart->msi ) +{ +struct msi_info msi = { +.bus = uart->ps_bdf[0], +.devfn = PCI_DEVFN(uart->ps_bdf[1], uart->ps_bdf[2]), +.irq = rc = uart->irq, +.entry_nr = 1 +}; + +if ( rc > 0 ) +{ +struct msi_desc *msi_desc = NULL; + +pcidevs_lock(); + +rc = pci_enable_msi(, _desc); +if ( !rc ) +{ +struct irq_desc *desc = irq_to_desc(msi.irq); +unsigned long flags; + +spin_lock_irqsave(>lock, flags); +rc = setup_msi_irq(desc, msi_desc); +spin_unlock_irqrestore(>lock, flags); +if ( rc ) +pci_disable_msi(msi_desc); +} + +pcidevs_unlock(); + +
Re: [Xen-devel] [PATCH] libxl: keep assigned pci devices across domain reboots
Fill the from_xenstore libxl_device_type hook for PCI devices so that libxl_retrieve_domain_configuration can properly retrieve PCI devices from xenstore. This fixes disappearing pci devices across domain reboots. This patch seems to be committed now. Please backport this to Xen 4.10 stable branch, for upcoming 4.10.3, because original bugreport was about Xen 4.10. Thanks to devs for the patch. I tested the patch and I can confirm that it fixes my original problem on Xen 4.10. To use with Xen 4.10 you need a small backport patch (attached). Regards Andreas backport-4.10.patch Description: Binary data ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 1/2] console: adjust IRQ initialization
In order for a Xen internal PCI device driver to enable MSI on the device, we need another hook which the driver can use to create the IRQ (doing this in the init_preirq hook is too early, since IRQ code hasn't got initialized at that time yet, and doing it in init_postirq is too late because at least on x86 smp_intr_init() needs to know the IRQ number). On x86 this additionally requires a slight ordering change to IRQ initialization, to facilitate calling the new hook between basic initialization and the call path leading to smp_intr_init(). Signed-off-by: Jan Beulich --- v2: Re-base. --- a/xen/arch/x86/i8259.c +++ b/xen/arch/x86/i8259.c @@ -341,8 +341,6 @@ void __init init_IRQ(void) init_8259A(0); -BUG_ON(init_irq_data() < 0); - for (irq = 0; platform_legacy_irq(irq); irq++) { struct irq_desc *desc = irq_to_desc(irq); --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -677,6 +677,7 @@ void __init noreturn __start_xen(unsigne unsigned long nr_pages, raw_max_page, modules_headroom, *module_map; int i, j, e820_warn = 0, bytes = 0; bool acpi_boot_table_init_done = false, relocated = false; +int ret; struct ns16550_defaults ns16550 = { .data_bits = 8, .parity= 'n', @@ -1559,6 +1560,12 @@ void __init noreturn __start_xen(unsigne x2apic_bsp_setup(); +ret = init_irq_data(); +if ( ret < 0 ) +panic("Error %d setting up IRQ data\n", ret); + +console_init_irq(); + init_IRQ(); module_map = xmalloc_array(unsigned long, BITS_TO_LONGS(mbi->mods_count)); @@ -1661,7 +1668,7 @@ void __init noreturn __start_xen(unsigne if ( (park_offline_cpus || num_online_cpus() < max_cpus) && !cpu_online(i) ) { -int ret = cpu_up(i); +ret = cpu_up(i); if ( ret != 0 ) printk("Failed to bring up CPU %u (error %d)\n", i, ret); else if ( num_online_cpus() > max_cpus || --- a/xen/drivers/char/console.c +++ b/xen/drivers/char/console.c @@ -909,6 +909,11 @@ void __init console_init_ring(void) printk("Allocated console ring of %u KiB.\n", opt_conring_size >> 10); } +void __init console_init_irq(void) +{ +serial_init_irq(); +} + void __init console_init_postirq(void) { serial_init_postirq(); --- a/xen/drivers/char/serial.c +++ b/xen/drivers/char/serial.c @@ -503,6 +503,15 @@ void __init serial_init_preirq(void) com[i].driver->init_preirq([i]); } +void __init serial_init_irq(void) +{ +unsigned int i; + +for ( i = 0; i < ARRAY_SIZE(com); i++ ) +if ( com[i].driver && com[i].driver->init_irq ) +com[i].driver->init_irq([i]); +} + void __init serial_init_postirq(void) { int i; --- a/xen/include/xen/console.h +++ b/xen/include/xen/console.h @@ -15,6 +15,7 @@ long read_console_ring(struct xen_sysctl void console_init_preirq(void); void console_init_ring(void); +void console_init_irq(void); void console_init_postirq(void); void console_endboot(void); int console_has(const char *device); --- a/xen/include/xen/serial.h +++ b/xen/include/xen/serial.h @@ -64,6 +64,7 @@ struct serial_port { struct uart_driver { /* Driver initialisation (pre- and post-IRQ subsystem setup). */ void (*init_preirq)(struct serial_port *); +void (*init_irq)(struct serial_port *); void (*init_postirq)(struct serial_port *); /* Hook to clean up after Xen bootstrap (before domain 0 runs). */ void (*endboot)(struct serial_port *); @@ -99,8 +100,9 @@ struct uart_driver { #define SERHND_LO (1<<3) /* Ditto, except that the MSB is cleared. */ #define SERHND_COOKED (1<<4) /* Newline/carriage-return translation?*/ -/* Two-stage initialisation (before/after IRQ-subsystem initialisation). */ +/* Three-stage initialisation (before/during/after IRQ-subsystem setup). */ void serial_init_preirq(void); +void serial_init_irq(void); void serial_init_postirq(void); /* Clean-up hook before domain 0 runs. */ ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH RFC] mm/memory_hotplug: Introduce memory block types
> How should a policy in user space look like when new memory gets added > - on s390x? Not onlining paravirtualized memory is very wrong. Because we're going to balloon it away in a moment anyway? We have auto-onlining. Why isn't that being used on s390? > So the type of memory is very important here to have in user space. > Relying on checks like "isS390()", "isKVMGuest()" or "isHyperVGuest()" > to decide whether to online memory and how to online memory is wrong. > Only some specific memory types (which I call "normal") are to be > handled by user space. > > For the other ones, we exactly know what to do: > - standby? don't online I think you're horribly conflating the software desire for what the stae should be and the hardware itself. >> As for the OOM issues, that sounds like something we need to fix by >> refusing to do (or delaying) hot-add operations once we consume too much >> ZONE_NORMAL from memmap[]s rather than trying to indirectly tell >> userspace to hurry thing along. > > That is a moving target and doing that automatically is basically > impossible. Nah. We know how much metadata we've allocated. We know how much ZONE_NORMAL we are eating. We can *easily* add something to add_memory() that just sleeps until the ratio is not out-of-whack. > You can add a lot of memory to the movable zone and > everything is fine. Suddenly a lot of processes are started - boom. > MOVABLE should only every be used if you expect an unplug. And for > paravirtualized devices, a "typical" unplug does not exist. No, it's more complicated than that. People use MOVABLE, for instance, to allow more consistent huge page allocations. It's certainly not just hot-remove. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [xen-unstable test] 128240: regressions - FAIL
On 01/10/2018 17:48, George Dunlap wrote: > On 10/01/2018 04:40 PM, Andrew Cooper wrote: >> On 01/10/18 16:35, Wei Liu wrote: >>> On Mon, Oct 01, 2018 at 04:19:07PM +0100, George Dunlap wrote: On 10/01/2018 04:17 PM, Wei Liu wrote: > On Mon, Oct 01, 2018 at 09:10:25AM -0600, Jan Beulich wrote: > On 01.10.18 at 16:33, wrote: >>> On Mon, Oct 01, 2018 at 03:04:02AM -0600, Jan Beulich wrote: >>> On 30.09.18 at 23:59, wrote: > flight 128240 xen-unstable real [real] > http://logs.test-lab.xenproject.org/osstest/logs/128240/ > > Regressions :-( > > Tests which did not succeed and are blocking, > including tests which could not be run: > test-amd64-amd64-migrupgrade 22 guest-migrate/src_host/dst_host fail > REGR. vs. >>> 128084 At the first glance libxl: error: libxl_sched.c:232:sched_credit_domain_set: Domain 1:Getting >>> domain sched credit: Invalid argument libxl: error: libxl_create.c:1275:domcreate_rebuild_done: Domain 1:cannot >>> (re-)build domain: -3 might indicate a problem resulting from the switch to credit2 as the default scheduler. But "first glance" here really means what it says - I didn't look (yet) at what exactly libxl tries to do there, in the hope that others may know without much digging. >>> I think this is due to toolstack trying to set the same scheduler >>> parameters for the newly created guest. >>> >>> But in this test, the destination host is using a different scheduler >>> from the source host. Asking for credit scheduler on a credit2 host is >>> wrong. >>> >>> The relevant snippet in guest cfg (JSON) is: >>> >>> "sched_params": { >>> "sched": "credit", >>> "weight": 256, >>> "cap": 0 >>> }, >>> >>> I can't think of a method to fix it off the top of my head though. >> So is this something that was specified in the original config? Or >> is it just the current value which gets read and an attempt made >> to re-install. If there was no explicit setting in the guest config, >> shouldn't such a "default" setting be retained by not transferring >> any scheduler specifics during migration? >> > No setting in guest cfg. Those values are extracted from the hypervisor. > I think we may be able to not send default values to the remote end. Wait, the migration code reads the scheduler parameters -- even if these have not been explicitly set by the admin -- and sends them along with the migration stream? And if the remote scheduler is different, the migration fails? That's not so good. :-) >>> But one can argue that the guest is specific configured that way so it's >>> parameters should be preserved. We normally analyse things on a case by >>> case basis. >> >> If there isn't an obvious fix, then the switch of default scheduler >> needs reverting until there is a fix present. This is currently >> blocking master. > > Agreed. I'd argue for ignoring failures to set scheduler parameters on > migrate, on the grounds that this will be less risk to the project as a > whole than reverting credit2 again. But either way we should do > something quickly. We should ignore a mismatch of the scheduler. Failures when setting parameters for a matching scheduler should not be ignored IMO. Juergen ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 0/2] ns16550: enable use of PCI MSI
Patch 2 is no longer RFC, now that I have a device where MSI actually works (suggesting that it was really the other device's fault that things didn't work). 1: console: adjust IRQ initialization 2: ns16550: enable use of PCI MSI Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
On Mon, Oct 01, 2018 at 03:25:36PM +0100, George Dunlap wrote: > On 10/01/2018 03:00 PM, Jan Beulich wrote: > On 01.10.18 at 15:38, wrote: > >> On 10/01/2018 12:25 PM, Jan Beulich wrote: > >>> I think the main concern > >>> was with the way migration of the new value was implemented. But I > >>> really have to defer to Andrew for that, irrespective of him not > >>> having responded (on the list) to prior pings. > >> > >> Is Andrew really the only person who knows enough about migration to > >> give this the thumbs-up? > > > > That's not the point here, at least afaic: He had voiced _some_ > > concern on an earlier version. In such a case it is, I think, only > > appropriate to wait with committing until there was indication > > that the concerns were sufficiently addressed (verbally or by > > adjustments to the code). > > Right -- but it's not your job to make sure the migration stuff is > properly addressed; it's Wei and Ian's job. Wei's R-b was a statement > from him that the code was good; when Andy questioned that, I think it > was then *Wei's* job to address the question, not yours or Andy's (or > even Olaf's). If Wei says, "I've considered Andy's objections and I > think the patch is fine as-is", then it can be checked in (given a > reasonable amount of time for Andy to respond); and Wei can own whatever > consequences there are. This patch touched more than toolstack code, that's why Jan gave his R-b in the first place. The contention is not on the correctness of the code, but on if this mechanism had unintended consequences. Both Jan and I thought the code was correct, but we didn't feel comfortable enough to ignore objections. Sorry Olaf. Wei. > > -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v4 7/8] libxc/pvh: set default MTRR type to write-back
On 01/10/18 16:52, Roger Pau Monné wrote: > On Wed, Sep 26, 2018 at 01:29:27PM +0100, Andrew Cooper wrote: >> On 08/06/18 10:59, Roger Pau Monne wrote: >>> @@ -1014,6 +1034,30 @@ static int vcpu_hvm(struct xc_dom_image *dom) >>> if ( dom->start_info_seg.pfn ) >>> bsp_ctx.cpu.rbx = dom->start_info_seg.pfn << PAGE_SHIFT; >>> >>> +/* Set the MTRR. */ >>> +bsp_ctx.mtrr_d.typecode = HVM_SAVE_CODE(MTRR); >>> +bsp_ctx.mtrr_d.instance = 0; >>> +bsp_ctx.mtrr_d.length = HVM_SAVE_LENGTH(MTRR); >>> + >>> +mtrr_record = hvm_get_save_record(full_ctx, HVM_SAVE_CODE(MTRR), 0); >>> +if ( !mtrr_record ) >>> +{ >>> +xc_dom_panic(dom->xch, XC_INTERNAL_ERROR, >>> + "%s: unable to get MTRR save record", __func__); >>> +goto out; >>> +} >>> + >>> +memcpy(_ctx.mtrr, mtrr_record, sizeof(bsp_ctx.mtrr)); >>> + >>> +/* TODO: maybe this should be a firmware option instead? */ >>> +if ( !dom->device_model ) >>> +/* >>> + * Enable MTRR, set default type to WB. >>> + * TODO: add MMIO areas as UC when passthrough is supported. >>> + */ >>> +bsp_ctx.mtrr.msr_mtrr_def_type = MTRR_TYPE_WRBACK | >>> + MTRR_DEF_TYPE_ENABLE; >> This is buggy. MTRRs are per-vcpu and expected to match. This only >> works by chance in the HVM case because all settings are still 0 at this >> point. > Yes, I will look into it ASAP, but I'm not sure if the current > save/restore trick that's done here will be possible with vcpus that > are down or I will need to expand the save/restore logic to allow > setting the state of vcpus that are down. This gets back into "what is an offline vcpu" argument which happened recently on the Introspection side. As far as I'm concerned, once the vcpus have been allocated XEN_DOMCTL_max_vcpus, they are present and "around". Xen should be responsible for putting them into the correct INIT state (which is one of the many bugs I've yet to get fully fixed from the debug handling side of things), and the domain builder is responsible for making any changes that firmware would normally expect to make. Whether the vcpus are scheduled or not is unrelated to the state (and indeed set-ability) of their registers. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH] ns16550/PCI: fix skipping of devices
Selecting between single/multiple BAR mode should happen after checking whether to skip the present device, or else multi-BAR devices won't be skipped correctly, due to port_idx getting set to zero in that case. Signed-off-by: Jan Beulich --- a/xen/drivers/char/ns16550.c +++ b/xen/drivers/char/ns16550.c @@ -1037,18 +1037,18 @@ pci_uart_config(struct ns16550 *uart, bo } } -if ( !param->bar0 ) -{ -bar_idx = idx; -port_idx = 0; -} - if ( port_idx >= param->max_ports ) { idx -= param->max_ports; continue; } +if ( !param->bar0 ) +{ +bar_idx = idx; +port_idx = 0; +} + uart->io_base = 0; bar = pci_conf_read32(0, b, d, f, PCI_BASE_ADDRESS_0 + bar_idx*4); ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] IOREQ server on Arm
On Wed, Sep 26, 2018 at 11:32:38AM +0100, Julien Grall wrote: > Hi, > > On 09/26/2018 10:14 AM, Paul Durrant wrote: > > > -Original Message- > > > From: Jan Beulich [mailto:jbeul...@suse.com] > > > Sent: 26 September 2018 09:09 > > > To: Julien Grall ; Paul Durrant > > > > > > Cc: Andrew Cooper ; Roger Pau Monne > > > ; Stefano Stabellini ; xen- > > > devel > > > Subject: Re: IOREQ server on Arm > > > > > > > > > On 26.09.18 at 00:39, wrote: > > > > Hi Paul, > > > > > > > > I am looking at porting the IOREQ server infrastructure on Arm. I didn't > > > > need much modification to make it run for Arm. Although, the > > > > implementation could be simplified over the x86 implementation. > > > > > > > > I noticed some issue while trying to implement the hypercall > > > > XENMEM_acquire_resource. Per my understanding, all the page mapped via > > > > that hypercall will use the type p2m_mapping_foreign. > > > > > > > > This will result to trigger the ASSERT(fdom != dom) in get_page_from_gfn > > > > (asm-arm/p2m.h) because the IOREQ page has been allocated to the > > > > emulator domain and mapped to it. AFAICT x86 has the same assert in > > > > p2m_get_page_from_gfn(...). > > > > > > > > IHMO, the ASSERT makes sense because you are only meant to map page > > > > belonging to other domain with that type. > > > > > > > > So I am wondering whether IOREQ server running in PVH Dom0 has been > > > > tested? What would be the best course of action to fix the issue? > > > > > > I think the p2m type needs to be chosen based on > > > XENMEM_rsrc_acq_caller_owned. > > > > > > > Yes, that's correct. There is a FIXME clause in acquire_resource so that > > that only caller owned resources can be mapped by HVM/PVH domains. Thus the > > new call can be used for IOREQ server pages, but not grant frames. > > I don't understand your last sentence. IOREQ is caller owned and therefore > should work. > > As I said, I don't have any problem with mapping the resource. The problem > is when unmapping it because the region is set with p2m_mapping_foreign. You > will reach the ASSERT(fdom != p2m->domain) in p2m_get_page_from_gfn. > > Regardless the reference problem (we support it on Arm). Can you explain how > this is working on PVH Dom0 today? IOREQs (QEMU) used to work on a PVH Dom0 in the past, it's been some time since I've launched a HVM guest now, but it definitely worked before, like it worked for PVHv1 Dom0. Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v4 7/8] libxc/pvh: set default MTRR type to write-back
On Wed, Sep 26, 2018 at 01:29:27PM +0100, Andrew Cooper wrote: > On 08/06/18 10:59, Roger Pau Monne wrote: > > @@ -1014,6 +1034,30 @@ static int vcpu_hvm(struct xc_dom_image *dom) > > if ( dom->start_info_seg.pfn ) > > bsp_ctx.cpu.rbx = dom->start_info_seg.pfn << PAGE_SHIFT; > > > > +/* Set the MTRR. */ > > +bsp_ctx.mtrr_d.typecode = HVM_SAVE_CODE(MTRR); > > +bsp_ctx.mtrr_d.instance = 0; > > +bsp_ctx.mtrr_d.length = HVM_SAVE_LENGTH(MTRR); > > + > > +mtrr_record = hvm_get_save_record(full_ctx, HVM_SAVE_CODE(MTRR), 0); > > +if ( !mtrr_record ) > > +{ > > +xc_dom_panic(dom->xch, XC_INTERNAL_ERROR, > > + "%s: unable to get MTRR save record", __func__); > > +goto out; > > +} > > + > > +memcpy(_ctx.mtrr, mtrr_record, sizeof(bsp_ctx.mtrr)); > > + > > +/* TODO: maybe this should be a firmware option instead? */ > > +if ( !dom->device_model ) > > +/* > > + * Enable MTRR, set default type to WB. > > + * TODO: add MMIO areas as UC when passthrough is supported. > > + */ > > +bsp_ctx.mtrr.msr_mtrr_def_type = MTRR_TYPE_WRBACK | > > + MTRR_DEF_TYPE_ENABLE; > > This is buggy. MTRRs are per-vcpu and expected to match. This only > works by chance in the HVM case because all settings are still 0 at this > point. Yes, I will look into it ASAP, but I'm not sure if the current save/restore trick that's done here will be possible with vcpus that are down or I will need to expand the save/restore logic to allow setting the state of vcpus that are down. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [xen-unstable test] 128240: regressions - FAIL
On 10/01/2018 04:40 PM, Andrew Cooper wrote: > On 01/10/18 16:35, Wei Liu wrote: >> On Mon, Oct 01, 2018 at 04:19:07PM +0100, George Dunlap wrote: >>> On 10/01/2018 04:17 PM, Wei Liu wrote: On Mon, Oct 01, 2018 at 09:10:25AM -0600, Jan Beulich wrote: On 01.10.18 at 16:33, wrote: >> On Mon, Oct 01, 2018 at 03:04:02AM -0600, Jan Beulich wrote: >> On 30.09.18 at 23:59, wrote: flight 128240 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/128240/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-migrupgrade 22 guest-migrate/src_host/dst_host fail REGR. vs. >> 128084 >>> At the first glance >>> >>> libxl: error: libxl_sched.c:232:sched_credit_domain_set: Domain >>> 1:Getting >> domain sched credit: Invalid argument >>> libxl: error: libxl_create.c:1275:domcreate_rebuild_done: Domain >>> 1:cannot >> (re-)build domain: -3 >>> might indicate a problem resulting from the switch to credit2 as the >>> default >>> scheduler. But "first glance" here really means what it says - I didn't >>> look >>> (yet) at what exactly libxl tries to do there, in the hope that others >>> may >>> know without much digging. >> I think this is due to toolstack trying to set the same scheduler >> parameters for the newly created guest. >> >> But in this test, the destination host is using a different scheduler >> from the source host. Asking for credit scheduler on a credit2 host is >> wrong. >> >> The relevant snippet in guest cfg (JSON) is: >> >> "sched_params": { >> "sched": "credit", >> "weight": 256, >> "cap": 0 >> }, >> >> I can't think of a method to fix it off the top of my head though. > So is this something that was specified in the original config? Or > is it just the current value which gets read and an attempt made > to re-install. If there was no explicit setting in the guest config, > shouldn't such a "default" setting be retained by not transferring > any scheduler specifics during migration? > No setting in guest cfg. Those values are extracted from the hypervisor. I think we may be able to not send default values to the remote end. >>> Wait, the migration code reads the scheduler parameters -- even if these >>> have not been explicitly set by the admin -- and sends them along with >>> the migration stream? And if the remote scheduler is different, the >>> migration fails? >>> >>> That's not so good. :-) >> But one can argue that the guest is specific configured that way so it's >> parameters should be preserved. We normally analyse things on a case by >> case basis. > > If there isn't an obvious fix, then the switch of default scheduler > needs reverting until there is a fix present. This is currently > blocking master. Agreed. I'd argue for ignoring failures to set scheduler parameters on migrate, on the grounds that this will be less risk to the project as a whole than reverting credit2 again. But either way we should do something quickly. -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/altp2m: propagate ept.ad changes to all active altp2ms
On 09/28/2018 05:19 PM, Razvan Cojocaru wrote: > On 9/28/18 6:55 PM, Jan Beulich wrote: > On 28.09.18 at 17:25, wrote: >>> On 9/28/18 5:52 PM, Jan Beulich wrote: >>> On 28.09.18 at 13:55, wrote: > @@ -1218,34 +1219,67 @@ static void ept_tlb_flush(struct p2m_domain *p2m) > ept_sync_domain_mask(p2m, p2m->domain->dirty_cpumask); > } > > +static void ept_set_ad_sync(struct p2m_domain *p2m, int value) > +{ > +struct domain *d = p2m->domain; > +unsigned int i; > + > +if ( likely(!altp2m_active(d)) ) > +{ > +p2m_lock(p2m); > +p2m->ept.ad = value; > +p2m_unlock(p2m); > + > +return; > +} Why would you want to skip updating the host p2m's flag when altp2m is active? >>> >>> It's not really skipped if I understand the altp2m code correctly: in >>> that case the hostp2m is d->arch.altp2m_p2m[0], which is take care of in >>> the loop below the code you've quoted. >> >> p2m_init_altp2m() (and other code in p2m.c) suggests otherwise to me. > > That's interesting, p2m_set_mem_access() is treating altp2m index 0 as > the hostp2m: > > 360 /* > 361 * Set access type for a region of gfns. > 362 * If gfn == INVALID_GFN, sets the default access type. > 363 */ > 364 long p2m_set_mem_access(struct domain *d, gfn_t gfn, uint32_t nr, > 365 uint32_t start, uint32_t mask, > xenmem_access_t access, > 366 unsigned int altp2m_idx) > 367 { > 368 struct p2m_domain *p2m = p2m_get_hostp2m(d), *ap2m = NULL; > 369 p2m_access_t a; > 370 unsigned long gfn_l; > 371 long rc = 0; > 372 > 373 /* altp2m view 0 is treated as the hostp2m */ > 374 #ifdef CONFIG_HVM > 375 if ( altp2m_idx ) > 376 { > 377 if ( altp2m_idx >= MAX_ALTP2M || > 378 d->arch.altp2m_eptp[altp2m_idx] == mfn_x(INVALID_MFN) ) > 379 return -EINVAL; > 380 > 381 ap2m = d->arch.altp2m_p2m[altp2m_idx]; > 382 } > 383 #else > 384 ASSERT(!altp2m_idx); > 385 #endif > > which would seem to imply that either we should be able to treat > d->arch.altp2m_p2m[0] and hostp2m interchangeably, or we are currently > wasting an altp2m array slot. Yes, ISTR that altp2m_idx 0 was specifically meant to be the host p2m (unmodified). But there seems to be some confusion over that -- this seems to say it needs to be "manually" interpreted; some of the mem_access code seems to think that it can just grab d->arch.altp2m_p2m[0] and that will affect the hostp2m. And as you say, if we do the "manual interpretation", then we're allocating an extra p2m in slot 0 that we're not using. -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [xen-unstable test] 128240: regressions - FAIL
On 01/10/18 16:35, Wei Liu wrote: > On Mon, Oct 01, 2018 at 04:19:07PM +0100, George Dunlap wrote: >> On 10/01/2018 04:17 PM, Wei Liu wrote: >>> On Mon, Oct 01, 2018 at 09:10:25AM -0600, Jan Beulich wrote: >>> On 01.10.18 at 16:33, wrote: > On Mon, Oct 01, 2018 at 03:04:02AM -0600, Jan Beulich wrote: > On 30.09.18 at 23:59, wrote: >>> flight 128240 xen-unstable real [real] >>> http://logs.test-lab.xenproject.org/osstest/logs/128240/ >>> >>> Regressions :-( >>> >>> Tests which did not succeed and are blocking, >>> including tests which could not be run: >>> test-amd64-amd64-migrupgrade 22 guest-migrate/src_host/dst_host fail >>> REGR. vs. > 128084 >> At the first glance >> >> libxl: error: libxl_sched.c:232:sched_credit_domain_set: Domain >> 1:Getting > domain sched credit: Invalid argument >> libxl: error: libxl_create.c:1275:domcreate_rebuild_done: Domain >> 1:cannot > (re-)build domain: -3 >> might indicate a problem resulting from the switch to credit2 as the >> default >> scheduler. But "first glance" here really means what it says - I didn't >> look >> (yet) at what exactly libxl tries to do there, in the hope that others >> may >> know without much digging. > I think this is due to toolstack trying to set the same scheduler > parameters for the newly created guest. > > But in this test, the destination host is using a different scheduler > from the source host. Asking for credit scheduler on a credit2 host is > wrong. > > The relevant snippet in guest cfg (JSON) is: > > "sched_params": { > "sched": "credit", > "weight": 256, > "cap": 0 > }, > > I can't think of a method to fix it off the top of my head though. So is this something that was specified in the original config? Or is it just the current value which gets read and an attempt made to re-install. If there was no explicit setting in the guest config, shouldn't such a "default" setting be retained by not transferring any scheduler specifics during migration? >>> No setting in guest cfg. Those values are extracted from the hypervisor. >>> I think we may be able to not send default values to the remote end. >> Wait, the migration code reads the scheduler parameters -- even if these >> have not been explicitly set by the admin -- and sends them along with >> the migration stream? And if the remote scheduler is different, the >> migration fails? >> >> That's not so good. :-) > But one can argue that the guest is specific configured that way so it's > parameters should be preserved. We normally analyse things on a case by > case basis. If there isn't an obvious fix, then the switch of default scheduler needs reverting until there is a fix present. This is currently blocking master. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [xen-unstable test] 128240: regressions - FAIL
On Mon, Oct 01, 2018 at 04:19:07PM +0100, George Dunlap wrote: > On 10/01/2018 04:17 PM, Wei Liu wrote: > > On Mon, Oct 01, 2018 at 09:10:25AM -0600, Jan Beulich wrote: > > On 01.10.18 at 16:33, wrote: > >>> On Mon, Oct 01, 2018 at 03:04:02AM -0600, Jan Beulich wrote: > >>> On 30.09.18 at 23:59, wrote: > > flight 128240 xen-unstable real [real] > > http://logs.test-lab.xenproject.org/osstest/logs/128240/ > > > > Regressions :-( > > > > Tests which did not succeed and are blocking, > > including tests which could not be run: > > test-amd64-amd64-migrupgrade 22 guest-migrate/src_host/dst_host fail > > REGR. vs. > >>> 128084 > > At the first glance > > libxl: error: libxl_sched.c:232:sched_credit_domain_set: Domain > 1:Getting > >>> domain sched credit: Invalid argument > libxl: error: libxl_create.c:1275:domcreate_rebuild_done: Domain > 1:cannot > >>> (re-)build domain: -3 > > might indicate a problem resulting from the switch to credit2 as the > default > scheduler. But "first glance" here really means what it says - I didn't > look > (yet) at what exactly libxl tries to do there, in the hope that others > may > know without much digging. > >>> > >>> I think this is due to toolstack trying to set the same scheduler > >>> parameters for the newly created guest. > >>> > >>> But in this test, the destination host is using a different scheduler > >>> from the source host. Asking for credit scheduler on a credit2 host is > >>> wrong. > >>> > >>> The relevant snippet in guest cfg (JSON) is: > >>> > >>> "sched_params": { > >>> "sched": "credit", > >>> "weight": 256, > >>> "cap": 0 > >>> }, > >>> > >>> I can't think of a method to fix it off the top of my head though. > >> > >> So is this something that was specified in the original config? Or > >> is it just the current value which gets read and an attempt made > >> to re-install. If there was no explicit setting in the guest config, > >> shouldn't such a "default" setting be retained by not transferring > >> any scheduler specifics during migration? > >> > > > > No setting in guest cfg. Those values are extracted from the hypervisor. > > I think we may be able to not send default values to the remote end. > > Wait, the migration code reads the scheduler parameters -- even if these > have not been explicitly set by the admin -- and sends them along with > the migration stream? And if the remote scheduler is different, the > migration fails? > > That's not so good. :-) But one can argue that the guest is specific configured that way so it's parameters should be preserved. We normally analyse things on a case by case basis. Wei. > > -George > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] tools/libxl: Switch Arm guest type to PVH
Hi, On 10/01/2018 04:27 PM, Roger Pau Monné wrote: On Tue, Sep 25, 2018 at 08:36:36PM +0100, Julien Grall wrote: Hi Roger, Sorry for the late reply. On 09/03/2018 03:40 PM, Roger Pau Monné wrote: On Mon, Sep 03, 2018 at 12:15:16PM +0100, Julien Grall wrote: On 03/09/18 12:09, Julien Grall wrote: On 23/08/18 08:58, Roger Pau Monné wrote: On Wed, Aug 22, 2018 at 06:48:05PM +0100, Julien Grall wrote: + + b_info->type = LIBXL_DOMAIN_TYPE_PVH; + + /* + * They only fie ld in u.pv that matters on Arm are: kernel, cmdline, + * ramdisk. + */ + + if (!b_info->kernel && b_info->u.pv.kernel) + b_info->kernel = b_info->u.pv.kernel; + + if (!b_info->ramdisk && b_info->u.pv.ramdisk) + b_info->ramdisk = b_info->u.pv.ramdisk; + + if (!b_info->cmdline && b_info->u.pv.cmdline) + b_info->cmdline = b_info->u.pv.cmdline; + + /* Reset b_info->u.pvh to default values */ + memset(_info->u.pvh, 0, sizeof(b_info->u.pvh)); I'm afraid that's not correct. The default values for u.pvh are set by libxl__domain_build_info_setdefault. I thought that this should be covered by the switch right after the call of libxl__arch_domain_build_info_setdefault. Did I miss anything? Oh right, libxl__arch_domain_build_info_setdefault is called by libxl__domain_build_info_setdefault. What I wanted to do here is resetting the union to 0 so you don't get data mangled by the pv fields. Another possible option I think would be to mark those fields as deprecated in the IDL, and libxl__domain_build_info_copy_deprecated will take care of copying them to the new place. In fact I think all guest types should be using the top level kernel, ramdisk and cmdline fields. I will have a look at it. I'm not specially comfortable with changing the guest type in the middle of libxl__domain_build_info_setdefault, but I also don't have a much better suggestion apart from using the deprecation helper. I forgot to answer to this bit. I don't think the deprecation helper will do all the jobs. There are still PV specific parameters: slack_memkb, features, e820_host. Those can be left inside the PV sub-struct and shouldn't be marked as deprecated. Those are not necessary for Arm, if you don't zero them then you will not initialize the PVH structure with default values. How do you suggest to handle them? But I guess ARM doesn't use any of those fields (or else they should be moved to the pvh sub-struct anyway)? Those fields should not be used by Arm. Looking at the current fields in pv, they all should be zeroed. However, I am not sure if we can assume that will always be the case. If we can assume it, then I think it would just be sufficient to have b_info->type = LIBXL_DOMAIN_TYPE_PVH in Arm. Any thoughts? If we go down this route for ARM I think you should add to xl.cfg that the `type` option only applies to x86, and that there's only one guest type on ARM. IMO it's best to not use type if there's only one type available, so that it can be intruded later on ARM if required. That's not what we discussed before. The idea is to default Arm guest to PVH for xl. But "type=" would still be available. We still have to support other type as other toolstack (e.g libvirt) may use PV for Arm. So the idea here is libxl will check if it was PV and convert to PVH. In an earlier reply, you said that we can use deprecated to copy most of the options over. However, can we always assume the pv.* fields will always be zero after the value was copied over? If not, then we would have to memset them. I asked opinion on the last bits. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] AMD Ping [PATCH] amd/iommu: remove hidden AMD inclusive mappings
Can I please get an Ack or otherwise from the AMD maintainers? Thanks, Roger. On Fri, Sep 21, 2018 at 05:20:47PM +0200, Roger Pau Monne wrote: > And just rely on arch_iommu_hwdom_init to setup the correct inclusive > mappings as it's done for Intel. > > AMD has code in amd_iommu_hwdom_init to setup inclusive mappings up to > max_pdx, remove this since it's now a duplication of > arch_iommu_hwdom_init. Note that AMD mapped every page with a valid > mfn up to max_pdx, arch_iommu_hwdom_init will only do so for memory > below 4GB, so this is a functional change for AMD. > > Move the default setting of iommu_hwdom_{inclusive/reserved} to > arch_iommu_hwdom_init since the defaults are now the same for both > Intel and AMD. > > Reported-by: Paul Durrant > Signed-off-by: Roger Pau Monné > --- > Cc: Suravee Suthikulpanit > Cc: Brian Woods > Cc: Kevin Tian > Cc: Jan Beulich > Cc: Paul Durrant > --- > xen/drivers/passthrough/amd/pci_amd_iommu.c | 39 - > xen/drivers/passthrough/vtd/iommu.c | 7 > xen/drivers/passthrough/x86/iommu.c | 8 - > 3 files changed, 7 insertions(+), 47 deletions(-) > > diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c > b/xen/drivers/passthrough/amd/pci_amd_iommu.c > index 4a633ca940..821fe03df5 100644 > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c > @@ -250,50 +250,11 @@ static int amd_iommu_add_device(u8 devfn, struct > pci_dev *pdev); > > static void __hwdom_init amd_iommu_hwdom_init(struct domain *d) > { > -unsigned long i; > const struct amd_iommu *iommu; > > -/* Inclusive IOMMU mappings are disabled by default on AMD hardware. */ > -if ( iommu_hwdom_inclusive == -1 ) > -iommu_hwdom_inclusive = 0; > -/* Reserved IOMMU mappings are disabled by default on AMD hardware. */ > -if ( iommu_hwdom_reserved == -1 ) > -iommu_hwdom_reserved = 0; > - > if ( allocate_domain_resources(dom_iommu(d)) ) > BUG(); > > -if ( !iommu_hwdom_passthrough && !need_iommu(d) ) > -{ > -int rc = 0; > - > -/* Set up 1:1 page table for dom0 */ > -for ( i = 0; i < max_pdx; i++ ) > -{ > -unsigned long pfn = pdx_to_pfn(i); > - > -/* > - * XXX Should we really map all non-RAM (above 4G)? Minimally > - * a pfn_valid() check would seem desirable here. > - */ > -if ( mfn_valid(_mfn(pfn)) ) > -{ > -int ret = amd_iommu_map_page(d, pfn, pfn, > - > IOMMUF_readable|IOMMUF_writable); > - > -if ( !rc ) > -rc = ret; > -} > - > -if ( !(i & 0xf) ) > -process_pending_softirqs(); > -} > - > -if ( rc ) > -AMD_IOMMU_DEBUG("d%d: IOMMU mapping failed: %d\n", > -d->domain_id, rc); > -} > - > for_each_amd_iommu ( iommu ) > if ( iomem_deny_access(d, PFN_DOWN(iommu->mmio_base_phys), > PFN_DOWN(iommu->mmio_base_phys + > diff --git a/xen/drivers/passthrough/vtd/iommu.c > b/xen/drivers/passthrough/vtd/iommu.c > index bb422ec58c..cf8a80d7a1 100644 > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -1304,13 +1304,6 @@ static void __hwdom_init intel_iommu_hwdom_init(struct > domain *d) > { > struct acpi_drhd_unit *drhd; > > -/* Inclusive mappings are enabled by default on Intel hardware for PV. */ > -if ( iommu_hwdom_inclusive == -1 ) > -iommu_hwdom_inclusive = is_pv_domain(d); > -/* Reserved IOMMU mappings are enabled by default on Intel hardware. */ > -if ( iommu_hwdom_reserved == -1 ) > -iommu_hwdom_reserved = 1; > - > setup_hwdom_pci_devices(d, setup_hwdom_device); > setup_hwdom_rmrr(d); > /* Make sure workarounds are applied before enabling the IOMMU(s). */ > diff --git a/xen/drivers/passthrough/x86/iommu.c > b/xen/drivers/passthrough/x86/iommu.c > index b7c8b5be41..2de8822c59 100644 > --- a/xen/drivers/passthrough/x86/iommu.c > +++ b/xen/drivers/passthrough/x86/iommu.c > @@ -210,7 +210,13 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d) > > BUG_ON(!is_hardware_domain(d)); > > -ASSERT(iommu_hwdom_inclusive != -1 && iommu_hwdom_inclusive != -1); > +/* Inclusive mappings are enabled by default for PV. */ > +if ( iommu_hwdom_inclusive == -1 ) > +iommu_hwdom_inclusive = is_pv_domain(d); > +/* Reserved IOMMU mappings are enabled by default. */ > +if ( iommu_hwdom_reserved == -1 ) > +iommu_hwdom_reserved = 1; > + > if ( iommu_hwdom_inclusive && !is_pv_domain(d) ) > { > printk(XENLOG_WARNING > -- > 2.19.0 > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
>>> On 01.10.18 at 17:17, wrote: > On 01/10/18 14:29, Jan Beulich wrote: > On 01.10.18 at 14:39, wrote: >>> On 07/06/18 14:08, Olaf Hering wrote: Add an option to control when vTSC emulation will be activated for a domU with tsc_mode=default. Without such option each TSC access from domU will be emulated, which causes a significant perfomance drop for workloads that make use of rdtsc. One option to avoid the TSC option is to run domUs with tsc_mode=native. This has the drawback that migrating a domU from a "2.3GHz" class host to a "2.4GHz" class host may change the rate at wich the TSC counter increases, the domU may not be prepared for that. With the new option the host admin can decide how a domU should behave when it is migrated across systems of the same class. Since there is always some jitter when Xen calibrates the cpu_khz value, all hosts of the same class will most likely have slightly different values. As a result vTSC emulation is unavoidable. Data collected during the incident which triggered this change showed a jitter of up to 200 KHz across systems of the same class. >>> Do you have any further details of the systems involved? If they are >>> identical systems, they should all have the same real TSC frequency, and >>> its a known issue that Xen isn't very good at working out the >>> frequency. TBH, fixing that would be far better overall. >> Are you convinced all parts match their nominal frequency without >> _any_ deviation? > > That is the intent of publishing the numbers, yes. > >> If that was the case, we could indeed use CPUID >> leaves 0x15 / 0x16 output, if available. > > We very much should be doing this. There are also model-specific ways > of getting the same data on older processors. > >> But I very much doubt this. >> As an example, here's what bare metal Linux says on my newest >> system: >> >> tsc: Detected 2600.000 MHz processor >> tsc: Refined TSC clocksource calibration: 2591.990 MHz >> >> Xen figures: >> >> (XEN) Detected 2592.107 MHz processor. >> >> And then after another re-boot bare metal Linux again >> >> tsc: Refined TSC clocksource calibration: 2592.008 MHz > > What is surprising here? The calibration loop is not 100% accurate and > cannot be made to be perfect. > > The fact that Linux and Xen agree is because they basically share the > same calibration algorithm - not that the processor is really running at > 2592MHz. And I'm not claiming it is. I'm merely voicing my doubt that the processor is running at exactly the announced 2600.000 MHz. In which case it is simply unknown whether calibrated or nominal values come closer to the truth. > For one, all calibration options will read slow by the amount > of time it takes an interrupt to propagate through the system fabric, > and there is basically nothing software can do to account for this. Well, if you measure under otherwise identical conditions twice the arrival of instances of the same signal, then its time to travel through the fabric doesn't matter for the distance in time between the arrival of the two instances. But of course this is as idealized as is an assumption that humans would be able to manufacture clocks ticking at exactly their nominal frequency. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] tools/libxl: Switch Arm guest type to PVH
On Tue, Sep 25, 2018 at 08:36:36PM +0100, Julien Grall wrote: > Hi Roger, > > Sorry for the late reply. > > On 09/03/2018 03:40 PM, Roger Pau Monné wrote: > > On Mon, Sep 03, 2018 at 12:15:16PM +0100, Julien Grall wrote: > > > On 03/09/18 12:09, Julien Grall wrote: > > > > > > > > > > > > On 23/08/18 08:58, Roger Pau Monné wrote: > > > > > On Wed, Aug 22, 2018 at 06:48:05PM +0100, Julien Grall wrote: > > > > > > > > > > > > > > > + > > > > > > > > + b_info->type = LIBXL_DOMAIN_TYPE_PVH; > > > > > > > > + > > > > > > > > + /* > > > > > > > > + * They only fie ld in u.pv that matters on Arm are: > > > > > > > > kernel, cmdline, > > > > > > > > + * ramdisk. > > > > > > > > + */ > > > > > > > > + > > > > > > > > + if (!b_info->kernel && b_info->u.pv.kernel) > > > > > > > > + b_info->kernel = b_info->u.pv.kernel; > > > > > > > > + > > > > > > > > + if (!b_info->ramdisk && b_info->u.pv.ramdisk) > > > > > > > > + b_info->ramdisk = b_info->u.pv.ramdisk; > > > > > > > > + > > > > > > > > + if (!b_info->cmdline && b_info->u.pv.cmdline) > > > > > > > > + b_info->cmdline = b_info->u.pv.cmdline; > > > > > > > > + > > > > > > > > + /* Reset b_info->u.pvh to default values */ > > > > > > > > + memset(_info->u.pvh, 0, sizeof(b_info->u.pvh)); > > > > > > > > > > > > > > I'm afraid that's not correct. The default values for u.pvh are > > > > > > > set > > > > > > > by libxl__domain_build_info_setdefault. > > > > > > > > > > > > I thought that this should be covered by the switch right after > > > > > > the call of > > > > > > libxl__arch_domain_build_info_setdefault. Did I miss anything? > > > > > > > > > > Oh right, libxl__arch_domain_build_info_setdefault is called by > > > > > libxl__domain_build_info_setdefault. > > > > > > > > > > > What I wanted to do here is resetting the union to 0 so you > > > > > > don't get data > > > > > > mangled by the pv fields. > > > > > > > > > > Another possible option I think would be to mark those fields as > > > > > deprecated in the IDL, and libxl__domain_build_info_copy_deprecated > > > > > will take care of copying them to the new place. In fact I think all > > > > > guest types should be using the top level kernel, ramdisk and cmdline > > > > > fields. > > > > > > > > I will have a look at it. > > > > > > > > > > > > > > I'm not specially comfortable with changing the guest type in the > > > > > middle of libxl__domain_build_info_setdefault, but I also don't have a > > > > > much better suggestion apart from using the deprecation helper. > > > > > > I forgot to answer to this bit. I don't think the deprecation helper will > > > do > > > all the jobs. There are still PV specific parameters: slack_memkb, > > > features, > > > e820_host. > > > > Those can be left inside the PV sub-struct and shouldn't be marked as > > deprecated. > > > > > Those are not necessary for Arm, if you don't zero them then you will not > > > initialize the PVH structure with default values. How do you suggest to > > > handle them? > > > > But I guess ARM doesn't use any of those fields (or else they should > > be moved to the pvh sub-struct anyway)? > > Those fields should not be used by Arm. Looking at the current fields in pv, > they all should be zeroed. > > However, I am not sure if we can assume that will always be the case. > If we can assume it, then I think it would just be sufficient to have > b_info->type = LIBXL_DOMAIN_TYPE_PVH in Arm. > > Any thoughts? If we go down this route for ARM I think you should add to xl.cfg that the `type` option only applies to x86, and that there's only one guest type on ARM. IMO it's best to not use type if there's only one type available, so that it can be intruded later on ARM if required. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [Non-DoD Source] [PATCH 2/2] xen/xsm: Add new SILO mode for XSM
> > When SILO is enabled, there would be no page-sharing or event notifications > between unprivileged VMs (no grant tables or event channels). > > Signed-off-by: Xin Li > Acked-by: Daniel De Graaf ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [xen-unstable test] 128240: regressions - FAIL
On 10/01/2018 04:17 PM, Wei Liu wrote: > On Mon, Oct 01, 2018 at 09:10:25AM -0600, Jan Beulich wrote: > On 01.10.18 at 16:33, wrote: >>> On Mon, Oct 01, 2018 at 03:04:02AM -0600, Jan Beulich wrote: >>> On 30.09.18 at 23:59, wrote: > flight 128240 xen-unstable real [real] > http://logs.test-lab.xenproject.org/osstest/logs/128240/ > > Regressions :-( > > Tests which did not succeed and are blocking, > including tests which could not be run: > test-amd64-amd64-migrupgrade 22 guest-migrate/src_host/dst_host fail > REGR. vs. >>> 128084 At the first glance libxl: error: libxl_sched.c:232:sched_credit_domain_set: Domain 1:Getting >>> domain sched credit: Invalid argument libxl: error: libxl_create.c:1275:domcreate_rebuild_done: Domain 1:cannot >>> (re-)build domain: -3 might indicate a problem resulting from the switch to credit2 as the default scheduler. But "first glance" here really means what it says - I didn't look (yet) at what exactly libxl tries to do there, in the hope that others may know without much digging. >>> >>> I think this is due to toolstack trying to set the same scheduler >>> parameters for the newly created guest. >>> >>> But in this test, the destination host is using a different scheduler >>> from the source host. Asking for credit scheduler on a credit2 host is >>> wrong. >>> >>> The relevant snippet in guest cfg (JSON) is: >>> >>> "sched_params": { >>> "sched": "credit", >>> "weight": 256, >>> "cap": 0 >>> }, >>> >>> I can't think of a method to fix it off the top of my head though. >> >> So is this something that was specified in the original config? Or >> is it just the current value which gets read and an attempt made >> to re-install. If there was no explicit setting in the guest config, >> shouldn't such a "default" setting be retained by not transferring >> any scheduler specifics during migration? >> > > No setting in guest cfg. Those values are extracted from the hypervisor. > I think we may be able to not send default values to the remote end. Wait, the migration code reads the scheduler parameters -- even if these have not been explicitly set by the admin -- and sends them along with the migration stream? And if the remote scheduler is different, the migration fails? That's not so good. :-) -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
On 01/10/18 14:29, Jan Beulich wrote: On 01.10.18 at 14:39, wrote: >> On 07/06/18 14:08, Olaf Hering wrote: >>> Add an option to control when vTSC emulation will be activated for a >>> domU with tsc_mode=default. Without such option each TSC access from >>> domU will be emulated, which causes a significant perfomance drop for >>> workloads that make use of rdtsc. >>> >>> One option to avoid the TSC option is to run domUs with tsc_mode=native. >>> This has the drawback that migrating a domU from a "2.3GHz" class host >>> to a "2.4GHz" class host may change the rate at wich the TSC counter >>> increases, the domU may not be prepared for that. >>> >>> With the new option the host admin can decide how a domU should behave >>> when it is migrated across systems of the same class. Since there is >>> always some jitter when Xen calibrates the cpu_khz value, all hosts of >>> the same class will most likely have slightly different values. As a >>> result vTSC emulation is unavoidable. Data collected during the incident >>> which triggered this change showed a jitter of up to 200 KHz across >>> systems of the same class. >> Do you have any further details of the systems involved? If they are >> identical systems, they should all have the same real TSC frequency, and >> its a known issue that Xen isn't very good at working out the >> frequency. TBH, fixing that would be far better overall. > Are you convinced all parts match their nominal frequency without > _any_ deviation? That is the intent of publishing the numbers, yes. > If that was the case, we could indeed use CPUID > leaves 0x15 / 0x16 output, if available. We very much should be doing this. There are also model-specific ways of getting the same data on older processors. > But I very much doubt this. > As an example, here's what bare metal Linux says on my newest > system: > > tsc: Detected 2600.000 MHz processor > tsc: Refined TSC clocksource calibration: 2591.990 MHz > > Xen figures: > > (XEN) Detected 2592.107 MHz processor. > > And then after another re-boot bare metal Linux again > > tsc: Refined TSC clocksource calibration: 2592.008 MHz What is surprising here? The calibration loop is not 100% accurate and cannot be made to be perfect. The fact that Linux and Xen agree is because they basically share the same calibration algorithm - not that the processor is really running at 2592MHz. For one, all calibration options will read slow by the amount of time it takes an interrupt to propagate through the system fabric, and there is basically nothing software can do to account for this. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [Non-DoD Source] [PATCH 1/2] xen/xsm: Introduce new boot parameter xsm
> > Introduce new boot parameter xsm to choose which xsm module is enabled, > and set default to dummy. > > Signed-off-by: Xin Li Acked-by: Daniel De Graaf It might be useful for the commit message to also reference the new Kconfig option; thanks for adding it. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [xen-unstable test] 128240: regressions - FAIL
On Mon, Oct 01, 2018 at 09:10:25AM -0600, Jan Beulich wrote: > >>> On 01.10.18 at 16:33, wrote: > > On Mon, Oct 01, 2018 at 03:04:02AM -0600, Jan Beulich wrote: > >> >>> On 30.09.18 at 23:59, wrote: > >> > flight 128240 xen-unstable real [real] > >> > http://logs.test-lab.xenproject.org/osstest/logs/128240/ > >> > > >> > Regressions :-( > >> > > >> > Tests which did not succeed and are blocking, > >> > including tests which could not be run: > >> > test-amd64-amd64-migrupgrade 22 guest-migrate/src_host/dst_host fail > >> > REGR. vs. > > 128084 > >> > >> At the first glance > >> > >> libxl: error: libxl_sched.c:232:sched_credit_domain_set: Domain 1:Getting > > domain sched credit: Invalid argument > >> libxl: error: libxl_create.c:1275:domcreate_rebuild_done: Domain 1:cannot > > (re-)build domain: -3 > >> > >> might indicate a problem resulting from the switch to credit2 as the > >> default > >> scheduler. But "first glance" here really means what it says - I didn't > >> look > >> (yet) at what exactly libxl tries to do there, in the hope that others may > >> know without much digging. > > > > I think this is due to toolstack trying to set the same scheduler > > parameters for the newly created guest. > > > > But in this test, the destination host is using a different scheduler > > from the source host. Asking for credit scheduler on a credit2 host is > > wrong. > > > > The relevant snippet in guest cfg (JSON) is: > > > > "sched_params": { > > "sched": "credit", > > "weight": 256, > > "cap": 0 > > }, > > > > I can't think of a method to fix it off the top of my head though. > > So is this something that was specified in the original config? Or > is it just the current value which gets read and an attempt made > to re-install. If there was no explicit setting in the guest config, > shouldn't such a "default" setting be retained by not transferring > any scheduler specifics during migration? > No setting in guest cfg. Those values are extracted from the hypervisor. I think we may be able to not send default values to the remote end. Wei. > Jan > > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm/page_alloc: always scrub pages given to the allocator
>>> On 01.10.18 at 16:28, wrote: > On Mon, 2018-10-01 at 14:54 +0100, George Dunlap wrote: >> Right, the whole point of idle loop scrubbing is that you *don't* >> syncronously wait for *all* the memory to finish scrubbing before you >> can use part of it. So why is this an issue for you guys -- what >> concrete problem did it cause, that the full amount of memory took 40s >> to finish scrubbing rather than only 8s? > > There is no issue at the moment. Using idle loop to scrub all the memory > is just not viable: it doesn't scale. How long does it currently take > to bootscrub a multi-Tb system? If it takes a few minutes then I fear > that it might take an hour to idle-scrub it. But that's fine, at least in theory, as long as no scrubbing will be skipped when it was actually needed. The system will be slightly more busy until done with the scrubbing, but that's exactly the expected price to pay for the cut down on boot time. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm/page_alloc: always scrub pages given to the allocator
>>> On 01.10.18 at 16:40, wrote: > On 01/10/18 12:13, Jan Beulich wrote: > On 01.10.18 at 11:58, wrote: >>> After this patch, alloc_heap_pages() is guaranteed to return scrubbed >>> pages to a caller unless MEMF_no_scrub flag was provided. >> >> I also don't understand the point of this: Xen's internal allocations >> have no need to come from scrubbed memory. This in particular >> also puts under question the need to "eagerly scrub all allocated >> pages during boot" (quoted from an earlier paragraph). > > There are ways to share a Xen's page with a guest. So from a security > point of view, there is no guarantee that a page allocated with > alloc_xenheap_pages() will not end up accessible by some guest. But this is the exception, not the rule, and hence the code enabling the sharing is responsible for initializing the page suitably. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm/page_alloc: always scrub pages given to the allocator
>>> On 01.10.18 at 16:11, wrote: > I think this is the main argument here: what to do about those security > sensitive use cases? Scrubbing everything unconditionally might be a too > radical approach. Would inroducing a new cmdline param be appropriate? Yes, I'm surely fine with this being an optional (and hence command line driven) mode. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [xen-unstable test] 128240: regressions - FAIL
>>> On 01.10.18 at 16:33, wrote: > On Mon, Oct 01, 2018 at 03:04:02AM -0600, Jan Beulich wrote: >> >>> On 30.09.18 at 23:59, wrote: >> > flight 128240 xen-unstable real [real] >> > http://logs.test-lab.xenproject.org/osstest/logs/128240/ >> > >> > Regressions :-( >> > >> > Tests which did not succeed and are blocking, >> > including tests which could not be run: >> > test-amd64-amd64-migrupgrade 22 guest-migrate/src_host/dst_host fail >> > REGR. vs. > 128084 >> >> At the first glance >> >> libxl: error: libxl_sched.c:232:sched_credit_domain_set: Domain 1:Getting > domain sched credit: Invalid argument >> libxl: error: libxl_create.c:1275:domcreate_rebuild_done: Domain 1:cannot > (re-)build domain: -3 >> >> might indicate a problem resulting from the switch to credit2 as the default >> scheduler. But "first glance" here really means what it says - I didn't look >> (yet) at what exactly libxl tries to do there, in the hope that others may >> know without much digging. > > I think this is due to toolstack trying to set the same scheduler > parameters for the newly created guest. > > But in this test, the destination host is using a different scheduler > from the source host. Asking for credit scheduler on a credit2 host is > wrong. > > The relevant snippet in guest cfg (JSON) is: > > "sched_params": { > "sched": "credit", > "weight": 256, > "cap": 0 > }, > > I can't think of a method to fix it off the top of my head though. So is this something that was specified in the original config? Or is it just the current value which gets read and an attempt made to re-install. If there was no explicit setting in the guest config, shouldn't such a "default" setting be retained by not transferring any scheduler specifics during migration? Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [xen-unstable-smoke test] 128283: tolerable all pass - PUSHED
flight 128283 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/128283/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-arm64-arm64-xl-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass version targeted for testing: xen 5c28a3c6814cc3a054fde133c0f6ef77d80c0412 baseline version: xen 146bb7e1934afc5056179fffbf8e24c0db415f8c Last test of basis 128279 2018-10-01 11:00:41 Z0 days Testing same since 128283 2018-10-01 13:00:52 Z0 days1 attempts People who touched revisions under test: Andrew Cooper Julien Grall Marc Zyngier jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git 146bb7e193..5c28a3c681 5c28a3c6814cc3a054fde133c0f6ef77d80c0412 -> smoke ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm/page_alloc: always scrub pages given to the allocator
On 01/10/18 14:57, Boris Ostrovsky wrote: > On 10/1/18 9:50 AM, George Dunlap wrote: >> On 10/01/2018 02:44 PM, Boris Ostrovsky wrote: >>> On 10/1/18 9:12 AM, Andrew Cooper wrote: On 01/10/18 12:13, Jan Beulich wrote: On 01.10.18 at 11:58, wrote: >> Having the allocator return unscrubbed pages is a potential security >> concern: some domain can be given pages with memory contents of another >> domain. This may happen, for example, if a domain voluntarily releases >> its own memory (ballooning being the easiest way for doing this). > And we've always said that in this case it's the domain's responsibility > to scrub the memory of secrets it cares about. Therefore I'm at the > very least missing some background on this change of expectations. You were on the call when this was discussed, along with the synchronous scrubbing in destroydomain. Put simply, the current behaviour is not good enough for a number of security sensitive usecases. The main reason however for doing this is the optimisations it enables, and in particular, not double scrubbing most of our pages. >>> OTOH, it introduces double scrubbing for ballooning because (at least >>> Linux) guests do scrub memory before handing it back to the hypervisor. >> We could add a Xen feature flag which tells the guest balloon driver >> whether the hypervisor will scrub pages (and thus it doesn't need to). > We can, but we are regressing existing guests. There is no regression. How Xen handles its free memory is unrelated to how the guest handles its free memory, and real usecases exist (AMD SEV most obviously) where there deliberately isn't full trust between the guest and the hypervisor. It is suboptimal in the case where Xen and the guest mutually trust each other, but that isn't the only case which exists. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm/page_alloc: always scrub pages given to the allocator
On 01/10/18 12:13, Jan Beulich wrote: On 01.10.18 at 11:58, wrote: >> Having the allocator return unscrubbed pages is a potential security >> concern: some domain can be given pages with memory contents of another >> domain. This may happen, for example, if a domain voluntarily releases >> its own memory (ballooning being the easiest way for doing this). > > And we've always said that in this case it's the domain's responsibility > to scrub the memory of secrets it cares about. Therefore I'm at the > very least missing some background on this change of expectations. > >> Change the allocator to always scrub the pages given to it by: >> >> 1. free_xenheap_pages() >> 2. free_domheap_pages() >> 3. online_page() >> 4. init_heap_pages() >> >> Performance testing has shown that on multi-node machines bootscrub >> vastly outperforms idle-loop scrubbing. So instead of marking all pages >> dirty initially, introduce bootscrub_done to track the completion of >> the process and eagerly scrub all allocated pages during boot. > > I'm afraid I'm somewhat lost: There still is active boot time scrubbing, > or at least I can't see how that might be skipped (other than due to > "bootscrub=0"). I was actually expecting this to change at some > point. Am I perhaps simply mis-reading this part of the description? > >> If bootscrub is disabled, then all pages will be marked as dirty right >> away and scrubbed either in idle-loop or eagerly during allocation. >> >> After this patch, alloc_heap_pages() is guaranteed to return scrubbed >> pages to a caller unless MEMF_no_scrub flag was provided. > > I also don't understand the point of this: Xen's internal allocations > have no need to come from scrubbed memory. This in particular > also puts under question the need to "eagerly scrub all allocated > pages during boot" (quoted from an earlier paragraph). There are ways to share a Xen's page with a guest. So from a security point of view, there is no guarantee that a page allocated with alloc_xenheap_pages() will not end up accessible by some guest. -- Thanks, Sergey ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [xen-unstable test] 128240: regressions - FAIL
On Mon, Oct 01, 2018 at 03:04:02AM -0600, Jan Beulich wrote: > >>> On 30.09.18 at 23:59, wrote: > > flight 128240 xen-unstable real [real] > > http://logs.test-lab.xenproject.org/osstest/logs/128240/ > > > > Regressions :-( > > > > Tests which did not succeed and are blocking, > > including tests which could not be run: > > test-amd64-amd64-migrupgrade 22 guest-migrate/src_host/dst_host fail REGR. > > vs. 128084 > > At the first glance > > libxl: error: libxl_sched.c:232:sched_credit_domain_set: Domain 1:Getting > domain sched credit: Invalid argument > libxl: error: libxl_create.c:1275:domcreate_rebuild_done: Domain 1:cannot > (re-)build domain: -3 > > might indicate a problem resulting from the switch to credit2 as the default > scheduler. But "first glance" here really means what it says - I didn't look > (yet) at what exactly libxl tries to do there, in the hope that others may > know without much digging. I think this is due to toolstack trying to set the same scheduler parameters for the newly created guest. But in this test, the destination host is using a different scheduler from the source host. Asking for credit scheduler on a credit2 host is wrong. The relevant snippet in guest cfg (JSON) is: "sched_params": { "sched": "credit", "weight": 256, "cap": 0 }, I can't think of a method to fix it off the top of my head though. Wei. > > Jan > > > > ___ > Xen-devel mailing list > Xen-devel@lists.xenproject.org > https://lists.xenproject.org/mailman/listinfo/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm/page_alloc: always scrub pages given to the allocator
On Mon, 2018-10-01 at 14:54 +0100, George Dunlap wrote: > On 10/01/2018 02:44 PM, Sergey Dyasli wrote: > > On Mon, 2018-10-01 at 07:38 -0600, Jan Beulich wrote: > > > > > > On 01.10.18 at 15:12, wrote: > > > > > > > > On 01/10/18 12:13, Jan Beulich wrote: > > > > > > > > On 01.10.18 at 11:58, wrote: > > > > > > > > > > > > Having the allocator return unscrubbed pages is a potential security > > > > > > concern: some domain can be given pages with memory contents of > > > > > > another > > > > > > domain. This may happen, for example, if a domain voluntarily > > > > > > releases > > > > > > its own memory (ballooning being the easiest way for doing this). > > > > > > > > > > And we've always said that in this case it's the domain's > > > > > responsibility > > > > > to scrub the memory of secrets it cares about. Therefore I'm at the > > > > > very least missing some background on this change of expectations. > > > > > > > > You were on the call when this was discussed, along with the synchronous > > > > scrubbing in destroydomain. > > > > > > Quite possible, but it has been a while. > > > > > > > Put simply, the current behaviour is not good enough for a number of > > > > security sensitive usecases. > > > > > > Well, I'm looking forward for Sergey to expand on this in the commit > > > message. > > > > > > > The main reason however for doing this is the optimisations it enables, > > > > and in particular, not double scrubbing most of our pages. > > > > > > Well, wait - scrubbing != zeroing (taking into account also what you > > > say further down). > > > > > > > > > Change the allocator to always scrub the pages given to it by: > > > > > > > > > > > > 1. free_xenheap_pages() > > > > > > 2. free_domheap_pages() > > > > > > 3. online_page() > > > > > > 4. init_heap_pages() > > > > > > > > > > > > Performance testing has shown that on multi-node machines bootscrub > > > > > > vastly outperforms idle-loop scrubbing. So instead of marking all > > > > > > pages > > > > > > dirty initially, introduce bootscrub_done to track the completion of > > > > > > the process and eagerly scrub all allocated pages during boot. > > > > > > > > > > I'm afraid I'm somewhat lost: There still is active boot time > > > > > scrubbing, > > > > > or at least I can't see how that might be skipped (other than due to > > > > > "bootscrub=0"). I was actually expecting this to change at some > > > > > point. Am I perhaps simply mis-reading this part of the description? > > > > > > > > No. Sergey tried that, and found a massive perf difference between > > > > scrubbing in the idle loop and scrubbing at boot. (1.2s vs 40s iirc) > > > > > > That's not something you can reasonably compare, imo: For one, > > > it is certainly expected for the background scrubbing to be slower, > > > simply because of other activity on the system. And then 1.2s > > > looks awfully small for a multi-Tb system. Yet it is mainly large > > > systems where the synchronous boot time scrubbing is a problem. > > > > Let me throw in some numbers. > > > > Performance of current idle loop scrubbing is just not good enough: > > on 8 nodes, 32 CPUs and 512GB RAM machine it takes ~40 seconds to scrub > > all the memory instead of ~8 seconds for current bootscrub implementation. > > > > This was measured while synchronously waiting for CPUs to scrub all the > > memory in idle-loop. But scrubbing can happen in background, of course. > > Right, the whole point of idle loop scrubbing is that you *don't* > syncronously wait for *all* the memory to finish scrubbing before you > can use part of it. So why is this an issue for you guys -- what > concrete problem did it cause, that the full amount of memory took 40s > to finish scrubbing rather than only 8s? There is no issue at the moment. Using idle loop to scrub all the memory is just not viable: it doesn't scale. How long does it currently take to bootscrub a multi-Tb system? If it takes a few minutes then I fear that it might take an hour to idle-scrub it. -- Thanks, Sergey ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [freebsd-master test] 128277: all pass - PUSHED
flight 128277 freebsd-master real [real] http://logs.test-lab.xenproject.org/osstest/logs/128277/ Perfect :-) All tests in this flight passed as required version targeted for testing: freebsd 04d432fdc0c15f2da76dac4a9a5caf1aeb051ef0 baseline version: freebsd a62d8e729c6db95f0c2e1de618be0b0796c0a97a Last test of basis 128102 2018-09-26 09:19:09 Z5 days Failing since128168 2018-09-28 09:19:05 Z3 days2 attempts Testing same since 128277 2018-10-01 09:19:04 Z0 days1 attempts People who touched revisions under test: 0mp <0...@freebsd.org> ae allanjude andrew bdrewery brooks bz dim emaste gjb gonzo gordon hselasky imp jhb kib mjg np sef slavash tuexen ygy jobs: build-amd64-freebsd-againpass build-amd64-freebsd pass build-amd64-xen-freebsd pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/freebsd.git a62d8e729c6..04d432fdc0c 04d432fdc0c15f2da76dac4a9a5caf1aeb051ef0 -> tested/master ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
On 10/01/2018 03:00 PM, Jan Beulich wrote: On 01.10.18 at 15:38, wrote: >> On 10/01/2018 12:25 PM, Jan Beulich wrote: >>> I think the main concern >>> was with the way migration of the new value was implemented. But I >>> really have to defer to Andrew for that, irrespective of him not >>> having responded (on the list) to prior pings. >> >> Is Andrew really the only person who knows enough about migration to >> give this the thumbs-up? > > That's not the point here, at least afaic: He had voiced _some_ > concern on an earlier version. In such a case it is, I think, only > appropriate to wait with committing until there was indication > that the concerns were sufficiently addressed (verbally or by > adjustments to the code). Right -- but it's not your job to make sure the migration stuff is properly addressed; it's Wei and Ian's job. Wei's R-b was a statement from him that the code was good; when Andy questioned that, I think it was then *Wei's* job to address the question, not yours or Andy's (or even Olaf's). If Wei says, "I've considered Andy's objections and I think the patch is fine as-is", then it can be checked in (given a reasonable amount of time for Andy to respond); and Wei can own whatever consequences there are. -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm/page_alloc: always scrub pages given to the allocator
On Mon, 2018-10-01 at 07:38 -0600, Jan Beulich wrote: > > > > On 01.10.18 at 15:12, wrote: > > > > On 01/10/18 12:13, Jan Beulich wrote: > > > > > > On 01.10.18 at 11:58, wrote: > > > > > > > > Having the allocator return unscrubbed pages is a potential security > > > > concern: some domain can be given pages with memory contents of another > > > > domain. This may happen, for example, if a domain voluntarily releases > > > > its own memory (ballooning being the easiest way for doing this). > > > > > > And we've always said that in this case it's the domain's responsibility > > > to scrub the memory of secrets it cares about. Therefore I'm at the > > > very least missing some background on this change of expectations. > > > > You were on the call when this was discussed, along with the synchronous > > scrubbing in destroydomain. > > Quite possible, but it has been a while. > > > Put simply, the current behaviour is not good enough for a number of > > security sensitive usecases. > > Well, I'm looking forward for Sergey to expand on this in the commit > message. Jan, I think this is the main argument here: what to do about those security sensitive use cases? Scrubbing everything unconditionally might be a too radical approach. Would inroducing a new cmdline param be appropriate? > > > The main reason however for doing this is the optimisations it enables, > > and in particular, not double scrubbing most of our pages. > > Well, wait - scrubbing != zeroing (taking into account also what you > say further down). Andrew, I'm not yet convinced myself about the value that returning always zeroed pages from the allocator provides. Most of the pages are given to guests anyway, and re-zeroing a few pages in the hypervisor doesn't sound too bad. But maybe I'm just not aware of cases where Xen performs large allocations and zeroes them afterwards? -- Thanks, Sergey ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2] xen/vsprintf: Introduce %pd formatter for domains
>>> On 01.10.18 at 15:55, wrote: > On 01/10/18 11:25, Jan Beulich wrote: > On 01.10.18 at 12:23, wrote: >>> On 01/10/18 11:14, Jan Beulich wrote: >>> On 01.10.18 at 12:02, wrote: > On 01/10/18 10:08, Jan Beulich wrote: > On 28.09.18 at 19:22, wrote: >>> +static char *print_domain(char *str, char *end, const struct domain *d) >>> +{ >>> +const char *name = NULL; >>> + >>> +/* Some debugging may have an optionally-NULL pointer. */ >>> +if ( unlikely(!d) ) >>> +return string(str, end, "NULL", -1, -1, 0); >>> + >>> +if ( str < end ) >>> +*str = 'd'; >>> + >>> +switch ( d->domain_id ) >>> +{ >>> +case DOMID_IO: name = "[IO]"; break; >>> +case DOMID_XEN: name = "[XEN]"; break; >>> +case DOMID_COW: name = "[COW]"; break; >>> +case DOMID_IDLE: name = "[IDLE]"; break; >> default: ASSERT_UNREACHABLE(); >> >> ? > No - specifically not in this case. > > This path is used when printing crash information, and falling back to a > number is better behaviour than falling into an infinite loop, > overflowing the primary stack, then taking a #DF (which escalates to > triple fault on AMD), without printing anything useful. Ah, good point. Perhaps worth a brief comment instead of a "default:" then? >>> This incremental diff? >> LGTM, thanks. > > I've committed this now, but its just occurred to me that we couldn't > possibly have had a default case, because that is the common case for > most domains. Oh, indeed, but still, while the "unreachable" part of my suggestion was indeed wrong, but it still could have been ASSERT(d->domain_id < DOMID_FIRST_RESERVED). But I agree this wouldn't have been overly helpful. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
>>> On 01.10.18 at 15:38, wrote: > On 10/01/2018 12:25 PM, Jan Beulich wrote: >> I think the main concern >> was with the way migration of the new value was implemented. But I >> really have to defer to Andrew for that, irrespective of him not >> having responded (on the list) to prior pings. > > Is Andrew really the only person who knows enough about migration to > give this the thumbs-up? That's not the point here, at least afaic: He had voiced _some_ concern on an earlier version. In such a case it is, I think, only appropriate to wait with committing until there was indication that the concerns were sufficiently addressed (verbally or by adjustments to the code). It is instead my incomplete knowledge on the various parts of migration that made it undesirable to try to judge in his place, after pings lead no- where. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation [and 1 more messages]
On 01/10/18 14:24, Ian Jackson wrote: > >>> handle_tsc_info has no code to verify that padding is indeed zero. Due >>> to the lack of a version field it is impossible to know if the sender >>> already has the newly introduced vtsc_tolerance field. In the worst >>> case the receiving domU will get an unemulated TSC. >> The lack of padding verification is deliberate, for forwards >> compatibility. Why does the sending code matter? One way or another, >> if the field is 0, the option wasn't present or wasn't configured. >> Neither of these situations affect the decision-making that the >> receiving side needs to perform. > We are talking here about an unused field that is (supposedly?) > always sent as zero ? The spec requires the padding to always be sent as zero, and verify-stream-v2 does check the padding and object to non-zero values. libxc's write_tsc_info() has always fully zeroed the structure, and convert-legacy-stream also behaves the same way. However, I notice this change will break migration from pre 4.6 as write_libxc_tsc_info() now has a mismatched number of parameters to pack and throw an exception. > > AIUI: > > In the new code the semantics of zero is "do not allow any tolerance". > The old code handles this correctly. The issue is with migrations > from new code to old: the tolerance value will silently ignored. Migrating from new to old doesn't remotely work, and whether things explode or not is very context dependent. I'm not sure its worth caring about this case. We never have before. (Getting backwards migration working for emergency cases is on my TODO list, but it is dependent on rewriting the hypervisor interfaces for getting/setting guest state.) ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm/page_alloc: always scrub pages given to the allocator
On 10/1/18 9:50 AM, George Dunlap wrote: > On 10/01/2018 02:44 PM, Boris Ostrovsky wrote: >> On 10/1/18 9:12 AM, Andrew Cooper wrote: >>> On 01/10/18 12:13, Jan Beulich wrote: >>> On 01.10.18 at 11:58, wrote: > Having the allocator return unscrubbed pages is a potential security > concern: some domain can be given pages with memory contents of another > domain. This may happen, for example, if a domain voluntarily releases > its own memory (ballooning being the easiest way for doing this). And we've always said that in this case it's the domain's responsibility to scrub the memory of secrets it cares about. Therefore I'm at the very least missing some background on this change of expectations. >>> You were on the call when this was discussed, along with the synchronous >>> scrubbing in destroydomain. >>> >>> Put simply, the current behaviour is not good enough for a number of >>> security sensitive usecases. >>> >>> The main reason however for doing this is the optimisations it enables, >>> and in particular, not double scrubbing most of our pages. >> OTOH, it introduces double scrubbing for ballooning because (at least >> Linux) guests do scrub memory before handing it back to the hypervisor. > We could add a Xen feature flag which tells the guest balloon driver > whether the hypervisor will scrub pages (and thus it doesn't need to). We can, but we are regressing existing guests. Can we except ballooned pages from being scrubbed, and instead have a guest request scrubbing if it has concerns (that Andrew mentioned) about this? -boris > > Andy, wasn't unconditional scrubbing of guest pages also required by > some certification or other? > > -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm/page_alloc: always scrub pages given to the allocator
On 01/10/18 14:50, George Dunlap wrote: > On 10/01/2018 02:44 PM, Boris Ostrovsky wrote: >> On 10/1/18 9:12 AM, Andrew Cooper wrote: >>> On 01/10/18 12:13, Jan Beulich wrote: >>> On 01.10.18 at 11:58, wrote: > Having the allocator return unscrubbed pages is a potential security > concern: some domain can be given pages with memory contents of another > domain. This may happen, for example, if a domain voluntarily releases > its own memory (ballooning being the easiest way for doing this). And we've always said that in this case it's the domain's responsibility to scrub the memory of secrets it cares about. Therefore I'm at the very least missing some background on this change of expectations. >>> You were on the call when this was discussed, along with the synchronous >>> scrubbing in destroydomain. >>> >>> Put simply, the current behaviour is not good enough for a number of >>> security sensitive usecases. >>> >>> The main reason however for doing this is the optimisations it enables, >>> and in particular, not double scrubbing most of our pages. >> OTOH, it introduces double scrubbing for ballooning because (at least >> Linux) guests do scrub memory before handing it back to the hypervisor. > We could add a Xen feature flag which tells the guest balloon driver > whether the hypervisor will scrub pages (and thus it doesn't need to). > > Andy, wasn't unconditional scrubbing of guest pages also required by > some certification or other? It was for Common Criteria, and not trusting guests to zero their own pages is by far the simplest way of ensuring that they can't set up a sidechannel. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2] xen/vsprintf: Introduce %pd formatter for domains
On 01/10/18 11:25, Jan Beulich wrote: On 01.10.18 at 12:23, wrote: >> On 01/10/18 11:14, Jan Beulich wrote: >> On 01.10.18 at 12:02, wrote: On 01/10/18 10:08, Jan Beulich wrote: On 28.09.18 at 19:22, wrote: >> +static char *print_domain(char *str, char *end, const struct domain *d) >> +{ >> +const char *name = NULL; >> + >> +/* Some debugging may have an optionally-NULL pointer. */ >> +if ( unlikely(!d) ) >> +return string(str, end, "NULL", -1, -1, 0); >> + >> +if ( str < end ) >> +*str = 'd'; >> + >> +switch ( d->domain_id ) >> +{ >> +case DOMID_IO: name = "[IO]"; break; >> +case DOMID_XEN: name = "[XEN]"; break; >> +case DOMID_COW: name = "[COW]"; break; >> +case DOMID_IDLE: name = "[IDLE]"; break; > default: ASSERT_UNREACHABLE(); > > ? No - specifically not in this case. This path is used when printing crash information, and falling back to a number is better behaviour than falling into an infinite loop, overflowing the primary stack, then taking a #DF (which escalates to triple fault on AMD), without printing anything useful. >>> Ah, good point. Perhaps worth a brief comment instead of a "default:" >>> then? >> This incremental diff? > LGTM, thanks. I've committed this now, but its just occurred to me that we couldn't possibly have had a default case, because that is the common case for most domains. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm/page_alloc: always scrub pages given to the allocator
On 10/01/2018 02:44 PM, Sergey Dyasli wrote: > On Mon, 2018-10-01 at 07:38 -0600, Jan Beulich wrote: > On 01.10.18 at 15:12, wrote: >>> >>> On 01/10/18 12:13, Jan Beulich wrote: >>> On 01.10.18 at 11:58, wrote: > > Having the allocator return unscrubbed pages is a potential security > concern: some domain can be given pages with memory contents of another > domain. This may happen, for example, if a domain voluntarily releases > its own memory (ballooning being the easiest way for doing this). And we've always said that in this case it's the domain's responsibility to scrub the memory of secrets it cares about. Therefore I'm at the very least missing some background on this change of expectations. >>> >>> You were on the call when this was discussed, along with the synchronous >>> scrubbing in destroydomain. >> >> Quite possible, but it has been a while. >> >>> Put simply, the current behaviour is not good enough for a number of >>> security sensitive usecases. >> >> Well, I'm looking forward for Sergey to expand on this in the commit >> message. >> >>> The main reason however for doing this is the optimisations it enables, >>> and in particular, not double scrubbing most of our pages. >> >> Well, wait - scrubbing != zeroing (taking into account also what you >> say further down). >> > Change the allocator to always scrub the pages given to it by: > > 1. free_xenheap_pages() > 2. free_domheap_pages() > 3. online_page() > 4. init_heap_pages() > > Performance testing has shown that on multi-node machines bootscrub > vastly outperforms idle-loop scrubbing. So instead of marking all pages > dirty initially, introduce bootscrub_done to track the completion of > the process and eagerly scrub all allocated pages during boot. I'm afraid I'm somewhat lost: There still is active boot time scrubbing, or at least I can't see how that might be skipped (other than due to "bootscrub=0"). I was actually expecting this to change at some point. Am I perhaps simply mis-reading this part of the description? >>> >>> No. Sergey tried that, and found a massive perf difference between >>> scrubbing in the idle loop and scrubbing at boot. (1.2s vs 40s iirc) >> >> That's not something you can reasonably compare, imo: For one, >> it is certainly expected for the background scrubbing to be slower, >> simply because of other activity on the system. And then 1.2s >> looks awfully small for a multi-Tb system. Yet it is mainly large >> systems where the synchronous boot time scrubbing is a problem. > > Let me throw in some numbers. > > Performance of current idle loop scrubbing is just not good enough: > on 8 nodes, 32 CPUs and 512GB RAM machine it takes ~40 seconds to scrub > all the memory instead of ~8 seconds for current bootscrub implementation. > > This was measured while synchronously waiting for CPUs to scrub all the > memory in idle-loop. But scrubbing can happen in background, of course. Right, the whole point of idle loop scrubbing is that you *don't* syncronously wait for *all* the memory to finish scrubbing before you can use part of it. So why is this an issue for you guys -- what concrete problem did it cause, that the full amount of memory took 40s to finish scrubbing rather than only 8s? -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm/page_alloc: always scrub pages given to the allocator
On 10/01/2018 02:44 PM, Boris Ostrovsky wrote: > On 10/1/18 9:12 AM, Andrew Cooper wrote: >> On 01/10/18 12:13, Jan Beulich wrote: >> On 01.10.18 at 11:58, wrote: Having the allocator return unscrubbed pages is a potential security concern: some domain can be given pages with memory contents of another domain. This may happen, for example, if a domain voluntarily releases its own memory (ballooning being the easiest way for doing this). >>> And we've always said that in this case it's the domain's responsibility >>> to scrub the memory of secrets it cares about. Therefore I'm at the >>> very least missing some background on this change of expectations. >> You were on the call when this was discussed, along with the synchronous >> scrubbing in destroydomain. >> >> Put simply, the current behaviour is not good enough for a number of >> security sensitive usecases. >> >> The main reason however for doing this is the optimisations it enables, >> and in particular, not double scrubbing most of our pages. > > OTOH, it introduces double scrubbing for ballooning because (at least > Linux) guests do scrub memory before handing it back to the hypervisor. We could add a Xen feature flag which tells the guest balloon driver whether the hypervisor will scrub pages (and thus it doesn't need to). Andy, wasn't unconditional scrubbing of guest pages also required by some certification or other? -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm/page_alloc: always scrub pages given to the allocator
On 10/1/18 9:12 AM, Andrew Cooper wrote: > On 01/10/18 12:13, Jan Beulich wrote: > On 01.10.18 at 11:58, wrote: >>> Having the allocator return unscrubbed pages is a potential security >>> concern: some domain can be given pages with memory contents of another >>> domain. This may happen, for example, if a domain voluntarily releases >>> its own memory (ballooning being the easiest way for doing this). >> And we've always said that in this case it's the domain's responsibility >> to scrub the memory of secrets it cares about. Therefore I'm at the >> very least missing some background on this change of expectations. > You were on the call when this was discussed, along with the synchronous > scrubbing in destroydomain. > > Put simply, the current behaviour is not good enough for a number of > security sensitive usecases. > > The main reason however for doing this is the optimisations it enables, > and in particular, not double scrubbing most of our pages. OTOH, it introduces double scrubbing for ballooning because (at least Linux) guests do scrub memory before handing it back to the hypervisor. -boris ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm/page_alloc: always scrub pages given to the allocator
On Mon, 2018-10-01 at 07:38 -0600, Jan Beulich wrote: > > > > On 01.10.18 at 15:12, wrote: > > > > On 01/10/18 12:13, Jan Beulich wrote: > > > > > > On 01.10.18 at 11:58, wrote: > > > > > > > > Having the allocator return unscrubbed pages is a potential security > > > > concern: some domain can be given pages with memory contents of another > > > > domain. This may happen, for example, if a domain voluntarily releases > > > > its own memory (ballooning being the easiest way for doing this). > > > > > > And we've always said that in this case it's the domain's responsibility > > > to scrub the memory of secrets it cares about. Therefore I'm at the > > > very least missing some background on this change of expectations. > > > > You were on the call when this was discussed, along with the synchronous > > scrubbing in destroydomain. > > Quite possible, but it has been a while. > > > Put simply, the current behaviour is not good enough for a number of > > security sensitive usecases. > > Well, I'm looking forward for Sergey to expand on this in the commit > message. > > > The main reason however for doing this is the optimisations it enables, > > and in particular, not double scrubbing most of our pages. > > Well, wait - scrubbing != zeroing (taking into account also what you > say further down). > > > > > Change the allocator to always scrub the pages given to it by: > > > > > > > > 1. free_xenheap_pages() > > > > 2. free_domheap_pages() > > > > 3. online_page() > > > > 4. init_heap_pages() > > > > > > > > Performance testing has shown that on multi-node machines bootscrub > > > > vastly outperforms idle-loop scrubbing. So instead of marking all pages > > > > dirty initially, introduce bootscrub_done to track the completion of > > > > the process and eagerly scrub all allocated pages during boot. > > > > > > I'm afraid I'm somewhat lost: There still is active boot time scrubbing, > > > or at least I can't see how that might be skipped (other than due to > > > "bootscrub=0"). I was actually expecting this to change at some > > > point. Am I perhaps simply mis-reading this part of the description? > > > > No. Sergey tried that, and found a massive perf difference between > > scrubbing in the idle loop and scrubbing at boot. (1.2s vs 40s iirc) > > That's not something you can reasonably compare, imo: For one, > it is certainly expected for the background scrubbing to be slower, > simply because of other activity on the system. And then 1.2s > looks awfully small for a multi-Tb system. Yet it is mainly large > systems where the synchronous boot time scrubbing is a problem. Let me throw in some numbers. Performance of current idle loop scrubbing is just not good enough: on 8 nodes, 32 CPUs and 512GB RAM machine it takes ~40 seconds to scrub all the memory instead of ~8 seconds for current bootscrub implementation. This was measured while synchronously waiting for CPUs to scrub all the memory in idle-loop. But scrubbing can happen in background, of course. -- Thanks, Sergey ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH] x86: deal with firmware setting bogus TSC_ADJUST values
The system Intel have handed me for AVX512 emulator work ("Gigabyte Technology Co., Ltd. X299 AORUS Gaming 3 Pro/X299 AORUS Gaming 3 Pro-CF, BIOS F3 12/28/2017") would not come up under Xen - it hung in the middle of Dom0 PCI initialization. As it turned out, Xen's time management did not work because of the firmware setting (only) the boot CPU's TSC_ADJUST MSR to a large negative value (on the order of -2^50). Follow Linux (also shamelessly stealing their comments) in - clearing the register for the boot CPU (we don't have a need for exceptions here yet, as the only exception in Linux is a class of systems Xen doesn't work on anyway as far as I'm aware), - forcing non-negative values uniformly, - syncing the registers within sockets. Linux caps at 0x7fff as well, but their comment saying "as those wreckage the timer as well" does, to me at least, neither really explain the reason nor the particular value chosen. Hence until someone runs into such a system (and can then, hopefully, explain things) I think we should leave out that specific part. In order to avoid making init_percpu_time() depend on running _before_ set_cpu_sibling_map() (and hence the booting CPU _not_ being accounted in socket_cpumask[] yet), move that call slightly earlier in start_secondary(). Signed-off-by: Jan Beulich --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -381,6 +381,8 @@ void start_secondary(void *unused) smp_callin(); +set_cpu_sibling_map(cpu); + init_percpu_time(); setup_secondary_APIC_clock(); @@ -393,7 +395,6 @@ void start_secondary(void *unused) /* This must be done before setting cpu_online_map */ spin_debug_enable(); -set_cpu_sibling_map(cpu); notify_cpu_starting(cpu); /* --- a/xen/arch/x86/time.c +++ b/xen/arch/x86/time.c @@ -88,6 +88,9 @@ static bool __read_mostly using_pit; /* Boot timestamp, filled in head.S */ u64 __initdata boot_tsc_stamp; +/* Per-socket TSC_ADJUST values, for secondary cores/threads to sync to. */ +static uint64_t *__read_mostly tsc_adjust; + /* * 32-bit division of integer dividend and integer divisor yielding * 32-bit fractional quotient. @@ -1602,6 +1605,57 @@ void init_percpu_time(void) /* Initial estimate for TSC rate. */ t->tsc_scale = per_cpu(cpu_time, 0).tsc_scale; +if ( tsc_adjust ) +{ +unsigned int socket = cpu_to_socket(smp_processor_id()); +int64_t adj; + +/* For now we don't want to come here for the BSP. */ +ASSERT(system_state >= SYS_STATE_smp_boot); + +rdmsrl(MSR_IA32_TSC_ADJUST, adj); + +/* + * Check whether this CPU is the first in a package to come up. In + * this case do not check the boot value against another package + * because the new package might have been physically hotplugged, + * where TSC_ADJUST is expected to be different. + */ +if ( cpumask_weight(socket_cpumask[socket]) == 1 ) +{ +/* + * On the boot CPU we just force the ADJUST value to 0 if it's non- + * zero (in early_time_init()). We don't do that on non-boot CPUs + * because physical hotplug should have set the ADJUST register to a + * value > 0, so the TSC is in sync with the already running CPUs. + * + * But we always force non-negative ADJUST values. Otherwise the TSC + * deadline timer creates an interrupt storm. + */ +if ( adj < 0 ) +{ +printk(XENLOG_WARNING + "TSC ADJUST set to -%lx on CPU%u - clearing\n", + -adj, smp_processor_id()); +wrmsrl(MSR_IA32_TSC_ADJUST, 0); +adj = 0; +} +tsc_adjust[socket] = adj; +} +else if ( adj != tsc_adjust[socket] ) +{ +static bool __read_mostly warned; + +if ( !warned ) +{ +warned = true; +printk(XENLOG_WARNING + "Differing TSC ADJUST values within socket(s) - fixing all\n"); +} +wrmsrl(MSR_IA32_TSC_ADJUST, tsc_adjust[socket]); +} +} + local_irq_save(flags); now = read_platform_stime(NULL); tsc = rdtsc_ordered(); @@ -1788,6 +1842,15 @@ int __init init_xen_time(void) /* Finish platform timer initialization. */ try_platform_timer_tail(false); +/* + * Setup space to track per-socket TSC_ADJUST values. Don't fiddle with + * values if the TSC is not reported as invariant. Ignore allocation + * failure here - most systems won't need any adjustment anyway. + */ +if ( boot_cpu_has(X86_FEATURE_TSC_ADJUST) && + boot_cpu_has(X86_FEATURE_ITSC) ) +tsc_adjust = xzalloc_array(uint64_t, nr_sockets); + return 0; } @@ -1798,6 +1861,19 @@ void __init early_time_init(void) struct cpu_time *t = _cpu(cpu_time);
Re: [Xen-devel] [PATCH] mm/page_alloc: always scrub pages given to the allocator
>>> On 01.10.18 at 15:12, wrote: > On 01/10/18 12:13, Jan Beulich wrote: > On 01.10.18 at 11:58, wrote: >>> Having the allocator return unscrubbed pages is a potential security >>> concern: some domain can be given pages with memory contents of another >>> domain. This may happen, for example, if a domain voluntarily releases >>> its own memory (ballooning being the easiest way for doing this). >> And we've always said that in this case it's the domain's responsibility >> to scrub the memory of secrets it cares about. Therefore I'm at the >> very least missing some background on this change of expectations. > > You were on the call when this was discussed, along with the synchronous > scrubbing in destroydomain. Quite possible, but it has been a while. > Put simply, the current behaviour is not good enough for a number of > security sensitive usecases. Well, I'm looking forward for Sergey to expand on this in the commit message. > The main reason however for doing this is the optimisations it enables, > and in particular, not double scrubbing most of our pages. Well, wait - scrubbing != zeroing (taking into account also what you say further down). >>> Change the allocator to always scrub the pages given to it by: >>> >>> 1. free_xenheap_pages() >>> 2. free_domheap_pages() >>> 3. online_page() >>> 4. init_heap_pages() >>> >>> Performance testing has shown that on multi-node machines bootscrub >>> vastly outperforms idle-loop scrubbing. So instead of marking all pages >>> dirty initially, introduce bootscrub_done to track the completion of >>> the process and eagerly scrub all allocated pages during boot. >> I'm afraid I'm somewhat lost: There still is active boot time scrubbing, >> or at least I can't see how that might be skipped (other than due to >> "bootscrub=0"). I was actually expecting this to change at some >> point. Am I perhaps simply mis-reading this part of the description? > > No. Sergey tried that, and found a massive perf difference between > scrubbing in the idle loop and scrubbing at boot. (1.2s vs 40s iirc) That's not something you can reasonably compare, imo: For one, it is certainly expected for the background scrubbing to be slower, simply because of other activity on the system. And then 1.2s looks awfully small for a multi-Tb system. Yet it is mainly large systems where the synchronous boot time scrubbing is a problem. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
On 10/01/2018 12:25 PM, Jan Beulich wrote: On 01.10.18 at 12:52, wrote: >> Olaf Hering writes ("Re: [PATCH v9] new config option vtsc_tolerance_khz to >> avoid TSC emulation"): >>> Am Thu, 13 Sep 2018 09:39:13 +0200 >>> schrieb Olaf Hering : this patch was not applied yet, even after a few "pings". >>> >>> No reaction since months. >>> So scrap that patch, just in case it is still part of someones to-consider >> queue. >> >> I think it would be worth exploring whether this patch could be >> applied without an explicit ack from Andrew. >> >> I confess I ignored all the previous mails because they all started >>Andrew, >> or >>Andrew, Lars, >> so I assumed that you didn't want attention from other >> maintainers/committers. >> >> Now that I look at the thread it is difficult for me to see the wood >> for the trees but I don't see unanswered concerns. > > Problem is - discussion around this was (iirc) happening not only on > the list, but also on irc (including perhaps private chats). It was for > that reason that I made my R-b conditional upon Andrew at least > giving an informal go-ahead (otherwise, together with Wei's R-b, > the patch could have gone in). > > Besides the question of correctness from the perspective of guests > (which imo is not a problem as the feature needs to be actively > enabled, and I think we have no reason to keep admins from > breaking their guests if they really mean to), I agree with this, BTW. > I think the main concern > was with the way migration of the new value was implemented. But I > really have to defer to Andrew for that, irrespective of him not > having responded (on the list) to prior pings. Is Andrew really the only person who knows enough about migration to give this the thumbs-up? Technically migration is in the toolstack, and so Wei should have double-checked that before giving his review; and when a question was raised, Wei (as the relevant maintainer who had given it an R-b) should either have asserted that the code was indeed correct, or withdrawn his R-b and given Olaf feedback to allow him to get it into shape. (This is all based on the history sketched out by Jan above; if there is more to it then of course this analysis may not be correct.) I was only skimming the thread, and intended to weigh in at some point; but I didn't really understand why it was blocked on Andrew in the first place. -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [[PATCH v3] 4/4] xen/arm: Replace call_smc with arm_smccc_smc
Hi Andrew, On 10/01/2018 02:11 PM, Andrew Cooper wrote: On 01/10/18 13:46, Julien Grall wrote: call_smc is a subset of arm_smccc_smc. Rather than having 2 methods to do SMCCC call, replace all call to the former by the later. Signed-off-by: Julien Grall --- Changes in v3: - Use PSCI_RET where needed --- xen/arch/arm/Makefile| 1 - xen/arch/arm/platforms/exynos5.c | 3 ++- xen/arch/arm/platforms/seattle.c | 4 ++-- xen/arch/arm/psci.c | 37 + xen/arch/arm/smc.S | 21 - xen/include/asm-arm/processor.h | 3 --- 6 files changed, 29 insertions(+), 40 deletions(-) delete mode 100644 xen/arch/arm/smc.S diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile index b9b141dc84..37fa8268b3 100644 --- a/xen/arch/arm/Makefile +++ b/xen/arch/arm/Makefile @@ -39,7 +39,6 @@ obj-y += processor.o obj-y += psci.o obj-y += setup.o obj-y += shutdown.o -obj-y += smc.o obj-y += smp.o obj-y += smpboot.o obj-y += sysctl.o diff --git a/xen/arch/arm/platforms/exynos5.c b/xen/arch/arm/platforms/exynos5.c index c15ecf80f5..e2c0b7b878 100644 --- a/xen/arch/arm/platforms/exynos5.c +++ b/xen/arch/arm/platforms/exynos5.c @@ -26,6 +26,7 @@ #include #include #include +#include static bool secure_firmware; @@ -249,7 +250,7 @@ static int exynos5_cpu_up(int cpu) iounmap(power); if ( secure_firmware ) -call_smc(SMC_CMD_CPU1BOOT, cpu, 0, 0); +arm_smccc_smc(SMC_CMD_CPU1BOOT, cpu, NULL); return cpu_up_send_sgi(cpu); } diff --git a/xen/arch/arm/platforms/seattle.c b/xen/arch/arm/platforms/seattle.c index 893cc17972..64cc1868c2 100644 --- a/xen/arch/arm/platforms/seattle.c +++ b/xen/arch/arm/platforms/seattle.c @@ -33,12 +33,12 @@ static const char * const seattle_dt_compat[] __initconst = */ static void seattle_system_reset(void) { -call_smc(PSCI_0_2_FN32_SYSTEM_RESET, 0, 0, 0); +arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_RESET, NULL); } static void seattle_system_off(void) { -call_smc(PSCI_0_2_FN32_SYSTEM_OFF, 0, 0, 0); +arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_OFF, NULL); } PLATFORM_START(seattle, "SEATTLE") diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c index 941eec921b..4ae6504f3e 100644 --- a/xen/arch/arm/psci.c +++ b/xen/arch/arm/psci.c @@ -42,42 +42,53 @@ uint32_t smccc_ver; static uint32_t psci_cpu_on_nr; +#define PSCI_RET(res) ((int32_t)(res).a0) + int call_psci_cpu_on(int cpu) { -return call_smc(psci_cpu_on_nr, cpu_logical_map(cpu), __pa(init_secondary), 0); +struct arm_smccc_res res; + +arm_smccc_smc(psci_cpu_on_nr, cpu_logical_map(cpu), __pa(init_secondary), + ); + +return PSCI_RET(res.a0); } Sorry if I'm jumping into the middle of a conversation, but why force all callers to manually extract the return value when it is a fixed register? The SMCCC allows up to 4 return value (a0 ... a3). At the moment, most of the caller in Xen only care about one result value. But this will change soon (see OP-TEE support in Xen). Wouldn't it be far easier to do this: #define arcm_smccc_smc(...) \ ({ \ struct arm_smccc_res res; \ \ if ( cpus_have_const_cap(ARM_SMCCC_1_1) ) \ res = arm_smccc_1_1_smc(__VA_ARGS__); \ else\ res = arm_smccc_1_0_smc(__VA_ARGS__); \ \ (int)res.a0;\ We can't possibly cast here. The interpretation of a0 is different from one call to another. }) Which also allows the compiler to optimise out the structure if it isn't read, and also avoids the caller needing to pass a NULL pointer for "I don't want the result". While I understand the value, I don't think this would be correct if we want to implement an interface with the SMCCC. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
>>> On 01.10.18 at 14:39, wrote: > On 07/06/18 14:08, Olaf Hering wrote: >> Add an option to control when vTSC emulation will be activated for a >> domU with tsc_mode=default. Without such option each TSC access from >> domU will be emulated, which causes a significant perfomance drop for >> workloads that make use of rdtsc. >> >> One option to avoid the TSC option is to run domUs with tsc_mode=native. >> This has the drawback that migrating a domU from a "2.3GHz" class host >> to a "2.4GHz" class host may change the rate at wich the TSC counter >> increases, the domU may not be prepared for that. >> >> With the new option the host admin can decide how a domU should behave >> when it is migrated across systems of the same class. Since there is >> always some jitter when Xen calibrates the cpu_khz value, all hosts of >> the same class will most likely have slightly different values. As a >> result vTSC emulation is unavoidable. Data collected during the incident >> which triggered this change showed a jitter of up to 200 KHz across >> systems of the same class. > > Do you have any further details of the systems involved? If they are > identical systems, they should all have the same real TSC frequency, and > its a known issue that Xen isn't very good at working out the > frequency. TBH, fixing that would be far better overall. Are you convinced all parts match their nominal frequency without _any_ deviation? If that was the case, we could indeed use CPUID leaves 0x15 / 0x16 output, if available. But I very much doubt this. As an example, here's what bare metal Linux says on my newest system: tsc: Detected 2600.000 MHz processor tsc: Refined TSC clocksource calibration: 2591.990 MHz Xen figures: (XEN) Detected 2592.107 MHz processor. And then after another re-boot bare metal Linux again tsc: Refined TSC clocksource calibration: 2592.008 MHz Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm/page_alloc: always scrub pages given to the allocator
On 10/01/2018 02:12 PM, Andrew Cooper wrote: > On 01/10/18 12:13, Jan Beulich wrote: > On 01.10.18 at 11:58, wrote: >>> Having the allocator return unscrubbed pages is a potential security >>> concern: some domain can be given pages with memory contents of another >>> domain. This may happen, for example, if a domain voluntarily releases >>> its own memory (ballooning being the easiest way for doing this). >> And we've always said that in this case it's the domain's responsibility >> to scrub the memory of secrets it cares about. Therefore I'm at the >> very least missing some background on this change of expectations. > > You were on the call when this was discussed, along with the synchronous > scrubbing in destroydomain. > > Put simply, the current behaviour is not good enough for a number of > security sensitive usecases. > > The main reason however for doing this is the optimisations it enables, > and in particular, not double scrubbing most of our pages. All of that should have been in the changelog, at least in summary form, regardless of where else it may have been discussed. -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation [and 1 more messages]
Jan Beulich writes ("Re: [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation"): > Problem is - discussion around this was (iirc) happening not only on > the list, but also on irc (including perhaps private chats). Hrm. Well, if it didn't happen on the list, it didn't happen. It is often useful and productive, of course, to thrash something out on irc, or ping there, or whatever but: Conclusions from irc (and from in-person conversations, phone calls, or other kinds of un-minuted discussions) must be transferred to email as otherwise they are lost (and the effort of having them is often wasted as the conversation has to be had again). If properly writing up the conclusion from an irc conversation is too hard, pasting a transcript into an email seems a bare minimum. > It was for that reason that I made my R-b conditional upon Andrew at > least giving an informal go-ahead (otherwise, together with Wei's > R-b, the patch could have gone in). Right. Thanks for the clarification. Andrew Cooper writes ("Re: [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation"): > [lots of stuff] Thanks for the review. I hope Olaf will be able to address most of it, but: Andrew: > [Olaf:] > > Existing padding fields are reused to store vtsc_khz_tolerance as u16. > > The padding is sent as zero in write_tsc_info to the receving host. > > The padding is undefined if the changed code runs as receiver. > > I'm not sure what you mean by this final sentence. Me neither. > > handle_tsc_info has no code to verify that padding is indeed zero. Due > > to the lack of a version field it is impossible to know if the sender > > already has the newly introduced vtsc_tolerance field. In the worst > > case the receiving domU will get an unemulated TSC. > > The lack of padding verification is deliberate, for forwards > compatibility. Why does the sending code matter? One way or another, > if the field is 0, the option wasn't present or wasn't configured. > Neither of these situations affect the decision-making that the > receiving side needs to perform. We are talking here about an unused field that is (supposedly?) always sent as zero ? AIUI: In the new code the semantics of zero is "do not allow any tolerance". The old code handles this correctly. The issue is with migrations from new code to old: the tolerance value will silently ignored. Presumably this is considered preferable to the alternative, which is to extend the migration stream in a deliberately incompatible way so that the migration fails. I think this is what Olaf is trying to say ? > > Signed-off-by: Olaf Hering > > Reviewed-by: Wei Liu (v07/v08) > > Reviewed-by: Jan Beulich (v08) > > I'm still -0.5 for this patch. I can appreciate why you want it, but it > is a gross hack which only works when you don't skew time more than NTP > in the guest can cope with. That surely is why there is a limit to the tolerance. I've asked Olaf to try to quantify an appropriate limit. > My gut feeling is that there will be other > more subtle fallout. That's not particularly convincing to me. But if you could try to be more specific I think this could be usefull addressed in the documentation for the feature ? Olaf, I think the ball is now in your court. I hope you can address Andrew's comments, and maybe mine too. Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [[PATCH v3] 4/4] xen/arm: Replace call_smc with arm_smccc_smc
Hi, On 10/01/2018 01:46 PM, Julien Grall wrote: PLATFORM_START(seattle, "SEATTLE") diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c index 941eec921b..4ae6504f3e 100644 --- a/xen/arch/arm/psci.c +++ b/xen/arch/arm/psci.c @@ -42,42 +42,53 @@ uint32_t smccc_ver; static uint32_t psci_cpu_on_nr; +#define PSCI_RET(res) ((int32_t)(res).a0) + int call_psci_cpu_on(int cpu) { -return call_smc(psci_cpu_on_nr, cpu_logical_map(cpu), __pa(init_secondary), 0); +struct arm_smccc_res res; + +arm_smccc_smc(psci_cpu_on_nr, cpu_logical_map(cpu), __pa(init_secondary), + ); + +return PSCI_RET(res.a0); Hmmm this should have been PSCI_RET(res). I guess this could be fixed if not other change. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm/page_alloc: always scrub pages given to the allocator
On 01/10/18 12:13, Jan Beulich wrote: On 01.10.18 at 11:58, wrote: >> Having the allocator return unscrubbed pages is a potential security >> concern: some domain can be given pages with memory contents of another >> domain. This may happen, for example, if a domain voluntarily releases >> its own memory (ballooning being the easiest way for doing this). > And we've always said that in this case it's the domain's responsibility > to scrub the memory of secrets it cares about. Therefore I'm at the > very least missing some background on this change of expectations. You were on the call when this was discussed, along with the synchronous scrubbing in destroydomain. Put simply, the current behaviour is not good enough for a number of security sensitive usecases. The main reason however for doing this is the optimisations it enables, and in particular, not double scrubbing most of our pages. > >> Change the allocator to always scrub the pages given to it by: >> >> 1. free_xenheap_pages() >> 2. free_domheap_pages() >> 3. online_page() >> 4. init_heap_pages() >> >> Performance testing has shown that on multi-node machines bootscrub >> vastly outperforms idle-loop scrubbing. So instead of marking all pages >> dirty initially, introduce bootscrub_done to track the completion of >> the process and eagerly scrub all allocated pages during boot. > I'm afraid I'm somewhat lost: There still is active boot time scrubbing, > or at least I can't see how that might be skipped (other than due to > "bootscrub=0"). I was actually expecting this to change at some > point. Am I perhaps simply mis-reading this part of the description? No. Sergey tried that, and found a massive perf difference between scrubbing in the idle loop and scrubbing at boot. (1.2s vs 40s iirc) Scrubbing at boot has some deliberate optimisations to reduce the pressure on the heap lock, and I expect that is where the performance difference lies. It is an issue which wants looking into irrespective of other changes. > >> If bootscrub is disabled, then all pages will be marked as dirty right >> away and scrubbed either in idle-loop or eagerly during allocation. >> >> After this patch, alloc_heap_pages() is guaranteed to return scrubbed >> pages to a caller unless MEMF_no_scrub flag was provided. > I also don't understand the point of this: Xen's internal allocations > have no need to come from scrubbed memory. This isn't true. Almost every caller re-zeroes an allocated page which is the cause of the double scrubbing in most cases. By having the allocators guarantee to hand out zeroed pages, we can avoid the redundant scrubbing. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [[PATCH v3] 4/4] xen/arm: Replace call_smc with arm_smccc_smc
On 01/10/18 13:46, Julien Grall wrote: > call_smc is a subset of arm_smccc_smc. Rather than having 2 methods to > do SMCCC call, replace all call to the former by the later. > > Signed-off-by: Julien Grall > > --- > > Changes in v3: > - Use PSCI_RET where needed > --- > xen/arch/arm/Makefile| 1 - > xen/arch/arm/platforms/exynos5.c | 3 ++- > xen/arch/arm/platforms/seattle.c | 4 ++-- > xen/arch/arm/psci.c | 37 + > xen/arch/arm/smc.S | 21 - > xen/include/asm-arm/processor.h | 3 --- > 6 files changed, 29 insertions(+), 40 deletions(-) > delete mode 100644 xen/arch/arm/smc.S > > diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile > index b9b141dc84..37fa8268b3 100644 > --- a/xen/arch/arm/Makefile > +++ b/xen/arch/arm/Makefile > @@ -39,7 +39,6 @@ obj-y += processor.o > obj-y += psci.o > obj-y += setup.o > obj-y += shutdown.o > -obj-y += smc.o > obj-y += smp.o > obj-y += smpboot.o > obj-y += sysctl.o > diff --git a/xen/arch/arm/platforms/exynos5.c > b/xen/arch/arm/platforms/exynos5.c > index c15ecf80f5..e2c0b7b878 100644 > --- a/xen/arch/arm/platforms/exynos5.c > +++ b/xen/arch/arm/platforms/exynos5.c > @@ -26,6 +26,7 @@ > #include > #include > #include > +#include > > static bool secure_firmware; > > @@ -249,7 +250,7 @@ static int exynos5_cpu_up(int cpu) > iounmap(power); > > if ( secure_firmware ) > -call_smc(SMC_CMD_CPU1BOOT, cpu, 0, 0); > +arm_smccc_smc(SMC_CMD_CPU1BOOT, cpu, NULL); > > return cpu_up_send_sgi(cpu); > } > diff --git a/xen/arch/arm/platforms/seattle.c > b/xen/arch/arm/platforms/seattle.c > index 893cc17972..64cc1868c2 100644 > --- a/xen/arch/arm/platforms/seattle.c > +++ b/xen/arch/arm/platforms/seattle.c > @@ -33,12 +33,12 @@ static const char * const seattle_dt_compat[] __initconst > = > */ > static void seattle_system_reset(void) > { > -call_smc(PSCI_0_2_FN32_SYSTEM_RESET, 0, 0, 0); > +arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_RESET, NULL); > } > > static void seattle_system_off(void) > { > -call_smc(PSCI_0_2_FN32_SYSTEM_OFF, 0, 0, 0); > +arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_OFF, NULL); > } > > PLATFORM_START(seattle, "SEATTLE") > diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c > index 941eec921b..4ae6504f3e 100644 > --- a/xen/arch/arm/psci.c > +++ b/xen/arch/arm/psci.c > @@ -42,42 +42,53 @@ uint32_t smccc_ver; > > static uint32_t psci_cpu_on_nr; > > +#define PSCI_RET(res) ((int32_t)(res).a0) > + > int call_psci_cpu_on(int cpu) > { > -return call_smc(psci_cpu_on_nr, cpu_logical_map(cpu), > __pa(init_secondary), 0); > +struct arm_smccc_res res; > + > +arm_smccc_smc(psci_cpu_on_nr, cpu_logical_map(cpu), __pa(init_secondary), > + ); > + > +return PSCI_RET(res.a0); > } Sorry if I'm jumping into the middle of a conversation, but why force all callers to manually extract the return value when it is a fixed register? Wouldn't it be far easier to do this: #define arcm_smccc_smc(...) \ ({ \ struct arm_smccc_res res; \ \ if ( cpus_have_const_cap(ARM_SMCCC_1_1) ) \ res = arm_smccc_1_1_smc(__VA_ARGS__); \ else\ res = arm_smccc_1_0_smc(__VA_ARGS__); \ \ (int)res.a0;\ }) Which also allows the compiler to optimise out the structure if it isn't read, and also avoids the caller needing to pass a NULL pointer for "I don't want the result". ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [[PATCH v3] 1/4] xen/arm: add SMC wrapper that is compatible with SMCCC v1.0
From: Volodymyr Babchuk Existing SMC wrapper call_smc() allows only 4 parameters and returns only one value. This is enough for existing use in PSCI code, but TEE mediator will need a call that is fully compatible with ARM SMCCC v1.0. This patch adds a wrapper for both arm32 and arm64. In the case of arm32, the wrapper is just an alias to the ARM SMCCC v1.1 as the convention is the same. CC: "Edgar E. Iglesias" Signed-off-by: Volodymyr Babchuk [julien: Rework the wrapper to make it closer to SMCC 1.1 wrapper] Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini --- Changes in v3: - Add Stefano's reviewed-by --- xen/arch/arm/arm64/Makefile | 1 + xen/arch/arm/arm64/asm-offsets.c | 5 xen/arch/arm/arm64/smc.S | 32 + xen/include/asm-arm/smccc.h | 51 +++- 4 files changed, 88 insertions(+), 1 deletion(-) create mode 100644 xen/arch/arm/arm64/smc.S diff --git a/xen/arch/arm/arm64/Makefile b/xen/arch/arm/arm64/Makefile index bb5c610b2a..c4f3a28a0d 100644 --- a/xen/arch/arm/arm64/Makefile +++ b/xen/arch/arm/arm64/Makefile @@ -8,6 +8,7 @@ obj-y += domain.o obj-y += entry.o obj-y += insn.o obj-$(CONFIG_LIVEPATCH) += livepatch.o +obj-y += smc.o obj-y += smpboot.o obj-y += traps.o obj-y += vfp.o diff --git a/xen/arch/arm/arm64/asm-offsets.c b/xen/arch/arm/arm64/asm-offsets.c index 62833d8c8b..280ddb55bf 100644 --- a/xen/arch/arm/arm64/asm-offsets.c +++ b/xen/arch/arm/arm64/asm-offsets.c @@ -10,6 +10,7 @@ #include #include #include +#include #define DEFINE(_sym, _val) \ asm volatile ("\n.ascii\"==>#define " #_sym " %0 /* " #_val " */<==\"" \ @@ -51,6 +52,10 @@ void __dummy__(void) BLANK(); OFFSET(INITINFO_stack, struct init_info, stack); + + BLANK(); + OFFSET(SMCCC_RES_a0, struct arm_smccc_res, a0); + OFFSET(SMCCC_RES_a2, struct arm_smccc_res, a2); } /* diff --git a/xen/arch/arm/arm64/smc.S b/xen/arch/arm/arm64/smc.S new file mode 100644 index 00..b0752be57e --- /dev/null +++ b/xen/arch/arm/arm64/smc.S @@ -0,0 +1,32 @@ +/* + * xen/arch/arm/arm64/smc.S + * + * Wrapper for Secure Monitors Calls + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include +#include + +/* + * void __arm_smccc_1_0_smc(register_t a0, register_t a1, register_t a2, + * register_t a3, register_t a4, register_t a5, + * register_t a6, register_t a7, + * struct arm_smccc_res *res) + */ +ENTRY(__arm_smccc_1_0_smc) +smc #0 +ldr x4, [sp] +cbz x4, 1f /* No need to store the result */ +stp x0, x1, [x4, #SMCCC_RES_a0] +stp x2, x3, [x4, #SMCCC_RES_a2] +1: +ret diff --git a/xen/include/asm-arm/smccc.h b/xen/include/asm-arm/smccc.h index 648bef28bd..1ed6cbaa48 100644 --- a/xen/include/asm-arm/smccc.h +++ b/xen/include/asm-arm/smccc.h @@ -207,7 +207,56 @@ struct arm_smccc_res { *___res = (typeof(*___res)){r0, r1, r2, r3};\ } while ( 0 ) -#endif +/* + * The calling convention for arm32 is the same for both SMCCC v1.0 and + * v1.1. + */ +#ifdef CONFIG_ARM_32 +#define arm_smccc_1_0_smc(...) arm_smccc_1_1_smc(__VA_ARGS__) +#else + +void __arm_smccc_1_0_smc(register_t a0, register_t a1, register_t a2, + register_t a3, register_t a4, register_t a5, + register_t a6, register_t a7, + struct arm_smccc_res *res); + +/* Macros to handle variadic parameter for SMCCC v1.0 helper */ +#define __arm_smccc_1_0_smc_7(a0, a1, a2, a3, a4, a5, a6, a7, res) \ +__arm_smccc_1_0_smc(a0, a1, a2, a3, a4, a5, a6, a7, res) + +#define __arm_smccc_1_0_smc_6(a0, a1, a2, a3, a4, a5, a6, res) \ +__arm_smccc_1_0_smc_7(a0, a1, a2, a3, a4, a5, a6, 0, res) + +#define __arm_smccc_1_0_smc_5(a0, a1, a2, a3, a4, a5, res) \ +__arm_smccc_1_0_smc_6(a0, a1, a2, a3, a4, a5, 0, res) + +#define __arm_smccc_1_0_smc_4(a0, a1, a2, a3, a4, res) \ +__arm_smccc_1_0_smc_5(a0, a1, a2, a3, a4, 0, res) + +#define __arm_smccc_1_0_smc_3(a0, a1, a2, a3, res) \ +__arm_smccc_1_0_smc_4(a0, a1, a2, a3, 0, res) + +#define __arm_smccc_1_0_smc_2(a0, a1, a2, res) \ +__arm_smccc_1_0_smc_3(a0, a1, a2, 0, res) + +#define __arm_smccc_1_0_smc_1(a0, a1, res) \ +__arm_smccc_1_0_smc_2(a0, a1, 0, res) + +#define __arm_smccc_1_0_smc_0(a0, res) \ +__arm_smccc_1_0_smc_1(a0, 0, res) + +#define
[Xen-devel] [xen-unstable-smoke test] 128279: tolerable all pass - PUSHED
flight 128279 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/128279/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-arm64-arm64-xl-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass version targeted for testing: xen 146bb7e1934afc5056179fffbf8e24c0db415f8c baseline version: xen edb4724e36256c495a6aa3cf1a12722efe271f9d Last test of basis 128209 2018-09-28 21:06:31 Z2 days Testing same since 128279 2018-10-01 11:00:41 Z0 days1 attempts People who touched revisions under test: Andrew Cooper Wei Liu jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git edb4724e36..146bb7e193 146bb7e1934afc5056179fffbf8e24c0db415f8c -> smoke ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [[PATCH v3] 2/4] xen/arm: cpufeature: Add helper to check constant caps
Some capababilities are set right during boot and will never change afterwards. At the moment, the function cpu_have_caps will check whether the cap is enabled from the memory. It is possible to avoid the load from the memory by using an ALTERNATIVE. With that the check is just reduced to 1 instruction. Signed-off-by: Julien Grall --- This is the static key for the poor. At some point we might want to introduce something similar to static key in Xen. Changes in v2: - Use unlikely --- xen/include/asm-arm/cpufeature.h | 12 1 file changed, 12 insertions(+) diff --git a/xen/include/asm-arm/cpufeature.h b/xen/include/asm-arm/cpufeature.h index 3de6b54301..c6cbc2ec84 100644 --- a/xen/include/asm-arm/cpufeature.h +++ b/xen/include/asm-arm/cpufeature.h @@ -63,6 +63,18 @@ static inline bool cpus_have_cap(unsigned int num) return test_bit(num, cpu_hwcaps); } +/* System capability check for constant cap */ +#define cpus_have_const_cap(num) ({ \ +bool __ret; \ +\ +asm volatile (ALTERNATIVE("mov %0, #0", \ + "mov %0, #1", \ + num) \ + : "=r" (__ret)); \ +\ +unlikely(__ret);\ +}) + static inline void cpus_set_cap(unsigned int num) { if (num >= ARM_NCAPS) -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [[PATCH v3] 4/4] xen/arm: Replace call_smc with arm_smccc_smc
call_smc is a subset of arm_smccc_smc. Rather than having 2 methods to do SMCCC call, replace all call to the former by the later. Signed-off-by: Julien Grall --- Changes in v3: - Use PSCI_RET where needed --- xen/arch/arm/Makefile| 1 - xen/arch/arm/platforms/exynos5.c | 3 ++- xen/arch/arm/platforms/seattle.c | 4 ++-- xen/arch/arm/psci.c | 37 + xen/arch/arm/smc.S | 21 - xen/include/asm-arm/processor.h | 3 --- 6 files changed, 29 insertions(+), 40 deletions(-) delete mode 100644 xen/arch/arm/smc.S diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile index b9b141dc84..37fa8268b3 100644 --- a/xen/arch/arm/Makefile +++ b/xen/arch/arm/Makefile @@ -39,7 +39,6 @@ obj-y += processor.o obj-y += psci.o obj-y += setup.o obj-y += shutdown.o -obj-y += smc.o obj-y += smp.o obj-y += smpboot.o obj-y += sysctl.o diff --git a/xen/arch/arm/platforms/exynos5.c b/xen/arch/arm/platforms/exynos5.c index c15ecf80f5..e2c0b7b878 100644 --- a/xen/arch/arm/platforms/exynos5.c +++ b/xen/arch/arm/platforms/exynos5.c @@ -26,6 +26,7 @@ #include #include #include +#include static bool secure_firmware; @@ -249,7 +250,7 @@ static int exynos5_cpu_up(int cpu) iounmap(power); if ( secure_firmware ) -call_smc(SMC_CMD_CPU1BOOT, cpu, 0, 0); +arm_smccc_smc(SMC_CMD_CPU1BOOT, cpu, NULL); return cpu_up_send_sgi(cpu); } diff --git a/xen/arch/arm/platforms/seattle.c b/xen/arch/arm/platforms/seattle.c index 893cc17972..64cc1868c2 100644 --- a/xen/arch/arm/platforms/seattle.c +++ b/xen/arch/arm/platforms/seattle.c @@ -33,12 +33,12 @@ static const char * const seattle_dt_compat[] __initconst = */ static void seattle_system_reset(void) { -call_smc(PSCI_0_2_FN32_SYSTEM_RESET, 0, 0, 0); +arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_RESET, NULL); } static void seattle_system_off(void) { -call_smc(PSCI_0_2_FN32_SYSTEM_OFF, 0, 0, 0); +arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_OFF, NULL); } PLATFORM_START(seattle, "SEATTLE") diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c index 941eec921b..4ae6504f3e 100644 --- a/xen/arch/arm/psci.c +++ b/xen/arch/arm/psci.c @@ -42,42 +42,53 @@ uint32_t smccc_ver; static uint32_t psci_cpu_on_nr; +#define PSCI_RET(res) ((int32_t)(res).a0) + int call_psci_cpu_on(int cpu) { -return call_smc(psci_cpu_on_nr, cpu_logical_map(cpu), __pa(init_secondary), 0); +struct arm_smccc_res res; + +arm_smccc_smc(psci_cpu_on_nr, cpu_logical_map(cpu), __pa(init_secondary), + ); + +return PSCI_RET(res.a0); } void call_psci_cpu_off(void) { if ( psci_ver > PSCI_VERSION(0, 1) ) { -int errno; +struct arm_smccc_res res; /* If successfull the PSCI cpu_off call doesn't return */ -errno = call_smc(PSCI_0_2_FN32_CPU_OFF, 0, 0, 0); +arm_smccc_smc(PSCI_0_2_FN32_CPU_OFF, ); panic("PSCI cpu off failed for CPU%d err=%d\n", smp_processor_id(), - errno); + PSCI_RET(res)); } } void call_psci_system_off(void) { if ( psci_ver > PSCI_VERSION(0, 1) ) -call_smc(PSCI_0_2_FN32_SYSTEM_OFF, 0, 0, 0); +arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_OFF, NULL); } void call_psci_system_reset(void) { if ( psci_ver > PSCI_VERSION(0, 1) ) -call_smc(PSCI_0_2_FN32_SYSTEM_RESET, 0, 0, 0); +arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_RESET, NULL); } static int __init psci_features(uint32_t psci_func_id) { +struct arm_smccc_res res; + if ( psci_ver < PSCI_VERSION(1, 0) ) return PSCI_NOT_SUPPORTED; -return call_smc(PSCI_1_0_FN32_PSCI_FEATURES, psci_func_id, 0, 0); +arm_smccc_smc(PSCI_1_0_FN32_PSCI_FEATURES, psci_func_id, NULL); + +return PSCI_RET(res); } static int __init psci_is_smc_method(const struct dt_device_node *psci) @@ -112,11 +123,11 @@ static void __init psci_init_smccc(void) if ( psci_features(ARM_SMCCC_VERSION_FID) != PSCI_NOT_SUPPORTED ) { -uint32_t ret; +struct arm_smccc_res res; -ret = call_smc(ARM_SMCCC_VERSION_FID, 0, 0, 0); -if ( ret != ARM_SMCCC_NOT_SUPPORTED ) -smccc_ver = ret; +arm_smccc_smc(ARM_SMCCC_VERSION_FID, ); +if ( PSCI_RET(res) != ARM_SMCCC_NOT_SUPPORTED ) +smccc_ver = PSCI_RET(res); } if ( smccc_ver >= SMCCC_VERSION(1, 1) ) @@ -165,6 +176,7 @@ static int __init psci_init_0_2(void) { /* sentinel */ }, }; int ret; +struct arm_smccc_res res; if ( acpi_disabled ) { @@ -186,7 +198,8 @@ static int __init psci_init_0_2(void) } } -psci_ver = call_smc(PSCI_0_2_FN32_PSCI_VERSION, 0, 0, 0); +arm_smccc_smc(PSCI_0_2_FN32_PSCI_VERSION, ); +psci_ver = PSCI_RET(res); /* For the moment, we only support PSCI 0.2 and PSCI 1.x */ if ( psci_ver != PSCI_VERSION(0, 2) && PSCI_VERSION_MAJOR(psci_ver)
[Xen-devel] [[PATCH v3] 0/4] xen/arm: SMCCC fixup and improvement
Hi all, This patch series contains fixup and improvement for the SMCCC subsystem. Cheers, Julien Grall (3): xen/arm: cpufeature: Add helper to check constant caps xen/arm: smccc: Add wrapper to automatically select the calling convention xen/arm: Replace call_smc with arm_smccc_smc Volodymyr Babchuk (1): xen/arm: add SMC wrapper that is compatible with SMCCC v1.0 xen/arch/arm/Makefile| 1 - xen/arch/arm/arm64/Makefile | 1 + xen/arch/arm/arm64/asm-offsets.c | 5 xen/arch/arm/arm64/smc.S | 32 + xen/arch/arm/platforms/exynos5.c | 3 +- xen/arch/arm/platforms/seattle.c | 4 +-- xen/arch/arm/psci.c | 41 ++ xen/arch/arm/smc.S | 21 -- xen/include/asm-arm/cpufeature.h | 15 +- xen/include/asm-arm/processor.h | 3 -- xen/include/asm-arm/smccc.h | 62 +++- 11 files changed, 146 insertions(+), 42 deletions(-) create mode 100644 xen/arch/arm/arm64/smc.S delete mode 100644 xen/arch/arm/smc.S -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [[PATCH v3] 3/4] xen/arm: smccc: Add wrapper to automatically select the calling convention
Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini Reviewed-by: Volodymyr Babchuk --- Changes in v3: - Add Stefano's and Volodymyr's reviewed-by Changes in v2: - Invert the condition - Add missing includes --- xen/arch/arm/psci.c | 4 xen/include/asm-arm/cpufeature.h | 3 ++- xen/include/asm-arm/smccc.h | 11 +++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c index 3cf5ecf0f3..941eec921b 100644 --- a/xen/arch/arm/psci.c +++ b/xen/arch/arm/psci.c @@ -21,6 +21,7 @@ #include #include #include +#include #include #include @@ -118,6 +119,9 @@ static void __init psci_init_smccc(void) smccc_ver = ret; } +if ( smccc_ver >= SMCCC_VERSION(1, 1) ) +cpus_set_cap(ARM_SMCCC_1_1); + printk(XENLOG_INFO "Using SMC Calling Convention v%u.%u\n", SMCCC_VERSION_MAJOR(smccc_ver), SMCCC_VERSION_MINOR(smccc_ver)); } diff --git a/xen/include/asm-arm/cpufeature.h b/xen/include/asm-arm/cpufeature.h index c6cbc2ec84..2d82264427 100644 --- a/xen/include/asm-arm/cpufeature.h +++ b/xen/include/asm-arm/cpufeature.h @@ -44,8 +44,9 @@ #define SKIP_CTXT_SWITCH_SERROR_SYNC 6 #define ARM_HARDEN_BRANCH_PREDICTOR 7 #define ARM_SSBD 8 +#define ARM_SMCCC_1_1 9 -#define ARM_NCAPS 9 +#define ARM_NCAPS 10 #ifndef __ASSEMBLY__ diff --git a/xen/include/asm-arm/smccc.h b/xen/include/asm-arm/smccc.h index 1ed6cbaa48..126399dd70 100644 --- a/xen/include/asm-arm/smccc.h +++ b/xen/include/asm-arm/smccc.h @@ -16,6 +16,9 @@ #ifndef __ASM_ARM_SMCCC_H__ #define __ASM_ARM_SMCCC_H__ +#include +#include + #define SMCCC_VERSION_MAJOR_SHIFT16 #define SMCCC_VERSION_MINOR_MASK \ ((1U << SMCCC_VERSION_MAJOR_SHIFT) - 1) @@ -213,6 +216,7 @@ struct arm_smccc_res { */ #ifdef CONFIG_ARM_32 #define arm_smccc_1_0_smc(...) arm_smccc_1_1_smc(__VA_ARGS__) +#define arm_smccc_smc(...) arm_smccc_1_1_smc(__VA_ARGS__) #else void __arm_smccc_1_0_smc(register_t a0, register_t a1, register_t a2, @@ -254,6 +258,13 @@ void __arm_smccc_1_0_smc(register_t a0, register_t a1, register_t a2, #define arm_smccc_1_0_smc(...) \ __arm_smccc_1_0_smc_count(__count_args(__VA_ARGS__), __VA_ARGS__) +#define arm_smccc_smc(...) \ +do {\ +if ( cpus_have_const_cap(ARM_SMCCC_1_1) ) \ +arm_smccc_1_1_smc(__VA_ARGS__); \ +else\ +arm_smccc_1_0_smc(__VA_ARGS__); \ +} while ( 0 ) #endif /* CONFIG_ARM_64 */ #endif /* __ASSEMBLY__ */ -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
On 07/06/18 14:08, Olaf Hering wrote: > Add an option to control when vTSC emulation will be activated for a > domU with tsc_mode=default. Without such option each TSC access from > domU will be emulated, which causes a significant perfomance drop for > workloads that make use of rdtsc. > > One option to avoid the TSC option is to run domUs with tsc_mode=native. > This has the drawback that migrating a domU from a "2.3GHz" class host > to a "2.4GHz" class host may change the rate at wich the TSC counter > increases, the domU may not be prepared for that. > > With the new option the host admin can decide how a domU should behave > when it is migrated across systems of the same class. Since there is > always some jitter when Xen calibrates the cpu_khz value, all hosts of > the same class will most likely have slightly different values. As a > result vTSC emulation is unavoidable. Data collected during the incident > which triggered this change showed a jitter of up to 200 KHz across > systems of the same class. Do you have any further details of the systems involved? If they are identical systems, they should all have the same real TSC frequency, and its a known issue that Xen isn't very good at working out the frequency. TBH, fixing that would be far better overall. > > Existing padding fields are reused to store vtsc_khz_tolerance as u16. > The padding is sent as zero in write_tsc_info to the receving host. > The padding is undefined if the changed code runs as receiver. I'm not sure what you mean by this final sentence. > handle_tsc_info has no code to verify that padding is indeed zero. Due > to the lack of a version field it is impossible to know if the sender > already has the newly introduced vtsc_tolerance field. In the worst > case the receiving domU will get an unemulated TSC. The lack of padding verification is deliberate, for forwards compatibility. Why does the sending code matter? One way or another, if the field is 0, the option wasn't present or wasn't configured. Neither of these situations affect the decision-making that the receiving side needs to perform. > > Signed-off-by: Olaf Hering > Reviewed-by: Wei Liu (v07/v08) > Reviewed-by: Jan Beulich (v08) I'm still -0.5 for this patch. I can appreciate why you want it, but it is a gross hack which only works when you don't skew time more than NTP in the guest can cope with. My gut feeling is that there will be other more subtle fallout. As for the implementation itself, a few trivial comments. > diff --git a/docs/man/xen-tscmode.pod.7 b/docs/man/xen-tscmode.pod.7 > index 3bbc96f201..122ae36679 100644 > --- a/docs/man/xen-tscmode.pod.7 > +++ b/docs/man/xen-tscmode.pod.7 > @@ -99,6 +99,9 @@ whether or not the VM has been saved/restored/migrated > > =back > > +If the tsc_mode is set to "default" the decision to emulate TSC can be > +tweaked further with the "vtsc_tolerance_khz" option. > + > To understand this in more detail, the rest of this document must > be read. > > @@ -211,6 +214,19 @@ is emulated. Note that, though emulated, the "apparent" > TSC frequency > will be the TSC frequency of the initial physical machine, even after > migration. > > +Since the calibration of the TSC frequency may not be 100% accurate, the > +exact value of the frequency can change even across reboots. It can change across reboots for other reasons, e.g. firmware settings. I'd phrase this as "Since the calibration of the TSC frequency isn't 100% accurate, the value measured by Xen can vary across reboots". > This means > +also several otherwise identical systems can have a slightly different > +TSC frequency. As a result TSC access will be emulated if a domU is > +migrated from one host to another, identical host. To avoid the > +performance impact of TSC emulation a certain tolerance of the measured > +host TSC frequency can be specified with "vtsc_tolerance_khz". If the > +measured "cpu_khz" value is within the tolerance range, TSC access > +remains native. Otherwise it will be emulated. This allows to migrate > +domUs between identical hardware. If the domU will be migrated to a > +different kind of hardware, say from a "2.3GHz" to a "2.5GHz" system, > +TSC will be emualted to maintain the TSC frequency expected by the domU. > + > For environments where both TSC-safeness AND highest performance > even across migration is a requirement, application code can be specially > modified to use an algorithm explicitly designed into Xen for this purpose. > diff --git a/docs/man/xl.cfg.pod.5.in b/docs/man/xl.cfg.pod.5.in > index 47d88243b1..995277794f 100644 > --- a/docs/man/xl.cfg.pod.5.in > +++ b/docs/man/xl.cfg.pod.5.in > @@ -1898,6 +1898,16 @@ determined in a similar way to that of B TSC > mode. > > Please see B for more information on this option. > > +=item B > + > +B<(x86 only, relevant only for tsc_mode=default)> > +When a domU is started, the CPU frequency of the host is used by the domU for > +TSC related time
[Xen-devel] [xen-unstable test] 128267: regressions - FAIL
flight 128267 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/128267/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-migrupgrade 22 guest-migrate/src_host/dst_host fail REGR. vs. 128084 test-amd64-i386-migrupgrade 22 guest-migrate/src_host/dst_host fail REGR. vs. 128084 Tests which are failing intermittently (not blocking): test-amd64-amd64-xl-qemuu-debianhvm-amd64 16 guest-localmigrate/x10 fail pass in 128240 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 128084 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 128084 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 128084 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 128084 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 128084 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 128084 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 128084 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 128084 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 128084 test-amd64-i386-xl-pvshim12 guest-start fail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 14 saverestore-support-checkfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 14 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-arm64-arm64-xl 13 migrate-support-checkfail never pass test-arm64-arm64-xl 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass version targeted for testing: xen edb4724e36256c495a6aa3cf1a12722efe271f9d baseline version: xen 940185b2f6f343251c2b83bd96e599398cea51ec Last test of basis 128084 2018-09-26 01:51:53 Z5 days Failing since128118 2018-09-27 00:37:03 Z4 days4 attempts
[Xen-devel] [PATCH v2 4/4] x86: support "pv-l1tf=default"
Just like the otherwise similar "xpti=" allows for, to revert back to built-in defaults. Signed-off-by: Jan Beulich --- v2: Split out into separate patch. --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -1604,7 +1604,7 @@ certain you don't plan on having PV gues turning it off can reduce the attack surface. ### pv-l1tf (x86) -> `= List of [ , dom0=, domu= ]` +> `= List of [ default, , dom0=, domu= ]` > Default: `false` on believed-unaffected hardware, or in pv-shim mode. > `domu` on believed-affected hardware. --- a/xen/arch/x86/spec_ctrl.c +++ b/xen/arch/x86/spec_ctrl.c @@ -252,7 +252,9 @@ static __init int parse_pv_l1tf(const ch break; default: -if ( (val = parse_boolean("dom0", s, ss)) >= 0 ) +if ( !strcmp(s, "default") ) +opt_pv_l1tf_hwdom = opt_pv_l1tf_domu = -1; +else if ( (val = parse_boolean("dom0", s, ss)) >= 0 ) opt_pv_l1tf_hwdom = val; else if ( (val = parse_boolean("domu", s, ss)) >= 0 ) opt_pv_l1tf_domu = val; ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 3/4] x86: fix "xpti=" and "pv-l1tf=" yet again
While commit 2a3b34ec47 ("x86/spec-ctrl: Yet more fixes for xpti= parsing") indeed fixed "xpti=dom0", it broke "xpti=no-dom0", in that this then became equivalent to "xpti=no". In particular, the presence of "xpti=" alone on the command line means nothing as to which default is to be overridden; "xpti=no-dom0", for example, ought to have no effect for DomU-s, as this is distinct from both "xpti=no-dom0,domu" and "xpti=no-dom0,no-domu". Signed-off-by: Jan Beulich --- v2: Fix copy-and-paste mistake in parse_pv_l1tf(). Split off log message silencing. Re-base over patches splitting opt_{xpti,pv_l1tf}. --- a/xen/arch/x86/spec_ctrl.c +++ b/xen/arch/x86/spec_ctrl.c @@ -232,12 +232,6 @@ static __init int parse_pv_l1tf(const ch const char *ss; int val, rc = 0; -/* Inhibit the defaults as an explicit choice has been given. */ -if ( opt_pv_l1tf_hwdom == -1 ) -opt_pv_l1tf_hwdom = 0; -if ( opt_pv_l1tf_domu == -1 ) -opt_pv_l1tf_domu = 0; - /* Interpret 'pv-l1tf' alone in its positive boolean form. */ if ( *s == '\0' ) opt_pv_l1tf_hwdom = opt_pv_l1tf_domu = 1; @@ -699,12 +693,6 @@ static __init int parse_xpti(const char const char *ss; int val, rc = 0; -/* Inhibit the defaults as an explicit choice has been given. */ -if ( opt_xpti_hwdom == -1 ) -opt_xpti_hwdom = 0; -if ( opt_xpti_domu == -1 ) -opt_xpti_domu = 0; - /* Interpret 'xpti' alone in its positive boolean form. */ if ( *s == '\0' ) opt_xpti_hwdom = opt_xpti_domu = 1; ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 2/4] x86: split opt_pv_l1tf
Use separate tracking variables for the hardware domain and DomU-s. No functional change intended, but adjust the comment in init_speculation_mitigations() to match prior as well as resulting code. Signed-off-by: Jan Beulich --- v2: New. --- a/xen/arch/x86/spec_ctrl.c +++ b/xen/arch/x86/spec_ctrl.c @@ -143,8 +143,10 @@ static int __init parse_spec_ctrl(const if ( opt_smt < 0 ) opt_smt = 1; -if ( opt_pv_l1tf < 0 ) -opt_pv_l1tf = 0; +if ( opt_pv_l1tf_hwdom < 0 ) +opt_pv_l1tf_hwdom = 0; +if ( opt_pv_l1tf_domu < 0 ) +opt_pv_l1tf_domu = 0; disable_common: opt_rsb_pv = false; @@ -222,7 +224,8 @@ static int __init parse_spec_ctrl(const } custom_param("spec-ctrl", parse_spec_ctrl); -int8_t __read_mostly opt_pv_l1tf = -1; +int8_t __read_mostly opt_pv_l1tf_hwdom = -1; +int8_t __read_mostly opt_pv_l1tf_domu = -1; static __init int parse_pv_l1tf(const char *s) { @@ -230,12 +233,14 @@ static __init int parse_pv_l1tf(const ch int val, rc = 0; /* Inhibit the defaults as an explicit choice has been given. */ -if ( opt_pv_l1tf == -1 ) -opt_pv_l1tf = 0; +if ( opt_pv_l1tf_hwdom == -1 ) +opt_pv_l1tf_hwdom = 0; +if ( opt_pv_l1tf_domu == -1 ) +opt_pv_l1tf_domu = 0; /* Interpret 'pv-l1tf' alone in its positive boolean form. */ if ( *s == '\0' ) -opt_pv_l1tf = OPT_PV_L1TF_DOM0 | OPT_PV_L1TF_DOMU; +opt_pv_l1tf_hwdom = opt_pv_l1tf_domu = 1; do { ss = strchr(s, ','); @@ -245,20 +250,18 @@ static __init int parse_pv_l1tf(const ch switch ( parse_bool(s, ss) ) { case 0: -opt_pv_l1tf = 0; +opt_pv_l1tf_hwdom = opt_pv_l1tf_domu = 0; break; case 1: -opt_pv_l1tf = OPT_PV_L1TF_DOM0 | OPT_PV_L1TF_DOMU; +opt_pv_l1tf_hwdom = opt_pv_l1tf_domu = 1; break; default: if ( (val = parse_boolean("dom0", s, ss)) >= 0 ) -opt_pv_l1tf = ((opt_pv_l1tf & ~OPT_PV_L1TF_DOM0) | - (val ? OPT_PV_L1TF_DOM0 : 0)); +opt_pv_l1tf_hwdom = val; else if ( (val = parse_boolean("domu", s, ss)) >= 0 ) -opt_pv_l1tf = ((opt_pv_l1tf & ~OPT_PV_L1TF_DOMU) | - (val ? OPT_PV_L1TF_DOMU : 0)); +opt_pv_l1tf_domu = val; else if ( *s ) rc = -EINVAL; break; @@ -321,7 +324,7 @@ static void __init print_details(enum in opt_l1d_flush ? " L1D_FLUSH" : ""); /* L1TF diagnostics, printed if vulnerable or PV shadowing is in use. */ -if ( cpu_has_bug_l1tf || opt_pv_l1tf ) +if ( cpu_has_bug_l1tf || opt_pv_l1tf_hwdom || opt_pv_l1tf_domu ) printk(" L1TF: believed%s vulnerable, maxphysaddr L1D %u, CPUID %u" ", Safe address %"PRIx64"\n", cpu_has_bug_l1tf ? "" : " not", @@ -356,8 +359,8 @@ static void __init print_details(enum in xpti_pcid_enabled() ? "" : "out"); printk(" PV L1TF shadowing: Dom0 %s, DomU %s\n", - opt_pv_l1tf & OPT_PV_L1TF_DOM0 ? "enabled" : "disabled", - opt_pv_l1tf & OPT_PV_L1TF_DOMU ? "enabled" : "disabled"); + opt_pv_l1tf_hwdom ? "enabled" : "disabled", + opt_pv_l1tf_domu ? "enabled" : "disabled"); #endif } @@ -889,18 +892,16 @@ void __init init_speculation_mitigations /* * By default, enable PV domU L1TF mitigations on all L1TF-vulnerable - * hardware, except when running in shim mode. + * hardware, except when running in shim mode, and - at least for the + * time being - also excepting the hardware domain. * * In shim mode, SHADOW is expected to be compiled out, and a malicious * guest kernel can only attack the shim Xen, not the host Xen. */ -if ( opt_pv_l1tf == -1 ) -{ -if ( pv_shim || !cpu_has_bug_l1tf ) -opt_pv_l1tf = 0; -else -opt_pv_l1tf = OPT_PV_L1TF_DOMU; -} +if ( opt_pv_l1tf_hwdom == -1 ) +opt_pv_l1tf_hwdom = 0; +if ( opt_pv_l1tf_domu == -1 ) +opt_pv_l1tf_domu = !pv_shim && cpu_has_bug_l1tf; /* * By default, enable L1D_FLUSH on L1TF-vulnerable hardware, unless --- a/xen/include/asm-x86/shadow.h +++ b/xen/include/asm-x86/shadow.h @@ -224,9 +224,8 @@ void pv_l1tf_tasklet(unsigned long data) static inline void pv_l1tf_domain_init(struct domain *d) { -d->arch.pv.check_l1tf = -opt_pv_l1tf & (is_hardware_domain(d) - ? OPT_PV_L1TF_DOM0 : OPT_PV_L1TF_DOMU); +d->arch.pv.check_l1tf = is_hardware_domain(d) ? opt_pv_l1tf_hwdom + : opt_pv_l1tf_domu; #if defined(CONFIG_SHADOW_PAGING) && defined(CONFIG_PV)
[Xen-devel] [PATCH v2 1/4] x86: split opt_xpti
Use separate tracking variables for the hardware domain and DomU-s. No functional change intended. Signed-off-by: Jan Beulich --- v2: New. --- a/xen/arch/x86/flushtlb.c +++ b/xen/arch/x86/flushtlb.c @@ -182,7 +182,7 @@ unsigned int flush_area_local(const void */ invpcid_flush_one(PCID_PV_PRIV, addr); invpcid_flush_one(PCID_PV_USER, addr); -if ( opt_xpti ) +if ( opt_xpti_hwdom || opt_xpti_domu ) { invpcid_flush_one(PCID_PV_PRIV | PCID_PV_XPTI, addr); invpcid_flush_one(PCID_PV_USER | PCID_PV_XPTI, addr); --- a/xen/arch/x86/pv/domain.c +++ b/xen/arch/x86/pv/domain.c @@ -253,8 +253,7 @@ int pv_domain_initialise(struct domain * /* 64-bit PV guest by default. */ d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0; -d->arch.pv.xpti = opt_xpti & (is_hardware_domain(d) - ? OPT_XPTI_DOM0 : OPT_XPTI_DOMU); +d->arch.pv.xpti = is_hardware_domain(d) ? opt_xpti_hwdom : opt_xpti_domu; if ( !is_pv_32bit_domain(d) && use_invpcid && cpu_has_pcid ) switch ( opt_pcid ) --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -789,7 +789,7 @@ static int setup_cpu_root_pgt(unsigned i unsigned int off; int rc; -if ( !opt_xpti ) +if ( !opt_xpti_hwdom && !opt_xpti_domu ) return 0; rpt = alloc_xen_pagetable(); --- a/xen/arch/x86/spec_ctrl.c +++ b/xen/arch/x86/spec_ctrl.c @@ -135,8 +135,10 @@ static int __init parse_spec_ctrl(const opt_eager_fpu = 0; -if ( opt_xpti < 0 ) -opt_xpti = 0; +if ( opt_xpti_hwdom < 0 ) +opt_xpti_hwdom = 0; +if ( opt_xpti_domu < 0 ) +opt_xpti_domu = 0; if ( opt_smt < 0 ) opt_smt = 1; @@ -349,8 +351,8 @@ static void __init print_details(enum in opt_eager_fpu ? " EAGER_FPU" : ""); printk(" XPTI (64-bit PV only): Dom0 %s, DomU %s (with%s PCID)\n", - opt_xpti & OPT_XPTI_DOM0 ? "enabled" : "disabled", - opt_xpti & OPT_XPTI_DOMU ? "enabled" : "disabled", + opt_xpti_hwdom ? "enabled" : "disabled", + opt_xpti_domu ? "enabled" : "disabled", xpti_pcid_enabled() ? "" : "out"); printk(" PV L1TF shadowing: Dom0 %s, DomU %s\n", @@ -665,7 +667,8 @@ static __init void l1tf_calculations(uin : (3ul << (paddr_bits - 2; } -int8_t __read_mostly opt_xpti = -1; +int8_t __read_mostly opt_xpti_hwdom = -1; +int8_t __read_mostly opt_xpti_domu = -1; static __init void xpti_init_default(uint64_t caps) { @@ -673,9 +676,19 @@ static __init void xpti_init_default(uin caps = ARCH_CAPABILITIES_RDCL_NO; if ( caps & ARCH_CAPABILITIES_RDCL_NO ) -opt_xpti = 0; +{ +if ( opt_xpti_hwdom < 0 ) +opt_xpti_hwdom = 0; +if ( opt_xpti_domu < 0 ) +opt_xpti_domu = 0; +} else -opt_xpti = OPT_XPTI_DOM0 | OPT_XPTI_DOMU; +{ +if ( opt_xpti_hwdom < 0 ) +opt_xpti_hwdom = 1; +if ( opt_xpti_domu < 0 ) +opt_xpti_domu = 1; +} } static __init int parse_xpti(const char *s) @@ -684,12 +697,14 @@ static __init int parse_xpti(const char int val, rc = 0; /* Inhibit the defaults as an explicit choice has been given. */ -if ( opt_xpti == -1 ) -opt_xpti = 0; +if ( opt_xpti_hwdom == -1 ) +opt_xpti_hwdom = 0; +if ( opt_xpti_domu == -1 ) +opt_xpti_domu = 0; /* Interpret 'xpti' alone in its positive boolean form. */ if ( *s == '\0' ) -opt_xpti = OPT_XPTI_DOM0 | OPT_XPTI_DOMU; +opt_xpti_hwdom = opt_xpti_domu = 1; do { ss = strchr(s, ','); @@ -699,22 +714,20 @@ static __init int parse_xpti(const char switch ( parse_bool(s, ss) ) { case 0: -opt_xpti = 0; +opt_xpti_hwdom = opt_xpti_domu = 0; break; case 1: -opt_xpti = OPT_XPTI_DOM0 | OPT_XPTI_DOMU; +opt_xpti_hwdom = opt_xpti_domu = 1; break; default: if ( !strcmp(s, "default") ) -opt_xpti = -1; +opt_xpti_hwdom = opt_xpti_domu = -1; else if ( (val = parse_boolean("dom0", s, ss)) >= 0 ) -opt_xpti = (opt_xpti & ~OPT_XPTI_DOM0) | - (val ? OPT_XPTI_DOM0 : 0); +opt_xpti_hwdom = val; else if ( (val = parse_boolean("domu", s, ss)) >= 0 ) -opt_xpti = (opt_xpti & ~OPT_XPTI_DOMU) | - (val ? OPT_XPTI_DOMU : 0); +opt_xpti_domu = val; else if ( *s ) rc = -EINVAL; break; @@ -870,8 +883,7 @@ void __init init_speculation_mitigations
Re: [Xen-devel] [PATCH V2] x86/altp2m: propagate ept.ad changes to all active altp2ms
On 10/1/18 2:23 PM, George Dunlap wrote: > On 10/01/2018 12:11 PM, Razvan Cojocaru wrote: >> >> >> On 10/1/18 1:39 PM, Jan Beulich wrote: >> On 01.10.18 at 11:58, wrote: Changes since V1: - Removed unnecessary p2m_lock() in p2m_init_altp2m_ept(). >>> >>> This was a step in the right direction, but ... >>> static void ept_enable_pml(struct p2m_domain *p2m) { +struct domain *d = p2m->domain; + /* Domain must have been paused */ -ASSERT(atomic_read(>domain->pause_count)); +ASSERT(atomic_read(>pause_count)); /* * No need to return whether vmx_domain_enable_pml has succeeded, as * ept_p2m_type_to_flags will do the check, and write protection will be * used if PML is not enabled. */ -if ( vmx_domain_enable_pml(p2m->domain) ) +if ( vmx_domain_enable_pml(d) ) return; /* Enable EPT A/D bit for PML */ -p2m->ept.ad = 1; -vmx_domain_update_eptp(p2m->domain); +ept_set_ad_sync(p2m, true); + +vmx_domain_update_eptp(d); } static void ept_disable_pml(struct p2m_domain *p2m) { +struct domain *d = p2m->domain; + /* Domain must have been paused */ -ASSERT(atomic_read(>domain->pause_count)); +ASSERT(atomic_read(>pause_count)); -vmx_domain_disable_pml(p2m->domain); +vmx_domain_disable_pml(d); /* Disable EPT A/D bit */ -p2m->ept.ad = 0; -vmx_domain_update_eptp(p2m->domain); +ept_set_ad_sync(p2m, false); + +vmx_domain_update_eptp(d); } >>> >>> These two functions used to be called with the p2m lock held, >>> while now they aren't anymore. Afaict this introduces a race >>> where the opposite ept_set_ad_sync() may be called before >>> an original one was follow by the respective >>> vmx_domain_update_eptp(), resulting in the A/D enable bit >>> being set the wrong way round in the end. >>> >>> I realize that George did already point out that this is sort of >>> ugly a situation, but the fixing of the issue here shouldn't >>> introduce a new race. What's wrong with retaining the >>> host p2m lock in p2m_{en,dis}able_hardware_log_dirty()? >>> ept_set_ad_sync() then simply wouldn't acquire/release that >>> one, but just the altp2m ones. >> >> That is fine with be, in fact the whole change has been prompted by >> George's remark that "there would something a bit funny here about >> grabbing the main p2m lock in p2m.c, and then grabbing altp2m locks >> within the function". If, after these comments, he doesn't mind the >> scenario then I'll do that in V3. > > I think I would rather grab the main p2m locks in > ept_{enable,disable}_pml(). Wouldn't hurt to have an > ASSERT(p2m_is_locked_by_me()) in ept_set_ad_sync() as well. I'll do that. Thanks! Razvan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] null scheduler bug
On 09/27/2018 06:06 PM, Dario Faggioli wrote: On Thu, 2018-09-27 at 16:09 +0100, Julien Grall wrote: Hi, Hi Dario, On 09/27/2018 03:32 PM, Dario Faggioli wrote: On Thu, 2018-09-27 at 15:15 +0200, Milan Boberic wrote: In one of your e-mail, you wrote: "Well, our implementation of RCU requires that, from time to time, the various physical CPUs of your box become idle, or get an interrupt, or go executing inside Xen (for hypercalls, vmexits, etc). In fact, a CPU going through Xen is what allow us to tell that it reached a so- called 'quiescent state', which in turns is necessary for declaring a so- called 'RCU grace period' over." I don't quite agree with you on the definition of "quiescent state" here. Hehe... I was trying to be both quick and accurate. It's more than possible that I failed. :-) To take the domain example, we want to wait until all the CPU has stopped using the pointer (an hypercall could race put_domain). I'm not sure what you mean with "an hypercall could race put_domain". I meant that another CPU get a pointer on the domain until the domain is effectively removed from the list by _domain_destroy. What we want is to wait until all the CPUs that are involved in the grace period, have gone through rcupdate.c:cpu_quiet(), or have become idle. Which is what I meant but in a more convoluted way. Receiving an interrupt, or experiencing a context switch, or even going idle, it's "just" how it happens that these CPUs have their chance to go through cpu_quiet(). It is in this sense that I meant that those events are used as markers of a quiescent state. And "wfi=native" (in particular in combination with the null scheduler, but I guess also with other ones, at least to a certain extent) makes figuring out the "or have become idle" part tricky. That is the problem here, isn't it? That's correct. That pointer will not be in-use if the CPU is in kernel-mode/user-mode or in the idle loop. Am I correct? Right. So, we want that all the CPUs that were in Xen to have either left Xen at least once or, if they're still there and have never left, that must be because they've become idle. And currently we treat all the CPUs that have not told the RCU subsystem that they're idle (via rcu_idle_enter()) as busy, without distinguishing between the ones that are busy in Xen from the one which are busy in guest (kernel or user) mode. So I am wondering whether we could: - Mark any CPU in kernel-mode/user-mode quiet Right. We'd need something like a rcu_guest_enter()/rcu_guest_exit() (or a rcu_xen_exit()/rcu_xen_enter()), which works for all combination of arches and guest types. It looks to me too that this would help in this case, as the vCPU that stays in guest mode because of wfi=idle would be counted as quiet, and we won't have to wait for it. - Raise a RCU_SOFTIRQ in call_rcu? Mmm... what would be the point of this? To check if the RCU has work to do. You may call_rcu with all the other CPUs quiet. Or do you expect this to be done in rcu_guest_{enter,exit}()? With that solution, it may even be possible to avoid the timer in the idle loop. Not sure. The timer is there to deal with the case when a CPU which has a callback queued wants to go idle. It may have quiesced already, but if there are others which have not, either: 1) we let it go idle, but then the callback will run only when it wakes up from idle which, without the timer, could be far ahead in time; 2) we don't let it go idle, but we waste resources; 3) we let it go idle and keep the timer. :-) Oh right. But anyway, even if it would not let us get rid of the timer, it seems like it could be nicer than any other approaches. I accept help/suggestions about the "let's intercept guest-Xen and Xen-guest transitions, and track that inside RCU code. The best place for Arm to have those calls would be in: - enter_hypervisor_head(): This is called after exiting the guest and before doing any other work - leave_hypervisor_head(): This is called just before returning to the guest. I am CCing Stefano work e-mail. When I spoke with him he was interested to put some effort towards fixing the bug. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 0/4] x86: fix "xpti=" and "pv-l1tf=" yet again
The original patch under this title as well as the involved variables were split up for v2, hopefully addressing the main (yet vague) review concerns on v1. 1: split opt_xpti 2: split opt_pv_l1tf 3: fix "xpti=" and "pv-l1tf=" yet again 4: support "pv-l1tf=default" Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [distros-debian-sid test] 75332: trouble: blocked/broken
flight 75332 distros-debian-sid real [real] http://osstest.xensource.com/osstest/logs/75332/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf-pvopsbroken build-i386 broken build-amd64-pvopsbroken build-armhf broken build-amd64 broken build-i386-pvops broken Tests which did not succeed, but are not blocking: test-amd64-amd64-i386-sid-netboot-pygrub 1 build-check(1) blocked n/a test-amd64-i386-i386-sid-netboot-pvgrub 1 build-check(1) blocked n/a test-armhf-armhf-armhf-sid-netboot-pygrub 1 build-check(1)blocked n/a test-amd64-i386-amd64-sid-netboot-pygrub 1 build-check(1) blocked n/a test-amd64-amd64-amd64-sid-netboot-pvgrub 1 build-check(1)blocked n/a build-armhf 4 host-install(4) broken like 75277 build-armhf-pvops 4 host-install(4) broken like 75277 build-i3864 host-install(4) broken like 75277 build-i386-pvops 4 host-install(4) broken like 75277 build-amd64 4 host-install(4) broken like 75277 build-amd64-pvops 4 host-install(4) broken like 75277 baseline version: flight 75277 jobs: build-amd64 broken build-armhf broken build-i386 broken build-amd64-pvopsbroken build-armhf-pvopsbroken build-i386-pvops broken test-amd64-amd64-amd64-sid-netboot-pvgrubblocked test-amd64-i386-i386-sid-netboot-pvgrub blocked test-amd64-i386-amd64-sid-netboot-pygrub blocked test-armhf-armhf-armhf-sid-netboot-pygrubblocked test-amd64-amd64-i386-sid-netboot-pygrub blocked sg-report-flight on osstest.xs.citrite.net logs: /home/osstest/logs images: /home/osstest/images Logs, config files, etc. are available at http://osstest.xensource.com/osstest/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Push not applicable. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/3] [not-for-unstable] xen/arm: vgic-v3: Delay the initialization of the domain information
Hi Andrew, On 10/01/2018 10:53 AM, Andrew Cooper wrote: On 01/10/18 10:43, Julien Grall wrote: Hi, On 09/29/2018 12:48 AM, Andrew Cooper wrote: On 29/09/18 00:45, Stefano Stabellini wrote: On Sat, 29 Sep 2018, Andrew Cooper wrote: On 28/09/18 21:35, Julien Grall wrote: On 09/28/2018 12:11 AM, Stefano Stabellini wrote: On Wed, 26 Sep 2018, Julien Grall wrote: Hi Stefano, On 09/25/2018 09:45 PM, Stefano Stabellini wrote: On Tue, 4 Sep 2018, Andrew Cooper wrote: On 04/09/18 20:35, Julien Grall wrote: Hi, On 09/04/2018 08:21 PM, Julien Grall wrote: A follow-up patch will require to know the number of vCPUs when initializating the vGICv3 domain structure. However this information is not available at domain creation. This is only known once XEN_DOMCTL_max_vpus is called for that domain. In order to get the max vCPUs around, delay the domain part of the vGIC v3 initialization until the first vCPU of the domain is initialized. Signed-off-by: Julien Grall --- Cc: Andrew Cooper This is nasty but I can't find a better way for Xen 4.11 and older. This is not necessary for unstable as the number of vCPUs is known at domain creation. Andrew, I have CCed you to know whether you have a better idea where to place this call on Xen 4.11 and older. I just noticed that d->max_vcpus is initialized after arch_domain_create. So without this patch on Xen 4.12, it will not work. This is getting nastier because arch_domain_init is the one initialize the value returned by dom0_max_vcpus. So I am not entirely sure what to do here. The positioning after arch_domain_create() is unfortunate, but I couldn’t manage better with ARM's current behaviour and Jan's insistence that the allocation of d->vcpu was common. I'd prefer if the dependency could be broken and the allocation moved earlier. One option might be to have an arch_check_domainconfig() (or similar?) which is called very early on and can sanity check the values, including cross-checking the vgic and max_vcpus settings? It could even be responsible for mutating XEN_DOMCTL_CONFIG_GIC_NATIVE into the correct real value. As for your patch here, its a gross hack, but its probably the best which can be done. *Sighs* If that is what we have to do, it is as ugly as hell, but that is what we'll do. This is the best we can do with the current code base. I think it would be worth reworking the code to make it nicer. I will add it in my TODO list. My only suggestion to marginally improve it would be instead of: + if ( v->vcpu_id == 0 ) + { + rc = vgic_v3_real_domain_init(d); + if ( rc ) + return rc; + } to check on d->arch.vgic.rdist_regions instead: if ( d->arch.vgic.rdist_regions == NULL ) { // initialize domain I would prefer to keep v->vcpu_id == 0 just in case we end up to re-order the allocation in the future. I was suggesting to check on (rdist_regions == NULL) exactly for potential re-ordering, in case in the future we end up calling vcpu_vgic_init differently and somehow vcpu_init(vcpu1) is done before before vcpu_init(vcpu0). Ideally we would like a way to check that vgic_v3_real_domain_init has been called before and I thought rdist_regions == NULL could do just that... What I meant by re-ordering is we manage to allocate the re-distributors before the vCPUs are created but still need vgic_v3_real_domain_init for other purpose. But vCPU initialization is potentially other issue. Anyway. both way have drawbacks. Yet I still prefer checking on the vCPU. It less likely vCPU0 will not be the first one initialized. With the exception of the idle domain, all vcpus are strictly allocated in packed ascending order. Loads of other stuff will break if that changed, so I wouldn't worry about it. Furthermore, there is no obvious reason for this behaviour to ever change. OK, let's go with Julien's patch. We need a new tag for this, something like: Acked-but-disliked-by: Stefano Stabellini Do bear in mind that this patch is only for 4.11 and earlier. I've already fixed staging (i.e. 4.12) when it comes to knowing d->max_vcpus :) I thought we agreed that patch is necessary for 4.12 as d->max_vcpus is initialized after arch_domain_init? Oh right. I am not planning to do the rework in short term. Did you do more work on around domain_create recently? There are multiple related patch series out on xen-devel atm, but I expect I need to spin a new version of each of them. I'll see if I have some time to put towards it. Are you happy in principle with the arch_check_domainconfig() plan? I am happy in principle. If you don't have time to work on it, I will try to have a look before Xen 4.12. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
>>> On 01.10.18 at 12:52, wrote: > Olaf Hering writes ("Re: [PATCH v9] new config option vtsc_tolerance_khz to > avoid TSC emulation"): >> Am Thu, 13 Sep 2018 09:39:13 +0200 >> schrieb Olaf Hering : >> > this patch was not applied yet, even after a few "pings". >> >> No reaction since months. >> So scrap that patch, just in case it is still part of someones to-consider > queue. > > I think it would be worth exploring whether this patch could be > applied without an explicit ack from Andrew. > > I confess I ignored all the previous mails because they all started >Andrew, > or >Andrew, Lars, > so I assumed that you didn't want attention from other > maintainers/committers. > > Now that I look at the thread it is difficult for me to see the wood > for the trees but I don't see unanswered concerns. Problem is - discussion around this was (iirc) happening not only on the list, but also on irc (including perhaps private chats). It was for that reason that I made my R-b conditional upon Andrew at least giving an informal go-ahead (otherwise, together with Wei's R-b, the patch could have gone in). Besides the question of correctness from the perspective of guests (which imo is not a problem as the feature needs to be actively enabled, and I think we have no reason to keep admins from breaking their guests if they really mean to), I think the main concern was with the way migration of the new value was implemented. But I really have to defer to Andrew for that, irrespective of him not having responded (on the list) to prior pings. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V2] x86/altp2m: propagate ept.ad changes to all active altp2ms
On 10/01/2018 12:11 PM, Razvan Cojocaru wrote: > > > On 10/1/18 1:39 PM, Jan Beulich wrote: > On 01.10.18 at 11:58, wrote: >>> Changes since V1: >>> - Removed unnecessary p2m_lock() in p2m_init_altp2m_ept(). >> >> This was a step in the right direction, but ... >> >>> static void ept_enable_pml(struct p2m_domain *p2m) >>> { >>> +struct domain *d = p2m->domain; >>> + >>> /* Domain must have been paused */ >>> -ASSERT(atomic_read(>domain->pause_count)); >>> +ASSERT(atomic_read(>pause_count)); >>> >>> /* >>> * No need to return whether vmx_domain_enable_pml has succeeded, as >>> * ept_p2m_type_to_flags will do the check, and write protection will >>> be >>> * used if PML is not enabled. >>> */ >>> -if ( vmx_domain_enable_pml(p2m->domain) ) >>> +if ( vmx_domain_enable_pml(d) ) >>> return; >>> >>> /* Enable EPT A/D bit for PML */ >>> -p2m->ept.ad = 1; >>> -vmx_domain_update_eptp(p2m->domain); >>> +ept_set_ad_sync(p2m, true); >>> + >>> +vmx_domain_update_eptp(d); >>> } >>> >>> static void ept_disable_pml(struct p2m_domain *p2m) >>> { >>> +struct domain *d = p2m->domain; >>> + >>> /* Domain must have been paused */ >>> -ASSERT(atomic_read(>domain->pause_count)); >>> +ASSERT(atomic_read(>pause_count)); >>> >>> -vmx_domain_disable_pml(p2m->domain); >>> +vmx_domain_disable_pml(d); >>> >>> /* Disable EPT A/D bit */ >>> -p2m->ept.ad = 0; >>> -vmx_domain_update_eptp(p2m->domain); >>> +ept_set_ad_sync(p2m, false); >>> + >>> +vmx_domain_update_eptp(d); >>> } >> >> These two functions used to be called with the p2m lock held, >> while now they aren't anymore. Afaict this introduces a race >> where the opposite ept_set_ad_sync() may be called before >> an original one was follow by the respective >> vmx_domain_update_eptp(), resulting in the A/D enable bit >> being set the wrong way round in the end. >> >> I realize that George did already point out that this is sort of >> ugly a situation, but the fixing of the issue here shouldn't >> introduce a new race. What's wrong with retaining the >> host p2m lock in p2m_{en,dis}able_hardware_log_dirty()? >> ept_set_ad_sync() then simply wouldn't acquire/release that >> one, but just the altp2m ones. > > That is fine with be, in fact the whole change has been prompted by > George's remark that "there would something a bit funny here about > grabbing the main p2m lock in p2m.c, and then grabbing altp2m locks > within the function". If, after these comments, he doesn't mind the > scenario then I'll do that in V3. I think I would rather grab the main p2m locks in ept_{enable,disable}_pml(). Wouldn't hurt to have an ASSERT(p2m_is_locked_by_me()) in ept_set_ad_sync() as well. Thanks, -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v12 8/9] mm / iommu: include need_iommu() test in iommu_use_hap_pt()
Hi Paul, On 10/01/2018 11:51 AM, Paul Durrant wrote: -Original Message- From: Julien Grall [mailto:julien.gr...@arm.com] Sent: 01 October 2018 11:37 To: Paul Durrant ; xen-devel@lists.xenproject.org Cc: Jun Nakajima ; George Dunlap ; Jan Beulich ; Andrew Cooper ; Stefano Stabellini Subject: Re: [PATCH v12 8/9] mm / iommu: include need_iommu() test in iommu_use_hap_pt() Hi Paul, On 09/27/2018 03:33 PM, Paul Durrant wrote: The name 'iommu_use_hap_pt' suggests that that P2M table is in use as the domain's IOMMU pagetable which, prior to this patch, is not strictly true since the macro did not test whether the domain actually has IOMMU mappings. Well, I think we assume that iommu_use_hap_pt() will only be called when need_iommu(d) == 1. Yes. At the end, you will still need to check need_iommu(d) outside because iommu_use_hap_pt() may now return 0 for domain without IOMMU or for domain not sharing p2m with the IOMMU. Do you expect an improvement in long-term? Yes. The ultimate goal is the patch to split up need_iommu() into separate macros for 'domain has IOMMU pt' and 'domain needs explicit sync of IOMMU pt' and this seems like a logical step. For 'need explicit sync of IOMMU pt' to true then 'has IOMMU pt' clearly needs to be true, so I wanted the macro for 'sharing IOMMU pt' to have the same predicate. Make senses. It didn't occur to me that was in preparation of the next patch. For the Arm bits: Acked-by: Julien Grall Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel