Re: [PATCH v2 1/3] x86/IOMMU: mark IOMMU / intremap not in use when ACPI tables are missing
On 21.10.2021 11:58, Jan Beulich wrote: > x2apic_bsp_setup() gets called ahead of iommu_setup(), and since x2APIC > mode (physical vs clustered) depends on iommu_intremap, that variable > needs to be set to off as soon as we know we can't / won't enable > interrupt remapping, i.e. in particular when parsing of the respective > ACPI tables failed. Move the turning off of iommu_intremap from AMD > specific code into acpi_iommu_init(), accompanying it by clearing of > iommu_enable. > > Take the opportunity and also fully skip ACPI table parsing logic on > VT-d when both "iommu=off" and "iommu=no-intremap" are in effect anyway, > like was already the case for AMD. > > The tag below only references the commit uncovering a pre-existing > anomaly. > > Fixes: d8bd82327b0f ("AMD/IOMMU: obtain IVHD type to use earlier") > Reported-by: Andrew Cooper > Signed-off-by: Jan Beulich Ouch, forgot to Cc Kevin; now added. Jan > --- > While the change here deals with apic_x2apic_probe() as called from > x2apic_bsp_setup(), the check_x2apic_preenabled() path looks to be > similarly affected. That call occurs before acpi_boot_init(), which is > what calls acpi_iommu_init(). The ordering in setup.c is in part > relatively fragile, which is why for the moment I'm still hesitant to > move the generic_apic_probe() call down. Plus I don't have easy access > to a suitable system to test this case. Thoughts? > --- > v2: Treat iommu_enable and iommu_intremap as separate options. > > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c > @@ -183,9 +183,6 @@ int __init acpi_ivrs_init(void) > { > int rc; > > -if ( !iommu_enable && !iommu_intremap ) > -return 0; > - > rc = amd_iommu_get_supported_ivhd_type(); > if ( rc < 0 ) > return rc; > @@ -193,10 +190,7 @@ int __init acpi_ivrs_init(void) > ivhd_type = rc; > > if ( (amd_iommu_detect_acpi() !=0) || (iommu_found() == 0) ) > -{ > -iommu_intremap = iommu_intremap_off; > return -ENODEV; > -} > > iommu_init_ops = &_iommu_init_ops; > > --- a/xen/drivers/passthrough/vtd/dmar.c > +++ b/xen/drivers/passthrough/vtd/dmar.c > @@ -777,11 +777,7 @@ static int __init acpi_parse_dmar(struct > dmar = (struct acpi_table_dmar *)table; > dmar_flags = dmar->flags; > > -if ( !iommu_enable && !iommu_intremap ) > -{ > -ret = -EINVAL; > -goto out; > -} > +ASSERT(iommu_enable || iommu_intremap); > > if ( !dmar->width ) > { > --- a/xen/drivers/passthrough/x86/iommu.c > +++ b/xen/drivers/passthrough/x86/iommu.c > @@ -41,6 +41,24 @@ enum iommu_intremap __read_mostly iommu_ > bool __read_mostly iommu_intpost; > #endif > > +void __init acpi_iommu_init(void) > +{ > +int ret; > + > +if ( !iommu_enable && !iommu_intremap ) > +return; > + > +ret = acpi_dmar_init(); > +if ( ret == -ENODEV ) > +ret = acpi_ivrs_init(); > + > +if ( ret ) > +{ > +iommu_enable = false; > +iommu_intremap = iommu_intremap_off; > +} > +} > + > int __init iommu_hardware_setup(void) > { > struct IO_APIC_route_entry **ioapic_entries = NULL; > --- a/xen/include/asm-x86/acpi.h > +++ b/xen/include/asm-x86/acpi.h > @@ -141,16 +141,10 @@ extern u32 x86_acpiid_to_apicid[]; > extern u32 pmtmr_ioport; > extern unsigned int pmtmr_width; > > +void acpi_iommu_init(void); > int acpi_dmar_init(void); > int acpi_ivrs_init(void); > > -static inline int acpi_iommu_init(void) > -{ > -int ret = acpi_dmar_init(); > - > -return ret == -ENODEV ? acpi_ivrs_init() : ret; > -} > - > void acpi_mmcfg_init(void); > > /* Incremented whenever we transition through S3. Value is 1 during boot. */ > >
[PATCH 1/2] drm/i915/gem: stop using PAGE_KERNEL_IO
PAGE_KERNEL_IO is only defined for x86 and nowadays is the same as PAGE_KERNEL. It was different for some time, OR'ing a `_PAGE_IOMAP` flag in commit be43d72835ba ("x86: add _PAGE_IOMAP pte flag for IO mappings"). This got removed in commit f955371ca9d3 ("x86: remove the Xen-specific _PAGE_IOMAP PTE flag"), so today they are just the same. This is the same that was done in commit ac96b5566926 ("io-mapping.h: s/PAGE_KERNEL_IO/PAGE_KERNEL/"). There is a subsequent commit with 'Fixes: ac96b5566926 ("io-mapping.h: s/PAGE_KERNEL_IO/PAGE_KERNEL/")' - but that is not relevant here since is it's actually fixing the different names for pgprot_writecombine(), which we also don't have today since all archs expose pgprot_writecombine(). Microblaze, mentioned in that discussion, gained pgprot_writecombine() in commit 97ccedd793ac ("microblaze: Provide pgprot_device/writecombine macros for nommu"). So, just use PAGE_KERNEL, and just use pgprot_writecombine(). Signed-off-by: Lucas De Marchi Reviewed-by: Daniel Vetter Link: https://patchwork.freedesktop.org/patch/msgid/20211020090625.1037517-1-lucas.demar...@intel.com --- drivers/gpu/drm/i915/gem/i915_gem_pages.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index 8eb1c3a6fc9c..68fe1837ef54 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -289,7 +289,7 @@ static void *i915_gem_object_map_page(struct drm_i915_gem_object *obj, pgprot = PAGE_KERNEL; break; case I915_MAP_WC: - pgprot = pgprot_writecombine(PAGE_KERNEL_IO); + pgprot = pgprot_writecombine(PAGE_KERNEL); break; } @@ -333,7 +333,7 @@ static void *i915_gem_object_map_pfn(struct drm_i915_gem_object *obj, i = 0; for_each_sgt_daddr(addr, iter, obj->mm.pages) pfns[i++] = (iomap + addr) >> PAGE_SHIFT; - vaddr = vmap_pfn(pfns, n_pfn, pgprot_writecombine(PAGE_KERNEL_IO)); + vaddr = vmap_pfn(pfns, n_pfn, pgprot_writecombine(PAGE_KERNEL)); if (pfns != stack) kvfree(pfns); -- 2.33.1
[PATCH 0/2] Nuke PAGE_KERNEL_IO
Last user of PAGE_KERNEL_IO is the i915 driver. While removing it from there as we seek to bring the driver to other architectures, Daniel suggested that we could finish the cleanup and remove it altogether, through the tip tree. So here I'm sending both commits needed for that. Lucas De Marchi (2): drm/i915/gem: stop using PAGE_KERNEL_IO x86/mm: nuke PAGE_KERNEL_IO arch/x86/include/asm/fixmap.h | 2 +- arch/x86/include/asm/pgtable_types.h | 7 --- arch/x86/mm/ioremap.c | 2 +- arch/x86/xen/setup.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_pages.c | 4 ++-- include/asm-generic/fixmap.h | 2 +- 6 files changed, 6 insertions(+), 13 deletions(-) -- 2.33.1
[PATCH 2/2] x86/mm: nuke PAGE_KERNEL_IO
PAGE_KERNEL_IO is only defined for x86 and nowadays is the same as PAGE_KERNEL. It was different for some time, OR'ing a `_PAGE_IOMAP` flag in commit be43d72835ba ("x86: add _PAGE_IOMAP pte flag for IO mappings"). This got removed in commit f955371ca9d3 ("x86: remove the Xen-specific _PAGE_IOMAP PTE flag"), so today they are just the same. With the last users outside arch/x86 being remove we can now remove PAGE_KERNEL_IO. Signed-off-by: Lucas De Marchi --- arch/x86/include/asm/fixmap.h| 2 +- arch/x86/include/asm/pgtable_types.h | 7 --- arch/x86/mm/ioremap.c| 2 +- arch/x86/xen/setup.c | 2 +- include/asm-generic/fixmap.h | 2 +- 5 files changed, 4 insertions(+), 11 deletions(-) diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h index d0dcefb5cc59..5e186a69db10 100644 --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -173,7 +173,7 @@ static inline void __set_fixmap(enum fixed_addresses idx, * supported for MMIO addresses, so make sure that the memory encryption * mask is not part of the page attributes. */ -#define FIXMAP_PAGE_NOCACHE PAGE_KERNEL_IO_NOCACHE +#define FIXMAP_PAGE_NOCACHE PAGE_KERNEL_NOCACHE /* * Early memremap routines used for in-place encryption. The mappings created diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 40497a9020c6..a87224767ff3 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -199,10 +199,6 @@ enum page_cache_mode { #define __PAGE_KERNEL_WP(__PP|__RW| 0|___A|__NX|___D| 0|___G| __WP) -#define __PAGE_KERNEL_IO __PAGE_KERNEL -#define __PAGE_KERNEL_IO_NOCACHE __PAGE_KERNEL_NOCACHE - - #ifndef __ASSEMBLY__ #define __PAGE_KERNEL_ENC (__PAGE_KERNEL| _ENC) @@ -223,9 +219,6 @@ enum page_cache_mode { #define PAGE_KERNEL_LARGE_EXEC __pgprot_mask(__PAGE_KERNEL_LARGE_EXEC | _ENC) #define PAGE_KERNEL_VVAR __pgprot_mask(__PAGE_KERNEL_VVAR | _ENC) -#define PAGE_KERNEL_IO __pgprot_mask(__PAGE_KERNEL_IO) -#define PAGE_KERNEL_IO_NOCACHE __pgprot_mask(__PAGE_KERNEL_IO_NOCACHE) - #endif /* __ASSEMBLY__ */ /* xwr */ diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 026031b3b782..3102dda4b152 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -243,7 +243,7 @@ __ioremap_caller(resource_size_t phys_addr, unsigned long size, * make sure the memory encryption attribute is enabled in the * resulting mapping. */ - prot = PAGE_KERNEL_IO; + prot = PAGE_KERNEL; if ((io_desc.flags & IORES_MAP_ENCRYPTED) || encrypted) prot = pgprot_encrypted(prot); diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index 8bfc10330107..5dc0771a50f3 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -435,7 +435,7 @@ static unsigned long __init xen_set_identity_and_remap_chunk( for (pfn = start_pfn; pfn <= max_pfn_mapped && pfn < end_pfn; pfn++) (void)HYPERVISOR_update_va_mapping( (unsigned long)__va(pfn << PAGE_SHIFT), - mfn_pte(pfn, PAGE_KERNEL_IO), 0); + mfn_pte(pfn, PAGE_KERNEL), 0); return remap_pfn; } diff --git a/include/asm-generic/fixmap.h b/include/asm-generic/fixmap.h index 8cc7b09c1bc7..f1b0c6f3d0be 100644 --- a/include/asm-generic/fixmap.h +++ b/include/asm-generic/fixmap.h @@ -54,7 +54,7 @@ static inline unsigned long virt_to_fix(const unsigned long vaddr) #define FIXMAP_PAGE_NOCACHE PAGE_KERNEL_NOCACHE #endif #ifndef FIXMAP_PAGE_IO -#define FIXMAP_PAGE_IO PAGE_KERNEL_IO +#define FIXMAP_PAGE_IO PAGE_KERNEL #endif #ifndef FIXMAP_PAGE_CLEAR #define FIXMAP_PAGE_CLEAR __pgprot(0) -- 2.33.1
RE: [for-4.16] Re: [PATCH v4] xen/arm: vgic: Ignore write access to ICPENDR*
Hi, > -Original Message- > From: Julien Grall > Sent: 2021年10月22日 1:16 > To: Hongda Deng ; xen-devel@lists.xenproject.org; > sstabell...@kernel.org > Cc: Bertrand Marquis ; Wei Chen > ; Ian Jackson > Subject: Re: [for-4.16] Re: [PATCH v4] xen/arm: vgic: Ignore write access to > ICPENDR* > > > > On 21/10/2021 16:14, Julien Grall wrote: > > On the previous version, we discussed to include the patch for 4.16. So > > please tag it with for-4.16 and CC the Release Manager (Ian). This will > > help him to track what's outstanding for the release. > > > > On 21/10/2021 13:03, Hongda Deng wrote: > >> Currently, Xen will return IO unhandled when guests write ICPENDR* > >> virtual registers, which will raise a data abort inside the guest. > >> For Linux guest, these virtual registers will not be accessed. But > >> for Zephyr, these virtual registers will be accessed during the > >> initialization. Zephyr guest will get an IO data abort and crash. > >> Emulating ICPENDR is not easy with the existing vGIC, this patch > >> reworks the emulation to ignore write access to ICPENDR* virtual > >> registers and print a message about whether they are already pending > >> instead of returning unhandled. > >> More details can be found at [1]. > >> > >> [1] https://github.com/zephyrproject-rtos/zephyr/blob/eaf6cf745df3807e6e > >> cc941c3a30de6c179ae359/drivers/interrupt_controller/intc_gicv3.c#L274 > >> > >> Signed-off-by: Hongda Deng > > > > While I agree the Reviewed-by from Bertrand should be dropped, the > > Release-acked-by from Ian is simply a way to say he is happy to include > > the patch for 4.16. So this should have been retain. > > > > The patch looks good to me, so I can add Ian's tag on commit: > > > > Reviewed-by: Julien Grall > > Committed. > > Cheers, > > -- > Julien Grall Thank you ! I just learned that I should add "Reviewed-by" and " Release-acked-by" tags based on previous patches, sorry for that, I will keep it in mind. Cheers, --- Hongda
[xen-unstable test] 165712: regressions - FAIL
flight 165712 xen-unstable real [real] flight 165727 xen-unstable real-retest [real] http://logs.test-lab.xenproject.org/osstest/logs/165712/ http://logs.test-lab.xenproject.org/osstest/logs/165727/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-examine 4 memdisk-try-append fail REGR. vs. 165699 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-rtds 20 guest-localmigrate/x10 fail like 165699 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 165699 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 165699 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 165699 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 165699 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 165699 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 165699 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 165699 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 165699 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 165699 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 165699 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 165699 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 165699 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-amd64-i386-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-check
[qemu-mainline test] 165721: regressions - FAIL
flight 165721 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/165721/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-i3866 xen-buildfail REGR. vs. 165682 build-i386-xsm6 xen-buildfail REGR. vs. 165682 build-armhf 6 xen-buildfail REGR. vs. 165682 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-qcow2 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-raw 1 build-check(1) blocked n/a test-armhf-armhf-xl 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit2 1 build-check(1) blocked n/a test-armhf-armhf-xl-cubietruck 1 build-check(1) blocked n/a test-armhf-armhf-xl-multivcpu 1 build-check(1) blocked n/a test-armhf-armhf-xl-rtds 1 build-check(1) blocked n/a test-armhf-armhf-xl-vhd 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-amd 1 build-check(1) blocked n/a test-amd64-i386-pair 1 build-check(1) blocked n/a test-amd64-i386-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt-raw 1 build-check(1) blocked n/a test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt-pair 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-i386 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-amd64 1 build-check(1) blocked n/a test-amd64-coresched-i386-xl 1 build-check(1) blocked n/a build-armhf-libvirt 1 build-check(1) blocked n/a build-i386-libvirt1 build-check(1) blocked n/a test-armhf-armhf-xl-credit1 1 build-check(1) blocked n/a test-armhf-armhf-xl-arndale 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-intel 1 build-check(1) blocked n/a test-amd64-i386-xl1 build-check(1) blocked n/a test-amd64-i386-xl-pvshim 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ws16-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-shadow 1 build-check(1) blocked n/a test-amd64-i386-xl-vhd1 build-check(1) blocked n/a test-amd64-i386-xl-xsm1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 165682 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 165682 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 165682 test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14
[PATCH 1/3] automation: add x86_64 alpine 3.12 test-artifact
From: Stefano Stabellini It is the same as the existing ARM64 alpine 3.12 test-artifact. It is used to export an Alpine rootfs for Dom0 used for testing. Also add the exporting job to build.yaml so that the binaries can be used during gitlab-ci runs. Signed-off-by: Stefano Stabellini --- automation/gitlab-ci/build.yaml | 13 .../tests-artifacts/alpine/3.12.dockerfile| 68 +++ 2 files changed, 81 insertions(+) create mode 100644 automation/tests-artifacts/alpine/3.12.dockerfile diff --git a/automation/gitlab-ci/build.yaml b/automation/gitlab-ci/build.yaml index d177da1710..76b73beead 100644 --- a/automation/gitlab-ci/build.yaml +++ b/automation/gitlab-ci/build.yaml @@ -596,3 +596,16 @@ qemu-system-aarch64-5.2.0-arm64-export: tags: - arm64 + +# x86_64 test artifacts + +alpine-3.12-rootfs-export: + stage: build + image: registry.gitlab.com/xen-project/xen/tests-artifacts/alpine:3.12 + script: +- mkdir binaries && cp /initrd.tar.gz binaries/initrd.tar.gz + artifacts: +paths: + - binaries/initrd.tar.gz + tags: +- x86_64 diff --git a/automation/tests-artifacts/alpine/3.12.dockerfile b/automation/tests-artifacts/alpine/3.12.dockerfile new file mode 100644 index 00..fbc26b9e36 --- /dev/null +++ b/automation/tests-artifacts/alpine/3.12.dockerfile @@ -0,0 +1,68 @@ +FROM alpine:3.12 +LABEL maintainer.name="The Xen Project" \ + maintainer.email="xen-devel@lists.xenproject.org" + +ENV USER root + +RUN mkdir /build +WORKDIR /build + +RUN \ + # apk + apk update && \ + \ + # xen runtime deps + apk add musl && \ + apk add openrc && \ + apk add busybox && \ + apk add sudo && \ + apk add dbus && \ + apk add bash && \ + apk add python2 && \ + # gettext for Xen < 4.13 + apk add gettext && \ + apk add zlib && \ + apk add ncurses && \ + apk add texinfo && \ + apk add yajl && \ + apk add libaio && \ + apk add xz-dev && \ + apk add util-linux && \ + apk add argp-standalone && \ + apk add libfdt && \ + apk add glib && \ + apk add pixman && \ + apk add curl && \ + apk add udev && \ + \ + # Xen + cd / && \ + # Minimal ramdisk environment in case of cpio output + rc-update add udev && \ + rc-update add udev-trigger && \ + rc-update add udev-settle && \ + rc-update add networking sysinit && \ + rc-update add loopback sysinit && \ + rc-update add bootmisc boot && \ + rc-update add devfs sysinit && \ + rc-update add dmesg sysinit && \ + rc-update add hostname boot && \ + rc-update add hwclock boot && \ + rc-update add hwdrivers sysinit && \ + rc-update add killprocs shutdown && \ + rc-update add modloop sysinit && \ + rc-update add modules boot && \ + rc-update add mount-ro shutdown && \ + rc-update add savecache shutdown && \ + rc-update add sysctl boot && \ + rc-update add local default && \ + cp -a /sbin/init /init && \ + echo "ttyS0" >> /etc/securetty && \ + echo "hvc0" >> /etc/securetty && \ + echo "ttyS0::respawn:/sbin/getty -L ttyS0 115200 vt100" >> /etc/inittab && \ + echo "hvc0::respawn:/sbin/getty -L hvc0 115200 vt100" >> /etc/inittab && \ + passwd -d "root" root && \ + \ + # Create rootfs + cd / && \ + tar cvzf /initrd.tar.gz bin dev etc home init lib mnt opt root sbin usr var -- 2.17.1
[PATCH 3/3] automation: add a QEMU based x86_64 Dom0/DomU test
From: Stefano Stabellini Introduce a test based on QEMU to run Xen, Dom0 and start a DomU. This is similar to the existing qemu-alpine-arm64.sh script and test. The only differences are: - use Debian's qemu-system-x86_64 (on ARM we build our own) - use ipxe instead of u-boot and ImageBuilder Signed-off-by: Stefano Stabellini --- automation/gitlab-ci/test.yaml | 24 +++ automation/scripts/qemu-alpine-x86_64.sh | 92 2 files changed, 116 insertions(+) create mode 100644 automation/scripts/qemu-alpine-x86_64.sh diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml index 91a10febbf..c1d67ec4b5 100644 --- a/automation/gitlab-ci/test.yaml +++ b/automation/gitlab-ci/test.yaml @@ -47,6 +47,30 @@ qemu-alpine-arm64-gcc: - /^coverity-tested\/.*/ - /^stable-.*/ +qemu-alpine-x86_64-gcc: + stage: test + image: registry.gitlab.com/xen-project/xen/${CONTAINER} + variables: +CONTAINER: debian:unstable + script: +- ./automation/scripts/qemu-alpine-x86_64.sh 2>&1 | tee qemu-smoke-arm64.log + dependencies: +- alpine-3.12-gcc +- alpine-3.12-rootfs-export +- kernel-5.10.74-export + artifacts: +paths: + - smoke.serial + - '*.log' +when: always + tags: +- x86_64 + except: +- master +- smoke +- /^coverity-tested\/.*/ +- /^stable-.*/ + qemu-smoke-arm64-gcc: stage: test image: registry.gitlab.com/xen-project/xen/${CONTAINER} diff --git a/automation/scripts/qemu-alpine-x86_64.sh b/automation/scripts/qemu-alpine-x86_64.sh new file mode 100644 index 00..41b05210d6 --- /dev/null +++ b/automation/scripts/qemu-alpine-x86_64.sh @@ -0,0 +1,92 @@ +#!/bin/bash + +set -ex + +apt-get -qy update +apt-get -qy install --no-install-recommends qemu-system-x86 \ +cpio \ +curl \ +busybox-static + +# DomU Busybox +cd binaries +mkdir -p initrd +mkdir -p initrd/bin +mkdir -p initrd/sbin +mkdir -p initrd/etc +mkdir -p initrd/dev +mkdir -p initrd/proc +mkdir -p initrd/sys +mkdir -p initrd/lib +mkdir -p initrd/var +mkdir -p initrd/mnt +cp /bin/busybox initrd/bin/busybox +initrd/bin/busybox --install initrd/bin +echo "#!/bin/sh + +mount -t proc proc /proc +mount -t sysfs sysfs /sys +mount -t devtmpfs devtmpfs /dev +/bin/sh" > initrd/init +chmod +x initrd/init +cd initrd +find . | cpio --create --format='newc' | gzip > ../initrd.cpio.gz +cd .. + +mkdir -p rootfs +cd rootfs +tar xvzf ../initrd.tar.gz +mkdir proc +mkdir run +mkdir srv +mkdir sys +rm var/run +cp -ar ../dist/install/* . +mv ../initrd.cpio.gz ./root +cp ../bzImage ./root +echo "name=\"test\" +memory=512 +vcpus=1 +kernel=\"/root/bzImage\" +ramdisk=\"/root/initrd.cpio.gz\" +extra=\"console=hvc0 root=/dev/ram0 rdinit=/bin/sh\" +" > root/test.cfg +echo "#!/bin/bash + +export LD_LIBRARY_PATH=/usr/local/lib +bash /etc/init.d/xencommons start + +xl list + +xl create -c /root/test.cfg + +" > etc/local.d/xen.start +chmod +x etc/local.d/xen.start +echo "rc_verbose=yes" >> etc/rc.conf +find . |cpio -H newc -o|gzip > ../xen-rootfs.cpio.gz +cd ../.. + +cat >> binaries/pxelinux.0 <<- EOF +#!ipxe + +kernel xen console=com1 +module bzImage console=hvc0 +module xen-rootfs.cpio.gz +boot +EOF + +# Run the test +rm -f smoke.serial +set +e +timeout -k 1 720 \ +qemu-system-x86_64 \ +-cpu qemu64,+svm \ +-m 2G -smp 2 \ +-monitor none -serial stdio \ +-nographic \ +-device virtio-net-pci,netdev=n0 \ +-netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0 |& tee smoke.serial + +set -e +(grep -q "Domain-0" smoke.serial && grep -q "BusyBox" smoke.serial) || exit 1 +exit 0 -- 2.17.1
[PATCH 2/3] automation: Linux 5.10.74 test-artifact
From: Stefano Stabellini Build a 5.10 kernel to be used as Dom0 and DomU kernel for testing. This is almost the same as the existing ARM64 recipe for Linux 5.9, the only differences are: - upgrade to latest 5.10.x stable - force Xen modules to built-in (on ARM it was already done by defconfig) Also add the exporting job to build.yaml so that the binary can be used during gitlab-ci runs. Signed-off-by: Stefano Stabellini --- automation/gitlab-ci/build.yaml | 11 ++ .../tests-artifacts/kernel/5.10.74.dockerfile | 38 +++ 2 files changed, 49 insertions(+) create mode 100644 automation/tests-artifacts/kernel/5.10.74.dockerfile diff --git a/automation/gitlab-ci/build.yaml b/automation/gitlab-ci/build.yaml index 76b73beead..0034c50950 100644 --- a/automation/gitlab-ci/build.yaml +++ b/automation/gitlab-ci/build.yaml @@ -609,3 +609,14 @@ alpine-3.12-rootfs-export: - binaries/initrd.tar.gz tags: - x86_64 + +kernel-5.10.74-export: + stage: build + image: registry.gitlab.com/xen-project/xen/tests-artifacts/kernel:5.10.74 + script: +- mkdir binaries && cp /bzImage binaries/bzImage + artifacts: +paths: + - binaries/bzImage + tags: +- x86_64 diff --git a/automation/tests-artifacts/kernel/5.10.74.dockerfile b/automation/tests-artifacts/kernel/5.10.74.dockerfile new file mode 100644 index 00..f2dbbecf74 --- /dev/null +++ b/automation/tests-artifacts/kernel/5.10.74.dockerfile @@ -0,0 +1,38 @@ +FROM debian:unstable +LABEL maintainer.name="The Xen Project" \ + maintainer.email="xen-devel@lists.xenproject.org" + +ENV DEBIAN_FRONTEND=noninteractive +ENV LINUX_VERSION=5.10.74 +ENV USER root + +RUN mkdir /build +WORKDIR /build + +# build depends +RUN apt-get update && \ +apt-get --quiet --yes install \ +build-essential \ +libssl-dev \ +bc \ +curl \ +flex \ +bison \ +libelf-dev \ +&& \ +\ +# Build the kernel +curl -fsSLO https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-"$LINUX_VERSION".tar.xz && \ +tar xvJf linux-"$LINUX_VERSION".tar.xz && \ +cd linux-"$LINUX_VERSION" && \ +make defconfig && \ +make xen.config && \ +cp .config .config.orig && \ +cat .config.orig | grep XEN | grep =m |sed 's/=m/=y/g' >> .config && \ +make -j$(nproc) bzImage && \ +cp arch/x86/boot/bzImage / && \ +cd /build && \ +rm -rf linux-"$LINUX_VERSION"* && \ +apt-get autoremove -y && \ +apt-get clean && \ +rm -rf /var/lib/apt/lists* /tmp/* /var/tmp/* -- 2.17.1
[PATCH 0/3] automation: introduce an x86_64 Dom0/DomU test
Hi all, This small patch series introduces a new QEMU-based test to Gitlab-CI. It uses QEMU to emulate an x86_64 machine and run Xen, Dom0 and start a DomU. It is very similar to the existing qemu-alpine-arm64-gcc test but based on x86_64 rather than ARM64. I think it is important because all the other x86_64 gitlab-ci tests we currently have are more narrow and based on XTF. This would be the first end-to-end x86_64 test in gitlab-ci. To make it happen, we need an Alpine Linux rootfs and a Linux kernel. The first two patches introduce them to gitlab-ci. Note that actually nothing will get build during gitlab-ci runs, it has already been done beforehand and uploaded as containers. They only import *existing* containers and binaries into a gitlab-ci run, thus, they cannot fail. The risk to the release of the first two patches is as close to zero as possible. The last patch is the one introducing a new test. This one can fail. However, it is a new test at the end of the pipeline: it doesn't impact the existing tests. In the worst case, the test will fail and the whole pipeline will be marked as "failed" but looking at the jobs all the other will continue to be marked as successful. In short, if it fails, we can simply ignore it. Also, at the moment it is actually succeeding. Cheers, Stefano Stefano Stabellini (3): automation: add x86_64 alpine 3.12 test-artifact automation: Linux 5.10.74 test-artifact automation: add a QEMU based x86_64 Dom0/DomU test automation/gitlab-ci/build.yaml| 24 ++ automation/gitlab-ci/test.yaml | 24 ++ automation/scripts/qemu-alpine-x86_64.sh | 92 ++ automation/tests-artifacts/alpine/3.12.dockerfile | 68 .../tests-artifacts/kernel/5.10.74.dockerfile | 38 + 5 files changed, 246 insertions(+) create mode 100644 automation/scripts/qemu-alpine-x86_64.sh create mode 100644 automation/tests-artifacts/alpine/3.12.dockerfile create mode 100644 automation/tests-artifacts/kernel/5.10.74.dockerfile
[xen-unstable-smoke test] 165719: tolerable all pass - PUSHED
flight 165719 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/165719/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass version targeted for testing: xen 23ec1ebc8acbfd2bf06f6085a776f0db923f9fa9 baseline version: xen 98f60e5de00baf650c574c9352bb19aedb082dea Last test of basis 165708 2021-10-21 11:01:40 Z0 days Testing same since 165719 2021-10-21 18:00:27 Z0 days1 attempts People who touched revisions under test: Hongda Deng jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git 98f60e5de0..23ec1ebc8a 23ec1ebc8acbfd2bf06f6085a776f0db923f9fa9 -> smoke
[ovmf test] 165714: all pass - PUSHED
flight 165714 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/165714/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf 2f286930a8280f4d817460020110009938f695b6 baseline version: ovmf 305fd6bee0bfe1602163d1f8841954f84aa31b68 Last test of basis 165701 2021-10-21 03:40:20 Z0 days Testing same since 165714 2021-10-21 15:40:13 Z0 days1 attempts People who touched revisions under test: Abdul Lateef Attar jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/osstest/ovmf.git 305fd6bee0..2f286930a8 2f286930a8280f4d817460020110009938f695b6 -> xen-tested-master
Re: [XEN][RFC PATCH 10/13] xen/arm: Implement device tree node addition functionalities
Hi Vikram, On 02/09/2021 07:06, Vikram Garhwal wrote: Introduce domctl XEN_DOMCTL_addfpga to add a device-tree node through device XEN_DOMCTL_* are hypercalls to manage a given domain. However, here you are modifying the system. This is similar to managing PCI devices, except here we are dealing with platforms devices. In the case of PCI devices, we are using PHYSDEVOP_*. If we only expect the toolstack to the call (e.g. the kernel won't use it directly), then it would be better to use SYSCTL_* as they are not stable (so we have more flexibility to modify the layout). Furthermore, rather than introduciong 2 new sub-hypercalls for XEN_SYSCTL_* I would introduce a single sub-hypercall that takes a command (e.g. add/remove). Regarding the name, in theory this feature is not FPGA specific. So I would prefer a more generic name, maybe XEN_DOMCTL_dt_overlay? tree overlay. This works with a device tree overlay(.dtbo) as input. Add check_pfdt() to do sanity check on the dtbo. From my experience with libfdt, the library tends to consider the DTB to be sound. So we would have to trust what the toolstack is provided us. So I think we need to make clear that this is basic sanity check and there are no expectation that we will be able to deal with a random blob. Also, added overlay_get_node_info() to get the node's full name with path. This comes handy when checking node for duplication. Each time a overlay node is added, a new fdt(memcpy of device_tree_flattened) is created and updated with overlay node. This updated fdt is further unflattened to a dt_host_new. Next, it checks if overlay node already exists in the dt_host. If overlay node doesn't exist then find the overlay node in dt_host_new, find the overlay node's parent in dt_host and add the node as child under parent in the dt_host. The node is attached as the last node under target parent. Finally, add IRQs, add device to IOMMUs, set permissions and map MMIO for the overlay node. When a node is added using overlay, a new entry is allocated in the overlay_track to keep the track of memory allocation due to addition of overlay node. This is helpful for freeing the memory allocated when a device tree node is removed with domctl XEN_DOMCTL_delfpga domctl. Signed-off-by: Vikram Garhwal --- xen/arch/arm/domctl.c | 262 ++ xen/common/device_tree.c | 54 + xen/include/public/domctl.h | 7 ++ xen/include/xen/device_tree.h | 1 + 4 files changed, 324 insertions(+) diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c index 5986934..0ac635f 100644 --- a/xen/arch/arm/domctl.c +++ b/xen/arch/arm/domctl.c @@ -15,6 +15,8 @@ #include #include #include +/* Included for FPGA dt add. */ We don't usually write done why an header is included. So I would remove this comment. +#include #include #include #include @@ -68,6 +70,61 @@ static int handle_vuart_init(struct domain *d, return rc; } +static int check_pfdt(void *pfdt, uint32_t pfdt_size) +{ +if ( fdt_totalsize(pfdt) != pfdt_size ) +{ +printk(XENLOG_ERR "Partial FDT is not a valid Flat Device Tree\n"); +return -EFAULT; I think -EINVAL is more suitable. +} + +if ( fdt_check_header(pfdt) ) +{ +printk(XENLOG_ERR "Partial FDT is not a valid Flat Device Tree\n"); The printk() is exactly the same as above. So how about combining the two check? Furthemore, from the commit message, the pfdt is an overlay fdt. So shouldn't we have write "The overlay FDT" rather than "Partial FDT"? +return -EFAULT; +} + +return 0; +} + +static void overlay_get_node_info(void *fdto, char *node_full_path) +{ +int fragment; + +/* + * Handle overlay nodes. But for now we are just handling one node. Is this a limitation you plan to handle before this series is going to be committed? + */ +fdt_for_each_subnode(fragment, fdto, 0) +{ +int target; +int overlay; +int subnode; +const char *target_path; + +target = overlay_get_target(device_tree_flattened, fdto, fragment, +_path); +overlay = fdt_subnode_offset(fdto, fragment, "__overlay__"); + +fdt_for_each_subnode(subnode, fdto, overlay) +{ +const char *node_name = fdt_get_name(fdto, subnode, NULL); +int node_name_len = strlen(node_name); +int target_path_len = strlen(target_path); + +memcpy(node_full_path, target_path, target_path_len); Looking at the caller (handle_add_fpga_overlay()), the node_full_path is a fixed array. What does guarantee that the array is large enough to target_path? + +node_full_path[target_path_len] = '/'; + +memcpy(node_full_path + target_path_len + 1, node_name, + node_name_len); + +node_full_path[target_path_len + 1 +
[qemu-mainline test] 165703: regressions - FAIL
flight 165703 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/165703/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-i3866 xen-buildfail REGR. vs. 165682 build-i386-xsm6 xen-buildfail REGR. vs. 165682 build-armhf 6 xen-buildfail REGR. vs. 165682 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-qcow2 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-raw 1 build-check(1) blocked n/a test-armhf-armhf-xl 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit2 1 build-check(1) blocked n/a test-armhf-armhf-xl-cubietruck 1 build-check(1) blocked n/a test-armhf-armhf-xl-multivcpu 1 build-check(1) blocked n/a test-armhf-armhf-xl-rtds 1 build-check(1) blocked n/a test-armhf-armhf-xl-vhd 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-amd 1 build-check(1) blocked n/a test-amd64-i386-pair 1 build-check(1) blocked n/a test-amd64-i386-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt-raw 1 build-check(1) blocked n/a test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt-pair 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-i386 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-amd64 1 build-check(1) blocked n/a test-amd64-coresched-i386-xl 1 build-check(1) blocked n/a build-armhf-libvirt 1 build-check(1) blocked n/a build-i386-libvirt1 build-check(1) blocked n/a test-armhf-armhf-xl-arndale 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit1 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-intel 1 build-check(1) blocked n/a test-amd64-i386-xl1 build-check(1) blocked n/a test-amd64-i386-xl-pvshim 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ws16-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-shadow 1 build-check(1) blocked n/a test-amd64-i386-xl-vhd1 build-check(1) blocked n/a test-amd64-i386-xl-xsm1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 165682 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 165682 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 165682 test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14
[linux-linus test] 165700: tolerable FAIL - PUSHED
flight 165700 linux-linus real [real] flight 165715 linux-linus real-retest [real] http://logs.test-lab.xenproject.org/osstest/logs/165700/ http://logs.test-lab.xenproject.org/osstest/logs/165715/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-arm64-arm64-xl-credit1 8 xen-bootfail pass in 165715-retest test-amd64-amd64-freebsd11-amd64 21 guest-start/freebsd.repeat fail pass in 165715-retest Tests which did not succeed, but are not blocking: test-arm64-arm64-xl-credit1 15 migrate-support-check fail in 165715 never pass test-arm64-arm64-xl-credit1 16 saverestore-support-check fail in 165715 never pass test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 165693 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 165693 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 165693 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 165693 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 165693 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 165693 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 165693 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 165693 test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass version targeted for testing: linux2f111a6fd5b5297b4e92f53798ca086f7c7d33a4 baseline version: linux8e37395c3a5dceff62a5010ebbbc107f4145935c Last test of basis 165693 2021-10-20 16:40:19 Z1 days Testing same since 165700 2021-10-21 03:27:38 Z0 days1 attempts People who touched revisions under test: Christoph Hellwig Gerald Schaefer Hamza
Re: [for-4.16] Re: [PATCH v4] xen/arm: vgic: Ignore write access to ICPENDR*
On 21/10/2021 16:14, Julien Grall wrote: On the previous version, we discussed to include the patch for 4.16. So please tag it with for-4.16 and CC the Release Manager (Ian). This will help him to track what's outstanding for the release. On 21/10/2021 13:03, Hongda Deng wrote: Currently, Xen will return IO unhandled when guests write ICPENDR* virtual registers, which will raise a data abort inside the guest. For Linux guest, these virtual registers will not be accessed. But for Zephyr, these virtual registers will be accessed during the initialization. Zephyr guest will get an IO data abort and crash. Emulating ICPENDR is not easy with the existing vGIC, this patch reworks the emulation to ignore write access to ICPENDR* virtual registers and print a message about whether they are already pending instead of returning unhandled. More details can be found at [1]. [1] https://github.com/zephyrproject-rtos/zephyr/blob/eaf6cf745df3807e6e cc941c3a30de6c179ae359/drivers/interrupt_controller/intc_gicv3.c#L274 Signed-off-by: Hongda Deng While I agree the Reviewed-by from Bertrand should be dropped, the Release-acked-by from Ian is simply a way to say he is happy to include the patch for 4.16. So this should have been retain. The patch looks good to me, so I can add Ian's tag on commit: Reviewed-by: Julien Grall Committed. Cheers, -- Julien Grall
[ovmf test] 165701: all pass - PUSHED
flight 165701 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/165701/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf 305fd6bee0bfe1602163d1f8841954f84aa31b68 baseline version: ovmf 6893865b3010bb6192f732643c4b8ba026726d07 Last test of basis 165690 2021-10-20 13:40:00 Z1 days Testing same since 165701 2021-10-21 03:40:20 Z0 days1 attempts People who touched revisions under test: IanX Kuo jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/osstest/ovmf.git 6893865b30..305fd6bee0 305fd6bee0bfe1602163d1f8841954f84aa31b68 -> xen-tested-master
Re: [PATCH 12/12] xen/x86: add hypercall performance counters for hvm, correct pv
On 15.10.2021 14:51, Juergen Gross wrote: > The HVM hypercall handler is missing incrementing the per hypercall > counters. Add that. > > The counters for PV are handled wrong, as they are not using > perf_incra() with the number of the hypercall as index, but are > incrementing the total number of hypercalls only. Fix that. Why do you say "total number"? Isn't it that all accounting goes into set_trap_table's slot, effectively making that slot a "total number" despite not being labeled that way? Also this fix renders largely redundant the calls_to_multicall counter. Could I talk you into deleting that at the same time? (As to the "not fully redundant": I consider it suspicious that this counter gets incremented at the bottom of the function, not at the top.) Finally I take it that with the Kconfig setting being under DEBUG, we don't consider security supported builds with PERF_COUNTERS enabled. Otherwise as a prereq I would think perfc_incra() would need teaching of array_index_nospec(). In any event, preferably with at least the description slightly adjusted, Reviewed-by: Jan Beulich Jan
[for-4.16] Re: [PATCH v4] xen/arm: vgic: Ignore write access to ICPENDR*
Hello Hongda, On the previous version, we discussed to include the patch for 4.16. So please tag it with for-4.16 and CC the Release Manager (Ian). This will help him to track what's outstanding for the release. On 21/10/2021 13:03, Hongda Deng wrote: Currently, Xen will return IO unhandled when guests write ICPENDR* virtual registers, which will raise a data abort inside the guest. For Linux guest, these virtual registers will not be accessed. But for Zephyr, these virtual registers will be accessed during the initialization. Zephyr guest will get an IO data abort and crash. Emulating ICPENDR is not easy with the existing vGIC, this patch reworks the emulation to ignore write access to ICPENDR* virtual registers and print a message about whether they are already pending instead of returning unhandled. More details can be found at [1]. [1] https://github.com/zephyrproject-rtos/zephyr/blob/eaf6cf745df3807e6e cc941c3a30de6c179ae359/drivers/interrupt_controller/intc_gicv3.c#L274 Signed-off-by: Hongda Deng While I agree the Reviewed-by from Bertrand should be dropped, the Release-acked-by from Ian is simply a way to say he is happy to include the patch for 4.16. So this should have been retain. The patch looks good to me, so I can add Ian's tag on commit: Reviewed-by: Julien Grall Cheers, -- Julien Grall
Re: (subset) [PATCH 0/9] block: reviewed add_disk() error handling set
On Fri, 15 Oct 2021 16:30:19 -0700, Luis Chamberlain wrote: > Jens, > > I had last split up patches into 7 groups, but at this point now > most changes are merged except a few more drivers. Instead of creating > a new patch set for each of the 7 groups I'm creating 3 new groups of > patches now: > > [...] Applied, thanks! [3/9] dm: add add_disk() error handling commit: e7089f65dd51afeda5eb760506b5950d95f9ec29 [4/9] bcache: add error handling support for add_disk() commit: 2961c3bbcaec0ed7fb7b9a465b3796f37f2294e5 [5/9] xen-blkfront: add error handling support for add_disk() commit: 293a7c528803321479593d42d0898bb5a9769db1 [6/9] m68k/emu/nfblock: add error handling support for add_disk() commit: 21fd880d3da7564bab68979417cab7408e4f9642 [7/9] um/drivers/ubd_kern: add error handling support for add_disk() commit: 66638f163a2b5c5b462ca38525129b14a20117eb [8/9] rnbd: add error handling support for add_disk() commit: 2e9e31bea01997450397d64da43b6675e0adb9e3 [9/9] mtd: add add_disk() error handling commit: 83b863f4a3f0de4ece7802d9121fed0c3e64145f Best regards, -- Jens Axboe
Re: [RFC PATCH 02/10] accel: Use qemu_security_policy_taint(), mark KVM and Xen as safe
It's been a while... Daniel P. Berrangé writes: > On Thu, Sep 09, 2021 at 01:20:16AM +0200, Philippe Mathieu-Daudé wrote: >> Add the AccelClass::secure_policy_supported field to classify >> safe (within security boundary) vs unsafe accelerators. >> >> Signed-off-by: Philippe Mathieu-Daudé >> --- >> include/qemu/accel.h | 5 + >> accel/kvm/kvm-all.c | 1 + >> accel/xen/xen-all.c | 1 + >> softmmu/vl.c | 3 +++ >> 4 files changed, 10 insertions(+) >> >> diff --git a/include/qemu/accel.h b/include/qemu/accel.h >> index 4f4c283f6fc..895e30be0de 100644 >> --- a/include/qemu/accel.h >> +++ b/include/qemu/accel.h >> @@ -44,6 +44,11 @@ typedef struct AccelClass { >> hwaddr start_addr, hwaddr size); >> #endif >> bool *allowed; >> +/* >> + * Whether the accelerator is withing QEMU security policy boundary. >> + * See: https://www.qemu.org/contribute/security-process/ >> + */ >> +bool secure_policy_supported; > > The security handling policy is a high level concept that is > open to variation over time and also by downstream distro > vendors. Moreover, the concept of "tainting" isn't limited to "because security". > At a code level we should be dealing in a more fundamental > concept. At an accelerator level we should really jsut > declare whether or not the accelerator impl is considered > to be secure against malicious guest code. > > eg > > /* Whether this accelerator is secure against execution > * of malciious guest machine code */ > bool secure; What I'd like to see is a separation of "assertions", like "not meant to serve as security boundary", and policy. Yes, this is vague. Take it as food for thought. >> /* >> * Array of global properties that would be applied when specific >> * accelerator is chosen. It works like MachineClass.compat_props >> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c >> index 0125c17edb8..eb6b9e44df2 100644 >> --- a/accel/kvm/kvm-all.c >> +++ b/accel/kvm/kvm-all.c >> @@ -3623,6 +3623,7 @@ static void kvm_accel_class_init(ObjectClass *oc, void >> *data) >> ac->init_machine = kvm_init; >> ac->has_memory = kvm_accel_has_memory; >> ac->allowed = _allowed; >> +ac->secure_policy_supported = true; >> >> object_class_property_add(oc, "kernel-irqchip", "on|off|split", >> NULL, kvm_set_kernel_irqchip, >> diff --git a/accel/xen/xen-all.c b/accel/xen/xen-all.c >> index 69aa7d018b2..57867af5faf 100644 >> --- a/accel/xen/xen-all.c >> +++ b/accel/xen/xen-all.c >> @@ -198,6 +198,7 @@ static void xen_accel_class_init(ObjectClass *oc, void >> *data) >> ac->setup_post = xen_setup_post; >> ac->allowed = _allowed; >> ac->compat_props = g_ptr_array_new(); >> +ac->secure_policy_supported = true; >> >> compat_props_add(ac->compat_props, compat, G_N_ELEMENTS(compat)); >> >> diff --git a/softmmu/vl.c b/softmmu/vl.c >> index 92c05ac97ee..e4f94e159c3 100644 >> --- a/softmmu/vl.c >> +++ b/softmmu/vl.c >> @@ -2388,6 +2388,9 @@ static int do_configure_accelerator(void *opaque, >> QemuOpts *opts, Error **errp) >> return 0; >> } >> >> +qemu_security_policy_taint(!ac->secure_policy_supported, >> + "%s accelerator", acc); > > We need this information to be introspectable, becuase stuff printed > to stderr is essentially opaque to libvirt and mgmt apps above. > > We don't have a convenient "query-accel" command but I think this > could possibly fit into 'query-target'. ie the TargetInfo struct > gain a field: > > > ## > # @TargetInfo: > # > # Information describing the QEMU target. > # > # @arch: the target architecture > # @secure: Whether the currently active accelerator for this target > # is secure against execution of malicous guest code > # > # Since: 1.2 > ## > { 'struct': 'TargetInfo', > 'data': { 'arch': 'SysEmuTarget', > 'secure': 'bool'} } My preferred means of introspection is QAPI schema introspection. For QMP, that's query-qmp-schema. For CLI, it doesn't exist, yet. If it did, then it would tell us that (QAPIfied) -accel takes an argument @accel of a certain enumeration type. We could then tack suitable feature flags to the enumeration type's values. If we make the feature flags "special", i.e. known to QAPI, we can then tie them to policy, like special feature flag 'deprecated' is tied to policy configured with -compat deprecated-{input,output}=... Alternatively, leave policy to the management application. QAPI schema feature flags plus policy are is not a *complete* solution, just like feature flag 'deprecated' and -compat are not a complete solution for handling use of deprecated interfaces: we can and do deprecate usage that isn't tied to a syntactic element in the QAPI schema. Example: commit a9b305ba291 deprecated use of socket chardev option wait together with server=true. It is, however, a solution for a
Re: [PATCH 10/12] xen/x86: call hypercall handlers via switch statement
On 15.10.2021 14:51, Juergen Gross wrote: > Instead of using a function table use the generated switch statement > macros for calling the appropriate hypercall handlers. > > This is beneficial to performance and avoids speculation issues. > > With calling the handlers using the correct number of parameters now > it is possible to do the parameter register clobbering in the NDEBUG > case after returning from the handler. This in turn removes the only > users of hypercall_args_table[] which can be removed now. "removed" reads misleading to me: You really replace it by new tables, using script-generated initializers. Also it looks like you're doubling the data, as the same sets were previously used by pv64/hvm64 and pv32/hvm32 respectively. > --- a/xen/arch/x86/hvm/hypercall.c > +++ b/xen/arch/x86/hvm/hypercall.c > @@ -108,56 +108,10 @@ long hvm_physdev_op(int cmd, > XEN_GUEST_HANDLE_PARAM(void) arg) > return compat_physdev_op(cmd, arg); > } > > -#define HYPERCALL(x) \ > -[ __HYPERVISOR_ ## x ] = { (hypercall_fn_t *) do_ ## x, \ > - (hypercall_fn_t *) do_ ## x } > - > -#define HVM_CALL(x) \ > -[ __HYPERVISOR_ ## x ] = { (hypercall_fn_t *) hvm_ ## x, \ > - (hypercall_fn_t *) hvm_ ## x } > - > -#define COMPAT_CALL(x) \ > -[ __HYPERVISOR_ ## x ] = { (hypercall_fn_t *) do_ ## x, \ > - (hypercall_fn_t *) compat_ ## x } > - > -static const struct { > -hypercall_fn_t *native, *compat; > -} hvm_hypercall_table[] = { > -HVM_CALL(memory_op), > -COMPAT_CALL(multicall), > -#ifdef CONFIG_GRANT_TABLE > -HVM_CALL(grant_table_op), > -#endif > -HYPERCALL(vm_assist), > -COMPAT_CALL(vcpu_op), > -HVM_CALL(physdev_op), > -COMPAT_CALL(xen_version), > -HYPERCALL(console_io), > -HYPERCALL(event_channel_op), > -COMPAT_CALL(sched_op), > -COMPAT_CALL(set_timer_op), > -COMPAT_CALL(xsm_op), > -HYPERCALL(hvm_op), > -HYPERCALL(sysctl), > -HYPERCALL(domctl), > -#ifdef CONFIG_ARGO > -COMPAT_CALL(argo_op), > -#endif > -COMPAT_CALL(platform_op), > -#ifdef CONFIG_PV > -COMPAT_CALL(mmuext_op), > -#endif > -HYPERCALL(xenpmu_op), > -COMPAT_CALL(dm_op), > -#ifdef CONFIG_HYPFS > -HYPERCALL(hypfs_op), > +#ifndef NDEBUG > +static unsigned char hypercall_args_64[] = hypercall_args_hvm64; > +static unsigned char hypercall_args_32[] = hypercall_args_hvm32; Irrespective of this being debugging-only: Const? > @@ -239,33 +176,11 @@ int hvm_hypercall(struct cpu_user_regs *regs) > HVM_DBG_LOG(DBG_LEVEL_HCALL, "hcall%lu(%lx, %lx, %lx, %lx, %lx)", > eax, rdi, rsi, rdx, r10, r8); > > -#ifndef NDEBUG > -/* Deliberately corrupt parameter regs not used by this hypercall. */ > -switch ( hypercall_args_table[eax].native ) > -{ > -case 0: rdi = 0xdeadbeefdeadf00dUL; fallthrough; > -case 1: rsi = 0xdeadbeefdeadf00dUL; fallthrough; > -case 2: rdx = 0xdeadbeefdeadf00dUL; fallthrough; > -case 3: r10 = 0xdeadbeefdeadf00dUL; fallthrough; > -case 4: r8 = 0xdeadbeefdeadf00dUL; > -} > -#endif > - > -regs->rax = hvm_hypercall_table[eax].native(rdi, rsi, rdx, r10, r8); > +call_handlers_hvm64(eax, regs->rax, rdi, rsi, rdx, r10, r8); > > #ifndef NDEBUG > -if ( !curr->hcall_preempted ) > -{ > -/* Deliberately corrupt parameter regs used by this hypercall. */ > -switch ( hypercall_args_table[eax].native ) > -{ > -case 5: regs->r8 = 0xdeadbeefdeadf00dUL; fallthrough; > -case 4: regs->r10 = 0xdeadbeefdeadf00dUL; fallthrough; > -case 3: regs->rdx = 0xdeadbeefdeadf00dUL; fallthrough; > -case 2: regs->rsi = 0xdeadbeefdeadf00dUL; fallthrough; > -case 1: regs->rdi = 0xdeadbeefdeadf00dUL; > -} > -} > +if ( !curr->hcall_preempted && regs->rax != -ENOSYS ) > +clobber_regs(regs, hypercall_args_64[eax]); I'm not fundamentally opposed, but sadly -ENOSYS comes back also in undue situations, e.g. various hypercalls still produce this for "unknown sub-function". Hence the weakened clobbering wants at least mentioning, perhaps also justifying, in the description. > @@ -55,4 +42,34 @@ compat_common_vcpu_op( > > #endif /* CONFIG_COMPAT */ > > +#ifndef NDEBUG Hmm, I was actuall hoping for the conditional to actually live ... > +static inline void clobber_regs(struct cpu_user_regs *regs, > +unsigned int nargs) > +{ ... here and ... > +/* Deliberately corrupt used parameter regs. */ > +switch ( nargs ) > +{ > +case 5: regs->r8 = 0xdeadbeefdeadf00dUL; fallthrough; > +case 4: regs->r10 = 0xdeadbeefdeadf00dUL; fallthrough; > +case 3: regs->rdx =
[xen-unstable test] 165699: tolerable FAIL - PUSHED
flight 165699 xen-unstable real [real] flight 165710 xen-unstable real-retest [real] http://logs.test-lab.xenproject.org/osstest/logs/165699/ http://logs.test-lab.xenproject.org/osstest/logs/165710/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 12 debian-hvm-install fail pass in 165710-retest Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-rtds 20 guest-localmigrate/x10 fail REGR. vs. 165692 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 165692 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 165692 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 165692 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 165692 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 165692 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 165692 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 165692 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 165692 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 165692 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 165692 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 165692 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 165692 test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-amd64-i386-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-check
Re: [XEN PATCH v7 42/51] build: grab common EFI source files in arch specific dir
On 21.10.2021 15:54, Anthony PERARD wrote: > On Thu, Oct 21, 2021 at 01:24:27PM +0200, Jan Beulich wrote: >> On 21.10.2021 13:03, Anthony PERARD wrote: >>> On Mon, Oct 18, 2021 at 10:48:26AM +0200, Jan Beulich wrote: On 15.10.2021 18:29, Anthony PERARD wrote: > On Thu, Oct 14, 2021 at 10:51:44AM +0200, Jan Beulich wrote: >> On 24.08.2021 12:50, Anthony PERARD wrote: >>> obj-y += boot.init.o pe.init.o ebmalloc.o runtime.o >>> obj-$(CONFIG_ACPI) += efi-dom0.init.o >>> + >>> +$(obj)/%.c: common/efi/%.c >>> + $(Q)cp -f $< $@ >> >> In case both trees are on the same file system, trying to hardlink first >> would seem desirable. When copying, I think you should also pass -p. > > I don't know if doing an hardlink is a good thing to do, I'm not sure of > the kind of issue this could bring. As for -p, I don't think it's a good > idea to copy the mode, ownership, and timestamps of the source file, I'd > rather have the timestamps that Make expect, e.i. "now". Why would "now" be correct (or expected) in any way? The cloned file is no different from the original. Nevertheless I agree that -p is not ideal; it's just that the more fine grained option to preserve just the timestamp is non-standard afaik. You could try that first and fall back to -p ... Otherwise, failing hard linking and using "cp -p", I'm afraid I'd prefer symlinking despite the arguments against it that you name in the description. >>> >>> I guess I'm missing something, is there a reason to keep/copy the >>> timestamps of the original files? >> >> Avoidance of confusion is my main aim here. I certainly would be puzzled >> to see what looks like a source file to have a time stamp much newer than >> expected. > > So, there isn't really anything to do with the timestamps :-). I guess > we could keep using symbolic links, but force update the link at every > build. > > I've tried that: > $(obj)/%.c: $(abs_srctree)/common/efi/%.c FORCE > $(Q)ln -nsf $< $@ > > and make seems happy. The link command run every time (due to adding > FORCE), but the `CC` command isn't, so that seems good. The recipe that > would run the `CC` command check if the prerequisite are newer than the > target using $? so it doesn't matter if the rule that update the source > file as run or not. Looks okay to me. One additional consideration, though: Linux puts a "source" link in the build tree. If we did so as well, then that could be the only absolute symlink that's needed. Links like the one you suggest could be relative ones into source/. But I guess this could as well be left as a future exercise, in case anyone cares to limit the number of absolute symlinks. Jan
Re: [XEN PATCH v7 49/51] build: adding out-of-tree support to the xen build
On Mon, Oct 18, 2021 at 12:36:45PM +0200, Jan Beulich wrote: > On 18.10.2021 12:28, Juergen Gross wrote: > > On 18.10.21 11:51, Anthony PERARD wrote: > >> On Mon, Oct 18, 2021 at 11:02:20AM +0200, Jan Beulich wrote: > >>> Oh, now I'm curious as to the why here. I thought use of $(srctree) > >>> ought to be the exception. > >> > >> In Linux, the use of $(srctree) is indeed the exception. This is because > >> we have VPATH=$(srctree), so when `make` look for a prerequisite or a > >> target it will look first in the current directory and then in > >> $(srctree). That works fine as long as the source tree only have sources > >> and no built files. > >> > >> But if we want to be able to build the pv-shim without the linkfarm and > >> thus using out-of-tree build, we are going to need the ability to build > >> from a non-clean source tree. I don't think another way is possible. > > > > Is there any reason (apart from historical ones) to build the hypervisor > > in $(srctree)? > > > > I could see several advantages to build it in another directory as soon > > as the build system has this capability: > > > > - possibility to have a simple build target for building multiple archs > >(assuming the cross-tools are available), leading to probably less > >problems with breaking the build of "the other" architecture we are > >normally not working with (and in future with e.g. Risc-V being added > >this will be even more important) > > > > - possibility to have a debug and a non-debug build in parallel (in fact > >at least at SUSE we are working around that by building those with an > >intermediate "make clean" for being able to package both variants) > > > > - make clean for the hypervisor part would be just a "rm -r" > > I fully agree, yet ... > > > Yes, this would require us (the developers) to maybe change some habits, > > but I think this would be better than working around the issues by > > adding $(srctree) all over the build system. > > ... developers' habits would only be my second concern here (and if that > had been the only one, then I would not see this as a reason speaking > against the change, but as said I've never been building from the root, > and I've also been building sort of out-of-tree all the time). Yet while > writing this reply I came to realize that my primary concern was wrong: > People would not need to adjust their spec files (or alike), at least > not as long as they consume only files living under dist/. > > So, Anthony - thoughts about making the default in-tree Xen build > actually build into, say, build/xen/? I don't think I should be the one answering this question. It would probably be fairly easy to do. But, I've already got some unfinished patches which add the ability to do in-tree and out-of-tree at the same time, so I'll propose them and we can decide whether they are worth it or if they are too much. That is, after we have at least the ability to do out-of-tree build. Cheers, -- Anthony PERARD
[xen-unstable-smoke test] 165708: tolerable all pass - PUSHED
flight 165708 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/165708/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass version targeted for testing: xen 98f60e5de00baf650c574c9352bb19aedb082dea baseline version: xen 118da371d1ff5d8432fa299544b1ea5e7e3710f0 Last test of basis 165691 2021-10-20 16:01:40 Z0 days Testing same since 165708 2021-10-21 11:01:40 Z0 days1 attempts People who touched revisions under test: Julien Grall jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git 118da371d1..98f60e5de0 98f60e5de00baf650c574c9352bb19aedb082dea -> smoke
Re: [XEN PATCH v7 42/51] build: grab common EFI source files in arch specific dir
On Thu, Oct 21, 2021 at 01:24:27PM +0200, Jan Beulich wrote: > On 21.10.2021 13:03, Anthony PERARD wrote: > > On Mon, Oct 18, 2021 at 10:48:26AM +0200, Jan Beulich wrote: > >> On 15.10.2021 18:29, Anthony PERARD wrote: > >>> On Thu, Oct 14, 2021 at 10:51:44AM +0200, Jan Beulich wrote: > On 24.08.2021 12:50, Anthony PERARD wrote: > > obj-y += boot.init.o pe.init.o ebmalloc.o runtime.o > > obj-$(CONFIG_ACPI) += efi-dom0.init.o > > + > > +$(obj)/%.c: common/efi/%.c > > + $(Q)cp -f $< $@ > > In case both trees are on the same file system, trying to hardlink first > would seem desirable. When copying, I think you should also pass -p. > >>> > >>> I don't know if doing an hardlink is a good thing to do, I'm not sure of > >>> the kind of issue this could bring. As for -p, I don't think it's a good > >>> idea to copy the mode, ownership, and timestamps of the source file, I'd > >>> rather have the timestamps that Make expect, e.i. "now". > >> > >> Why would "now" be correct (or expected) in any way? The cloned file is no > >> different from the original. Nevertheless I agree that -p is not ideal; > >> it's just that the more fine grained option to preserve just the timestamp > >> is non-standard afaik. You could try that first and fall back to -p ... > >> Otherwise, failing hard linking and using "cp -p", I'm afraid I'd prefer > >> symlinking despite the arguments against it that you name in the > >> description. > > > > I guess I'm missing something, is there a reason to keep/copy the > > timestamps of the original files? > > Avoidance of confusion is my main aim here. I certainly would be puzzled > to see what looks like a source file to have a time stamp much newer than > expected. So, there isn't really anything to do with the timestamps :-). I guess we could keep using symbolic links, but force update the link at every build. I've tried that: $(obj)/%.c: $(abs_srctree)/common/efi/%.c FORCE $(Q)ln -nsf $< $@ and make seems happy. The link command run every time (due to adding FORCE), but the `CC` command isn't, so that seems good. The recipe that would run the `CC` command check if the prerequisite are newer than the target using $? so it doesn't matter if the rule that update the source file as run or not. Thanks, -- Anthony PERARD
Re: Tentative fix for "out of PoD memory" issue
On Thu, Oct 21, 2021 at 01:53:06PM +0200, Juergen Gross wrote: > Marek, > > could you please test whether the attached patch is fixing your > problem? Sure. In fact, I made a similar patch in the meantime (attached) to experiment with this a bit. > BTW, I don't think this couldn't happen before kernel 5.15. I guess > my modification to use a kernel thread instead of a workqueue just > made the issue more probable. I think you are right here. But this all looks still a bit weird. 1. baseline: 5.10.61 (before using kernel thread - which was backported to stable branches). Here the startup completes successfully (no "out of PoD memory" issue) with memory=270MB. 2. 5.10.61 with added boot delay patch: The delay is about 18s and the guest boot successfully. 3. 5.10.71 with "xen/balloon: fix cancelled balloon action" but without delay patch: The domain is killed during startup (in the middle of fsck, I think) 4. 5.10.74 with delay patch: The delay is about 19s and the guest boot successfully. Now the weird part: with memory=270MB with the delay patch, the balloon delay _fails_ - state=BP_ECANCELED, and credit is -19712 at that time. In both thread and workqueue balloon variants. Yet, it isn't killed (*). But with 5.10.61, even without the delay patch, the guest starts successfully in the end. Also, I think there was some implicit wait for initial balloon down before. That was the main motivation for 197ecb3802c0 "xen/balloon: add runtime control for scrubbing ballooned out pages" - because that initial balloon down held the system startup for some long time. Sadly, I can't find my notes from debugging that (especially if I had written down a stacktrace _where_ exactly it was waiting)... > I couldn't reproduce the crash you are seeing, but the introduced > wait was 4.2 seconds on my test system (a PVH guest with 2 GB of > memory, maxmem 6 GB). I'm testing it on a much more aggressive setting: - memory: 270 MB (the minimal that is sufficient to boot the system) - maxmem: 4 GB The default settings in Qubes are: - memory: 400 MB - maxmem: 4 GB That should explains why it happens on Qubes way more often than elsewhere. (*) At some point during system boot, qubes memory manager kicks in and the VM gets more memory. But it's rather late, and definitely after it is killed with "out of PoD memory" in other cases. -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab From 947818c731094a952d4955e99a23ef336daf7ab9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marek=20Marczykowski-G=C3=B3recki?= Date: Thu, 21 Oct 2021 01:10:21 +0200 Subject: [PATCH] WIP: xen/balloon: wait for initial balloon down before starting userspace MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Organization: Invisible Things Lab Cc: Marek Marczykowski-Górecki When HVM/PVH guest with maxmem > memory, a populate on demand feature is used. This allows the guest to see up to 'maxmem' memory, but when it tries to use more than 'memory', it is crashed. Balloon driver should prevent that by ballooning down the guest before it tries to use too much memory. Unfortunately, this was done asynchronously and it wasn't really guaranteed to be quick enough. And indeed, with recent kernel versions, the initial balloon down process is slower and guests with small initial 'memory' are crashed frequently by Xen. Fix this by adding late init call that waits for the initial balloon down to complete, before allowing any userspace to run. If that initial balloon down fails, it is very likely that guest will be killed (as soon as it will really use all the memory that something has allocated) - print a message about that to aid diagnosing issues. Signed-off-by: Marek Marczykowski-Górecki (cherry picked from commit 9a226e669c918c98ee603ee30a1798da6434a423) --- drivers/xen/balloon.c | 27 ++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index b57b2067ecbf..c2a4e25a14dc 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -56,6 +56,7 @@ #include #include #include +#include #include #include @@ -136,6 +137,8 @@ static DEFINE_MUTEX(balloon_mutex); struct balloon_stats balloon_stats; EXPORT_SYMBOL_GPL(balloon_stats); +static DECLARE_COMPLETION(initial_balloon); + /* We increase/decrease in batches which fit in a page */ static xen_pfn_t frame_list[PAGE_SIZE / sizeof(xen_pfn_t)]; @@ -501,7 +504,6 @@ static void balloon_process(struct work_struct *work) enum bp_state state = BP_DONE; long credit; - do { mutex_lock(_mutex); @@ -526,6 +528,15 @@ static void balloon_process(struct work_struct *work) state = update_schedule(state); + if (credit >= 0) + complete(_balloon); + else if (state == BP_ECANCELED) { + if (!completion_done(_balloon) &&
Re: xen/arm: Missing appropriate locking for the IOMMU (WAS Re: [PATCH v5 02/11] xen/arm: Add PHYSDEVOP_pci_device_(*add/remove) support for ARM)
Hi Julien, > On 21 Oct 2021, at 14:47, Julien Grall wrote: > > > > On 21/10/2021 14:15, Bertrand Marquis wrote: >> Hi Julien, > > Hi Bertand, > >>> On 21 Oct 2021, at 10:28, Julien Grall wrote: >>> >>> Hi all, >>> >>> While going through the passthrough code. I noticed that we don't have a >>> common lock for the IOMMU between the PCI and DT code. >>> >>> This is going to be an issue given it would technically be possible to add >>> a PCI device while assigning a DT. >>> >>> Rahul, Bertrand, Oleksandr, can you have a look at the issue? >> Yes we can have a look at this. >> Right now pci device add is done by dom0 so I do not think we have an issue >> in practice unless I wrongly understood something > This will depend on the XSM policy. With the default one, then yes I agree > that only dom0 (we don't support hardware domain) can add PCI device. > > However, this restriction doesn't really matter here. You would be relying on > dom0 to do the locking and AFAIK this doesn't exist. Instead, the admin would > have to ensure that two don't happen at the same time. > > Anyway, I think Xen should take care of preventing concurrent IOMMU > operations rather than relying on external subsystem (e.g. dom0) to do it. At > least the Arm SMMU driver will rely the generic locking to modify atomically > internal list. Agree, was just trying to make sure I understood the problem correctly. We will check that. Cheers Bertrand > > Cheers, > > -- > Julien Grall
Re: xen/arm: Missing appropriate locking for the IOMMU (WAS Re: [PATCH v5 02/11] xen/arm: Add PHYSDEVOP_pci_device_(*add/remove) support for ARM)
On 21/10/2021 14:15, Bertrand Marquis wrote: Hi Julien, Hi Bertand, On 21 Oct 2021, at 10:28, Julien Grall wrote: Hi all, While going through the passthrough code. I noticed that we don't have a common lock for the IOMMU between the PCI and DT code. This is going to be an issue given it would technically be possible to add a PCI device while assigning a DT. Rahul, Bertrand, Oleksandr, can you have a look at the issue? Yes we can have a look at this. Right now pci device add is done by dom0 so I do not think we have an issue in practice unless I wrongly understood something This will depend on the XSM policy. With the default one, then yes I agree that only dom0 (we don't support hardware domain) can add PCI device. However, this restriction doesn't really matter here. You would be relying on dom0 to do the locking and AFAIK this doesn't exist. Instead, the admin would have to ensure that two don't happen at the same time. Anyway, I think Xen should take care of preventing concurrent IOMMU operations rather than relying on external subsystem (e.g. dom0) to do it. At least the Arm SMMU driver will rely the generic locking to modify atomically internal list. Cheers, -- Julien Grall
Re: Ping: [PATCH] x86/xstate: reset cached register values on resume
On Mon, Oct 18, 2021 at 10:21:28AM +0200, Jan Beulich wrote: > On 24.08.2021 23:11, Andrew Cooper wrote: > > On 18/08/2021 13:44, Andrew Cooper wrote: > >> On 18/08/2021 12:30, Marek Marczykowski-Górecki wrote: > >>> set_xcr0() and set_msr_xss() use cached value to avoid setting the > >>> register to the same value over and over. But suspend/resume implicitly > >>> reset the registers and since percpu areas are not deallocated on > >>> suspend anymore, the cache gets stale. > >>> Reset the cache on resume, to ensure the next write will really hit the > >>> hardware. Choose value 0, as it will never be a legitimate write to > >>> those registers - and so, will force write (and cache update). > >>> > >>> Note the cache is used io get_xcr0() and get_msr_xss() too, but: > >>> - set_xcr0() is called few lines below in xstate_init(), so it will > >>> update the cache with appropriate value > >>> - get_msr_xss() is not used anywhere - and thus not before any > >>> set_msr_xss() that will fill the cache > >>> > >>> Fixes: aca2a985a55a "xen: don't free percpu areas during suspend" > >>> Signed-off-by: Marek Marczykowski-Górecki > >>> > >> I'd prefer to do this differently. As I said in the thread, there are > >> other registers such as MSR_TSC_AUX which fall into the same category, > >> and I'd like to make something which works systematically. > > > > Ok - after some searching, I think we have problems with: > > > > cpu/common.c:47:DEFINE_PER_CPU(struct cpuidmasks, cpuidmasks); > > cpu/common.c:120:static DEFINE_PER_CPU(uint64_t, msr_misc_features); > > msr.c:35:DEFINE_PER_CPU(uint32_t, tsc_aux); > > xstate.c:36:static DEFINE_PER_CPU(uint64_t, xcr0); > > xstate.c:79:static DEFINE_PER_CPU(uint64_t, xss); > > > > There is also: > > > > traps.c:100:DEFINE_PER_CPU(uint64_t, efer); > > > > which we *almost* handle correctly, but fail to update the cache on the > > BSP out of S3. > > > > > > For the APIC, I think we have issues with: > > > > irq.c:1083:static DEFINE_PER_CPU(struct pending_eoi, > > pending_eoi[NR_DYNAMIC_VECTORS]); > > > > because we don't defer S3 until all pending EOIs are complete. > > As your planned more extensive rework appears to not have made much > progress yet, may I suggest that we go with Marek's fix for 4.16, > with the one adjustment I suggested alongside giving my R-b? I think that's the only viable solution in order to avoid shipping a broken 4.16 so we should go ahead with it. Thanks, Roger.
Re: xen/arm: Missing appropriate locking for the IOMMU (WAS Re: [PATCH v5 02/11] xen/arm: Add PHYSDEVOP_pci_device_(*add/remove) support for ARM)
Hi Julien, > On 21 Oct 2021, at 10:28, Julien Grall wrote: > > Hi all, > > While going through the passthrough code. I noticed that we don't have a > common lock for the IOMMU between the PCI and DT code. > > This is going to be an issue given it would technically be possible to add a > PCI device while assigning a DT. > > Rahul, Bertrand, Oleksandr, can you have a look at the issue? Yes we can have a look at this. Right now pci device add is done by dom0 so I do not think we have an issue in practice unless I wrongly understood something. But for sure in theory yes we need to look at this. Cheers Bertrand > > Cheers, > > On 06/10/2021 18:40, Rahul Singh wrote: >> Hardware domain is in charge of doing the PCI enumeration and will >> discover the PCI devices and then will communicate to XEN via hyper >> call PHYSDEVOP_pci_device_add(..) to add the PCI devices in XEN. >> Also implement PHYSDEVOP_pci_device_remove(..) to remove the PCI device. >> As most of the code for PHYSDEVOP_pci_device_* is the same between x86 >> and ARM, move the code to a common file to avoid duplication. >> There are other PHYSDEVOP_pci_device_* operations to add PCI devices. >> Currently implemented PHYSDEVOP_pci_device_remove(..) and >> PHYSDEVOP_pci_device_add(..) only as those are minimum required to >> support PCI passthrough on ARM. >> Signed-off-by: Rahul Singh >> --- >> Change in v5: >> - Move the pci_physdev_op() stub to xen/arch/arm/physdev.c. >> Change in v4: >> - Move file commom/physdev.c to drivers/pci/physdev.c >> - minor comments. >> Change in v3: Fixed minor comment. >> Change in v2: >> - Add support for PHYSDEVOP_pci_device_remove() >> - Move code to common code >> --- >> --- >> xen/arch/arm/physdev.c| 6 ++- >> xen/arch/x86/physdev.c| 52 +-- >> xen/arch/x86/x86_64/physdev.c | 2 +- >> xen/drivers/pci/Makefile | 1 + >> xen/drivers/pci/physdev.c | 80 +++ >> xen/include/public/arch-arm.h | 4 +- >> xen/include/xen/hypercall.h | 4 ++ >> 7 files changed, 96 insertions(+), 53 deletions(-) >> create mode 100644 xen/drivers/pci/physdev.c >> diff --git a/xen/arch/arm/physdev.c b/xen/arch/arm/physdev.c >> index e91355fe22..f9aa274dda 100644 >> --- a/xen/arch/arm/physdev.c >> +++ b/xen/arch/arm/physdev.c >> @@ -8,13 +8,17 @@ >> #include >> #include >> #include >> -#include >> +#include >> int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) >> { >> +#ifdef CONFIG_HAS_PCI >> +return pci_physdev_op(cmd, arg); >> +#else >> gdprintk(XENLOG_DEBUG, "PHYSDEVOP cmd=%d: not implemented\n", cmd); >> return -ENOSYS; >> +#endif >> } >>/* >> diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c >> index 23465bcd00..ea38be8b79 100644 >> --- a/xen/arch/x86/physdev.c >> +++ b/xen/arch/x86/physdev.c >> @@ -12,7 +12,7 @@ >> #include >> #include >> #include >> -#include >> +#include >> #include >> #include >> #include >> @@ -480,54 +480,6 @@ ret_t do_physdev_op(int cmd, >> XEN_GUEST_HANDLE_PARAM(void) arg) >> break; >> } >> -case PHYSDEVOP_pci_device_add: { >> -struct physdev_pci_device_add add; >> -struct pci_dev_info pdev_info; >> -nodeid_t node; >> - >> -ret = -EFAULT; >> -if ( copy_from_guest(, arg, 1) != 0 ) >> -break; >> - >> -pdev_info.is_extfn = !!(add.flags & XEN_PCI_DEV_EXTFN); >> -if ( add.flags & XEN_PCI_DEV_VIRTFN ) >> -{ >> -pdev_info.is_virtfn = 1; >> -pdev_info.physfn.bus = add.physfn.bus; >> -pdev_info.physfn.devfn = add.physfn.devfn; >> -} >> -else >> -pdev_info.is_virtfn = 0; >> - >> -if ( add.flags & XEN_PCI_DEV_PXM ) >> -{ >> -uint32_t pxm; >> -size_t optarr_off = offsetof(struct physdev_pci_device_add, >> optarr) / >> -sizeof(add.optarr[0]); >> - >> -if ( copy_from_guest_offset(, arg, optarr_off, 1) ) >> -break; >> - >> -node = pxm_to_node(pxm); >> -} >> -else >> -node = NUMA_NO_NODE; >> - >> -ret = pci_add_device(add.seg, add.bus, add.devfn, _info, node); >> -break; >> -} >> - >> -case PHYSDEVOP_pci_device_remove: { >> -struct physdev_pci_device dev; >> - >> -ret = -EFAULT; >> -if ( copy_from_guest(, arg, 1) != 0 ) >> -break; >> - >> -ret = pci_remove_device(dev.seg, dev.bus, dev.devfn); >> -break; >> -} >> - >> case PHYSDEVOP_prepare_msix: >> case PHYSDEVOP_release_msix: { >> struct physdev_pci_device dev; >> @@ -663,7 +615,7 @@ ret_t do_physdev_op(int cmd, >> XEN_GUEST_HANDLE_PARAM(void) arg) >> } >>default: >> -ret = -ENOSYS; >> +ret = pci_physdev_op(cmd, arg); >> break; >> } >> diff --git
[PATCH net-next v2 01/12] net: xen: use eth_hw_addr_set()
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Signed-off-by: Jakub Kicinski --- CC: wei@kernel.org CC: p...@xen.org CC: boris.ostrov...@oracle.com CC: jgr...@suse.com CC: sstabell...@kernel.org CC: xen-devel@lists.xenproject.org --- drivers/net/xen-netback/interface.c | 6 -- drivers/net/xen-netfront.c | 4 +++- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index c58996c1e230..fe8e21ad8ed9 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -494,6 +494,9 @@ static const struct net_device_ops xenvif_netdev_ops = { struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, unsigned int handle) { + static const u8 dummy_addr[ETH_ALEN] = { + 0xfe, 0xff, 0xff, 0xff, 0xff, 0xff, + }; int err; struct net_device *dev; struct xenvif *vif; @@ -551,8 +554,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid, * stolen by an Ethernet bridge for STP purposes. * (FE:FF:FF:FF:FF:FF) */ - eth_broadcast_addr(dev->dev_addr); - dev->dev_addr[0] &= ~0x01; + eth_hw_addr_set(dev, dummy_addr); netif_carrier_off(dev); diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index e31b98403f31..57437e4b8a94 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -2157,6 +2157,7 @@ static int talk_to_netback(struct xenbus_device *dev, unsigned int max_queues = 0; struct netfront_queue *queue = NULL; unsigned int num_queues = 1; + u8 addr[ETH_ALEN]; info->netdev->irq = 0; @@ -2170,11 +2171,12 @@ static int talk_to_netback(struct xenbus_device *dev, "feature-split-event-channels", 0); /* Read mac addr. */ - err = xen_net_read_mac(dev, info->netdev->dev_addr); + err = xen_net_read_mac(dev, addr); if (err) { xenbus_dev_fatal(dev, err, "parsing %s/mac", dev->nodename); goto out_unlocked; } + eth_hw_addr_set(info->netdev, addr); info->netback_has_xdp_headroom = xenbus_read_unsigned(info->xbdev->otherend, "feature-xdp-headroom", 0); -- 2.31.1
[PATCH v4] xen/arm: vgic: Ignore write access to ICPENDR*
Currently, Xen will return IO unhandled when guests write ICPENDR* virtual registers, which will raise a data abort inside the guest. For Linux guest, these virtual registers will not be accessed. But for Zephyr, these virtual registers will be accessed during the initialization. Zephyr guest will get an IO data abort and crash. Emulating ICPENDR is not easy with the existing vGIC, this patch reworks the emulation to ignore write access to ICPENDR* virtual registers and print a message about whether they are already pending instead of returning unhandled. More details can be found at [1]. [1] https://github.com/zephyrproject-rtos/zephyr/blob/eaf6cf745df3807e6e cc941c3a30de6c179ae359/drivers/interrupt_controller/intc_gicv3.c#L274 Signed-off-by: Hongda Deng --- Changes since v3: * Commit message modification * Change "goto write_ignore_32" to "goto write_ignore" to avoid double checking dabt.size * Delete data.size check in vgic_v3_rdistr_sgi_mmio_write to avoid double check in __vgic_v3_distr_common_mmio_write for SGI * Declare flags, p, v_target within the loop to reduce their scope * Use the same vgic_get_target_vcpu(v, irq) to get v_target for SPI, PPI and SGI * Code principle modification Changes since v2: * Avoid to print messages when there is no pending interrupt * Add helper vgic_check_inflight_irqs_pending to check pending status * Print a message for each interrupt separately Changes since v1: * Check pending states by going through vcpu->arch.vgic.inflight_irqs instead of checking hardware registers --- xen/arch/arm/vgic-v2.c | 10 ++ xen/arch/arm/vgic-v3.c | 17 - xen/arch/arm/vgic.c| 28 xen/include/asm-arm/vgic.h | 2 ++ 4 files changed, 44 insertions(+), 13 deletions(-) diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c index b2da886adc..589c033eda 100644 --- a/xen/arch/arm/vgic-v2.c +++ b/xen/arch/arm/vgic-v2.c @@ -481,10 +481,12 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info, case VRANGE32(GICD_ICPENDR, GICD_ICPENDRN): if ( dabt.size != DABT_WORD ) goto bad_width; -printk(XENLOG_G_ERR - "%pv: vGICD: unhandled word write %#"PRIregister" to ICPENDR%d\n", - v, r, gicd_reg - GICD_ICPENDR); -return 0; +rank = vgic_rank_offset(v, 1, gicd_reg - GICD_ICPENDR, DABT_WORD); +if ( rank == NULL ) goto write_ignore; + +vgic_check_inflight_irqs_pending(v->domain, v, rank->index, r); + +goto write_ignore; case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVERN): if ( dabt.size != DABT_WORD ) goto bad_width; diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c index cb5a70c42e..65bb7991a6 100644 --- a/xen/arch/arm/vgic-v3.c +++ b/xen/arch/arm/vgic-v3.c @@ -817,10 +817,12 @@ static int __vgic_v3_distr_common_mmio_write(const char *name, struct vcpu *v, case VRANGE32(GICD_ICPENDR, GICD_ICPENDRN): if ( dabt.size != DABT_WORD ) goto bad_width; -printk(XENLOG_G_ERR - "%pv: %s: unhandled word write %#"PRIregister" to ICPENDR%d\n", - v, name, r, reg - GICD_ICPENDR); -return 0; +rank = vgic_rank_offset(v, 1, reg - GICD_ICPENDR, DABT_WORD); +if ( rank == NULL ) goto write_ignore; + +vgic_check_inflight_irqs_pending(v->domain, v, rank->index, r); + +goto write_ignore; case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVERN): if ( dabt.size != DABT_WORD ) goto bad_width; @@ -986,11 +988,8 @@ static int vgic_v3_rdistr_sgi_mmio_write(struct vcpu *v, mmio_info_t *info, info, gicr_reg, r); case VREG32(GICR_ICPENDR0): -if ( dabt.size != DABT_WORD ) goto bad_width; -printk(XENLOG_G_ERR - "%pv: vGICR: SGI: unhandled word write %#"PRIregister" to ICPENDR0\n", - v, r); -return 0; +return __vgic_v3_distr_common_mmio_write("vGICR: SGI", v, + info, gicr_reg, r); case VREG32(GICR_IGRPMODR0): /* We do not implement security extensions for guests, write ignore */ diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index 8f9400a519..83386cf3d5 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -726,6 +726,34 @@ unsigned int vgic_max_vcpus(unsigned int domctl_vgic_version) } } +void vgic_check_inflight_irqs_pending(struct domain *d, struct vcpu *v, + unsigned int rank, uint32_t r) +{ +const unsigned long mask = r; +unsigned int i; + +for_each_set_bit( i, , 32 ) +{ +struct pending_irq *p; +struct vcpu *v_target; +unsigned long flags; +unsigned int irq = i + 32 * rank; + +v_target = vgic_get_target_vcpu(v, irq); + +spin_lock_irqsave(_target->arch.vgic.lock, flags); + +p =
Tentative fix for "out of PoD memory" issue
Marek, could you please test whether the attached patch is fixing your problem? BTW, I don't think this couldn't happen before kernel 5.15. I guess my modification to use a kernel thread instead of a workqueue just made the issue more probable. I couldn't reproduce the crash you are seeing, but the introduced wait was 4.2 seconds on my test system (a PVH guest with 2 GB of memory, maxmem 6 GB). Juergen From 3ee35f6f110e2258ec94f0d1397fac8c26b41761 Mon Sep 17 00:00:00 2001 From: Juergen Gross To: linux-ker...@vger.kernel.org Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Stefano Stabellini Cc: xen-devel@lists.xenproject.org Date: Thu, 21 Oct 2021 12:51:06 +0200 Subject: [PATCH] xen/balloon: add late_initcall_sync() for initial ballooning done MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When running as PVH or HVM guest with actual memory < max memory the hypervisor is using "populate on demand" in order to allow the guest to balloon down from its maximum memory size. For this to work correctly the guest must not touch more memory pages than its target memory size as otherwise the PoD cache will be exhausted and the guest is crashed as a result of that. In extreme cases ballooning down might not be finished today before the init process is started, which can consume lots of memory. In order to avoid random boot crashes in such cases, add a late init call to wait for ballooning down having finished for PVH/HVM guests. Reported-by: Marek Marczykowski-Górecki Signed-off-by: Juergen Gross --- drivers/xen/balloon.c | 20 1 file changed, 20 insertions(+) diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 3a50f097ed3e..d19b851c3d3b 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -765,3 +765,23 @@ static int __init balloon_init(void) return 0; } subsys_initcall(balloon_init); + +static int __init balloon_wait_finish(void) +{ + if (!xen_domain()) + return -ENODEV; + + /* PV guests don't need to wait. */ + if (xen_pv_domain() || !current_credit()) + return 0; + + pr_info("Waiting for initial ballooning down having finished.\n"); + + while (current_credit()) + schedule_timeout_interruptible(HZ / 10); + + pr_info("Initial ballooning down finished.\n"); + + return 0; +} +late_initcall_sync(balloon_wait_finish); -- 2.26.2 OpenPGP_0xB0DE9DD628BF132F.asc Description: OpenPGP public key OpenPGP_signature Description: OpenPGP digital signature
[libvirt test] 165702: regressions - FAIL
flight 165702 libvirt real [real] http://logs.test-lab.xenproject.org/osstest/logs/165702/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf-libvirt 6 libvirt-buildfail REGR. vs. 151777 build-amd64-libvirt 6 libvirt-buildfail REGR. vs. 151777 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777 build-arm64-libvirt 6 libvirt-buildfail REGR. vs. 151777 Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-pair 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-vhd 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-i386-libvirt-pair 1 build-check(1) blocked n/a test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt-raw 1 build-check(1) blocked n/a test-amd64-i386-libvirt-xsm 1 build-check(1) blocked n/a test-arm64-arm64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-qcow2 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-raw 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-raw 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-qcow2 1 build-check(1) blocked n/a version targeted for testing: libvirt 2772162316dc4938cd60d19e14dabe4371188c05 baseline version: libvirt 2c846fa6bcc11929c9fb857a22430fb9945654ad Last test of basis 151777 2020-07-10 04:19:19 Z 468 days Failing since151818 2020-07-11 04:18:52 Z 467 days 453 attempts Testing same since 165702 2021-10-21 04:18:54 Z0 days1 attempts People who touched revisions under test: Adolfo Jayme Barrientos Aleksandr Alekseev Aleksei Zakharov Andika Triwidada Andrea Bolognani Ani Sinha Balázs Meskó Barrett Schonefeld Bastian Germann Bastien Orivel BiaoXiang Ye Bihong Yu Binfeng Wu Bjoern Walk Boris Fiuczynski Brian Turek Bruno Haible Chris Mayo Christian Borntraeger Christian Ehrhardt Christian Kirbach Christian Schoenebeck Cole Robinson Collin Walling Cornelia Huck Cédric Bosdonnat Côme Borsoi Daniel Henrique Barboza Daniel Letai Daniel P. Berrange Daniel P. Berrangé Didik Supriadi dinglimin Dmytro Linkin Eiichi Tsukata Eric Farman Erik Skultety Fabian Affolter Fabian Freyer Fabiano Fidêncio Fangge Jin Farhan Ali Fedora Weblate Translation Gavi Teitz gongwei Guoyi Tu Göran Uddeborg Halil Pasic Han Han Hao Wang Hela Basa Helmut Grohne Hiroki Narukawa Ian Wienand Jakob Meng Jamie Strandboge Jamie Strandboge Jan Kuparinen jason lee Jean-Baptiste Holcroft Jia Zhou Jianan Gao Jim Fehlig Jin Yan Jinsheng Zhang Jiri Denemark John Ferlan Jonathan Watt Jonathon Jongsma Julio Faracco Justin Gatzen Ján Tomko Kashyap Chamarthy Kevin Locke Kristina Hanicova Laine Stump Laszlo Ersek Lee Yarwood Lei Yang Liao Pingfang Lin Ma Lin Ma Lin Ma Liu Yiding Luke Yue Luyao Zhong Marc Hartmayer Marc-André Lureau Marek Marczykowski-Górecki Markus Schade Martin Kletzander Masayoshi Mizuma Matej Cepl Matt Coleman Matt Coleman Mauro Matteo Cascella Meina Li Michal Privoznik Michał Smyk Milo Casagrande Moshe Levi Muha Aliss Nathan Neal Gompa Nick Chevsky Nick Shyrokovskiy Nickys Music Group Nico Pache Nikolay Shirokovskiy Olaf Hering Olesya Gerasimenko Orion Poplawski Pany Patrick Magauran Paulo de Rezende Pinatti Pavel Hrdina Peng Liang Peter Krempa Pino Toscano Pino Toscano Piotr Drąg Prathamesh Chavan Praveen K Paladugu Richard W.M. Jones Ricky Tigg Robin Lee Roman Bogorodskiy Roman Bolshakov Ryan Gahagan Ryan Schmidt Sam Hartman Scott Shambarger Sebastian Mitterle SeongHyun Jo Shalini Chellathurai Saroja Shaojun Yang Shi Lei simmon Simon Chopin Simon Gaiser Simon Rowe Stefan Bader Stefan Berger Stefan Berger Stefan Hajnoczi Stefan Hajnoczi Szymon Scholz Thomas Huth Tim Wiederhake Tomáš Golembiovský Tomáš Janoušek Tuguoyi Victor Toso Ville Skyttä
Re: [XEN PATCH v7 42/51] build: grab common EFI source files in arch specific dir
On 21.10.2021 13:03, Anthony PERARD wrote: > On Mon, Oct 18, 2021 at 10:48:26AM +0200, Jan Beulich wrote: >> On 15.10.2021 18:29, Anthony PERARD wrote: >>> On Thu, Oct 14, 2021 at 10:51:44AM +0200, Jan Beulich wrote: On 24.08.2021 12:50, Anthony PERARD wrote: > --- a/xen/arch/arm/efi/Makefile > +++ b/xen/arch/arm/efi/Makefile > @@ -1,4 +1,10 @@ > CFLAGS-y += -fshort-wchar > +CFLAGS-y += -I$(srctree)/common/efi Perhaps another opportunity for -iquote? >>> >>> Yes. >>> > obj-y += boot.init.o pe.init.o ebmalloc.o runtime.o > obj-$(CONFIG_ACPI) += efi-dom0.init.o > + > +$(obj)/%.c: common/efi/%.c > + $(Q)cp -f $< $@ In case both trees are on the same file system, trying to hardlink first would seem desirable. When copying, I think you should also pass -p. >>> >>> I don't know if doing an hardlink is a good thing to do, I'm not sure of >>> the kind of issue this could bring. As for -p, I don't think it's a good >>> idea to copy the mode, ownership, and timestamps of the source file, I'd >>> rather have the timestamps that Make expect, e.i. "now". >> >> Why would "now" be correct (or expected) in any way? The cloned file is no >> different from the original. Nevertheless I agree that -p is not ideal; >> it's just that the more fine grained option to preserve just the timestamp >> is non-standard afaik. You could try that first and fall back to -p ... >> Otherwise, failing hard linking and using "cp -p", I'm afraid I'd prefer >> symlinking despite the arguments against it that you name in the >> description. > > I guess I'm missing something, is there a reason to keep/copy the > timestamps of the original files? Avoidance of confusion is my main aim here. I certainly would be puzzled to see what looks like a source file to have a time stamp much newer than expected. Jan
Re: [XEN PATCH v7 42/51] build: grab common EFI source files in arch specific dir
On Mon, Oct 18, 2021 at 10:48:26AM +0200, Jan Beulich wrote: > On 15.10.2021 18:29, Anthony PERARD wrote: > > On Thu, Oct 14, 2021 at 10:51:44AM +0200, Jan Beulich wrote: > >> On 24.08.2021 12:50, Anthony PERARD wrote: > >>> --- a/xen/arch/arm/efi/Makefile > >>> +++ b/xen/arch/arm/efi/Makefile > >>> @@ -1,4 +1,10 @@ > >>> CFLAGS-y += -fshort-wchar > >>> +CFLAGS-y += -I$(srctree)/common/efi > >> > >> Perhaps another opportunity for -iquote? > > > > Yes. > > > >>> obj-y += boot.init.o pe.init.o ebmalloc.o runtime.o > >>> obj-$(CONFIG_ACPI) += efi-dom0.init.o > >>> + > >>> +$(obj)/%.c: common/efi/%.c > >>> + $(Q)cp -f $< $@ > >> > >> In case both trees are on the same file system, trying to hardlink first > >> would seem desirable. When copying, I think you should also pass -p. > > > > I don't know if doing an hardlink is a good thing to do, I'm not sure of > > the kind of issue this could bring. As for -p, I don't think it's a good > > idea to copy the mode, ownership, and timestamps of the source file, I'd > > rather have the timestamps that Make expect, e.i. "now". > > Why would "now" be correct (or expected) in any way? The cloned file is no > different from the original. Nevertheless I agree that -p is not ideal; > it's just that the more fine grained option to preserve just the timestamp > is non-standard afaik. You could try that first and fall back to -p ... > Otherwise, failing hard linking and using "cp -p", I'm afraid I'd prefer > symlinking despite the arguments against it that you name in the > description. I guess I'm missing something, is there a reason to keep/copy the timestamps of the original files? > Might be good to have someone else's view here as well. Indeed. -- Anthony PERARD
Re: [PATCH v4 1/6] x86/PVH: improve Dom0 memory size calculation
On 29.09.2021 15:13, Jan Beulich wrote: > @@ -337,53 +336,65 @@ unsigned long __init dom0_compute_nr_pag > avail -= d->max_vcpus - 1; > > /* Reserve memory for iommu_dom0_init() (rough estimate). */ > -if ( is_iommu_enabled(d) ) > +if ( is_iommu_enabled(d) && !iommu_hwdom_passthrough ) > { > unsigned int s; > > for ( s = 9; s < BITS_PER_LONG; s += 9 ) > -avail -= max_pdx >> s; > +iommu_pages += max_pdx >> s; > + > +avail -= iommu_pages; > +} > + > +nr_pages = get_memsize(_size, avail); > + > +/* > + * If allocation isn't specified, reserve 1/16th of available memory for > + * things like DMA buffers. This reservation is clamped to a maximum of > + * 128MB. > + */ > +if ( !nr_pages ) > +{ > +nr_pages = avail - (pv_shim ? pv_shim_mem(avail) > +: min(avail / 16, 128UL << (20 - PAGE_SHIFT))); > +if ( paging_mode_enabled(d) ) > +/* > + * Temporary workaround message until internal (paging) memory > + * accounting required to build a pvh dom0 is improved. > + */ > +printk("WARNING: PVH dom0 without dom0_mem set is still > unstable. " > + "If you get crashes during boot, try adding a dom0_mem > parameter\n"); > } > > -need_paging = is_hvm_domain(d) && > -(!iommu_use_hap_pt(d) || !paging_mode_hap(d)); > -for ( ; ; need_paging = false ) > +if ( paging_mode_enabled(d) || opt_dom0_shadow ) > { > -nr_pages = get_memsize(_size, avail); > -min_pages = get_memsize(_min_size, avail); > -max_pages = get_memsize(_max_size, avail); > +unsigned long cpu_pages; > > /* > - * If allocation isn't specified, reserve 1/16th of available memory > - * for things like DMA buffers. This reservation is clamped to a > - * maximum of 128MB. > + * Clamp according to min/max limits and available memory > + * (preliminary). > */ > -if ( !nr_pages ) > -{ > -nr_pages = avail - (pv_shim ? pv_shim_mem(avail) > - : min(avail / 16, 128UL << (20 - > PAGE_SHIFT))); Just FYI that I've noticed only now that moving this only up is not enough; the same also ... > -if ( is_hvm_domain(d) && !need_paging ) > -/* > - * Temporary workaround message until internal (paging) > memory > - * accounting required to build a pvh dom0 is improved. > - */ > -printk("WARNING: PVH dom0 without dom0_mem set is still > unstable. " > - "If you get crashes during boot, try adding a > dom0_mem parameter\n"); > -} > - > - > -/* Clamp according to min/max limits and available memory. */ > -nr_pages = max(nr_pages, min_pages); > -nr_pages = min(nr_pages, max_pages); > +nr_pages = max(nr_pages, get_memsize(_min_size, avail)); > +nr_pages = min(nr_pages, get_memsize(_max_size, avail)); > nr_pages = min(nr_pages, avail); > > -if ( !need_paging ) > -break; > +cpu_pages = dom0_paging_pages(d, nr_pages); > > -/* Reserve memory for shadow or HAP. */ > -avail -= dom0_paging_pages(d, nr_pages); > +if ( !iommu_use_hap_pt(d) ) > +avail -= cpu_pages; > +else if ( cpu_pages > iommu_pages ) > +avail -= cpu_pages - iommu_pages; > } > > +nr_pages = get_memsize(_size, avail); ... is needed here, or else things won't work e.g. without any "dom0_mem=". I'll introduce a helper function ... Jan
Re: [PATCH for-4.16] tools/xenstored: Ignore domain we were unable to restore
Thanks everyone, committed. Ian.
[PATCH v2 3/3] AMD/IOMMU: iommu_enable vs iommu_intremap
The two are really meant to be independent settings; iov_supports_xt() using || instead of && was simply wrong. The corrected check is, however, redundant, just like the (correct) one in iov_detect(): These hook functions are unreachable without acpi_ivrs_init() installing the iommu_init_ops pointer, which it does only upon success. (Unlike for VT-d there is no late clearing of iommu_enable due to quirks, and any possible clearing of iommu_intremap happens only after iov_supports_xt() has run.) Signed-off-by: Jan Beulich --- In fact in iov_detect() it could be iommu_enable alone which gets checked, but this felt overly aggressive to me. Instead I'm getting the impression that the function may wrongly not get called when "iommu=off" but interrupt remapping is in use: We'd not get the interrupt handler installed, and hence interrupt remapping related events would never get reported. (Same on VT-d, FTAOD.) For iov_supports_xt() the question is whether, like VT-d's intel_iommu_supports_eim(), it shouldn't rather check iommu_intremap alone (in which case it would need to remain a check rather than getting converted to ASSERT()). --- v2: New. --- a/xen/drivers/passthrough/amd/iommu_intr.c +++ b/xen/drivers/passthrough/amd/iommu_intr.c @@ -731,8 +731,7 @@ bool __init iov_supports_xt(void) { unsigned int apic; -if ( !iommu_enable || !iommu_intremap ) -return false; +ASSERT(iommu_enable || iommu_intremap); if ( amd_iommu_prepare(true) ) return false; --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -199,8 +199,7 @@ int __init acpi_ivrs_init(void) static int __init iov_detect(void) { -if ( !iommu_enable && !iommu_intremap ) -return 0; +ASSERT(iommu_enable || iommu_intremap); if ( (init_done ? amd_iommu_init_late() : amd_iommu_init(false)) != 0 )
[PATCH v2 2/3] x86/APIC: avoid iommu_supports_x2apic() on error path
The value it returns may change from true to false in case iommu_enable_x2apic() fails and, as a side effect, clears iommu_intremap (as can happen at least on AMD). Latch the return value from the first invocation to replace the second one. Signed-off-by: Jan Beulich --- v2: New. --- a/xen/arch/x86/apic.c +++ b/xen/arch/x86/apic.c @@ -865,6 +865,7 @@ void x2apic_ap_setup(void) void __init x2apic_bsp_setup(void) { struct IO_APIC_route_entry **ioapic_entries = NULL; +bool iommu_x2apic; const char *orig_name; if ( !cpu_has_x2apic ) @@ -880,7 +881,8 @@ void __init x2apic_bsp_setup(void) printk("x2APIC: Already enabled by BIOS: Ignoring cmdline disable.\n"); } -if ( iommu_supports_x2apic() ) +iommu_x2apic = iommu_supports_x2apic(); +if ( iommu_x2apic ) { if ( (ioapic_entries = alloc_ioapic_entries()) == NULL ) { @@ -933,8 +935,11 @@ void __init x2apic_bsp_setup(void) printk("Switched to APIC driver %s\n", genapic.name); restore_out: -/* iommu_x2apic_enabled cannot be used here in the error case. */ -if ( iommu_supports_x2apic() ) +/* + * iommu_x2apic_enabled and iommu_supports_x2apic() cannot be used here + * in the error case. + */ +if ( iommu_x2apic ) { /* * NB: do not use raw mode when restoring entries if the iommu has
[PATCH v2 1/3] x86/IOMMU: mark IOMMU / intremap not in use when ACPI tables are missing
x2apic_bsp_setup() gets called ahead of iommu_setup(), and since x2APIC mode (physical vs clustered) depends on iommu_intremap, that variable needs to be set to off as soon as we know we can't / won't enable interrupt remapping, i.e. in particular when parsing of the respective ACPI tables failed. Move the turning off of iommu_intremap from AMD specific code into acpi_iommu_init(), accompanying it by clearing of iommu_enable. Take the opportunity and also fully skip ACPI table parsing logic on VT-d when both "iommu=off" and "iommu=no-intremap" are in effect anyway, like was already the case for AMD. The tag below only references the commit uncovering a pre-existing anomaly. Fixes: d8bd82327b0f ("AMD/IOMMU: obtain IVHD type to use earlier") Reported-by: Andrew Cooper Signed-off-by: Jan Beulich --- While the change here deals with apic_x2apic_probe() as called from x2apic_bsp_setup(), the check_x2apic_preenabled() path looks to be similarly affected. That call occurs before acpi_boot_init(), which is what calls acpi_iommu_init(). The ordering in setup.c is in part relatively fragile, which is why for the moment I'm still hesitant to move the generic_apic_probe() call down. Plus I don't have easy access to a suitable system to test this case. Thoughts? --- v2: Treat iommu_enable and iommu_intremap as separate options. --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -183,9 +183,6 @@ int __init acpi_ivrs_init(void) { int rc; -if ( !iommu_enable && !iommu_intremap ) -return 0; - rc = amd_iommu_get_supported_ivhd_type(); if ( rc < 0 ) return rc; @@ -193,10 +190,7 @@ int __init acpi_ivrs_init(void) ivhd_type = rc; if ( (amd_iommu_detect_acpi() !=0) || (iommu_found() == 0) ) -{ -iommu_intremap = iommu_intremap_off; return -ENODEV; -} iommu_init_ops = &_iommu_init_ops; --- a/xen/drivers/passthrough/vtd/dmar.c +++ b/xen/drivers/passthrough/vtd/dmar.c @@ -777,11 +777,7 @@ static int __init acpi_parse_dmar(struct dmar = (struct acpi_table_dmar *)table; dmar_flags = dmar->flags; -if ( !iommu_enable && !iommu_intremap ) -{ -ret = -EINVAL; -goto out; -} +ASSERT(iommu_enable || iommu_intremap); if ( !dmar->width ) { --- a/xen/drivers/passthrough/x86/iommu.c +++ b/xen/drivers/passthrough/x86/iommu.c @@ -41,6 +41,24 @@ enum iommu_intremap __read_mostly iommu_ bool __read_mostly iommu_intpost; #endif +void __init acpi_iommu_init(void) +{ +int ret; + +if ( !iommu_enable && !iommu_intremap ) +return; + +ret = acpi_dmar_init(); +if ( ret == -ENODEV ) +ret = acpi_ivrs_init(); + +if ( ret ) +{ +iommu_enable = false; +iommu_intremap = iommu_intremap_off; +} +} + int __init iommu_hardware_setup(void) { struct IO_APIC_route_entry **ioapic_entries = NULL; --- a/xen/include/asm-x86/acpi.h +++ b/xen/include/asm-x86/acpi.h @@ -141,16 +141,10 @@ extern u32 x86_acpiid_to_apicid[]; extern u32 pmtmr_ioport; extern unsigned int pmtmr_width; +void acpi_iommu_init(void); int acpi_dmar_init(void); int acpi_ivrs_init(void); -static inline int acpi_iommu_init(void) -{ -int ret = acpi_dmar_init(); - -return ret == -ENODEV ? acpi_ivrs_init() : ret; -} - void acpi_mmcfg_init(void); /* Incremented whenever we transition through S3. Value is 1 during boot. */
[PATCH v2 0/3] x86/IOMMU: enabled / intremap handling
In the course of reading the response to v1 (patch 1 only) I realized that not only that patch needs further adjustment, but that also further changes are needed (and there's likely yet more amiss). 1: x86/IOMMU: mark IOMMU / intremap not in use when ACPI tables are missing 2: x86/APIC: avoid iommu_supports_x2apic() on error path 3: AMD/IOMMU: iommu_enable vs iommu_intremap Jan
Re: [XEN][RFC PATCH 08/13] xen/iommu: Introduce iommu_remove_dt_devices function
Hi Vikram, On 02/09/2021 07:05, Vikram Garhwal wrote: iommu_remove_dt_device function is introduced for supporting dynamic programming i.e. adding and removing a node during runtime. When user removes the device node, iommu_remove_dt_device() removes the device entry from smmu-masters too using following steps: 1. Find if SMMU master exists for the device node. 2. Remove the SMMU master. I would prefer if this patch is split in two: * Part 1: Add the generic helper * Part 2: Implement the callback for the SMMU Signed-off-by: Vikram Garhwal --- xen/drivers/passthrough/arm/smmu.c| 53 +++ xen/drivers/passthrough/device_tree.c | 30 xen/include/xen/iommu.h | 2 ++ 3 files changed, 85 insertions(+) diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c index c9dfc4c..7b615bc 100644 --- a/xen/drivers/passthrough/arm/smmu.c +++ b/xen/drivers/passthrough/arm/smmu.c @@ -816,6 +816,17 @@ static int insert_smmu_master(struct arm_smmu_device *smmu, return 0; } +static int remove_smmu_master(struct arm_smmu_device *smmu, + struct arm_smmu_master *master) +{ + if (!(smmu->masters.rb_node)) + return -ENOENT; + + rb_erase(>node, >masters); + + return 0; +} + static int arm_smmu_dt_add_device_legacy(struct arm_smmu_device *smmu, struct device *dev, struct iommu_fwspec *fwspec) @@ -853,6 +864,31 @@ static int arm_smmu_dt_add_device_legacy(struct arm_smmu_device *smmu, return insert_smmu_master(smmu, master); } +static int arm_smmu_dt_remove_device_legacy(struct arm_smmu_device *smmu, +struct device *dev) +{ + struct arm_smmu_master *master; + struct device_node *dev_node = dev_get_dev_node(dev); + int ret; + + master = find_smmu_master(smmu, dev_node); + if (master == NULL) { + dev_err(dev, + "No registrations found for master device %s\n", + dev_node->name); + return -EINVAL; + } + + ret = remove_smmu_master(smmu, master); Can you add a comment why you are not clearing the dev_node->is_protected? + + if (ret) + return ret; + + master->of_node = NULL; NIT: This is a bit pointless to do given that you are freeing master right after. + kfree(master); + return 0; +} + static int register_smmu_master(struct arm_smmu_device *smmu, struct device *dev, struct of_phandle_args *masterspec) @@ -876,6 +912,22 @@ static int register_smmu_master(struct arm_smmu_device *smmu, fwspec); } +static int arm_smmu_dt_remove_device_generic(u8 devfn, struct device *dev) +{ + struct arm_smmu_device *smmu; + struct iommu_fwspec *fwspec; + + fwspec = dev_iommu_fwspec_get(dev); + if (fwspec == NULL) + return -ENXIO; + + smmu = find_smmu(fwspec->iommu_dev); + if (smmu == NULL) + return -ENXIO; + + return arm_smmu_dt_remove_device_legacy(smmu, dev); +} + static int arm_smmu_dt_add_device_generic(u8 devfn, struct device *dev) { struct arm_smmu_device *smmu; @@ -2876,6 +2928,7 @@ static const struct iommu_ops arm_smmu_iommu_ops = { .init = arm_smmu_iommu_domain_init, .hwdom_init = arm_smmu_iommu_hwdom_init, .add_device = arm_smmu_dt_add_device_generic, +.remove_device = arm_smmu_dt_remove_device_generic, .teardown = arm_smmu_iommu_domain_teardown, .iotlb_flush = arm_smmu_iotlb_flush, .iotlb_flush_all = arm_smmu_iotlb_flush_all, diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c index 98f2aa0..37f4945 100644 --- a/xen/drivers/passthrough/device_tree.c +++ b/xen/drivers/passthrough/device_tree.c @@ -127,6 +127,36 @@ int iommu_release_dt_devices(struct domain *d) return 0; } +int iommu_remove_dt_device(struct dt_device_node *np) +{ +const struct iommu_ops *ops = iommu_get_ops(); +struct device *dev = dt_to_dev(np); +int rc = 1; You set rc to 1 but AFAICT the value is never used. + +if ( !ops ) +return -EINVAL; It would be better to return -EOPNOSUPP. + +if ( iommu_dt_device_is_assigned(np) ) +return -EPERM; + +/* + * The driver which supports generic IOMMU DT bindings must have + * these callback implemented. + */ I think we should make ops->remove_device optional. +if ( !ops->remove_device ) +return -EINVAL; It would be better to return -EOPNOSUPP. + +/* + * Remove master device from the IOMMU if latter is present and available. + */ +rc = ops->remove_device(0, dev); This
xen/arm: Missing appropriate locking for the IOMMU (WAS Re: [PATCH v5 02/11] xen/arm: Add PHYSDEVOP_pci_device_(*add/remove) support for ARM)
Hi all, While going through the passthrough code. I noticed that we don't have a common lock for the IOMMU between the PCI and DT code. This is going to be an issue given it would technically be possible to add a PCI device while assigning a DT. Rahul, Bertrand, Oleksandr, can you have a look at the issue? Cheers, On 06/10/2021 18:40, Rahul Singh wrote: Hardware domain is in charge of doing the PCI enumeration and will discover the PCI devices and then will communicate to XEN via hyper call PHYSDEVOP_pci_device_add(..) to add the PCI devices in XEN. Also implement PHYSDEVOP_pci_device_remove(..) to remove the PCI device. As most of the code for PHYSDEVOP_pci_device_* is the same between x86 and ARM, move the code to a common file to avoid duplication. There are other PHYSDEVOP_pci_device_* operations to add PCI devices. Currently implemented PHYSDEVOP_pci_device_remove(..) and PHYSDEVOP_pci_device_add(..) only as those are minimum required to support PCI passthrough on ARM. Signed-off-by: Rahul Singh --- Change in v5: - Move the pci_physdev_op() stub to xen/arch/arm/physdev.c. Change in v4: - Move file commom/physdev.c to drivers/pci/physdev.c - minor comments. Change in v3: Fixed minor comment. Change in v2: - Add support for PHYSDEVOP_pci_device_remove() - Move code to common code --- --- xen/arch/arm/physdev.c| 6 ++- xen/arch/x86/physdev.c| 52 +-- xen/arch/x86/x86_64/physdev.c | 2 +- xen/drivers/pci/Makefile | 1 + xen/drivers/pci/physdev.c | 80 +++ xen/include/public/arch-arm.h | 4 +- xen/include/xen/hypercall.h | 4 ++ 7 files changed, 96 insertions(+), 53 deletions(-) create mode 100644 xen/drivers/pci/physdev.c diff --git a/xen/arch/arm/physdev.c b/xen/arch/arm/physdev.c index e91355fe22..f9aa274dda 100644 --- a/xen/arch/arm/physdev.c +++ b/xen/arch/arm/physdev.c @@ -8,13 +8,17 @@ #include #include #include -#include +#include int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) { +#ifdef CONFIG_HAS_PCI +return pci_physdev_op(cmd, arg); +#else gdprintk(XENLOG_DEBUG, "PHYSDEVOP cmd=%d: not implemented\n", cmd); return -ENOSYS; +#endif } /* diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c index 23465bcd00..ea38be8b79 100644 --- a/xen/arch/x86/physdev.c +++ b/xen/arch/x86/physdev.c @@ -12,7 +12,7 @@ #include #include #include -#include +#include #include #include #include @@ -480,54 +480,6 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) break; } -case PHYSDEVOP_pci_device_add: { -struct physdev_pci_device_add add; -struct pci_dev_info pdev_info; -nodeid_t node; - -ret = -EFAULT; -if ( copy_from_guest(, arg, 1) != 0 ) -break; - -pdev_info.is_extfn = !!(add.flags & XEN_PCI_DEV_EXTFN); -if ( add.flags & XEN_PCI_DEV_VIRTFN ) -{ -pdev_info.is_virtfn = 1; -pdev_info.physfn.bus = add.physfn.bus; -pdev_info.physfn.devfn = add.physfn.devfn; -} -else -pdev_info.is_virtfn = 0; - -if ( add.flags & XEN_PCI_DEV_PXM ) -{ -uint32_t pxm; -size_t optarr_off = offsetof(struct physdev_pci_device_add, optarr) / -sizeof(add.optarr[0]); - -if ( copy_from_guest_offset(, arg, optarr_off, 1) ) -break; - -node = pxm_to_node(pxm); -} -else -node = NUMA_NO_NODE; - -ret = pci_add_device(add.seg, add.bus, add.devfn, _info, node); -break; -} - -case PHYSDEVOP_pci_device_remove: { -struct physdev_pci_device dev; - -ret = -EFAULT; -if ( copy_from_guest(, arg, 1) != 0 ) -break; - -ret = pci_remove_device(dev.seg, dev.bus, dev.devfn); -break; -} - case PHYSDEVOP_prepare_msix: case PHYSDEVOP_release_msix: { struct physdev_pci_device dev; @@ -663,7 +615,7 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) } default: -ret = -ENOSYS; +ret = pci_physdev_op(cmd, arg); break; } diff --git a/xen/arch/x86/x86_64/physdev.c b/xen/arch/x86/x86_64/physdev.c index 0a50cbd4d8..e3cbd5ebcb 100644 --- a/xen/arch/x86/x86_64/physdev.c +++ b/xen/arch/x86/x86_64/physdev.c @@ -9,7 +9,7 @@ EMIT_FILE; #include #include #include -#include +#include #define do_physdev_op compat_physdev_op diff --git a/xen/drivers/pci/Makefile b/xen/drivers/pci/Makefile index a98035df4c..972c923db0 100644 --- a/xen/drivers/pci/Makefile +++ b/xen/drivers/pci/Makefile @@ -1 +1,2 @@ obj-y += pci.o +obj-y += physdev.o diff --git a/xen/drivers/pci/physdev.c b/xen/drivers/pci/physdev.c new file mode 100644 index 00..4f3e1a96c0 --- /dev/null +++ b/xen/drivers/pci/physdev.c
Re: [XEN][RFC PATCH 06/13] device tree: Add dt_print_node_names()
Hi Vikram, On 02/09/2021 07:05, Vikram Garhwal wrote: Add dt_print_node_names() to print all nodes under a dt_device_node. dt_print_node_names() takes a dt_device_node type input and prints the node name of all the subsequent nodes. This is added for debugging purpose for device tree overlays. I can't find any use of this function in the rest of series. While I understand it was helpful for you, I am not entirely convinced that we should add it in Xen with no-use. The code is simple enough for anyone to re-implement it if needed. Cheers, Signed-off-by: Vikram Garhwal --- xen/common/device_tree.c | 10 ++ xen/include/xen/device_tree.h | 5 + 2 files changed, 15 insertions(+) diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c index cda21be..bfe3191 100644 --- a/xen/common/device_tree.c +++ b/xen/common/device_tree.c @@ -308,6 +308,16 @@ struct dt_device_node *dt_find_node_by_path(const char *path) return np; } +void dt_print_node_names(struct dt_device_node *dt) +{ +struct dt_device_node *np; + +dt_for_each_device_node(dt, np) +dt_dprintk("Node name: %s Full name %s\n", np->name, np->full_name); + +return; +} + int dt_find_node_by_gpath(XEN_GUEST_HANDLE(char) u_path, uint32_t u_plen, struct dt_device_node **node) { diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h index a4e98a7..dcd96b4 100644 --- a/xen/include/xen/device_tree.h +++ b/xen/include/xen/device_tree.h @@ -483,6 +483,11 @@ struct dt_device_node *dt_find_node_by_path(const char *path); int dt_find_node_by_gpath(XEN_GUEST_HANDLE(char) u_path, uint32_t u_plen, struct dt_device_node **node); +/* + * Prints all node names. + */ +void dt_print_node_names(struct dt_device_node *dt); + /** * dt_get_parent - Get a node's parent if any * @node: Node to get parent -- Julien Grall
[linux-5.4 test] 165696: tolerable FAIL - PUSHED
flight 165696 linux-5.4 real [real] http://logs.test-lab.xenproject.org/osstest/logs/165696/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 165616 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 165616 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 165616 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 165616 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 165616 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 165616 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 165616 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 165616 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 165616 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 165616 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 165616 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 165616 test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-amd64-i386-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail never pass version targeted for testing: linux4f508aa9dd3bde7a1c5e4e6de72abb8a03fd504a baseline version: linux
Re: [PATCH v2] PCI/MSI: Re-add checks for skip masking MSI-X on Xen PV
On 10/20/21 16:03, Jason Andryuk wrote: > Hi, Marc, > > Adding Juergen and Boris since this involves Xen. > > On Wed, Oct 20, 2021 at 8:51 AM Marc Zyngier wrote: >> On Tue, 19 Oct 2021 22:48:19 +0100, >> Josef Johansson wrote: >>> From: Josef Johansson >>> >>> >>> PCI/MSI: Re-add checks for skip masking MSI-X on Xen PV >>> >>> commit fcacdfbef5a1 ("PCI/MSI: Provide a new set of mask and unmask >>> functions") introduce functions pci_msi_update_mask() and >>> pci_msix_write_vector_ctrl() that is missing checks for >>> pci_msi_ignore_mask that exists in commit 446a98b19fd6 ("PCI/MSI: Use >>> new mask/unmask functions"). Add them back since it is >>> causing severe lockups in amdgpu drivers under Xen during boot. >>> >>> As explained in commit 1a519dc7a73c ("PCI/MSI: Skip masking MSI-X >>> on Xen PV"), when running as Xen PV guest, masking MSI-X is a >>> responsibility of the hypervisor. >>> >>> Fixes: fcacdfbef5a1 ("PCI/MSI: Provide a new set of mask and unmask >>> functions") >>> Suggested-by: Jason Andryuk >>> Signed-off-by: Josef Johansson >>> >> [...] >> >>> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c >>> index 0099a00af361..355b791e382f 100644 >>> --- a/drivers/pci/msi.c >>> +++ b/drivers/pci/msi.c >>> @@ -148,6 +148,9 @@ static noinline void pci_msi_update_mask(struct >>> msi_desc *desc, u32 clear, u32 s >>> raw_spinlock_t *lock = >dev->msi_lock; >>> unsigned long flags; >>> >>> + if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual) >>> + return; >>> + >> I'd rather be consistent, and keep the check outside of >> pci_msi_update_mask(), just like we do in __pci_msi_mask_desc(). >> Something like this instead: >> >> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c >> index 0099a00af361..6c69eab304ce 100644 >> --- a/drivers/pci/msi.c >> +++ b/drivers/pci/msi.c >> @@ -420,7 +420,8 @@ static void __pci_restore_msi_state(struct pci_dev *dev) >> arch_restore_msi_irqs(dev); >> >> pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, ); >> - pci_msi_update_mask(entry, 0, 0); >> + if (!(pci_msi_ignore_mask || desc->msi_attrib.is_virtual)) >> + pci_msi_update_mask(entry, 0, 0); >> control &= ~PCI_MSI_FLAGS_QSIZE; >> control |= (entry->msi_attrib.multiple << 4) | PCI_MSI_FLAGS_ENABLE; >> pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control); >> >> But the commit message talks about MSI-X, and the above is MSI >> only. Is Xen messing with the former, the latter, or both? > My understanding is pci_msi_ignore_mask covers both MSI and MSI-X for Xen. Please let me know if I should go ahead and try it out and send in a v3 of the patch. I'm watching for further discussion right now, just to be clear. >>> raw_spin_lock_irqsave(lock, flags); >>> desc->msi_mask &= ~clear; >>> desc->msi_mask |= set; >>> @@ -181,6 +184,9 @@ static void pci_msix_write_vector_ctrl(struct msi_desc >>> *desc, u32 ctrl) >>> { >>> void __iomem *desc_addr = pci_msix_desc_addr(desc); >>> >>> + if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual) >>> + return; >>> + >>> writel(ctrl, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL); >>> } >> I have similar reservations for this one. > The problem here is some of the changes in commit 446a98b19fd6 > ("PCI/MSI: Use new mask/unmask functions") bypass the checks in > __pci_msi_mask_desc/__pci_msi_unmask_desc. I've wondered if it would > be cleaner to push all the `if (pci_msi_ignore_mask)` checks down to > the place of the writes. That keeps dropping the write local to the > write and leaves the higher level code consistent between the regular > and Xen PV cases. I don't know where checking > desc->msi_attrib.is_virtual is appropriate. > > Regards, > Jason
Re: [PATCH] x86/IOMMU: mark IOMMU / intremap not in use when ACPI tables are missing
On 21.10.2021 09:21, Jan Beulich wrote: > On 20.10.2021 22:01, Andrew Cooper wrote: >> However, I don't think skipping parsing is a sensible move. Intremap is >> utterly mandatory if during boot, we discover that our APIC ID is >254, >> and iommu=no-intremap must be ignored in this case, or if the MADT says >> we have CPUs beyond that limit and the user hasn't specified nr_cpus=1 >> or equivalent. > > Reading this made me realize that the change breaks other behavior. > The conditional really needs to be iommu_enable || iommu_intremap - > at least AMD code added in support for x2APIC already treats the > latter to not be a sub-option of the former (iov_supports_xt(), > acpi_ivrs_init()), and e.g. intel_iommu_supports_eim() also checks > the latter alone. Actually the check in iov_supports_xt() is wrong - it needs to use &&, not ||. I'll make a 2nd patch ... Jan
Re: [PATCH] x86/IOMMU: mark IOMMU / intremap not in use when ACPI tables are missing
On 20.10.2021 22:01, Andrew Cooper wrote: > On 20/10/2021 11:36, Jan Beulich wrote: >> x2apic_bsp_setup() gets called ahead of iommu_setup(), and since x2APIC >> mode (physical vs clustered) depends on iommu_intremap, that variable >> needs to be set to off as soon as we know we can't / won't enable the >> IOMMU, i.e. in particular when >> - parsing of the respective ACPI tables failed, >> - "iommu=off" is in effect, but not "iommu=no-intremap". >> Move the turning off of iommu_intremap from AMD specific code into >> acpi_iommu_init(), accompanying it by clearing of iommu_enable. >> >> Take the opportunity and also skip ACPI table parsing altogether when >> "iommu=off" is in effect anyway. >> >> Reported-by: Andrew Cooper >> Signed-off-by: Jan Beulich >> --- >> I've deliberately not added a Fixes: tag here, as I'm of the opinion >> that d8bd82327b0f ("AMD/IOMMU: obtain IVHD type to use earlier") only >> uncovered a pre-existing anomaly. > > I agree it uncovered a pre-existing issue, but that doesn't mean a Fixes > tag should be omitted. That change very concretely regressed booting on > some systems in their pre-existing configuration. > > The commit message needs to spell out a link to d8bd82327b0f, but it's > fine to say "that commit broke it by violating an unexpected ordering > dependency, but isn't really the source of the bug". > >> This particularly applies to the "iommu=off" aspect. > > There should be at least two Fixes tags, but I suspect trying to trace > the history of this mess is not a productive use of time. > >> (This now allows me to remove an item from my TODO >> list: I was meaning to figure out why one of my systems wouldn't come >> up properly with "iommu=off", and I had never thought of combining this >> with "no-intremap".) >> >> Arguably "iommu=off" should turn off subordinate features in common >> IOMMU code, but doing so in parse_iommu_param() would be wrong (as >> there might be multiple "iommu=" to parse). This could be placed in >> iommu_supports_x2apic(), but see the next item. > > I don't think we make any claim or implication that passing the same > option twice works. The problem here is the nested structure of > options, and the variable doing double duty. > >> >> While the change here deals with apic_x2apic_probe() as called from >> x2apic_bsp_setup(), the check_x2apic_preenabled() path looks to be >> similarly affected. That call occurs before acpi_boot_init(), which is >> what calls acpi_iommu_init(). The ordering in setup.c is in part >> relatively fragile, which is why for the moment I'm still hesitant to >> move the generic_apic_probe() call down. Plus I don't have easy access >> to a suitable system to test this case. Thoughts? > > I've written these thoughts before, but IOMMU handling it a catastrophic > mess. It needs burning to the ground and redoing from scratch. > > We currently have two ways of turning on the IOMMU, depending on what > firmware does, and plenty ways of crashing Xen with cmdline options > which should work, and the legacy xAPIC startup routine is after > interrupts have been enabled, leading to an attempt to rewrite > interrupts in place to remap them. This in particular has lead to XSAs > due to trusting registers which can't be trusted, and the rewrite is > impossible to do safely. > > The correct order is: > 1) Parse DMAR/IVRS (but do not configure anything), MADT, current APIC > setting and cmdline arguments > 2) Figure out whether we want to be in xAPIC or x2APIC mode, and whether > we need intremap. Change the LAPIC setting > 3) Configure the IOMMUs. In particular, their interrupt needs to be > after step 2 Leaving aside check_x2apic_preenabled(), all of this is already the case afaict, almost at least: We do acpi_boot_init(), later x2apic_bsp_setup(), and yet later iommu_setup(). The only issue might be inside x2apic_bsp_setup(), where we call iommu_enable_x2apic() before switching to x2APIC mode. Yet we avoid setting up IOMMU interrupts during this early stage. Hence I think, as expressed, that the question really is whether we can safely defer check_x2apic_preenabled() by a little bit. > 4) Start up Xen's general IRQ infrastructure. > > It's a fair chunk of work, but it will vastly simplify the boot logic > and let us delete the infinite flushing loops out of the IOMMU logic, > and we don't need any logic which has to second guess itself based on > what happened earlier on boot. > >> --- a/xen/drivers/passthrough/x86/iommu.c >> +++ b/xen/drivers/passthrough/x86/iommu.c >> @@ -41,6 +41,23 @@ enum iommu_intremap __read_mostly iommu_ >> bool __read_mostly iommu_intpost; >> #endif >> >> +void __init acpi_iommu_init(void) >> +{ >> +if ( iommu_enable ) >> +{ >> +int ret = acpi_dmar_init(); >> + >> +if ( ret == -ENODEV ) >> +ret = acpi_ivrs_init(); >> + >> +if ( ret ) >> +iommu_enable = false; >> +} >> + >> +if ( !iommu_enable ) >> +
Re: [PATCH for-4.16] tools/xenstored: Ignore domain we were unable to restore
> On 20 Oct 2021, at 15:45, Julien Grall wrote: > > From: Julien Grall > > Commit 939775cfd3 "handle dying domains in live update" was meant to > handle gracefully dying domain. However, the @releaseDomain watch > will end up to be sent as soon as we finished to restore Xenstored > state. > > This may be before Xen reports the domain to be dying (such as if > the guest decided to revoke access to the xenstore page). Consequently > daemon like xenconsoled will not clean-up the domain and it will be > left as a zombie. > > To avoid the problem, mark the connection as ignored. This also > requires to tweak conn_can_write() and conn_can_read() to prevent > dereferencing a NULL pointer (the interface will not mapped). > > The check conn->is_ignored was originally added after the callbacks > because the helpers for a socket connection may close the fd. However, > ignore_connection() will close a socket connection directly. So it is > fine to do the re-order. > > Signed-off-by: Julien Grall Reviewed-by: Luca Fancellu > > --- > > This issue was originally found when developping commit 939775cfd3. > Unfortunately, due to some miscommunication the wrong patch was sent > upstream. The approach is re-using the one we discussed back then. > > This was tested by modifying Linux to revoke the Xenstore grant during > boot. Without this patch, the domain will be left after Live-Update. Note > that on a basic setup (only xenconsoled and xl watch @releaseDomain), > the domain may be cleaned on the next domain shutdown/start. > > Xenstore Live-Update is so far a tech preview feature. But I would still > like to request this patch to be included in 4.16 as this was meant to > be part of the original work. > --- > tools/xenstore/xenstored_core.c | 8 > tools/xenstore/xenstored_core.h | 1 + > tools/xenstore/xenstored_domain.c | 21 - > 3 files changed, 17 insertions(+), 13 deletions(-) > > diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c > index 0d4c73d6e20c..91d093a12ea6 100644 > --- a/tools/xenstore/xenstored_core.c > +++ b/tools/xenstore/xenstored_core.c > @@ -338,10 +338,10 @@ static int destroy_conn(void *_conn) > > static bool conn_can_read(struct connection *conn) > { > - if (!conn->funcs->can_read(conn)) > + if (conn->is_ignored) > return false; > > - if (conn->is_ignored) > + if (!conn->funcs->can_read(conn)) > return false; > > /* > @@ -356,7 +356,7 @@ static bool conn_can_read(struct connection *conn) > > static bool conn_can_write(struct connection *conn) > { > - return conn->funcs->can_write(conn) && !conn->is_ignored; > + return !conn->is_ignored && conn->funcs->can_write(conn); > } > > /* This function returns index inside the array if succeed, -1 if fail */ > @@ -1466,7 +1466,7 @@ static struct { > * > * All watches, transactions, buffers will be freed. > */ > -static void ignore_connection(struct connection *conn) > +void ignore_connection(struct connection *conn) > { > struct buffered_data *out, *tmp; > > diff --git a/tools/xenstore/xenstored_core.h b/tools/xenstore/xenstored_core.h > index 258f6ff38279..07d861d92499 100644 > --- a/tools/xenstore/xenstored_core.h > +++ b/tools/xenstore/xenstored_core.h > @@ -206,6 +206,7 @@ struct node *read_node(struct connection *conn, const > void *ctx, > > struct connection *new_connection(const struct interface_funcs *funcs); > struct connection *get_connection_by_id(unsigned int conn_id); > +void ignore_connection(struct connection *conn); > void check_store(void); > void corrupt(struct connection *conn, const char *fmt, ...); > > diff --git a/tools/xenstore/xenstored_domain.c > b/tools/xenstore/xenstored_domain.c > index 47e9107c144e..d03c7d93a9e7 100644 > --- a/tools/xenstore/xenstored_domain.c > +++ b/tools/xenstore/xenstored_domain.c > @@ -268,14 +268,7 @@ void check_domains(void) > domain->shutdown = true; > notify = 1; > } > - /* > - * On Restore, we may have been unable to remap the > - * interface and the port. As we don't know whether > - * this was because of a dying domain, we need to > - * check if the interface and port are still valid. > - */ > - if (!dominfo.dying && domain->port && > - domain->interface) > + if (!dominfo.dying) > continue; > } > if (domain->conn) { > @@ -1303,6 +1296,17 @@ void read_state_connection(const void *ctx, const void > *state) > if (!domain) > barf("domain allocation error"); > > + conn = domain->conn; > + > + /* > + * We may not have been able to restore the domain (for > +
Re: [PATCH for-4.16] tools/xenstored: Ignore domain we were unable to restore
On 20.10.21 16:45, Julien Grall wrote: From: Julien Grall Commit 939775cfd3 "handle dying domains in live update" was meant to handle gracefully dying domain. However, the @releaseDomain watch will end up to be sent as soon as we finished to restore Xenstored state. This may be before Xen reports the domain to be dying (such as if the guest decided to revoke access to the xenstore page). Consequently daemon like xenconsoled will not clean-up the domain and it will be left as a zombie. To avoid the problem, mark the connection as ignored. This also requires to tweak conn_can_write() and conn_can_read() to prevent dereferencing a NULL pointer (the interface will not mapped). The check conn->is_ignored was originally added after the callbacks because the helpers for a socket connection may close the fd. However, ignore_connection() will close a socket connection directly. So it is fine to do the re-order. Signed-off-by: Julien Grall Reviewed-by: Juergen Gross Juergen OpenPGP_0xB0DE9DD628BF132F.asc Description: OpenPGP public key OpenPGP_signature Description: OpenPGP digital signature