Re: [PATCH] arm/smmu: Complete SMR masking support
Hi Michal, > On 4 Sep 2024, at 1:43 PM, Michal Orzel wrote: > > SMR masking support allows deriving a mask either using a 2-cell iommu > specifier (per master) or stream-match-mask SMMU dt property (global > config). Even though the mask is stored in the fwid when adding a > device (in arm_smmu_dt_xlate_generic()), we still set it to 0 when > allocating SMEs (in arm_smmu_master_alloc_smes()). So at the end, we > always ignore the mask when programming SMRn registers. This leads to > SMMU failures. Fix it by completing the support. > > A bit of history: > Linux support for SMR allocation was mainly done with: > 58a7399d ("iommu/arm-smmu: Intelligent SMR allocation") > 021bb8420d44 ("iommu/arm-smmu: Wire up generic configuration support") > > Taking the mask into account in arm_smmu_master_alloc_smes() was added > as part of the second commit, although quite hidden in the thicket of > other changes. We backported only the first patch with: 0435784cc75d > ("xen/arm: smmuv1: Intelligent SMR allocation") but the changes to take > the mask into account were missed. > > Signed-off-by: Michal Orzel Reviewed-by: Rahul Singh Regards, Rahul > --- > xen/drivers/passthrough/arm/smmu.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/xen/drivers/passthrough/arm/smmu.c > b/xen/drivers/passthrough/arm/smmu.c > index f2cee82f553a..4c8a446754cc 100644 > --- a/xen/drivers/passthrough/arm/smmu.c > +++ b/xen/drivers/passthrough/arm/smmu.c > @@ -1619,19 +1619,21 @@ static int arm_smmu_master_alloc_smes(struct device > *dev) > spin_lock(&smmu->stream_map_lock); > /* Figure out a viable stream map entry allocation */ > for_each_cfg_sme(cfg, i, idx, fwspec->num_ids) { > + uint16_t mask = (fwspec->ids[i] >> SMR_MASK_SHIFT) & SMR_MASK_MASK; > + > if (idx != INVALID_SMENDX) { > ret = -EEXIST; > goto out_err; > } > > - ret = arm_smmu_find_sme(smmu, fwspec->ids[i], 0); > + ret = arm_smmu_find_sme(smmu, fwspec->ids[i], mask); > if (ret < 0) > goto out_err; > > idx = ret; > if (smrs && smmu->s2crs[idx].count == 0) { > smrs[idx].id = fwspec->ids[i]; > - smrs[idx].mask = 0; /* We don't currently share SMRs */ > + smrs[idx].mask = mask; > smrs[idx].valid = true; > } > smmu->s2crs[idx].count++; > -- > 2.25.1 >
Re: [PATCH] xen/arm: vITS: add #msi-cells property
Hi Stewart, > On 2 Aug 2024, at 7:26 PM, Stewart Hildebrand > wrote: > > Non-PCI platform devices may use the ITS. Dom0 Linux drivers for such > devices are failing to register IRQs due to a missing #msi-cells > property. Add the missing #msi-cells property. > > Signed-off-by: Stewart Hildebrand Reviewed-by: Rahul Singh Regards, Rahul > --- > See Linux dc4dae00d82f ("Docs: dt: add #msi-cells to GICv3 ITS binding") > --- > xen/arch/arm/gic-v3-its.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c > index 8afcd9783bc8..55bed3fe87d0 100644 > --- a/xen/arch/arm/gic-v3-its.c > +++ b/xen/arch/arm/gic-v3-its.c > @@ -951,6 +951,10 @@ int gicv3_its_make_hwdom_dt_nodes(const struct domain *d, > if ( res ) > return res; > > +res = fdt_property_cell(fdt, "#msi-cells", 1); > +if ( res ) > +return res; > + > if ( its->phandle ) > { > res = fdt_property_cell(fdt, "phandle", its->phandle); > > base-commit: 984cb316cb27b53704c607e640a7dd2763b898ab > -- > 2.45.2 > >
Re: [PATCH v1 1/1] xen/arm: smmuv3: Mark more init-only functions with __init
Hi Edgar, > On 22 May 2024, at 2:28 PM, Edgar E. Iglesias > wrote: > > From: "Edgar E. Iglesias" > > Move more functions that are only called at init to > the .init.text section. > > Signed-off-by: Edgar E. Iglesias Acked-by: Rahul Singh Tested-by: Rahul Singh Regards, Rahul
Re: [PATCH v9 04/15] xen/bitops: put __ffs() into linux compatible header
Hi Oleksii, > On 6 May 2024, at 11:15 AM, Oleksii Kurochko > wrote: > > The mentioned macros exist only because of Linux compatible purpose. > > The patch defines __ffs() in terms of Xen bitops and it is safe > to define in this way ( as __ffs() - 1 ) as considering that __ffs() > was defined as __builtin_ctzl(x), which has undefined behavior when x=0, > so it is assumed that such cases are not encountered in the current code. > > To not include to Xen library files __ffs() and __ffz() > were defined locally in find-next-bit.c. > > Except __ffs() usage in find-next-bit.c only one usage of __ffs() leave > in smmu-v3.c. It seems that it __ffs can be changed to ffsl(x)-1 in > this file, but to keep smmu-v3.c looks close to linux it was deciced just > to define __ffs() in xen/linux-compat.h and include it in smmu-v3.c > > Signed-off-by: Oleksii Kurochko > Acked-by: Shawn Anastasio > Reviewed-by: Jan Beulich For SMMUv3 changes: Acked-by: Rahul Singh Regards, Rahul
Re: [PATCH] arm/vpci: make prefetchable mem 64 bit
Hi Stewart, > On 24 Apr 2024, at 5:27 PM, Stewart Hildebrand > wrote: > > The vPCI prefetchable memory range is >= 4GB, so the memory space flags > should be set to 64-bit. See IEEE Std 1275-1994 [1] for a definition of > the field. > > [1] https://www.devicetree.org/open-firmware/bindings/pci/pci2_1.pdf > > Signed-off-by: Stewart Hildebrand Reviewed-by: Rahul Singh Regards, Rahul
Re: [PATCH v6 2/5] xen/vpci: move xen_domctl_createdomain vPCI flag to common
Hi Stewart, > On 29 Nov 2023, at 9:25 pm, Stewart Hildebrand > wrote: > > On 11/14/23 04:11, Jan Beulich wrote: >> On 13.11.2023 23:21, Stewart Hildebrand wrote: >>> @@ -709,10 +710,17 @@ int arch_sanitise_domain_config(struct >>> xen_domctl_createdomain *config) >>> return -EINVAL; >>> } >>> >>> +if ( vpci && !hvm ) >>> +{ >>> +dprintk(XENLOG_INFO, "vPCI requested for non-HVM guest\n"); >>> +return -EINVAL; >>> +} >>> + >>> return 0; >>> } >> >> As said on the v5 thread, I think my comment was misguided (I'm sorry) >> and this wants keeping in common code as you had it. > > I'll move it back to xen/common/domain.c. No worries. I tested this patch and observed build failure when compiling the "x86_64” arch with "CONFIG_HVM=n“ option. x86_64-linux-gnu-ld-melf_x86_64 -T arch/x86/xen.lds -N prelink.o --build-id=sha1 \ ./common/symbols-dummy.o -o ./.xen-syms.0 x86_64-linux-gnu-ld: prelink.o: in function `arch_iommu_hwdom_init’: (.init.text+0x2192b): undefined reference to `vpci_is_mmcfg_address’ (.init.text+0x2192b): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vpci_is_mmcfg_address' x86_64-linux-gnu-ld: (.init.text+0x21947): undefined reference to `vpci_is_mmcfg_address' (.init.text+0x21947): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vpci_is_mmcfg_address' x86_64-linux-gnu-ld: prelink.o: in function `do_physdev_op’: (.text.do_physdev_op+0x6db): undefined reference to `register_vpci_mmcfg_handler' (.text.do_physdev_op+0x6db): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `register_vpci_mmcfg_handler' x86_64-linux-gnu-ld: ./.xen-syms.0: hidden symbol `vpci_is_mmcfg_address' isn't defined x86_64-linux-gnu-ld: final link failed: bad value Regards, Rahul > >> >>> --- a/xen/include/public/arch-x86/xen.h >>> +++ b/xen/include/public/arch-x86/xen.h >>> @@ -283,15 +283,16 @@ struct xen_arch_domainconfig { >>> #define XEN_X86_EMU_PIT (1U<<_XEN_X86_EMU_PIT) >>> #define _XEN_X86_EMU_USE_PIRQ 9 >>> #define XEN_X86_EMU_USE_PIRQ(1U<<_XEN_X86_EMU_USE_PIRQ) >>> -#define _XEN_X86_EMU_VPCI 10 >>> -#define XEN_X86_EMU_VPCI(1U<<_XEN_X86_EMU_VPCI) >>> +/* >>> + * Note: bit 10 was previously used for a XEN_X86_EMU_VPCI flag. This bit >>> should >>> + * not be re-used without careful consideration. >>> + */ >> >> I think a multi-line comment is drawing overly much attention to this. >> How about "Note: Bit 10 was previously used for XEN_X86_EMU_VPCI. Re-use >> with care." which I think fits in a single line comment. > > Sounds good. > >> >> Jan
Re: [XEN PATCH 10/10] arm/smmu: address violation of MISRA C:2012 Rule 8.2
Hi Bertrand > On 16 Oct 2023, at 2:31 pm, Bertrand Marquis wrote: > > Hi Julien, > >> On 16 Oct 2023, at 11:07, Julien Grall wrote: >> >> Hi, >> >> On 13/10/2023 16:24, Federico Serafini wrote: >>> Add missing parameter names, no functional change. >>> Signed-off-by: Federico Serafini >>> --- >>> xen/drivers/passthrough/arm/smmu.c | 6 +++--- >> >> This file is using the Linux coding style because it is imported from Linux. >> I was under the impression we would exclude such file for now. >> >> Looking at exclude-list.json, it doesn't seem to be present. I think this >> patch should be replaced with adding a line in execlude-list.json. > > I think that during one of the discussions we said that this file already > deviated quite a lot from the status in Linux and we wanted to turn it to Xen > coding style in the future hence it is not listed in the exclude file. > At the end having a working smmu might be critical in a safety context so it > will make sense to also check this part of xen. > > @Rahul: do you agree ? Yes, you are right current SMMUv3 code already deviates quite a lot from the status of Linux because Xen handles the command queue in a different way than the way Linux handles it. More detailed can be found at the start of the SMMUv3 file. I am pasting it here also. * 5. Latest version of the Linux SMMUv3 code implements the commands queue * access functions based on atomic operations implemented in Linux. * Atomic functions used by the commands queue access functions are not * implemented in XEN therefore we decided to port the earlier version * of the code. Atomic operations are introduced to fix the bottleneck of * the SMMU command queue insertion operation. A new algorithm for * inserting commands into the queue is introduced, which is * lock-free on the fast-path. * Consequence of reverting the patch is that the command queue insertion * will be slow for large systems as spinlock will be used to serializes * accesses from all CPUs to the single queue supported by the hardware. * Once the proper atomic operations will be available in XEN the driver * can be updated. Anyway because of above reason backporting SMMUv3 Linux driver patches to Xen are not straight forward. If making smmu-v3.c to Xen coding style helps in safety context I am okay with that. Regards, Rahul
Re: [PATCH -next] xen/evtchn: Remove unused function declaration xen_set_affinity_evtchn()
Hi Yue, > On 1 Aug 2023, at 3:54 pm, Yue Haibing wrote: > > Commit 67473b8194bc ("xen/events: Remove disfunct affinity spreading") > leave this unused declaration. > > Signed-off-by: Yue Haibing Reviewed-by: Rahul Singh Regards, Rahul Singh > --- > include/xen/events.h | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/include/xen/events.h b/include/xen/events.h > index 95970a2f7695..95d5e28de324 100644 > --- a/include/xen/events.h > +++ b/include/xen/events.h > @@ -75,7 +75,6 @@ void evtchn_put(evtchn_port_t evtchn); > > void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector); > void rebind_evtchn_irq(evtchn_port_t evtchn, int irq); > -int xen_set_affinity_evtchn(struct irq_desc *desc, unsigned int tcpu); > > static inline void notify_remote_via_evtchn(evtchn_port_t port) > { > -- > 2.34.1 >
Re: [PATCH v3] xen/evtchn: Introduce new IOCTL to bind static evtchn
HI Juergen, > On 27 Jul 2023, at 6:54 am, Juergen Gross wrote: > > On 18.07.23 13:31, Rahul Singh wrote: >> Xen 4.17 supports the creation of static evtchns. To allow user space >> application to bind static evtchns introduce new ioctl >> "IOCTL_EVTCHN_BIND_STATIC". Existing IOCTL doing more than binding >> that’s why we need to introduce the new IOCTL to only bind the static >> event channels. >> Static evtchns to be available for use during the lifetime of the >> guest. When the application exits, __unbind_from_irq() ends up being >> called from release() file operations because of that static evtchns >> are getting closed. To avoid closing the static event channel, add the >> new bool variable "is_static" in "struct irq_info" to mark the event >> channel static when creating the event channel to avoid closing the >> static evtchn. >> Also, take this opportunity to remove the open-coded version of the >> evtchn close in drivers/xen/evtchn.c file and use xen_evtchn_close(). >> Signed-off-by: Rahul Singh > > Pushed to xen/tip.git for-linus-6.5a Thanks. Regards, Rahul
Re: [PATCH v2] vpci: add permission checks to map_range()
Hi Roger, > On 26 Jul 2023, at 3:01 pm, Roger Pau Monne wrote: > > Just like it's done for the XEN_DOMCTL_memory_mapping hypercall, add > the permissions checks to vPCI map_range(), which is used to map the > BARs into the domain p2m. > > Adding those checks requires that for x86 PVH hardware domain builder > the permissions are set before initializing the IOMMU, or else > attempts to initialize vPCI done as part of IOMMU device setup will > fail due to missing permissions to create the BAR mappings. > > While moving the call to dom0_setup_permissions() convert the panic() > used for error handling to a printk, the caller will already panic if > required. > > Fixes: 9c244fdef7e7 ('vpci: add header handlers') > Signed-off-by: Roger Pau Monné I tested the patch on ARM board with vPCI enabled everything works. Reviewed-by: Rahul Singh Tested-by: Rahul Singh Regards, Rahul
Re: [PATCH v8 05/13] vpci/header: implement guest BAR register handlers
Hi Volodymyr, > On 20 Jul 2023, at 1:32 am, Volodymyr Babchuk > wrote: > > From: Oleksandr Andrushchenko > > Add relevant vpci register handlers when assigning PCI device to a domain > and remove those when de-assigning. This allows having different > handlers for different domains, e.g. hwdom and other guests. > > Emulate guest BAR register values: this allows creating a guest view > of the registers and emulates size and properties probe as it is done > during PCI device enumeration by the guest. > > All empty, IO and ROM BARs for guests are emulated by returning 0 on > reads and ignoring writes: this BARs are special with this respect as > their lower bits have special meaning, so returning default ~0 on read > may confuse guest OS. > > Memory decoding is initially disabled when used by guests in order to > prevent the BAR being placed on top of a RAM region. > > Signed-off-by: Oleksandr Andrushchenko > --- > > Since v6: > - unify the writing of the PCI_COMMAND register on the > error path into a label > - do not introduce bar_ignore_access helper and open code > - s/guest_bar_ignore_read/empty_bar_read > - update error message in guest_bar_write > - only setup empty_bar_read for IO if !x86 > Since v5: > - make sure that the guest set address has the same page offset > as the physical address on the host > - remove guest_rom_{read|write} as those just implement the default > behaviour of the registers not being handled > - adjusted comment for struct vpci.addr field > - add guest handlers for BARs which are not handled and will otherwise > return ~0 on read and ignore writes. The BARs are special with this > respect as their lower bits have special meaning, so returning ~0 > doesn't seem to be right > Since v4: > - updated commit message > - s/guest_addr/guest_reg > Since v3: > - squashed two patches: dynamic add/remove handlers and guest BAR > handler implementation > - fix guest BAR read of the high part of a 64bit BAR (Roger) > - add error handling to vpci_assign_device > - s/dom%pd/%pd > - blank line before return > Since v2: > - remove unneeded ifdefs for CONFIG_HAS_VPCI_GUEST_SUPPORT as more code > has been eliminated from being built on x86 > Since v1: > - constify struct pci_dev where possible > - do not open code is_system_domain() > - simplify some code3. simplify > - use gdprintk + error code instead of gprintk > - gate vpci_bar_{add|remove}_handlers with CONFIG_HAS_VPCI_GUEST_SUPPORT, > so these do not get compiled for x86 > - removed unneeded is_system_domain check > - re-work guest read/write to be much simpler and do more work on write > than read which is expected to be called more frequently > - removed one too obvious comment > --- > xen/drivers/vpci/header.c | 156 +++--- > xen/include/xen/vpci.h| 3 + > 2 files changed, 130 insertions(+), 29 deletions(-) > > diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c > index 2780fcae72..5dc9b5338b 100644 > --- a/xen/drivers/vpci/header.c > +++ b/xen/drivers/vpci/header.c > @@ -457,6 +457,71 @@ static void cf_check bar_write( > pci_conf_write32(pdev->sbdf, reg, val); > } > > +static void cf_check guest_bar_write(const struct pci_dev *pdev, > + unsigned int reg, uint32_t val, void > *data) > +{ > +struct vpci_bar *bar = data; > +bool hi = false; > +uint64_t guest_reg = bar->guest_reg; > + > +if ( bar->type == VPCI_BAR_MEM64_HI ) > +{ > +ASSERT(reg > PCI_BASE_ADDRESS_0); > +bar--; > +hi = true; > +} > +else > +{ > +val &= PCI_BASE_ADDRESS_MEM_MASK; > +val |= bar->type == VPCI_BAR_MEM32 ? PCI_BASE_ADDRESS_MEM_TYPE_32 > + : PCI_BASE_ADDRESS_MEM_TYPE_64; > +val |= bar->prefetchable ? PCI_BASE_ADDRESS_MEM_PREFETCH : 0; > +} > + > +guest_reg &= ~(0xull << (hi ? 32 : 0)); > +guest_reg |= (uint64_t)val << (hi ? 32 : 0); > + > +guest_reg &= ~(bar->size - 1) | ~PCI_BASE_ADDRESS_MEM_MASK; > + > +/* > + * Make sure that the guest set address has the same page offset > + * as the physical address on the host or otherwise things won't work as > + * expected. > + */ > +if ( (guest_reg & (~PAGE_MASK & PCI_BASE_ADDRESS_MEM_MASK)) != > + (bar->addr & ~PAGE_MASK) ) > +{ > +gprintk(XENLOG_WARNING, > +"%pp: ignored BAR %zu write attempting to change page > offset\n", > +&pdev->sbdf, bar - pdev->vpci->header.bars + hi); > +return; > +} > + > +bar->guest_reg = guest_reg; > +} > + > +static uint32_t cf_check guest_bar_read(const struct pci_dev *pdev, > +unsigned int reg, void *data) > +{ > +const struct vpci_bar *bar = data; > +bool hi = false; > + > +if ( bar->type == VPCI_BAR_MEM64_HI ) > +{ > +ASSERT(reg > PCI_BASE_ADDRESS_0); > +bar--; > +
Re: [PATCH v2 3/3] [FUTURE] xen/arm: enable vPCI for domUs
Hi Stewart, > On 21 Jul 2023, at 5:54 am, Stewart Hildebrand > wrote: > > On 7/7/23 07:04, Rahul Singh wrote: >> Hi Stewart, >> >>> On 7 Jul 2023, at 2:47 am, Stewart Hildebrand >>> wrote: >>> >>> Remove is_hardware_domain check in has_vpci, and select >>> HAS_VPCI_GUEST_SUPPORT >>> in Kconfig. >>> >>> [1] >>> https://lists.xenproject.org/archives/html/xen-devel/2023-06/msg00863.html >>> >>> Signed-off-by: Stewart Hildebrand >>> --- >>> As the tag implies, this patch is not intended to be merged (yet). >>> >>> Note that CONFIG_HAS_VPCI_GUEST_SUPPORT is not currently used in the >>> upstream >>> code base. It will be used by the vPCI series [1]. This patch is intended >>> to be >>> merged as part of the vPCI series. >>> >>> v1->v2: >>> * new patch >>> --- >>> xen/arch/arm/Kconfig | 1 + >>> xen/arch/arm/include/asm/domain.h | 2 +- >>> 2 files changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig >>> index 4e0cc421ad48..75dfa2f5a82d 100644 >>> --- a/xen/arch/arm/Kconfig >>> +++ b/xen/arch/arm/Kconfig >>> @@ -195,6 +195,7 @@ config PCI_PASSTHROUGH >>> depends on ARM_64 >>> select HAS_PCI >>> select HAS_VPCI >>> + select HAS_VPCI_GUEST_SUPPORT >> >> I tested this series on top of "SMMU handling for PCIe Passthrough on ARM” >> series on the N1SDP board >> and observe the SMMUv3 fault. > > Thanks for testing this. After a great deal of tinkering, I can reproduce the > SMMU fault. > > (XEN) smmu: /axi/smmu@fd80: Unhandled context fault: fsr=0x402, > iova=0xf9030040, fsynr=0x12, cb=0 > >> Enable the Kconfig option PCI_PASSTHROUGH, ARM_SMMU_V3,HAS_ITS and >> "iommu=on”, >> "pci_passthrough_enabled=on" cmd line parameter and after that, there is an >> SMMU fault >> for the ITS doorbell register access from the PCI devices. >> >> As there is no upstream support for ARM for vPCI MSI/MSI-X handling because >> of that SMMU fault is observed. >> >> Linux Kernel will set the ITS doorbell register( physical address of >> doorbell register as IOMMU is not enabled in Kernel) >> in PCI config space to set up the MSI-X interrupts, but there is no mapping >> in SMMU page tables because of that SMMU >> fault is observed. To fix this we need to map the ITS doorbell register to >> SMMU page tables to avoid the fault. >> >> We can fix this after setting the mapping for the ITS doorbell offset in the >> ITS code. >> >> diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c >> index 299b384250..8227a7a74b 100644 >> --- a/xen/arch/arm/vgic-v3-its.c >> +++ b/xen/arch/arm/vgic-v3-its.c >> @@ -682,6 +682,18 @@ static int its_handle_mapd(struct virt_its *its, >> uint64_t *cmdptr) >> BIT(size, UL), valid); >> if ( ret && valid ) >> return ret; >> + >> +if ( is_iommu_enabled(its->d) ) { >> +ret = map_mmio_regions(its->d, >> gaddr_to_gfn(its->doorbell_address), >> + PFN_UP(ITS_DOORBELL_OFFSET), >> + maddr_to_mfn(its->doorbell_address)); >> +if ( ret < 0 ) >> +{ >> +printk(XENLOG_ERR "GICv3: Map ITS translation register d%d >> failed.\n", >> +its->d->domain_id); >> +return ret; >> +} >> +} >> } > > Thank you, this resolves the SMMU fault. If it's okay, I will include this > patch in the next revision of the SMMU series (I see your Signed-off-by is > already in the attachment). Yes, you can include this patch in your next version. > >> Also as per Julien's request, I tried to set up the IOMMU for the PCI device >> without >> "pci_passthroigh_enable=on" and without HAS_VPCI everything works as expected >> after applying below patches. >> >> To test enable kconfig options HAS_PCI, ARM_SMMU_V3 and HAS_ITS and add below >> patches to make it work. >> >>• Set the mapping for the ITS doorbell offset in the ITS code when iommu >> is enabled. Also, If we want to support for adding PCI devices to IOMMU without PCI passthrough support (without HAS_VPCI and cmd line options “pci-passthrough-enabled=on”) as suggested by Julien, we also need below 2 patches also. >>• Reverted the patch that added the support for pci_passthrough_on. >>• Allow MMIO mapping of ECAM space to dom0 when vPCI is not enabled, as >> of now MMIO >> mapping for ECAM is based on pci_passthrough_enabled. We need this >> patch if we want to avoid >> enabling HAS_VPCI >> >> Please find the attached patches in case you want to test at your end. >> Regards, Rahul
[PATCH v3] xen/evtchn: Introduce new IOCTL to bind static evtchn
Xen 4.17 supports the creation of static evtchns. To allow user space application to bind static evtchns introduce new ioctl "IOCTL_EVTCHN_BIND_STATIC". Existing IOCTL doing more than binding that’s why we need to introduce the new IOCTL to only bind the static event channels. Static evtchns to be available for use during the lifetime of the guest. When the application exits, __unbind_from_irq() ends up being called from release() file operations because of that static evtchns are getting closed. To avoid closing the static event channel, add the new bool variable "is_static" in "struct irq_info" to mark the event channel static when creating the event channel to avoid closing the static evtchn. Also, take this opportunity to remove the open-coded version of the evtchn close in drivers/xen/evtchn.c file and use xen_evtchn_close(). Signed-off-by: Rahul Singh --- v3: * Remove the open-coded version of the evtchn close in drivers/xen/evtchn.c v2: * Use bool in place u8 to define is_static variable. * Avoid closing the static evtchns in error path. --- drivers/xen/events/events_base.c | 16 +-- drivers/xen/evtchn.c | 35 include/uapi/xen/evtchn.h| 9 include/xen/events.h | 11 +- 4 files changed, 50 insertions(+), 21 deletions(-) diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index c7715f8bd452..3bdd5b59661d 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -112,6 +112,7 @@ struct irq_info { unsigned int irq_epoch; /* If eoi_cpu valid: irq_epoch of event */ u64 eoi_time; /* Time in jiffies when to EOI. */ raw_spinlock_t lock; + bool is_static; /* Is event channel static */ union { unsigned short virq; @@ -815,15 +816,6 @@ static void xen_free_irq(unsigned irq) irq_free_desc(irq); } -static void xen_evtchn_close(evtchn_port_t port) -{ - struct evtchn_close close; - - close.port = port; - if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) - BUG(); -} - /* Not called for lateeoi events. */ static void event_handler_exit(struct irq_info *info) { @@ -982,7 +974,8 @@ static void __unbind_from_irq(unsigned int irq) unsigned int cpu = cpu_from_irq(irq); struct xenbus_device *dev; - xen_evtchn_close(evtchn); + if (!info->is_static) + xen_evtchn_close(evtchn); switch (type_from_irq(irq)) { case IRQT_VIRQ: @@ -1574,7 +1567,7 @@ int xen_set_irq_priority(unsigned irq, unsigned priority) } EXPORT_SYMBOL_GPL(xen_set_irq_priority); -int evtchn_make_refcounted(evtchn_port_t evtchn) +int evtchn_make_refcounted(evtchn_port_t evtchn, bool is_static) { int irq = get_evtchn_to_irq(evtchn); struct irq_info *info; @@ -1590,6 +1583,7 @@ int evtchn_make_refcounted(evtchn_port_t evtchn) WARN_ON(info->refcnt != -1); info->refcnt = 1; + info->is_static = is_static; return 0; } diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c index c99415a70051..9139a7364df5 100644 --- a/drivers/xen/evtchn.c +++ b/drivers/xen/evtchn.c @@ -366,10 +366,10 @@ static int evtchn_resize_ring(struct per_user_data *u) return 0; } -static int evtchn_bind_to_user(struct per_user_data *u, evtchn_port_t port) +static int evtchn_bind_to_user(struct per_user_data *u, evtchn_port_t port, + bool is_static) { struct user_evtchn *evtchn; - struct evtchn_close close; int rc = 0; /* @@ -402,14 +402,14 @@ static int evtchn_bind_to_user(struct per_user_data *u, evtchn_port_t port) if (rc < 0) goto err; - rc = evtchn_make_refcounted(port); + rc = evtchn_make_refcounted(port, is_static); return rc; err: /* bind failed, should close the port now */ - close.port = port; - if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) - BUG(); + if (!is_static) + xen_evtchn_close(port); + del_evtchn(u, evtchn); return rc; } @@ -456,7 +456,7 @@ static long evtchn_ioctl(struct file *file, if (rc != 0) break; - rc = evtchn_bind_to_user(u, bind_virq.port); + rc = evtchn_bind_to_user(u, bind_virq.port, false); if (rc == 0) rc = bind_virq.port; break; @@ -482,7 +482,7 @@ static long evtchn_ioctl(struct file *file, if (rc != 0) break; - rc = evtchn_bind_to_user(u, bind_interdomain.local_port); + rc = evtchn_bind_to_user(u, bind_interdomain.local_port, false);
Re: [PATCH v3] xen/arm: pci: fix check in pci_check_bar()
Hi Stewart, > On 12 Jul 2023, at 2:52 pm, Stewart Hildebrand > wrote: > > When mapping BARs for vPCI, it's valid for a BAR mfn_t start to equal the BAR > mfn_t end (i.e. start == end) since end is inclusive. However, pci_check_bar() > currently returns false in this case, which results in Xen not mapping the BAR > in the guest 2nd stage page tables. In this example boot log, Linux has mapped > the BARs in the 1st stage, but since Xen did not map them in the 2nd stage, > Linux encounters a data abort and panics: > > [2.593300] pci :00:00.0: BAR 0: assigned [mem 0x50008000-0x50008fff] > [2.593682] pci :00:00.0: BAR 2: assigned [mem 0x50009000-0x50009fff] > [2.594066] pci :00:00.0: BAR 4: assigned [mem 0x5000a000-0x5000afff] > ... > [2.810502] virtio-pci :00:00.0: enabling device ( -> 0002) > (XEN) :00:00.0: not mapping BAR [50008, 50008] invalid position > (XEN) :00:00.0: not mapping BAR [50009, 50009] invalid position > (XEN) :00:00.0: not mapping BAR [5000a, 5000a] invalid position > [2.817502] virtio-pci :00:00.0: virtio_pci: leaving for legacy driver > [2.817853] virtio-pci :00:00.0: enabling bus mastering > (XEN) arch/arm/traps.c:1992:d0v0 HSR=0x0093010045 pc=0x889507d4 > gva=0x8c46d012 gpa=0x0050008012 > [2.818397] Unable to handle kernel ttbr address size fault at virtual > address 8c46d012 > ... > > Adjust the end physical address e to account for the full page when converting > from mfn, at which point s and e cannot be equal, so drop the equality check > in > the condition. > > Note that adjusting e to account for the full page also increases the accuracy > of the subsequent is_bar_valid check. > > Fixes: cc80e2bab0d0 ("xen/pci: replace call to is_memory_hole to > pci_check_bar") > Signed-off-by: Stewart Hildebrand > Reviewed-by: Roger Pau Monné I tested the patch on N1SDP board everything works. Reviewed-by: Rahul Singh Tested-by: Rahul Singh Regards, Rahul
Re: [PATCH v3] xen/arm: pci: fix check in pci_check_bar()
Hi Stewart, > On 12 Jul 2023, at 2:52 pm, Stewart Hildebrand > wrote: > > When mapping BARs for vPCI, it's valid for a BAR mfn_t start to equal the BAR > mfn_t end (i.e. start == end) since end is inclusive. However, pci_check_bar() > currently returns false in this case, which results in Xen not mapping the BAR > in the guest 2nd stage page tables. In this example boot log, Linux has mapped > the BARs in the 1st stage, but since Xen did not map them in the 2nd stage, > Linux encounters a data abort and panics: > > [2.593300] pci :00:00.0: BAR 0: assigned [mem 0x50008000-0x50008fff] > [2.593682] pci :00:00.0: BAR 2: assigned [mem 0x50009000-0x50009fff] > [2.594066] pci :00:00.0: BAR 4: assigned [mem 0x5000a000-0x5000afff] > ... > [2.810502] virtio-pci :00:00.0: enabling device ( -> 0002) > (XEN) :00:00.0: not mapping BAR [50008, 50008] invalid position > (XEN) :00:00.0: not mapping BAR [50009, 50009] invalid position > (XEN) :00:00.0: not mapping BAR [5000a, 5000a] invalid position > [2.817502] virtio-pci :00:00.0: virtio_pci: leaving for legacy driver > [2.817853] virtio-pci :00:00.0: enabling bus mastering > (XEN) arch/arm/traps.c:1992:d0v0 HSR=0x0093010045 pc=0x889507d4 > gva=0x8c46d012 gpa=0x0050008012 > [2.818397] Unable to handle kernel ttbr address size fault at virtual > address 8c46d012 > ... > > Adjust the end physical address e to account for the full page when converting > from mfn, at which point s and e cannot be equal, so drop the equality check > in > the condition. > > Note that adjusting e to account for the full page also increases the accuracy > of the subsequent is_bar_valid check. > > Fixes: cc80e2bab0d0 ("xen/pci: replace call to is_memory_hole to > pci_check_bar") > Signed-off-by: Stewart Hildebrand > Reviewed-by: Roger Pau Monné I tested the patch on N1SDP board everything works. Reviewed-by: Rahul Singh Tested-by: Rahul Singh Regards, Rahul
Re: [PATCH v2] xen/evtchn: Introduce new IOCTL to bind static evtchn
Hi , > On 9 Jul 2023, at 1:10 am, Stefano Stabellini wrote: > > On Fri, 30 Jun 2023, Rahul Singh wrote: >> Hi Oleksandr, >> >> Thanks for reviewing the code. >> >>> On 29 Jun 2023, at 7:06 pm, Oleksandr Tyshchenko >>> wrote: >>> >>> >>> >>> On 29.06.23 18:46, Rahul Singh wrote: >>> >>> Hello Rahul >>> >>> >>>> Xen 4.17 supports the creation of static evtchns. To allow user space >>>> application to bind static evtchns introduce new ioctl >>>> "IOCTL_EVTCHN_BIND_STATIC". Existing IOCTL doing more than binding >>>> that’s why we need to introduce the new IOCTL to only bind the static >>>> event channels. >>>> >>>> Also, static evtchns to be available for use during the lifetime of the >>>> guest. When the application exits, __unbind_from_irq() ends up being >>>> called from release() file operations because of that static evtchns >>>> are getting closed. To avoid closing the static event channel, add the >>>> new bool variable "is_static" in "struct irq_info" to mark the event >>>> channel static when creating the event channel to avoid closing the >>>> static evtchn. >>>> >>>> Signed-off-by: Rahul Singh >>>> --- >>>> v2: >>>> * Use bool in place u8 to define is_static variable. >>>> * Avoid closing the static evtchns in error path. >>> >>> >>> Patch looks good to me, just a nit (question) below. >>> >>> >>>> --- >>>> drivers/xen/events/events_base.c | 7 +-- >>>> drivers/xen/evtchn.c | 30 ++ >>>> include/uapi/xen/evtchn.h| 9 + >>>> include/xen/events.h | 2 +- >>>> 4 files changed, 37 insertions(+), 11 deletions(-) >>>> >>>> diff --git a/drivers/xen/events/events_base.c >>>> b/drivers/xen/events/events_base.c >>>> index c7715f8bd452..5d3b5c7cfe64 100644 >>>> --- a/drivers/xen/events/events_base.c >>>> +++ b/drivers/xen/events/events_base.c >>>> @@ -112,6 +112,7 @@ struct irq_info { >>>> unsigned int irq_epoch; /* If eoi_cpu valid: irq_epoch of event */ >>>> u64 eoi_time; /* Time in jiffies when to EOI. */ >>>> raw_spinlock_t lock; >>>> + bool is_static; /* Is event channel static */ >>>> >>>> union { >>>> unsigned short virq; >>>> @@ -982,7 +983,8 @@ static void __unbind_from_irq(unsigned int irq) >>>> unsigned int cpu = cpu_from_irq(irq); >>>> struct xenbus_device *dev; >>>> >>>> - xen_evtchn_close(evtchn); >>>> + if (!info->is_static) >>>> + xen_evtchn_close(evtchn); >>>> >>>> switch (type_from_irq(irq)) { >>>> case IRQT_VIRQ: >>>> @@ -1574,7 +1576,7 @@ int xen_set_irq_priority(unsigned irq, unsigned >>>> priority) >>>> } >>>> EXPORT_SYMBOL_GPL(xen_set_irq_priority); >>>> >>>> -int evtchn_make_refcounted(evtchn_port_t evtchn) >>>> +int evtchn_make_refcounted(evtchn_port_t evtchn, bool is_static) >>>> { >>>> int irq = get_evtchn_to_irq(evtchn); >>>> struct irq_info *info; >>>> @@ -1590,6 +1592,7 @@ int evtchn_make_refcounted(evtchn_port_t evtchn) >>>> WARN_ON(info->refcnt != -1); >>>> >>>> info->refcnt = 1; >>>> + info->is_static = is_static; >>>> >>>> return 0; >>>> } >>>> diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c >>>> index c99415a70051..e6d2303478b2 100644 >>>> --- a/drivers/xen/evtchn.c >>>> +++ b/drivers/xen/evtchn.c >>>> @@ -366,7 +366,8 @@ static int evtchn_resize_ring(struct per_user_data *u) >>>> return 0; >>>> } >>>> >>>> -static int evtchn_bind_to_user(struct per_user_data *u, evtchn_port_t >>>> port) >>>> +static int evtchn_bind_to_user(struct per_user_data *u, evtchn_port_t >>>> port, >>>> + bool is_static) >>>> { >>>> struct user_evtchn *evtchn; >>>> struct evtchn_close close; >>>> @@ -402,14 +403,16 @@ static int evtchn_bind_to_user(struct per_user_data >>>> *u, evtchn_port_t port) >>>> if (rc < 0) >>>> goto err; >>>> >>>> - rc = evtchn_make_refcounted(port); >>>> + rc = evtchn_make_refcounted(port, is_static); >>>> return rc; >>>> >>>> err: >>>> /* bind failed, should close the port now */ >>>> - close.port = port; >>>> - if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) >>>> - BUG(); >>>> + if (!is_static) { >>> >>> >>> I think now "struct evtchn_close close;" can be placed here as it is not >>> used outside of this block. >>> >>> Also this block looks like an open-coded version of xen_evtchn_close() >>> defined at events_base.c, so maybe it is worth making xen_evtchn_close() >>> static inline and placing it into events.h, then calling helper here? >>> Please note, I will be ok either way. >> >> Make sense. I will modify the patch as per your request in the next version. >> I will wait for other maintainers to review the patch before sending the >> next version. > > I don't have any further comments. Thanks for the update. I will send the next version. Regards, Rahul
Re: [PATCH v2 3/3] [FUTURE] xen/arm: enable vPCI for domUs
Hi Stewart, > On 7 Jul 2023, at 2:47 am, Stewart Hildebrand > wrote: > > Remove is_hardware_domain check in has_vpci, and select HAS_VPCI_GUEST_SUPPORT > in Kconfig. > > [1] https://lists.xenproject.org/archives/html/xen-devel/2023-06/msg00863.html > > Signed-off-by: Stewart Hildebrand > --- > As the tag implies, this patch is not intended to be merged (yet). > > Note that CONFIG_HAS_VPCI_GUEST_SUPPORT is not currently used in the upstream > code base. It will be used by the vPCI series [1]. This patch is intended to > be > merged as part of the vPCI series. > > v1->v2: > * new patch > --- > xen/arch/arm/Kconfig | 1 + > xen/arch/arm/include/asm/domain.h | 2 +- > 2 files changed, 2 insertions(+), 1 deletion(-) > > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig > index 4e0cc421ad48..75dfa2f5a82d 100644 > --- a/xen/arch/arm/Kconfig > +++ b/xen/arch/arm/Kconfig > @@ -195,6 +195,7 @@ config PCI_PASSTHROUGH > depends on ARM_64 > select HAS_PCI > select HAS_VPCI > + select HAS_VPCI_GUEST_SUPPORT I tested this series on top of "SMMU handling for PCIe Passthrough on ARM” series on the N1SDP board and observe the SMMUv3 fault. Enable the Kconfig option PCI_PASSTHROUGH, ARM_SMMU_V3,HAS_ITS and "iommu=on”, "pci_passthrough_enabled=on" cmd line parameter and after that, there is an SMMU fault for the ITS doorbell register access from the PCI devices. As there is no upstream support for ARM for vPCI MSI/MSI-X handling because of that SMMU fault is observed. Linux Kernel will set the ITS doorbell register( physical address of doorbell register as IOMMU is not enabled in Kernel) in PCI config space to set up the MSI-X interrupts, but there is no mapping in SMMU page tables because of that SMMU fault is observed. To fix this we need to map the ITS doorbell register to SMMU page tables to avoid the fault. We can fix this after setting the mapping for the ITS doorbell offset in the ITS code. diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c index 299b384250..8227a7a74b 100644 --- a/xen/arch/arm/vgic-v3-its.c +++ b/xen/arch/arm/vgic-v3-its.c @@ -682,6 +682,18 @@ static int its_handle_mapd(struct virt_its *its, uint64_t *cmdptr) BIT(size, UL), valid); if ( ret && valid ) return ret; + +if ( is_iommu_enabled(its->d) ) { +ret = map_mmio_regions(its->d, gaddr_to_gfn(its->doorbell_address), + PFN_UP(ITS_DOORBELL_OFFSET), + maddr_to_mfn(its->doorbell_address)); +if ( ret < 0 ) +{ +printk(XENLOG_ERR "GICv3: Map ITS translation register d%d failed.\n", +its->d->domain_id); +return ret; +} +} } Also as per Julien's request, I tried to set up the IOMMU for the PCI device without "pci_passthroigh_enable=on" and without HAS_VPCI everything works as expected after applying below patches. To test enable kconfig options HAS_PCI, ARM_SMMU_V3 and HAS_ITS and add below patches to make it work. • Set the mapping for the ITS doorbell offset in the ITS code when iommu is enabled. • Reverted the patch that added the support for pci_passthrough_on. • Allow MMIO mapping of ECAM space to dom0 when vPCI is not enabled, as of now MMIO mapping for ECAM is based on pci_passthrough_enabled. We need this patch if we want to avoid enabling HAS_VPCI Please find the attached patches in case you want to test at your end. Regards, Rahul > default n > help > This option enables PCI device passthrough > diff --git a/xen/arch/arm/include/asm/domain.h > b/xen/arch/arm/include/asm/domain.h > index 1a13965a26b8..6e016b00bae1 100644 > --- a/xen/arch/arm/include/asm/domain.h > +++ b/xen/arch/arm/include/asm/domain.h > @@ -298,7 +298,7 @@ static inline void arch_vcpu_block(struct vcpu *v) {} > > #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag) > > -#define has_vpci(d) ({ IS_ENABLED(CONFIG_HAS_VPCI) && is_hardware_domain(d); > }) > +#define has_vpci(d)({ (void)(d); IS_ENABLED(CONFIG_HAS_VPCI); }) > > struct arch_vcpu_io { > struct instr_details dabt_instr; /* when the instruction is decoded */ > -- > 2.41.0 > > 0001-Revert-xen-arm-Add-cmdline-boot-option-pci-passthrou.patch Description: 0001-Revert-xen-arm-Add-cmdline-boot-option-pci-passthrou.patch 0002-xen-arm-Fix-mapping-for-PCI-bridge-mmio-region.patch Description: 0002-xen-arm-Fix-mapping-for-PCI-bridge-mmio-region.patch 0003-xen-arm-Map-ITS-dorrbell-register-to-IOMMU-page-tabl.patch Description: 0003-xen-arm-Map-ITS-dorrbell-register-to-IOMMU-page-tabl.patch
Re: [XEN PATCH v3 2/3] xen/drivers/passthrough/arm/smmu-v3.c: fix violations of MISRA C:2012 Rule 3.1
Hi, > On 4 Jul 2023, at 4:51 pm, Jan Beulich wrote: > > On 29.06.2023 16:52, Luca Fancellu wrote: >> >> >>> On 29 Jun 2023, at 11:06, Nicola Vetrini wrote: >>> >>> In the file `xen/drivers/passthrough/arm/smmu-v3.c' there are a few >>> occurrences >> >> here you use a different character to enclose the file path (` vs ‘) may I >> suggest to >> use only (‘)? >> >>> of nested '//' character sequences inside C-style comment blocks, which >>> violate >>> Rule 3.1. >>> >>> The patch aims to resolve those by replacing the nested comments with >>> equivalent constructs that do not violate the rule. >>> >>> Signed-off-by: Nicola Vetrini >> >> You are missing the “---“ here, meaning that the lines below are part of the >> commit message and I’m sure you don’t want that. >> >> Also here, may I suggest to use this commit title instead? >> “xen/arm: smmuv3: Fix violations of MISRA C:2012 Rule 3.1” > > Just to mention it: Personally I'm averse to such double subject prefixes. > Why would (here) "xen/smmuv3: " not be sufficient (and entirely unambiguous)? With the changes suggested above. Acked-by: Rahul Singh Regards, Rahul
Re: [PATCH v2] xen/evtchn: Introduce new IOCTL to bind static evtchn
Hi Oleksandr, Thanks for reviewing the code. > On 29 Jun 2023, at 7:06 pm, Oleksandr Tyshchenko > wrote: > > > > On 29.06.23 18:46, Rahul Singh wrote: > > Hello Rahul > > >> Xen 4.17 supports the creation of static evtchns. To allow user space >> application to bind static evtchns introduce new ioctl >> "IOCTL_EVTCHN_BIND_STATIC". Existing IOCTL doing more than binding >> that’s why we need to introduce the new IOCTL to only bind the static >> event channels. >> >> Also, static evtchns to be available for use during the lifetime of the >> guest. When the application exits, __unbind_from_irq() ends up being >> called from release() file operations because of that static evtchns >> are getting closed. To avoid closing the static event channel, add the >> new bool variable "is_static" in "struct irq_info" to mark the event >> channel static when creating the event channel to avoid closing the >> static evtchn. >> >> Signed-off-by: Rahul Singh >> --- >> v2: >> * Use bool in place u8 to define is_static variable. >> * Avoid closing the static evtchns in error path. > > > Patch looks good to me, just a nit (question) below. > > >> --- >> drivers/xen/events/events_base.c | 7 +-- >> drivers/xen/evtchn.c | 30 ++ >> include/uapi/xen/evtchn.h| 9 + >> include/xen/events.h | 2 +- >> 4 files changed, 37 insertions(+), 11 deletions(-) >> >> diff --git a/drivers/xen/events/events_base.c >> b/drivers/xen/events/events_base.c >> index c7715f8bd452..5d3b5c7cfe64 100644 >> --- a/drivers/xen/events/events_base.c >> +++ b/drivers/xen/events/events_base.c >> @@ -112,6 +112,7 @@ struct irq_info { >> unsigned int irq_epoch; /* If eoi_cpu valid: irq_epoch of event */ >> u64 eoi_time; /* Time in jiffies when to EOI. */ >> raw_spinlock_t lock; >> + bool is_static; /* Is event channel static */ >> >> union { >> unsigned short virq; >> @@ -982,7 +983,8 @@ static void __unbind_from_irq(unsigned int irq) >> unsigned int cpu = cpu_from_irq(irq); >> struct xenbus_device *dev; >> >> - xen_evtchn_close(evtchn); >> + if (!info->is_static) >> + xen_evtchn_close(evtchn); >> >> switch (type_from_irq(irq)) { >> case IRQT_VIRQ: >> @@ -1574,7 +1576,7 @@ int xen_set_irq_priority(unsigned irq, unsigned >> priority) >> } >> EXPORT_SYMBOL_GPL(xen_set_irq_priority); >> >> -int evtchn_make_refcounted(evtchn_port_t evtchn) >> +int evtchn_make_refcounted(evtchn_port_t evtchn, bool is_static) >> { >> int irq = get_evtchn_to_irq(evtchn); >> struct irq_info *info; >> @@ -1590,6 +1592,7 @@ int evtchn_make_refcounted(evtchn_port_t evtchn) >> WARN_ON(info->refcnt != -1); >> >> info->refcnt = 1; >> + info->is_static = is_static; >> >> return 0; >> } >> diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c >> index c99415a70051..e6d2303478b2 100644 >> --- a/drivers/xen/evtchn.c >> +++ b/drivers/xen/evtchn.c >> @@ -366,7 +366,8 @@ static int evtchn_resize_ring(struct per_user_data *u) >> return 0; >> } >> >> -static int evtchn_bind_to_user(struct per_user_data *u, evtchn_port_t port) >> +static int evtchn_bind_to_user(struct per_user_data *u, evtchn_port_t port, >> + bool is_static) >> { >> struct user_evtchn *evtchn; >> struct evtchn_close close; >> @@ -402,14 +403,16 @@ static int evtchn_bind_to_user(struct per_user_data >> *u, evtchn_port_t port) >> if (rc < 0) >> goto err; >> >> - rc = evtchn_make_refcounted(port); >> + rc = evtchn_make_refcounted(port, is_static); >> return rc; >> >> err: >> /* bind failed, should close the port now */ >> - close.port = port; >> - if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) >> - BUG(); >> + if (!is_static) { > > > I think now "struct evtchn_close close;" can be placed here as it is not > used outside of this block. > > Also this block looks like an open-coded version of xen_evtchn_close() > defined at events_base.c, so maybe it is worth making xen_evtchn_close() > static inline and placing it into events.h, then calling helper here? > Please note, I will be ok either way. Make sense. I will modify the patch as per your request in the next version. I will wait for other maintainers to review the patch before sending the next version. Regards, Rahul
[PATCH v2] xen/evtchn: Introduce new IOCTL to bind static evtchn
Xen 4.17 supports the creation of static evtchns. To allow user space application to bind static evtchns introduce new ioctl "IOCTL_EVTCHN_BIND_STATIC". Existing IOCTL doing more than binding that’s why we need to introduce the new IOCTL to only bind the static event channels. Also, static evtchns to be available for use during the lifetime of the guest. When the application exits, __unbind_from_irq() ends up being called from release() file operations because of that static evtchns are getting closed. To avoid closing the static event channel, add the new bool variable "is_static" in "struct irq_info" to mark the event channel static when creating the event channel to avoid closing the static evtchn. Signed-off-by: Rahul Singh --- v2: * Use bool in place u8 to define is_static variable. * Avoid closing the static evtchns in error path. --- drivers/xen/events/events_base.c | 7 +-- drivers/xen/evtchn.c | 30 ++ include/uapi/xen/evtchn.h| 9 + include/xen/events.h | 2 +- 4 files changed, 37 insertions(+), 11 deletions(-) diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index c7715f8bd452..5d3b5c7cfe64 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -112,6 +112,7 @@ struct irq_info { unsigned int irq_epoch; /* If eoi_cpu valid: irq_epoch of event */ u64 eoi_time; /* Time in jiffies when to EOI. */ raw_spinlock_t lock; + bool is_static; /* Is event channel static */ union { unsigned short virq; @@ -982,7 +983,8 @@ static void __unbind_from_irq(unsigned int irq) unsigned int cpu = cpu_from_irq(irq); struct xenbus_device *dev; - xen_evtchn_close(evtchn); + if (!info->is_static) + xen_evtchn_close(evtchn); switch (type_from_irq(irq)) { case IRQT_VIRQ: @@ -1574,7 +1576,7 @@ int xen_set_irq_priority(unsigned irq, unsigned priority) } EXPORT_SYMBOL_GPL(xen_set_irq_priority); -int evtchn_make_refcounted(evtchn_port_t evtchn) +int evtchn_make_refcounted(evtchn_port_t evtchn, bool is_static) { int irq = get_evtchn_to_irq(evtchn); struct irq_info *info; @@ -1590,6 +1592,7 @@ int evtchn_make_refcounted(evtchn_port_t evtchn) WARN_ON(info->refcnt != -1); info->refcnt = 1; + info->is_static = is_static; return 0; } diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c index c99415a70051..e6d2303478b2 100644 --- a/drivers/xen/evtchn.c +++ b/drivers/xen/evtchn.c @@ -366,7 +366,8 @@ static int evtchn_resize_ring(struct per_user_data *u) return 0; } -static int evtchn_bind_to_user(struct per_user_data *u, evtchn_port_t port) +static int evtchn_bind_to_user(struct per_user_data *u, evtchn_port_t port, + bool is_static) { struct user_evtchn *evtchn; struct evtchn_close close; @@ -402,14 +403,16 @@ static int evtchn_bind_to_user(struct per_user_data *u, evtchn_port_t port) if (rc < 0) goto err; - rc = evtchn_make_refcounted(port); + rc = evtchn_make_refcounted(port, is_static); return rc; err: /* bind failed, should close the port now */ - close.port = port; - if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) - BUG(); + if (!is_static) { + close.port = port; + if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) + BUG(); + } del_evtchn(u, evtchn); return rc; } @@ -456,7 +459,7 @@ static long evtchn_ioctl(struct file *file, if (rc != 0) break; - rc = evtchn_bind_to_user(u, bind_virq.port); + rc = evtchn_bind_to_user(u, bind_virq.port, false); if (rc == 0) rc = bind_virq.port; break; @@ -482,7 +485,7 @@ static long evtchn_ioctl(struct file *file, if (rc != 0) break; - rc = evtchn_bind_to_user(u, bind_interdomain.local_port); + rc = evtchn_bind_to_user(u, bind_interdomain.local_port, false); if (rc == 0) rc = bind_interdomain.local_port; break; @@ -507,7 +510,7 @@ static long evtchn_ioctl(struct file *file, if (rc != 0) break; - rc = evtchn_bind_to_user(u, alloc_unbound.port); + rc = evtchn_bind_to_user(u, alloc_unbound.port, false); if (rc == 0) rc = alloc_unbound.port; break; @@ -536,6 +539,17 @@ static long evtchn_ioctl(struct file *file, bre
Re: [PATCH v4 0/7] SMMU handling for PCIe Passthrough on ARM
Hi Julien, > On 25 Jun 2023, at 1:56 pm, Julien Grall wrote: > > Hi, > > On 15/06/2023 22:05, Stewart Hildebrand wrote: >> On 6/7/23 03:19, Julien Grall wrote: >>> On 07/06/2023 04:02, Stewart Hildebrand wrote: This series introduces SMMU handling for PCIe passthrough on ARM. These patches are independent from (and don't depend on) the vPCI reference counting/locking work in progress, and should be able to be upstreamed independently. >>> >>> Can you clarify how this code was tested? Does this require code not yet >>> upstreamed? >> I'm testing the series standalone (+ config changes) by using a PCI device >> in dom0, and also in combination with the vPCI series [3] [4] for >> passthrough to a domU. >> Here are some more details on the test cases I'm using. > > Thanks that's helpful! One comment about the first test case. > >> 1. Using the PCI device in dom0 with the pci-passthrough=yes arg. In this >> case a couple of additional config changes [1] [2] are needed to select >> CONFIG_HAS_PCI=y, CONFIG_HAS_VPCI=y, and make has_vpci() return true. Aside >> from this series itself and the config changes, nothing else >> not-yet-upstreamed is required for this test case. It is on my TODO list to >> upstream these config changes, which I think will be useful on their own, >> not necessarily as part of any other series. > > I find a bit confusing that the IOMMU support for dom0 is gated behind > 'pci-passthrough'. I was expecting that the iommu would also be properly > configured for PCI if we using 'iommu=yes'. As per my understanding Xen can configure the iommus for PCI device without "pci-passthrough” enabled if we follow below path: 1) PCI host bridge is already enumerated and powered on in firmware before Xen boot 2) PCI devices are scanned in Xen. (https://gitlab.com/xen-project/people/bmarquis/xen-arm-poc/-/commit/bce463e1588a45e1bfdf59fc0d5f88b16604e439) 3) After scanning the PCI devices add PCI devices to iommu ( iommu_add_device() ) If PCI host bridge is not enumerated then we need "pci-passthrough” enabled to allow Linux to do enumeration and to inform Xen via PHYSDEVOP_pci_device_add hyper call to add PCI devices in Xen This is implemented as part of PCI passthrough feature. Regards, Rahul
Re: [PATCH] pci: fix pci_get_pdev() to always account for the segment
Hi Roger, > On 18 May 2023, at 11:57 am, Roger Pau Monne wrote: > > When a domain parameter is provided to pci_get_pdev() the search > function would match against the bdf, without taking the segment into > account. > > Fix this and also account for the passed segment. > > Fixes: 8cf6e0738906 ('PCI: simplify (and thus correct) > pci_get_pdev{,_by_domain}()') > Signed-off-by: Roger Pau Monné I think the correct fixes tag is: Fixes: a37f9ea7a651 ("PCI: fold pci_get_pdev{,_by_domain}()") With that: Reviewed-by: Rahul Singh Regards, Rahul
Re: [PATCH 2/2] xen/arm: smmuv3: Advertise coherent table walk if supported
Hi Michal, On 12 May 2023, at 3:35 pm, Michal Orzel wrote: At the moment, even in case of a SMMU being I/O coherent, we clean the updated PT as a result of not advertising the coherency feature. SMMUv3 coherency feature means that page table walks, accesses to memory structures and queues are I/O coherent (refer ARM IHI 0070 E.A, 3.15). Follow the same steps that were done for SMMU v1,v2 driver by the commit: 080dcb781e1bc3bb22f55a9dfdecb830ccbabe88 The same restrictions apply, meaning that in order to advertise coherent table walk platform feature, all the SMMU devices need to report coherency feature. This is because the page tables (we are sharing them with CPU) are populated before any device assignment and in case of a device being behind non-coherent SMMU, we would have to scan the tables and clean the cache. It is to be noted that the SBSA/BSA (refer ARM DEN0094C 1.0C, section D) requires that all SMMUv3 devices support I/O coherency. Signed-off-by: Michal Orzel Reviewed-by: Rahul Singh mailto:rahul.si...@arm.com>> Regards, Rahul
Re: [PATCH 2/2] xen/arm: smmuv3: Advertise coherent table walk if supported
Hi Michal, > On 12 May 2023, at 3:35 pm, Michal Orzel wrote: > > At the moment, even in case of a SMMU being I/O coherent, we clean the > updated PT as a result of not advertising the coherency feature. SMMUv3 > coherency feature means that page table walks, accesses to memory > structures and queues are I/O coherent (refer ARM IHI 0070 E.A, 3.15). > > Follow the same steps that were done for SMMU v1,v2 driver by the commit: > 080dcb781e1bc3bb22f55a9dfdecb830ccbabe88 > > The same restrictions apply, meaning that in order to advertise coherent > table walk platform feature, all the SMMU devices need to report coherency > feature. This is because the page tables (we are sharing them with CPU) > are populated before any device assignment and in case of a device being > behind non-coherent SMMU, we would have to scan the tables and clean > the cache. > > It is to be noted that the SBSA/BSA (refer ARM DEN0094C 1.0C, section D) > requires that all SMMUv3 devices support I/O coherency. > > Signed-off-by: Michal Orzel > --- > There are very few platforms out there with SMMUv3 but I have never seen > a SMMUv3 that is not I/O coherent. > --- > xen/drivers/passthrough/arm/smmu-v3.c | 24 +++- > 1 file changed, 23 insertions(+), 1 deletion(-) > > diff --git a/xen/drivers/passthrough/arm/smmu-v3.c > b/xen/drivers/passthrough/arm/smmu-v3.c > index bf053cdb6d5c..2adaad0fa038 100644 > --- a/xen/drivers/passthrough/arm/smmu-v3.c > +++ b/xen/drivers/passthrough/arm/smmu-v3.c > @@ -2526,6 +2526,15 @@ static const struct dt_device_match > arm_smmu_of_match[] = { > }; > > /* Start of Xen specific code. */ > + > +/* > + * Platform features. It indicates the list of features supported by all > + * SMMUs. Actually we only care about coherent table walk, which in case of > + * SMMUv3 is implied by the overall coherency feature (refer ARM IHI 0070 > E.A, > + * section 3.15 and SMMU_IDR0.COHACC bit description). > + */ > +static uint32_t platform_features = ARM_SMMU_FEAT_COHERENCY; > + > static int __must_check arm_smmu_iotlb_flush_all(struct domain *d) > { > struct arm_smmu_xen_domain *xen_domain = dom_iommu(d)->arch.priv; > @@ -2708,8 +2717,12 @@ static int arm_smmu_iommu_xen_domain_init(struct > domain *d) > INIT_LIST_HEAD(&xen_domain->contexts); > > dom_iommu(d)->arch.priv = xen_domain; > - return 0; > > + /* Coherent walk can be enabled only when all SMMUs support it. */ > + if (platform_features & ARM_SMMU_FEAT_COHERENCY) > + iommu_set_feature(d, IOMMU_FEAT_COHERENT_WALK); > + > + return 0; > } > > static void arm_smmu_iommu_xen_domain_teardown(struct domain *d) > @@ -2738,6 +2751,7 @@ static __init int arm_smmu_dt_init(struct > dt_device_node *dev, > const void *data) > { > int rc; > + const struct arm_smmu_device *smmu; > > /* >* Even if the device can't be initialized, we don't want to > @@ -2751,6 +2765,14 @@ static __init int arm_smmu_dt_init(struct > dt_device_node *dev, > > iommu_set_ops(&arm_smmu_iommu_ops); > > + /* Find the just added SMMU and retrieve its features. */ > + smmu = arm_smmu_get_by_dev(dt_to_dev(dev)); > + > + /* It would be a bug not to find the SMMU we just added. */ > + BUG_ON(!smmu); > + > + platform_features &= smmu->features; > + > return 0; > } > > -- > 2.25.1 >
Re: [PATCH 1/2] xen/arm: smmuv3: Constify arm_smmu_get_by_dev() parameter
Hi Michal, > On 12 May 2023, at 3:35 pm, Michal Orzel wrote: > > This function does not modify its parameter 'dev' and it is not supposed > to do it. Therefore, constify it. > > Signed-off-by: Michal Orzel Reviewed-by: Rahul Singh Regards, Rahul
Re: [PATCH] xen/evtchn: Introduce new IOCTL to bind static evtchn
Hi Stefano, Thanks for the review. On 6 May 2023, at 1:52 am, Stefano Stabellini wrote: On Fri, 28 Apr 2023, Rahul Singh wrote: Xen 4.17 supports the creation of static evtchns. To allow user space application to bind static evtchns introduce new ioctl "IOCTL_EVTCHN_BIND_STATIC". Existing IOCTL doing more than binding that’s why we need to introduce the new IOCTL to only bind the static event channels. Also, static evtchns to be available for use during the lifetime of the guest. When the application exits, __unbind_from_irq() end up being called from release() fop because of that static evtchns are getting closed. To avoid closing the static event channel, add the new bool variable "is_static" in "struct irq_info" to mark the event channel static when creating the event channel to avoid closing the static evtchn. Signed-off-by: Rahul Singh I think the patch is OK but evtchn_bind_to_user on the error path calls EVTCHNOP_close. Could that be a problem for static evtchns? I wonder if we need to skip that EVTCHNOP_close call too. err: /* bind failed, should close the port now */ close.port = port; if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) BUG(); del_evtchn(u, evtchn); Yes, we need to avoid to close the static event channel in case of error path also. I will fix this in next version. Regards, Rahul
Re: [XEN v6 04/12] xen/arm: smmu: Use writeq_relaxed_non_atomic() for writing to SMMU_CBn_TTBR0
Hi Ayan, On 4 May 2023, at 8:37 am, Michal Orzel wrote: On 28/04/2023 19:55, Ayan Kumar Halder wrote: Refer ARM IHI 0062D.c ID070116 (SMMU 2.0 spec), 17-360, 17.3.9, SMMU_CBn_TTBR0 is a 64 bit register. Thus, one can use writeq_relaxed_non_atomic() to write to it instead of invoking writel_relaxed() twice for lower half and upper half of the register. This also helps us as p2maddr is 'paddr_t' (which may be u32 in future). Thus, one can assign p2maddr to a 64 bit register and do the bit manipulations on it, to generate the value for SMMU_CBn_TTBR0. Signed-off-by: Ayan Kumar Halder Reviewed-by: Stefano Stabellini --- Changes from - v1 - 1. Extracted the patch from "[XEN v1 8/9] xen/arm: Other adaptations required to support 32bit paddr". Use writeq_relaxed_non_atomic() to write u64 register in a non-atomic fashion. v2 - 1. Added R-b. v3 - 1. No changes. v4 - 1. Reordered the R-b. No further changes. (This patch can be committed independent of the series). v5 - Used 'uint64_t' instead of u64. As the change looked trivial to me, I retained the R-b. xen/drivers/passthrough/arm/smmu.c | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c index 79281075ba..fb8bef5f69 100644 --- a/xen/drivers/passthrough/arm/smmu.c +++ b/xen/drivers/passthrough/arm/smmu.c @@ -499,8 +499,7 @@ enum arm_smmu_s2cr_privcfg { #define ARM_SMMU_CB_SCTLR 0x0 #define ARM_SMMU_CB_RESUME 0x8 #define ARM_SMMU_CB_TTBCR2 0x10 -#define ARM_SMMU_CB_TTBR0_LO 0x20 -#define ARM_SMMU_CB_TTBR0_HI 0x24 +#define ARM_SMMU_CB_TTBR0 0x20 #define ARM_SMMU_CB_TTBCR 0x30 #define ARM_SMMU_CB_S1_MAIR0 0x38 #define ARM_SMMU_CB_FSR0x58 @@ -1083,6 +1082,7 @@ static void arm_smmu_flush_pgtable(struct arm_smmu_device *smmu, void *addr, static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain) { u32 reg; + uint64_t reg64; bool stage1; struct arm_smmu_cfg *cfg = &smmu_domain->cfg; struct arm_smmu_device *smmu = smmu_domain->smmu; @@ -1177,12 +1177,13 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain) dev_notice(smmu->dev, "d%u: p2maddr 0x%"PRIpaddr"\n", smmu_domain->cfg.domain->domain_id, p2maddr); - reg = (p2maddr & ((1ULL << 32) - 1)); - writel_relaxed(reg, cb_base + ARM_SMMU_CB_TTBR0_LO); - reg = (p2maddr >> 32); + reg64 = p2maddr; + if (stage1) - reg |= ARM_SMMU_CB_ASID(cfg) << TTBRn_HI_ASID_SHIFT; - writel_relaxed(reg, cb_base + ARM_SMMU_CB_TTBR0_HI); + reg64 |= (((uint64_t) (ARM_SMMU_CB_ASID(cfg) << TTBRn_HI_ASID_SHIFT)) +<< 32); I think << should be aligned to the second '(' above. Reviewed-by: Michal Orzel mailto:michal.or...@amd.com>> With the Michal comment fixed. Reviewed-by: Rahul Singh mailto:rahul.si...@arm.com>> Regards, Rahul
Re: [PATCH] xen/arm: pci: fix -Wtype-limits warning in pci-host-common.c
Hi Stewart, On 3 May 2023, at 8:18 pm, Stewart Hildebrand wrote: When building with EXTRA_CFLAGS_XEN_CORE="-Wtype-limits", we observe the following warning: arch/arm/pci/pci-host-common.c: In function ‘pci_host_common_probe’: arch/arm/pci/pci-host-common.c:238:26: warning: comparison is always false due to limited range of data type [-Wtype-limits] 238 | if ( bridge->segment < 0 ) | ^ This is due to bridge->segment being an unsigned type. Fix it by introducing a new variable of signed type to use in the condition. Signed-off-by: Stewart Hildebrand Reviewed-by: Rahul Singh mailto:rahul.si...@arm.com>> Regards, Rahul
Re: [PATCH] xen/evtchn: Introduce new IOCTL to bind static evtchn
Hi Ayan, On 28 Apr 2023, at 2:30 pm, Ayan Kumar Halder wrote: Hi Rahul, On 28/04/2023 13:36, Rahul Singh wrote: CAUTION: This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email. Xen 4.17 supports the creation of static evtchns. To allow user space application to bind static evtchns introduce new ioctl "IOCTL_EVTCHN_BIND_STATIC". Existing IOCTL doing more than binding that’s why we need to introduce the new IOCTL to only bind the static event channels. Also, static evtchns to be available for use during the lifetime of the guest. When the application exits, __unbind_from_irq() end up being called from release() fop because of that static evtchns are getting closed. To avoid closing the static event channel, add the new bool variable "is_static" in "struct irq_info" to mark the event channel static when creating the event channel to avoid closing the static evtchn. Signed-off-by: Rahul Singh --- drivers/xen/events/events_base.c | 7 +-- drivers/xen/evtchn.c | 22 +- include/uapi/xen/evtchn.h| 9 + include/xen/events.h | 2 +- 4 files changed, 32 insertions(+), 8 deletions(-) diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index c7715f8bd452..31f2d3634ad5 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -112,6 +112,7 @@ struct irq_info { unsigned int irq_epoch; /* If eoi_cpu valid: irq_epoch of event */ u64 eoi_time; /* Time in jiffies when to EOI. */ raw_spinlock_t lock; + u8 is_static; /* Is event channel static */ I think we should avoid u8/u16/u32 and instead use uint8_t/uint16_t/uint32_t. However in this case, you can use bool. Make sense. I will change to bool in next patch. Regards, Rahul
[PATCH] xen/evtchn: Introduce new IOCTL to bind static evtchn
Xen 4.17 supports the creation of static evtchns. To allow user space application to bind static evtchns introduce new ioctl "IOCTL_EVTCHN_BIND_STATIC". Existing IOCTL doing more than binding that’s why we need to introduce the new IOCTL to only bind the static event channels. Also, static evtchns to be available for use during the lifetime of the guest. When the application exits, __unbind_from_irq() end up being called from release() fop because of that static evtchns are getting closed. To avoid closing the static event channel, add the new bool variable "is_static" in "struct irq_info" to mark the event channel static when creating the event channel to avoid closing the static evtchn. Signed-off-by: Rahul Singh --- drivers/xen/events/events_base.c | 7 +-- drivers/xen/evtchn.c | 22 +- include/uapi/xen/evtchn.h| 9 + include/xen/events.h | 2 +- 4 files changed, 32 insertions(+), 8 deletions(-) diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index c7715f8bd452..31f2d3634ad5 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -112,6 +112,7 @@ struct irq_info { unsigned int irq_epoch; /* If eoi_cpu valid: irq_epoch of event */ u64 eoi_time; /* Time in jiffies when to EOI. */ raw_spinlock_t lock; + u8 is_static; /* Is event channel static */ union { unsigned short virq; @@ -982,7 +983,8 @@ static void __unbind_from_irq(unsigned int irq) unsigned int cpu = cpu_from_irq(irq); struct xenbus_device *dev; - xen_evtchn_close(evtchn); + if (!info->is_static) + xen_evtchn_close(evtchn); switch (type_from_irq(irq)) { case IRQT_VIRQ: @@ -1574,7 +1576,7 @@ int xen_set_irq_priority(unsigned irq, unsigned priority) } EXPORT_SYMBOL_GPL(xen_set_irq_priority); -int evtchn_make_refcounted(evtchn_port_t evtchn) +int evtchn_make_refcounted(evtchn_port_t evtchn, bool is_static) { int irq = get_evtchn_to_irq(evtchn); struct irq_info *info; @@ -1590,6 +1592,7 @@ int evtchn_make_refcounted(evtchn_port_t evtchn) WARN_ON(info->refcnt != -1); info->refcnt = 1; + info->is_static = is_static; return 0; } diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c index c99415a70051..47681d4c696b 100644 --- a/drivers/xen/evtchn.c +++ b/drivers/xen/evtchn.c @@ -366,7 +366,8 @@ static int evtchn_resize_ring(struct per_user_data *u) return 0; } -static int evtchn_bind_to_user(struct per_user_data *u, evtchn_port_t port) +static int evtchn_bind_to_user(struct per_user_data *u, evtchn_port_t port, + bool is_static) { struct user_evtchn *evtchn; struct evtchn_close close; @@ -402,7 +403,7 @@ static int evtchn_bind_to_user(struct per_user_data *u, evtchn_port_t port) if (rc < 0) goto err; - rc = evtchn_make_refcounted(port); + rc = evtchn_make_refcounted(port, is_static); return rc; err: @@ -456,7 +457,7 @@ static long evtchn_ioctl(struct file *file, if (rc != 0) break; - rc = evtchn_bind_to_user(u, bind_virq.port); + rc = evtchn_bind_to_user(u, bind_virq.port, false); if (rc == 0) rc = bind_virq.port; break; @@ -482,7 +483,7 @@ static long evtchn_ioctl(struct file *file, if (rc != 0) break; - rc = evtchn_bind_to_user(u, bind_interdomain.local_port); + rc = evtchn_bind_to_user(u, bind_interdomain.local_port, false); if (rc == 0) rc = bind_interdomain.local_port; break; @@ -507,7 +508,7 @@ static long evtchn_ioctl(struct file *file, if (rc != 0) break; - rc = evtchn_bind_to_user(u, alloc_unbound.port); + rc = evtchn_bind_to_user(u, alloc_unbound.port, false); if (rc == 0) rc = alloc_unbound.port; break; @@ -536,6 +537,17 @@ static long evtchn_ioctl(struct file *file, break; } + case IOCTL_EVTCHN_BIND_STATIC: { + struct ioctl_evtchn_bind bind; + + rc = -EFAULT; + if (copy_from_user(&bind, uarg, sizeof(bind))) + break; + + rc = evtchn_bind_to_user(u, bind.port, true); + break; + } + case IOCTL_EVTCHN_NOTIFY: { struct ioctl_evtchn_notify notify; struct user_evtchn *evtchn; diff --git a/include/uapi/xen/evtchn.h b/include/uapi/xen/evtchn.h index 7fbf732f168f..aef2b75f3413 100644 -
Re: [PATCH v2] xen/arm: smmuv3: mark arm_smmu_disable_pasid __maybe_unused
Hi Stewart, > On 15 Dec 2022, at 3:10 pm, Stewart Hildebrand > wrote: > > On 12/15/22 09:51, Julien Grall wrote: >> Hi Stewart, >> >> On 15/12/2022 14:11, Stewart Hildebrand wrote: >>> On 12/15/22 06:34, Julien Grall wrote: Hi Stewart, I was about to commit this patch when I noticed the placement of the attribute doesn't match what we are usually doing in Xen. On 13/12/2022 18:18, Stewart Hildebrand wrote: > When building with clang 12 and CONFIG_ARM_SMMU_V3=y, we observe the > following build error: > > drivers/passthrough/arm/smmu-v3.c:1408:20: error: unused function > 'arm_smmu_disable_pasid' [-Werror,-Wunused-function] > static inline void arm_smmu_disable_pasid(struct arm_smmu_master *master) > { } > ^ > > arm_smmu_disable_pasid is not currently called from anywhere in Xen, but > it is inside a section of code guarded by CONFIG_PCI_ATS, which may be > helpful in the future if the PASID feature is to be implemented. Add the > attribute __maybe_unused to the function. > > Signed-off-by: Stewart Hildebrand > --- > v1->v2: > Add __maybe_unused attribute instead of removing > --- >xen/drivers/passthrough/arm/smmu-v3.c | 2 ++ >1 file changed, 2 insertions(+) > > diff --git a/xen/drivers/passthrough/arm/smmu-v3.c > b/xen/drivers/passthrough/arm/smmu-v3.c > index 9c9f4630090e..0cdc862f96d1 100644 > --- a/xen/drivers/passthrough/arm/smmu-v3.c > +++ b/xen/drivers/passthrough/arm/smmu-v3.c > @@ -1376,6 +1376,7 @@ static int arm_smmu_enable_pasid(struct > arm_smmu_master *master) >return 0; >} > > +__maybe_unused >static void arm_smmu_disable_pasid(struct arm_smmu_master *master) The attribute should be placed after "void". I.e.: static void __maybe_unused arm_smmu_disable_pasid(...) >>> >>> I had initially tried placing it exactly where you suggest in the first >>> draft of v2 of this patch. However, the line would then exceed 72 >>> characters (actual 81 characters): >> >> This doesn't change the problem here but the limit is 80 characters per >> line rather than 72. >> >>> >>> static void __maybe_unused arm_smmu_disable_pasid(struct arm_smmu_master >>> *master) >>> >>> So I found myself juggling with how best to wrap it. How about a newline >>> after the __maybe_unused attribute? >>> >>> static void __maybe_unused >>> arm_smmu_disable_pasid(struct arm_smmu_master *master) >>> >>> and similarly for the 2nd occurrence: >>> >>> static inline void __maybe_unused >>> arm_smmu_disable_pasid(struct arm_smmu_master *master) { } >>> >>> There is precedent for this style of wrapping in xen/common/sched/credit2.c. >> >> Ah! I didn't realize the line would have been too long. In this case, >> the newline after __maybe_unused is the way to go. > > Ok, I will send a v3 with this change. > > Rahul - may I retain your R-b tag in v3? Yes you can retain my R-b. Regards, Rahul
Re: [PATCH v2] xen/arm: smmuv3: mark arm_smmu_disable_pasid __maybe_unused
Hi Stewart, > On 13 Dec 2022, at 6:18 pm, Stewart Hildebrand > wrote: > > When building with clang 12 and CONFIG_ARM_SMMU_V3=y, we observe the > following build error: > > drivers/passthrough/arm/smmu-v3.c:1408:20: error: unused function > 'arm_smmu_disable_pasid' [-Werror,-Wunused-function] > static inline void arm_smmu_disable_pasid(struct arm_smmu_master *master) { } > ^ > > arm_smmu_disable_pasid is not currently called from anywhere in Xen, but > it is inside a section of code guarded by CONFIG_PCI_ATS, which may be > helpful in the future if the PASID feature is to be implemented. Add the > attribute __maybe_unused to the function. > > Signed-off-by: Stewart Hildebrand Reviewed-by: Rahul Singh Regards, Rahul
Re: [PATCH] xen/arm: smmuv3: remove unused function
Hi Julien, > On 12 Dec 2022, at 4:07 pm, Julien Grall wrote: > > Hi Stewart, > > On 12/12/2022 16:00, Stewart Hildebrand wrote: >> When building with clang 12 and CONFIG_ARM_SMMU_V3=y, we observe the >> following build error: >> drivers/passthrough/arm/smmu-v3.c:1408:20: error: unused function >> 'arm_smmu_disable_pasid' [-Werror,-Wunused-function] >> static inline void arm_smmu_disable_pasid(struct arm_smmu_master *master) { } >>^ >> Remove the function. >> Signed-off-by: Stewart Hildebrand >> --- >> There is also a definition of arm_smmu_disable_pasid() just above, >> guarded by #ifdef CONFIG_PCI_ATS. Should this one be removed too? It >> might be nice to keep this definition for ease of backporting patches >> from Linux, but if we ever plan on supporting PCI_ATS in Xen this may >> need to be re-visited. > > Given the function is not called at all, I think this is a bit odd to remove > the stub but leave the implementation when CONFIG_PCI_ATS is defined. > > Rahul, do you plan to use it in the PCI passthrough code? If yes, then I > would consider to use __maybe_unused. No, this function will not be used in PCI passthrough code, but when we merged the SMMUv3 code from Linux at that time we decided to have this code and gate with CONFIG_PCI_ATS so that in the future if someone wants to implement the PASID feature will use these functions. I also agree with Julien we would consider using __maybe_unused. Regards, Rahul
Re: [RFC PATCH 00/21] Add SMMUv3 Stage 1 Support for XEN guests
Hi Stefano,Julien > On 5 Dec 2022, at 9:43 pm, Stefano Stabellini wrote: > > On Sat, 3 Dec 2022, Julien Grall wrote: >> On 01/12/2022 16:02, Rahul Singh wrote: >>> This patch series is sent as RFC to get the initial feedback from the >>> community. This patch series consists of 21 patches which is a big number >>> for >>> the reviewer to review the patches but to understand the feature end-to-end >>> we >>> thought of sending this as a big series. Once we will get initial feedback, >>> we >>> will divide the series into a small number of patches for review. >> >> From the cover letter, it is not clear to me what sort of input you are >> expecting for the RFC. Is this about the design itself? >> >> If so, I think it would be more helpful to write an high level document on >> how >> you plan to emulate the vIOMMU in Xen. So there is one place to >> read/agree/verify rather than trying to collate all the information from the >> 20+ patches. >> >> Briefly skimming through I think the main things that need to be addressed in >> order of priority: >> - How to secure the vIOMMU >> - 1 vs multiple vIOMMU >> >> The questions are very similar to the vITS because the SMMUv3 is based on a >> queue. And given you are selling this feature as a security one, I don't >> think >> we can go forward with the review without any understanding/agreement on what >> needs to be implemented in order to have a safe/secure vIOMMU. > > I think we are all aligned here, but let me try to clarify further. > > As the vIOMMU is exposed to the guest, and exposing a queue-based > interface to the guest is not simple, it would be good to clarify in a > document the following points: > > - how is the queue exposed to the guest > - how are guest-inputs sanitized > - how do the virtual queue resources map to the physical queue > resources > - lifecycle of the resource mappings > - any memory allocations triggered by guest actions and their lifecycle > > It is difficult to extrapole these details from 21 patches. Having these > key detailed written down in the 0/21 email would greatly help with the > review. It would make the review go a lot faster. Ack. I will send the design docs by next week that will include all the requested information. Regards. Rahul
Re: [RFC PATCH 04/21] xen/arm: vIOMMU: add generic vIOMMU framework
Hi Julien, > On 5 Dec 2022, at 3:20 pm, Julien Grall wrote: > > On 05/12/2022 14:25, Michal Orzel wrote: > diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h > index 1528ced509..33d32835e7 100644 > --- a/xen/include/public/arch-arm.h > +++ b/xen/include/public/arch-arm.h > @@ -297,10 +297,14 @@ DEFINE_XEN_GUEST_HANDLE(vcpu_guest_context_t); > #define XEN_DOMCTL_CONFIG_TEE_NONE 0 > #define XEN_DOMCTL_CONFIG_TEE_OPTEE 1 > > +#define XEN_DOMCTL_CONFIG_VIOMMU_NONE 0 > + > struct xen_arch_domainconfig { > /* IN/OUT */ > uint8_t gic_version; > /* IN */ > +uint8_t viommu_type; this should be uint16_t and not uint8_t >>> >>> I will modify the in viommu_type to uint8_t in >>> "arch/arm/include/asm/viommu.h" and will >>> also fix everywhere the viommu_type to uint8_t. >> Also I think that you need to bump XEN_DOMCTL_INTERFACE_VERSION due to the >> change >> in struct xen_arch_domainconfig. > > We only need to bump the domctl version once per release. So if this is the > first modification of domctl.h in 4.18 then yes. > > That said, I am not sure whether this is necessary here as you are using a > padding. > > @Rahul, BTW, I think you may need to regenerate the bindings for OCaml and Go. Ack. I will check this before sending the v2. Regards, Rahul
Re: [RFC PATCH 00/21] Add SMMUv3 Stage 1 Support for XEN guests
Hi Michal, > On 6 Dec 2022, at 9:33 am, Michal Orzel wrote: > > Hi Rahul, > > On 02/12/2022 11:59, Michal Orzel wrote: >> Hi Rahul, >> >> On 01/12/2022 17:02, Rahul Singh wrote: >>> >>> >>> The SMMUv3 supports two stages of translation. Each stage of translation >>> can be >>> independently enabled. An incoming address is logically translated from VA >>> to >>> IPA in stage 1, then the IPA is input to stage 2 which translates the IPA to >>> the output PA. >>> >>> Stage 1 is intended to be used by a software entity to provide isolation or >>> translation to buffers within the entity, for example DMA isolation within >>> an >>> OS. Stage 2 is intended to be available in systems supporting the >>> Virtualization Extensions and is intended to virtualize device DMA to guest >>> VM >>> address spaces. When both stage 1 and stage 2 are enabled, the translation >>> configuration is called nested. >>> >>> Stage 1 translation support is required to provide isolation between >>> different >>> devices within OS. XEN already supports Stage 2 translation but there is no >>> support for Stage 1 translation. The goal of this work is to support Stage 1 >>> translation for XEN guests. Stage 1 has to be configured within the guest to >>> provide isolation. >>> >>> We cannot trust the guest OS to control the SMMUv3 hardware directly as >>> compromised guest OS can corrupt the SMMUv3 configuration and make the >>> system >>> vulnerable. The guest gets the ownership of the stage 1 page tables and also >>> owns stage 1 configuration structures. The XEN handles the root >>> configuration >>> structure (for security reasons), including the stage 2 configuration. >>> >>> XEN will emulate the SMMUv3 hardware and exposes the virtual SMMUv3 to the >>> guest. Guest can use the native SMMUv3 driver to configure the stage 1 >>> translation. When the guest configures the SMMUv3 for Stage 1, XEN will trap >>> the access and configure hardware. >>> >>> SMMUv3 Driver(Guest OS) -> Configure the Stage-1 translation -> >>> XEN trap access -> XEN SMMUv3 driver configure the HW. >>> >>> SMMUv3 driver has to be updated to support the Stage-1 translation support >>> based on work done by the KVM team to support Nested Stage translation: >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feauger%2Flinux%2Fcommits%2Fv5.11-stallv12-2stage-v14&data=05%7C01%7Cmichal.orzel%40amd.com%7Cecb9075a29974c8f5ad608dad3b5916f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638055074068482160%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=PdK4%2Bsps3%2FdXYJUDv3iCy%2Byaqbh1bOVb1AFzTtx1nts%3D&reserved=0 >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F852299%2F&data=05%7C01%7Cmichal.orzel%40amd.com%7Cecb9075a29974c8f5ad608dad3b5916f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638055074068482160%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=5Kp7023HiA4Qbfi28wcPL20JyC2xLwwiyEUZcxTSCOA%3D&reserved=0 >>> >>> As the stage 1 translation is configured by XEN on behalf of the guest, >>> translation faults encountered during the translation process need to be >>> propagated up to the guest and re-injected into the guest. When the guest >>> invalidates stage 1 related caches, invalidations must be forwarded to the >>> SMMUv3 hardware. >>> >>> This patch series is sent as RFC to get the initial feedback from the >>> community. This patch series consists of 21 patches which is a big number >>> for >>> the reviewer to review the patches but to understand the feature end-to-end >>> we >>> thought of sending this as a big series. Once we will get initial feedback, >>> we >>> will divide the series into a small number of patches for review. >> >> Due to the very limited availability of the board we have, that is equipped >> with >> DMA platform devices and SMMUv3 (I know that you tested PCI use case >> thoroughly), >> I managed for now to do the testing on dom0 only. >> >> By commenting out the code in Linux responsible for setting up Xen SWIOTLB >> DMA ops, I was able >> to successfully verify the nested SMMU working properly for DMA platform >> devices on the >> example of using ZDM
Re: [RFC PATCH 08/21] xen/arm: vsmmuv3: Add support for registers emulation
Hi Julien, > On 3 Dec 2022, at 9:16 pm, Julien Grall wrote: > > Hi Rahul, > > I have only skimmed through the patch so far. > > On 01/12/2022 16:02, Rahul Singh wrote: >> static int vsmmuv3_mmio_write(struct vcpu *v, mmio_info_t *info, >>register_t r, void *priv) >> { >> +struct virt_smmu *smmu = priv; >> +uint64_t reg; >> +uint32_t reg32; >> + >> +switch ( info->gpa & 0x ) >> +{ >> +case VREG32(ARM_SMMU_CR0): > > > Shouldn't this code (and all the other register emulations) be protected for > concurrent access in some way? Yes, I agree I will add the lock for register emulations in next v2. > > >> +reg32 = smmu->cr[0]; >> +vreg_reg32_update(®32, r, info); >> +smmu->cr[0] = reg32; >> +smmu->cr0ack = reg32 & ~CR0_RESERVED; > > Looking at the use. I think it doesn't look necessary to have a copy of cr0 > with just the reserved bit(s) unset. Instead, it would be better to clear the > bit(s) when reading it. Ack. Regards, Rahul
Re: [RFC PATCH 06/21] xen/domctl: Add XEN_DOMCTL_CONFIG_VIOMMU_* and viommu config param
Hi Jan, > On 2 Dec 2022, at 8:45 am, Jan Beulich wrote: > > On 01.12.2022 17:02, Rahul Singh wrote: >> Add new viommu_type field and field values XEN_DOMCTL_CONFIG_VIOMMU_NONE >> XEN_DOMCTL_CONFIG_VIOMMU_SMMUV3 in xen_arch_domainconfig to >> enable/disable vIOMMU support for domains. >> >> Also add viommu="N" parameter to xl domain configuration to enable the >> vIOMMU for the domains. Currently, only the "smmuv3" type is supported >> for ARM. >> >> Signed-off-by: Rahul Singh >> --- >> docs/man/xl.cfg.5.pod.in | 11 +++ >> tools/golang/xenlight/helpers.gen.go | 2 ++ >> tools/golang/xenlight/types.gen.go | 1 + >> tools/include/libxl.h| 5 + >> tools/libs/light/libxl_arm.c | 13 + >> tools/libs/light/libxl_types.idl | 6 ++ >> tools/xl/xl_parse.c | 9 + >> 7 files changed, 47 insertions(+) > > This diffstat taken together with the title makes me assume that e.g. ... > >> --- a/tools/libs/light/libxl_arm.c >> +++ b/tools/libs/light/libxl_arm.c >> @@ -179,6 +179,19 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, >> return ERROR_FAIL; >> } >> >> +switch (d_config->b_info.arch_arm.viommu_type) { >> +case LIBXL_VIOMMU_TYPE_NONE: >> +config->arch.viommu_type = XEN_DOMCTL_CONFIG_VIOMMU_NONE; > > ... this constant doesn't exist yet, and hence this would fail to build > at this point in the series. I notice, however, that the constants are > introduced in earlier patches. Perhaps the title here wants re-wording? Yes, I will fix the commit msg. Regards, Rahul
Re: [RFC PATCH 05/21] xen/arm: vsmmuv3: Add dummy support for virtual SMMUv3 for guests
Hi Michal, > On 5 Dec 2022, at 8:33 am, Michal Orzel wrote: > > Hi Rahul, > > On 01/12/2022 17:02, Rahul Singh wrote: >> >> >> domain_viommu_init() will be called during domain creation and will add >> the dummy trap handler for virtual IOMMUs for guests. >> >> A host IOMMU list will be created when host IOMMU devices are probed >> and this list will be used to create the IOMMU device tree node for >> dom0. For dom0, 1-1 mapping will be established between vIOMMU in dom0 >> and physical IOMMU. >> >> For domUs, the 1-N mapping will be established between domU and physical >> IOMMUs. A new area has been reserved in the arm guest physical map at >> which the emulated vIOMMU node is created in the device tree. >> >> Also set the vIOMMU type to vSMMUv3 to enable vIOMMU framework to call >> vSMMUv3 domain creation/destroy functions. >> >> Signed-off-by: Rahul Singh >> --- >> xen/arch/arm/domain.c | 3 +- >> xen/arch/arm/include/asm/domain.h | 4 + >> xen/arch/arm/include/asm/viommu.h | 20 >> xen/drivers/passthrough/Kconfig| 8 ++ >> xen/drivers/passthrough/arm/Makefile | 1 + >> xen/drivers/passthrough/arm/smmu-v3.c | 7 ++ >> xen/drivers/passthrough/arm/viommu.c | 30 ++ >> xen/drivers/passthrough/arm/vsmmu-v3.c | 124 + >> xen/drivers/passthrough/arm/vsmmu-v3.h | 20 >> xen/include/public/arch-arm.h | 7 +- >> 10 files changed, 222 insertions(+), 2 deletions(-) >> create mode 100644 xen/drivers/passthrough/arm/vsmmu-v3.c >> create mode 100644 xen/drivers/passthrough/arm/vsmmu-v3.h >> >> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c >> index 2a85209736..9a2b613500 100644 >> --- a/xen/arch/arm/domain.c >> +++ b/xen/arch/arm/domain.c >> @@ -692,7 +692,8 @@ int arch_sanitise_domain_config(struct >> xen_domctl_createdomain *config) >> return -EINVAL; >> } >> >> -if ( config->arch.viommu_type != XEN_DOMCTL_CONFIG_VIOMMU_NONE ) >> +if ( config->arch.viommu_type != XEN_DOMCTL_CONFIG_VIOMMU_NONE && >> + config->arch.viommu_type != viommu_get_type() ) >> { >> dprintk(XENLOG_INFO, >> "vIOMMU type requested not supported by the platform or >> Xen\n"); >> diff --git a/xen/arch/arm/include/asm/domain.h >> b/xen/arch/arm/include/asm/domain.h >> index 2ce6764322..8eb4eb5fd6 100644 >> --- a/xen/arch/arm/include/asm/domain.h >> +++ b/xen/arch/arm/include/asm/domain.h >> @@ -114,6 +114,10 @@ struct arch_domain >> void *tee; >> #endif >> >> +#ifdef CONFIG_VIRTUAL_IOMMU >> +struct list_head viommu_list; /* List of virtual IOMMUs */ >> +#endif >> + >> } __cacheline_aligned; >> >> struct arch_vcpu >> diff --git a/xen/arch/arm/include/asm/viommu.h >> b/xen/arch/arm/include/asm/viommu.h >> index 7cd3818a12..4785877e2a 100644 >> --- a/xen/arch/arm/include/asm/viommu.h >> +++ b/xen/arch/arm/include/asm/viommu.h >> @@ -5,9 +5,21 @@ >> #ifdef CONFIG_VIRTUAL_IOMMU >> >> #include >> +#include >> #include >> #include >> >> +extern struct list_head host_iommu_list; >> + >> +/* data structure for each hardware IOMMU */ >> +struct host_iommu { >> +struct list_head entry; >> +const struct dt_device_node *dt_node; >> +paddr_t addr; >> +paddr_t size; >> +uint32_t irq; > You want this to be int and not unsigned. The reason is ... > >> +}; >> + >> struct viommu_ops { >> /* >> * Called during domain construction if toolstack requests to enable >> @@ -35,6 +47,8 @@ struct viommu_desc { >> int domain_viommu_init(struct domain *d, uint16_t viommu_type); >> int viommu_relinquish_resources(struct domain *d); >> uint16_t viommu_get_type(void); >> +void add_to_host_iommu_list(paddr_t addr, paddr_t size, >> +const struct dt_device_node *node); >> >> #else >> >> @@ -56,6 +70,12 @@ static inline int viommu_relinquish_resources(struct >> domain *d) >> return 0; >> } >> >> +static inline void add_to_host_iommu_list(paddr_t addr, paddr_t size, >> + const struct dt_device_node *node) >> +{ >> +return; >> +} >> + >> #endif /* CONFIG_VIRTUAL_IOMMU */ >> >> #endif /* __ARCH_ARM_VIOMMU_H__ */ >> diff --git a/xen/drivers/passthrou
Re: [RFC PATCH 04/21] xen/arm: vIOMMU: add generic vIOMMU framework
Hi Michal, > On 5 Dec 2022, at 8:26 am, Michal Orzel wrote: > > Hi Rahul, > > On 01/12/2022 17:02, Rahul Singh wrote: >> >> >> This patch adds basic framework for vIOMMU. >> >> Signed-off-by: Rahul Singh >> --- >> xen/arch/arm/domain.c| 17 +++ >> xen/arch/arm/domain_build.c | 3 ++ >> xen/arch/arm/include/asm/viommu.h| 70 >> xen/drivers/passthrough/Kconfig | 6 +++ >> xen/drivers/passthrough/arm/Makefile | 1 + >> xen/drivers/passthrough/arm/viommu.c | 48 +++ >> xen/include/public/arch-arm.h| 4 ++ >> 7 files changed, 149 insertions(+) >> create mode 100644 xen/arch/arm/include/asm/viommu.h >> create mode 100644 xen/drivers/passthrough/arm/viommu.c >> >> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c >> index 38e22f12af..2a85209736 100644 >> --- a/xen/arch/arm/domain.c >> +++ b/xen/arch/arm/domain.c >> @@ -37,6 +37,7 @@ >> #include >> #include >> #include >> +#include >> #include >> >> #include "vpci.h" >> @@ -691,6 +692,13 @@ int arch_sanitise_domain_config(struct >> xen_domctl_createdomain *config) >> return -EINVAL; >> } >> >> +if ( config->arch.viommu_type != XEN_DOMCTL_CONFIG_VIOMMU_NONE ) >> +{ >> +dprintk(XENLOG_INFO, >> +"vIOMMU type requested not supported by the platform or >> Xen\n"); > Maybe a simpler message like for TEE would be better: "Unsupported vIOMMU > type" Ack. > >> +return -EINVAL; >> +} >> + >> return 0; >> } >> >> @@ -783,6 +791,9 @@ int arch_domain_create(struct domain *d, >> if ( (rc = domain_vpci_init(d)) != 0 ) >> goto fail; >> >> +if ( (rc = domain_viommu_init(d, config->arch.viommu_type)) != 0 ) >> +goto fail; >> + >> return 0; >> >> fail: >> @@ -998,6 +1009,7 @@ static int relinquish_memory(struct domain *d, struct >> page_list_head *list) >> enum { >> PROG_pci = 1, >> PROG_tee, >> +PROG_viommu, >> PROG_xen, >> PROG_page, >> PROG_mapping, >> @@ -1048,6 +1060,11 @@ int domain_relinquish_resources(struct domain *d) >> if (ret ) >> return ret; >> >> +PROGRESS(viommu): >> +ret = viommu_relinquish_resources(d); >> +if (ret ) >> +return ret; >> + >> PROGRESS(xen): >> ret = relinquish_memory(d, &d->xenpage_list); >> if ( ret ) >> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c >> index bd30d3798c..abbaf37a2e 100644 >> --- a/xen/arch/arm/domain_build.c >> +++ b/xen/arch/arm/domain_build.c >> @@ -27,6 +27,7 @@ >> #include >> #include >> #include >> +#include >> #include >> >> #include >> @@ -3858,6 +3859,7 @@ void __init create_domUs(void) >> struct domain *d; >> struct xen_domctl_createdomain d_cfg = { >> .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE, >> +.arch.viommu_type = viommu_get_type(), >> .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap, >> /* >> * The default of 1023 should be sufficient for guests because >> @@ -4052,6 +4054,7 @@ void __init create_dom0(void) >> printk(XENLOG_WARNING "Maximum number of vGIC IRQs exceeded.\n"); >> dom0_cfg.arch.tee_type = tee_get_type(); >> dom0_cfg.max_vcpus = dom0_max_vcpus(); >> +dom0_cfg.arch.viommu_type = viommu_get_type(); >> >> if ( iommu_enabled ) >> dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu; >> diff --git a/xen/arch/arm/include/asm/viommu.h >> b/xen/arch/arm/include/asm/viommu.h >> new file mode 100644 >> index 00..7cd3818a12 >> --- /dev/null >> +++ b/xen/arch/arm/include/asm/viommu.h >> @@ -0,0 +1,70 @@ >> +/* SPDX-License-Identifier: (GPL-2.0-or-later OR BSD-2-Clause) */ >> +#ifndef __ARCH_ARM_VIOMMU_H__ >> +#define __ARCH_ARM_VIOMMU_H__ >> + >> +#ifdef CONFIG_VIRTUAL_IOMMU >> + >> +#include >> +#include >> +#include >> + >> +struct viommu_ops { >> +/* >> + * Called during domain construction if toolstack requests to enable >> + * vIOMMU support. >> + */ >> +int (*domain_init)(struct domain *d); >> + >> +
Re: [RFC PATCH 04/21] xen/arm: vIOMMU: add generic vIOMMU framework
Hi Julien, > On 3 Dec 2022, at 9:54 pm, Julien Grall wrote: > > Hi Rahul, > > On 01/12/2022 16:02, Rahul Singh wrote: >> This patch adds basic framework for vIOMMU. >> Signed-off-by: Rahul Singh >> --- >> xen/arch/arm/domain.c| 17 +++ >> xen/arch/arm/domain_build.c | 3 ++ >> xen/arch/arm/include/asm/viommu.h| 70 >> xen/drivers/passthrough/Kconfig | 6 +++ >> xen/drivers/passthrough/arm/Makefile | 1 + >> xen/drivers/passthrough/arm/viommu.c | 48 +++ >> xen/include/public/arch-arm.h| 4 ++ >> 7 files changed, 149 insertions(+) >> create mode 100644 xen/arch/arm/include/asm/viommu.h >> create mode 100644 xen/drivers/passthrough/arm/viommu.c >> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c >> index 38e22f12af..2a85209736 100644 >> --- a/xen/arch/arm/domain.c >> +++ b/xen/arch/arm/domain.c >> @@ -37,6 +37,7 @@ >> #include >> #include >> #include >> +#include >> #include >>#include "vpci.h" >> @@ -691,6 +692,13 @@ int arch_sanitise_domain_config(struct >> xen_domctl_createdomain *config) >> return -EINVAL; >> } >> +if ( config->arch.viommu_type != XEN_DOMCTL_CONFIG_VIOMMU_NONE ) >> +{ >> +dprintk(XENLOG_INFO, >> +"vIOMMU type requested not supported by the platform or >> Xen\n"); >> +return -EINVAL; >> +} >> + >> return 0; >> } >> @@ -783,6 +791,9 @@ int arch_domain_create(struct domain *d, >> if ( (rc = domain_vpci_init(d)) != 0 ) >> goto fail; >> +if ( (rc = domain_viommu_init(d, config->arch.viommu_type)) != 0 ) >> +goto fail; >> + >> return 0; >>fail: >> @@ -998,6 +1009,7 @@ static int relinquish_memory(struct domain *d, struct >> page_list_head *list) >> enum { >> PROG_pci = 1, >> PROG_tee, >> +PROG_viommu, >> PROG_xen, >> PROG_page, >> PROG_mapping, >> @@ -1048,6 +1060,11 @@ int domain_relinquish_resources(struct domain *d) >> if (ret ) >> return ret; >> +PROGRESS(viommu): >> +ret = viommu_relinquish_resources(d); >> +if (ret ) >> +return ret; > > I would have expected us to relinquish the vIOMMU resources *before* we > detach the devices. So can you explain the ordering? I think first we need to detach the device that will set the S1 and S2 stage translation to bypass/abort then we need to remove the vIOMMU. Do you have anything that you feel if we relinquish the vIOMMU after detach is not right. > >> + >> PROGRESS(xen): >> ret = relinquish_memory(d, &d->xenpage_list); >> if ( ret ) >> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c >> index bd30d3798c..abbaf37a2e 100644 >> --- a/xen/arch/arm/domain_build.c >> +++ b/xen/arch/arm/domain_build.c >> @@ -27,6 +27,7 @@ >> #include >> #include >> #include >> +#include >> #include >>#include >> @@ -3858,6 +3859,7 @@ void __init create_domUs(void) >> struct domain *d; >> struct xen_domctl_createdomain d_cfg = { >> .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE, >> +.arch.viommu_type = viommu_get_type(), > > I don't think the vIOMMU should be enabled to dom0less domU by default. I am not enabling the vIOMMU by default. If vIOMMU is disabled via the command line or config option then viommu_get_type() will return XEN_DOMCTL_CONFIG_VIOMMU_NONE and in that case domain_viommu_init() will return 0 without doing anything. > >> .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap, >> /* >> * The default of 1023 should be sufficient for guests because >> @@ -4052,6 +4054,7 @@ void __init create_dom0(void) >> printk(XENLOG_WARNING "Maximum number of vGIC IRQs exceeded.\n"); >> dom0_cfg.arch.tee_type = tee_get_type(); >> dom0_cfg.max_vcpus = dom0_max_vcpus(); >> +dom0_cfg.arch.viommu_type = viommu_get_type(); > > Same here. > >>if ( iommu_enabled ) >> dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu; >> diff --git a/xen/arch/arm/include/asm/viommu.h >> b/xen/arch/arm/include/asm/viommu.h >> new file mode 100644 >> index 00..7cd3818a12 >> --- /dev/null >> +++ b/xen/arch/arm/include/asm/viomm
Re: [RFC PATCH 04/21] xen/arm: vIOMMU: add generic vIOMMU framework
Hi Jan, > On 2 Dec 2022, at 8:39 am, Jan Beulich wrote: > > On 01.12.2022 17:02, Rahul Singh wrote: >> --- a/xen/drivers/passthrough/Kconfig >> +++ b/xen/drivers/passthrough/Kconfig >> @@ -35,6 +35,12 @@ config IPMMU_VMSA >> (H3 ES3.0, M3-W+, etc) or Gen4 SoCs which IPMMU hardware supports stage 2 >> translation table format and is able to use CPU's P2M table as is. >> >> +config VIRTUAL_IOMMU >> + bool "Virtual IOMMU Support (UNSUPPORTED)" if UNSUPPORTED >> + default n >> + help >> + Support virtual IOMMU infrastructure to implement vIOMMU. > > I simply "virtual" specific enough in the name? Seeing that there are > multiple IOMMU flavors for Arm, and judging from the titles of subsequent > patches, you're implementing a virtualized form of only one variant. I agree with you I will remove the virtual in next version. > > Also, nit: Please omit "default n" here - it leads to a needless > line in the resulting .config, which in addition prevents the prompt > from appearing for user selection when someone later enables > UNSUPPORTED in their config and then runs e.g. "make oldconfig". But > perhaps you anyway really mean > > config VIRTUAL_IOMMU > bool "Virtual IOMMU Support (UNSUPPORTED)" > depends on UNSUPPORTED > help > Support virtual IOMMU infrastructure to implement vIOMMU. > > ? > > Note (nit again) the slightly altered indentation I'm also using in > the alternative suggestion. > I will modify as below: config VIRTUAL_IOMMU bool "Virtual IOMMU Support (UNSUPPORTED)” depends on UNSUPPORTED help Support IOMMU infrastructure to implement different variants of virtual IOMMUs. Regards, Rahul
[RFC PATCH 14/21] xen/arm: vIOMMU: IOMMU device tree node for dom0
XEN will create an IOMMU device tree node in the device tree to enable the dom0 to discover the virtual SMMUv3 during dom0 boot. IOMMU device tree node will only be created when cmdline option viommu is enabled. Signed-off-by: Rahul Singh --- xen/arch/arm/domain_build.c | 94 +++ xen/arch/arm/include/asm/viommu.h | 1 + 2 files changed, 95 insertions(+) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index a5295e8c3e..b82121beb5 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -2233,6 +2233,95 @@ int __init make_chosen_node(const struct kernel_info *kinfo) return res; } +#ifdef CONFIG_VIRTUAL_IOMMU +static int make_hwdom_viommu_node(const struct kernel_info *kinfo) +{ +uint32_t len; +int res; +char buf[24]; +void *fdt = kinfo->fdt; +const void *prop = NULL; +const struct dt_device_node *iommu = NULL; +struct host_iommu *iommu_data; +gic_interrupt_t intr; + +if ( list_empty(&host_iommu_list) ) +return 0; + +list_for_each_entry( iommu_data, &host_iommu_list, entry ) +{ +if ( iommu_data->hwdom_node_created ) +return 0; + +iommu = iommu_data->dt_node; + +snprintf(buf, sizeof(buf), "iommu@%"PRIx64, iommu_data->addr); + +res = fdt_begin_node(fdt, buf); +if ( res ) +return res; + +prop = dt_get_property(iommu, "compatible", &len); +if ( !prop ) +{ +res = -FDT_ERR_XEN(ENOENT); +return res; +} + +res = fdt_property(fdt, "compatible", prop, len); +if ( res ) +return res; + +if ( iommu->phandle ) +{ +res = fdt_property_cell(fdt, "phandle", iommu->phandle); +if ( res ) +return res; +} + +/* Use the same reg regions as the IOMMU node in host DTB. */ +prop = dt_get_property(iommu, "reg", &len); +if ( !prop ) +{ +printk(XENLOG_ERR "vIOMMU: Can't find IOMMU reg property.\n"); +res = -FDT_ERR_XEN(ENOENT); +return res; +} + +res = fdt_property(fdt, "reg", prop, len); +if ( res ) +return res; + +prop = dt_get_property(iommu, "#iommu-cells", &len); +if ( !prop ) +{ +res = -FDT_ERR_XEN(ENOENT); +return res; +} + +res = fdt_property(fdt, "#iommu-cells", prop, len); +if ( res ) +return res; + +res = fdt_property_string(fdt, "interrupt-names", "combined"); +if ( res ) +return res; + +set_interrupt(intr, iommu_data->irq, 0xf, DT_IRQ_TYPE_LEVEL_HIGH); + +res = fdt_property_interrupts(kinfo, &intr, 1); +if ( res ) +return res; + +iommu_data->hwdom_node_created = true; + +fdt_end_node(fdt); +} + +return res; +} +#endif + int __init map_irq_to_domain(struct domain *d, unsigned int irq, bool need_mapping, const char *devname) { @@ -2587,6 +2676,11 @@ static int __init handle_node(struct domain *d, struct kernel_info *kinfo, if ( dt_match_node(timer_matches, node) ) return make_timer_node(kinfo); +#ifdef CONFIG_VIRTUAL_IOMMU +if ( device_get_class(node) == DEVICE_IOMMU && is_viommu_enabled() ) +return make_hwdom_viommu_node(kinfo); +#endif + /* Skip nodes used by Xen */ if ( dt_device_used_by(node) == DOMID_XEN ) { diff --git a/xen/arch/arm/include/asm/viommu.h b/xen/arch/arm/include/asm/viommu.h index 4de4cceeda..e6018f435b 100644 --- a/xen/arch/arm/include/asm/viommu.h +++ b/xen/arch/arm/include/asm/viommu.h @@ -19,6 +19,7 @@ struct host_iommu { paddr_t addr; paddr_t size; uint32_t irq; +bool hwdom_node_created; }; struct viommu_ops { -- 2.25.1
[RFC PATCH 21/21] xen/arm: vIOMMU: Modify the partial device tree for dom0less
To configure IOMMU in guest for passthrough devices, user will need to copy the unmodified "iommus" property from host device tree to partial device tree. To enable the dom0 linux kernel to confiure the IOMMU correctly replace the phandle in partial device tree with virtual IOMMU phandle when "iommus" property is set. Signed-off-by: Rahul Singh --- xen/arch/arm/domain_build.c | 31 ++- 1 file changed, 30 insertions(+), 1 deletion(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 7cd99a6771..afb3e76409 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -3235,7 +3235,35 @@ static int __init handle_prop_pfdt(struct kernel_info *kinfo, return ( propoff != -FDT_ERR_NOTFOUND ) ? propoff : 0; } -static int __init scan_pfdt_node(struct kernel_info *kinfo, const void *pfdt, +static void modify_pfdt_node(void *pfdt, int nodeoff) +{ +int proplen, i, rc; +const fdt32_t *prop; +fdt32_t *prop_c; + +prop = fdt_getprop(pfdt, nodeoff, "iommus", &proplen); +if ( !prop ) +return; + +prop_c = xzalloc_bytes(proplen); + +for ( i = 0; i < proplen / 8; ++i ) +{ +prop_c[i * 2] = cpu_to_fdt32(GUEST_PHANDLE_VSMMUV3); +prop_c[i * 2 + 1] = prop[i * 2 + 1]; +} + +rc = fdt_setprop(pfdt, nodeoff, "iommus", prop_c, proplen); +if ( rc ) +{ +dprintk(XENLOG_ERR, "Can't set the iommus property in partial FDT"); +return; +} + +return; +} + +static int __init scan_pfdt_node(struct kernel_info *kinfo, void *pfdt, int nodeoff, uint32_t address_cells, uint32_t size_cells, bool scan_passthrough_prop) @@ -3261,6 +3289,7 @@ static int __init scan_pfdt_node(struct kernel_info *kinfo, const void *pfdt, node_next = fdt_first_subnode(pfdt, nodeoff); while ( node_next > 0 ) { +modify_pfdt_node(pfdt, node_next); scan_pfdt_node(kinfo, pfdt, node_next, address_cells, size_cells, scan_passthrough_prop); node_next = fdt_next_subnode(pfdt, node_next); -- 2.25.1
[RFC PATCH 20/21] libxl/arm: vIOMMU: Modify the partial device tree for iommus
To configure IOMMU in guest for passthrough devices, user will need to copy the unmodified "iommus" property from host device tree to partial device tree. To enable the dom0 linux kernel to confiure the IOMMU correctly replace the phandle in partial device tree with virtual IOMMU phandle when "iommus" property is set. Signed-off-by: Rahul Singh --- tools/libs/light/libxl_arm.c | 47 +++- 1 file changed, 46 insertions(+), 1 deletion(-) diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c index f2bb7d40dc..16d068404f 100644 --- a/tools/libs/light/libxl_arm.c +++ b/tools/libs/light/libxl_arm.c @@ -1167,6 +1167,41 @@ static int copy_partial_fdt(libxl__gc *gc, void *fdt, void *pfdt) return 0; } +static int modify_partial_fdt(libxl__gc *gc, void *pfdt) +{ +int nodeoff, proplen, i, r; +const fdt32_t *prop; +fdt32_t *prop_c; + +nodeoff = fdt_path_offset(pfdt, "/passthrough"); +if (nodeoff < 0) +return nodeoff; + +for (nodeoff = fdt_first_subnode(pfdt, nodeoff); + nodeoff >= 0; + nodeoff = fdt_next_subnode(pfdt, nodeoff)) { + +prop = fdt_getprop(pfdt, nodeoff, "iommus", &proplen); +if (!prop) +continue; + +prop_c = libxl__zalloc(gc, proplen); + +for (i = 0; i < proplen / 8; ++i) { +prop_c[i * 2] = cpu_to_fdt32(GUEST_PHANDLE_VSMMUV3); +prop_c[i * 2 + 1] = prop[i * 2 + 1]; +} + +r = fdt_setprop(pfdt, nodeoff, "iommus", prop_c, proplen); +if (r) { +LOG(ERROR, "Can't set the iommus property in partial FDT"); +return r; +} +} + +return 0; +} + #else static int check_partial_fdt(libxl__gc *gc, void *fdt, size_t size) @@ -1185,6 +1220,13 @@ static int copy_partial_fdt(libxl__gc *gc, void *fdt, void *pfdt) return -FDT_ERR_INTERNAL; } +static int modify_partial_fdt(libxl__gc *gc, void *pfdt) +{ +LOG(ERROR, "partial device tree not supported"); + +return ERROR_FAIL; +} + #endif /* ENABLE_PARTIAL_DEVICE_TREE */ #define FDT_MAX_SIZE (1<<20) @@ -1307,8 +1349,11 @@ next_resize: if (d_config->num_pcidevs) FDT( make_vpci_node(gc, fdt, ainfo, dom) ); -if (info->arch_arm.viommu_type == LIBXL_VIOMMU_TYPE_SMMUV3) +if (info->arch_arm.viommu_type == LIBXL_VIOMMU_TYPE_SMMUV3) { FDT( make_vsmmuv3_node(gc, fdt, ainfo, dom) ); +if (pfdt) +FDT( modify_partial_fdt(gc, pfdt) ); +} iommu_created = false; for (i = 0; i < d_config->num_disks; i++) { -- 2.25.1
[RFC PATCH 19/21] xen/arm: vsmmuv3: Add support to send stage-1 event to guest
Stage-1 translation is handled by guest, therefore stage-1 fault has to be forwarded to guest. Signed-off-by: Rahul Singh --- xen/drivers/passthrough/arm/smmu-v3.c | 48 -- xen/drivers/passthrough/arm/vsmmu-v3.c | 45 xen/drivers/passthrough/arm/vsmmu-v3.h | 12 +++ 3 files changed, 103 insertions(+), 2 deletions(-) diff --git a/xen/drivers/passthrough/arm/smmu-v3.c b/xen/drivers/passthrough/arm/smmu-v3.c index c4b4a5d86d..e17debc456 100644 --- a/xen/drivers/passthrough/arm/smmu-v3.c +++ b/xen/drivers/passthrough/arm/smmu-v3.c @@ -871,7 +871,6 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid) return 0; } -__maybe_unused static struct arm_smmu_master * arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid) { @@ -892,10 +891,51 @@ arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid) return NULL; } +static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt) +{ + int ret; + struct arm_smmu_master *master; + u32 sid = FIELD_GET(EVTQ_0_SID, evt[0]); + + switch (FIELD_GET(EVTQ_0_ID, evt[0])) { + case EVT_ID_TRANSLATION_FAULT: + break; + case EVT_ID_ADDR_SIZE_FAULT: + break; + case EVT_ID_ACCESS_FAULT: + break; + case EVT_ID_PERMISSION_FAULT: + break; + default: + return -EOPNOTSUPP; + } + + /* Stage-2 event */ + if (evt[1] & EVTQ_1_S2) + return -EFAULT; + + mutex_lock(&smmu->streams_mutex); + master = arm_smmu_find_master(smmu, sid); + if (!master) { + ret = -EINVAL; + goto out_unlock; + } + + ret = arm_vsmmu_handle_evt(master->domain->d, smmu->dev, evt); + if (ret) { + ret = -EINVAL; + goto out_unlock; + } + +out_unlock: + mutex_unlock(&smmu->streams_mutex); + return ret; +} + /* IRQ and event handlers */ static void arm_smmu_evtq_tasklet(void *dev) { - int i; + int i, ret; struct arm_smmu_device *smmu = dev; struct arm_smmu_queue *q = &smmu->evtq.q; struct arm_smmu_ll_queue *llq = &q->llq; @@ -905,6 +945,10 @@ static void arm_smmu_evtq_tasklet(void *dev) while (!queue_remove_raw(q, evt)) { u8 id = FIELD_GET(EVTQ_0_ID, evt[0]); + ret = arm_smmu_handle_evt(smmu, evt); + if (!ret) + continue; + dev_info(smmu->dev, "event 0x%02x received:\n", id); for (i = 0; i < ARRAY_SIZE(evt); ++i) dev_info(smmu->dev, "\t0x%016llx\n", diff --git a/xen/drivers/passthrough/arm/vsmmu-v3.c b/xen/drivers/passthrough/arm/vsmmu-v3.c index b280b70da0..cd8b62d806 100644 --- a/xen/drivers/passthrough/arm/vsmmu-v3.c +++ b/xen/drivers/passthrough/arm/vsmmu-v3.c @@ -102,6 +102,7 @@ struct arm_vsmmu_queue { struct virt_smmu { struct domain *d; struct list_head viommu_list; +paddr_t addr; uint8_t sid_split; uint32_tfeatures; uint32_tcr[3]; @@ -236,6 +237,49 @@ void arm_vsmmu_send_event(struct virt_smmu *smmu, return; } +static struct virt_smmu *vsmmuv3_find_by_addr(struct domain *d, paddr_t paddr) +{ +struct virt_smmu *smmu; + +list_for_each_entry( smmu, &d->arch.viommu_list, viommu_list ) +{ +if ( smmu->addr == paddr ) +return smmu; +} + +return NULL; +} + +int arm_vsmmu_handle_evt(struct domain *d, struct device *dev, uint64_t *evt) +{ +int ret; +struct virt_smmu *smmu; + +if ( is_hardware_domain(d) ) +{ +paddr_t paddr; +/* Base address */ +ret = dt_device_get_address(dev_to_dt(dev), 0, &paddr, NULL); +if ( ret ) +return -EINVAL; + +smmu = vsmmuv3_find_by_addr(d, paddr); +if ( !smmu ) +return -ENODEV; +} +else +{ +smmu = list_entry(d->arch.viommu_list.next, + struct virt_smmu, viommu_list); +} + +ret = arm_vsmmu_write_evtq(smmu, evt); +if ( ret ) +arm_vsmmu_inject_irq(smmu, true, GERROR_EVTQ_ABT_ERR); + +return 0; +} + static int arm_vsmmu_find_ste(struct virt_smmu *smmu, uint32_t sid, uint64_t *ste) { @@ -737,6 +781,7 @@ static int vsmmuv3_init_single(struct domain *d, paddr_t addr, smmu->d = d; smmu->virq = virq; +smmu->addr = addr; smmu->cmdq.q_base = FIELD_PREP(Q_BASE_LOG2SIZE, SMMU_CMDQS); smmu->cmdq.ent_size = CMDQ_ENT_DWORDS * DWORDS_BYTES; smmu->evtq.q_base = FIELD_PREP(Q_BASE_LOG2SIZE, SMMU_EVTQS); diff --git a/xen/drivers/passthrough/arm/vsmmu-v3.h b/xen/drivers/passthrough/
[RFC PATCH 18/21] xen/arm: iommu: skip the iommu-map property for PCI devices
Current code skip the IOMMUS specific properties for the non PCI devices when handling the dom0 node but there is no support to skip the IOMMUS specific properties for the PCI devices. This patch will add the support to skip the IOMMUS specific properties for the PCI devices. Signed-off-by: Rahul Singh --- xen/arch/arm/domain_build.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 8e85fb7854..7cd99a6771 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -1112,9 +1112,18 @@ static int __init write_properties(struct domain *d, struct kernel_info *kinfo, * Use "iommu_node" as an indicator of the master device which properties * should be skipped. */ -iommu_node = dt_parse_phandle(node, "iommus", 0); -if ( iommu_node && device_get_class(iommu_node) != DEVICE_IOMMU ) -iommu_node = NULL; +if ( dt_device_type_is_equal(node, "pci") ) +{ +iommu_node = dt_parse_phandle(node, "iommu-map", 1); +if ( iommu_node && device_get_class(iommu_node) != DEVICE_IOMMU ) +iommu_node = NULL; +} +else +{ +iommu_node = dt_parse_phandle(node, "iommus", 0); +if ( iommu_node && device_get_class(iommu_node) != DEVICE_IOMMU ) +iommu_node = NULL; +} dt_for_each_property_node (node, prop) { -- 2.25.1
[RFC PATCH 17/21] xen/arm: vsmmuv3: Alloc virq for virtual SMMUv3
Alloc and reserve virq for event queue and global error to send event to guests. Also Modify the libxl to accomadate the new define virq. Signed-off-by: Rahul Singh --- tools/libs/light/libxl_arm.c | 24 ++-- xen/arch/arm/domain_build.c| 11 +++ xen/drivers/passthrough/arm/vsmmu-v3.c | 10 ++ 3 files changed, 43 insertions(+), 2 deletions(-) diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c index 00fcbd466c..f2bb7d40dc 100644 --- a/tools/libs/light/libxl_arm.c +++ b/tools/libs/light/libxl_arm.c @@ -66,8 +66,8 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, { uint32_t nr_spis = 0; unsigned int i; -uint32_t vuart_irq, virtio_irq = 0; -bool vuart_enabled = false, virtio_enabled = false; +uint32_t vuart_irq, virtio_irq = 0, vsmmu_irq = 0; +bool vuart_enabled = false, virtio_enabled = false, vsmmu_enabled = false; uint64_t virtio_mmio_base = GUEST_VIRTIO_MMIO_BASE; uint32_t virtio_mmio_irq = GUEST_VIRTIO_MMIO_SPI_FIRST; @@ -81,6 +81,12 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, vuart_enabled = true; } +if (d_config->num_pcidevs || d_config->b_info.device_tree) { +nr_spis += (GUEST_VSMMU_SPI - 32) + 1; +vsmmu_irq = GUEST_VSMMU_SPI; +vsmmu_enabled = true; +} + for (i = 0; i < d_config->num_disks; i++) { libxl_device_disk *disk = &d_config->disks[i]; @@ -136,6 +142,11 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, return ERROR_FAIL; } +if (vsmmu_enabled && irq == vsmmu_irq) { +LOG(ERROR, "Physical IRQ %u conflicting with vSMMUv3 SPI\n", irq); +return ERROR_FAIL; +} + if (irq < 32) continue; @@ -837,6 +848,7 @@ static int make_vsmmuv3_node(libxl__gc *gc, void *fdt, { int res; const char *name = GCSPRINTF("iommu@%llx", GUEST_VSMMUV3_BASE); +gic_interrupt intr; res = fdt_begin_node(fdt, name); if (res) return res; @@ -855,6 +867,14 @@ static int make_vsmmuv3_node(libxl__gc *gc, void *fdt, res = fdt_property_cell(fdt, "#iommu-cells", 1); if (res) return res; +res = fdt_property_string(fdt, "interrupt-names", "combined"); +if (res) return res; + +set_interrupt(intr, GUEST_VSMMU_SPI, 0xf, DT_IRQ_TYPE_LEVEL_HIGH); + +res = fdt_property_interrupts(gc, fdt, &intr, 1); +if (res) return res; + res = fdt_end_node(fdt); if (res) return res; diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 29f00b18ec..8e85fb7854 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -2329,6 +2329,7 @@ static int __init make_vsmmuv3_node(const struct kernel_info *kinfo) char buf[24]; __be32 reg[GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS]; __be32 *cells; +gic_interrupt_t intr; void *fdt = kinfo->fdt; snprintf(buf, sizeof(buf), "iommu@%llx", GUEST_VSMMUV3_BASE); @@ -2359,6 +2360,16 @@ static int __init make_vsmmuv3_node(const struct kernel_info *kinfo) if ( res ) return res; +res = fdt_property_string(fdt, "interrupt-names", "combined"); +if ( res ) +return res; + +set_interrupt(intr, GUEST_VSMMU_SPI, 0xf, DT_IRQ_TYPE_LEVEL_HIGH); + +res = fdt_property_interrupts(kinfo, &intr, 1); +if ( res ) +return res; + res = fdt_end_node(fdt); return res; diff --git a/xen/drivers/passthrough/arm/vsmmu-v3.c b/xen/drivers/passthrough/arm/vsmmu-v3.c index 031c1f74b6..b280b70da0 100644 --- a/xen/drivers/passthrough/arm/vsmmu-v3.c +++ b/xen/drivers/passthrough/arm/vsmmu-v3.c @@ -728,6 +728,7 @@ static const struct mmio_handler_ops vsmmuv3_mmio_handler = { static int vsmmuv3_init_single(struct domain *d, paddr_t addr, paddr_t size, uint32_t virq) { +int ret; struct virt_smmu *smmu; smmu = xzalloc(struct virt_smmu); @@ -743,12 +744,21 @@ static int vsmmuv3_init_single(struct domain *d, paddr_t addr, spin_lock_init(&smmu->cmd_queue_lock); +ret = vgic_reserve_virq(d, virq); +if ( !ret ) +goto out; + register_mmio_handler(d, &vsmmuv3_mmio_handler, addr, size, smmu); /* Register the vIOMMU to be able to clean it up later. */ list_add_tail(&smmu->viommu_list, &d->arch.viommu_list); return 0; + +out: +xfree(smmu); +vgic_free_virq(d, virq); +return ret; } int domain_vsmmuv3_init(struct domain *d) -- 2.25.1
[RFC PATCH 16/21] arm/libxl: vsmmuv3: Emulated SMMUv3 device tree node in libxl
libxl will create an Emulated SMMUv3 device tree node in the device tree to enable the guest OS to discover the virtual SMMUv3 during guest boot. Emulated SMMUv3 device tree node will only be created when "viommu=smmuv3" is set in xl domain configuration. Signed-off-by: Rahul Singh --- tools/libs/light/libxl_arm.c | 39 1 file changed, 39 insertions(+) diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c index b8eff10a41..00fcbd466c 100644 --- a/tools/libs/light/libxl_arm.c +++ b/tools/libs/light/libxl_arm.c @@ -831,6 +831,36 @@ static int make_vpl011_uart_node(libxl__gc *gc, void *fdt, return 0; } +static int make_vsmmuv3_node(libxl__gc *gc, void *fdt, + const struct arch_info *ainfo, + struct xc_dom_image *dom) +{ +int res; +const char *name = GCSPRINTF("iommu@%llx", GUEST_VSMMUV3_BASE); + +res = fdt_begin_node(fdt, name); +if (res) return res; + +res = fdt_property_compat(gc, fdt, 1, "arm,smmu-v3"); +if (res) return res; + +res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, +GUEST_ROOT_SIZE_CELLS, 1, GUEST_VSMMUV3_BASE, +GUEST_VSMMUV3_SIZE); +if (res) return res; + +res = fdt_property_cell(fdt, "phandle", GUEST_PHANDLE_VSMMUV3); +if (res) return res; + +res = fdt_property_cell(fdt, "#iommu-cells", 1); +if (res) return res; + +res = fdt_end_node(fdt); +if (res) return res; + +return 0; +} + static int make_vpci_node(libxl__gc *gc, void *fdt, const struct arch_info *ainfo, struct xc_dom_image *dom) @@ -872,6 +902,12 @@ static int make_vpci_node(libxl__gc *gc, void *fdt, GUEST_VPCI_PREFETCH_MEM_SIZE); if (res) return res; +if (res) return res; + +res = fdt_property_values(gc, fdt, "iommu-map", 4, 0, + GUEST_PHANDLE_VSMMUV3, 0, 0x1); +if (res) return res; + res = fdt_end_node(fdt); if (res) return res; @@ -1251,6 +1287,9 @@ next_resize: if (d_config->num_pcidevs) FDT( make_vpci_node(gc, fdt, ainfo, dom) ); +if (info->arch_arm.viommu_type == LIBXL_VIOMMU_TYPE_SMMUV3) +FDT( make_vsmmuv3_node(gc, fdt, ainfo, dom) ); + iommu_created = false; for (i = 0; i < d_config->num_disks; i++) { libxl_device_disk *disk = &d_config->disks[i]; -- 2.25.1
[RFC PATCH 15/21] xen/arm: vsmmuv3: Emulated SMMUv3 device tree node for dom0less
XEN will create an Emulated SMMUv3 device tree node in the device tree to enable the dom0less domains to discover the virtual SMMUv3 during boot. Emulated SMMUv3 device tree node will only be created when cmdline option vsmmuv3 is enabled. Signed-off-by: Rahul Singh --- xen/arch/arm/domain_build.c | 52 +++ xen/include/public/device_tree_defs.h | 1 + 2 files changed, 53 insertions(+) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index b82121beb5..29f00b18ec 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -2322,6 +2322,49 @@ static int make_hwdom_viommu_node(const struct kernel_info *kinfo) } #endif +#ifdef CONFIG_VIRTUAL_ARM_SMMU_V3 +static int __init make_vsmmuv3_node(const struct kernel_info *kinfo) +{ +int res; +char buf[24]; +__be32 reg[GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS]; +__be32 *cells; +void *fdt = kinfo->fdt; + +snprintf(buf, sizeof(buf), "iommu@%llx", GUEST_VSMMUV3_BASE); + +res = fdt_begin_node(fdt, buf); +if ( res ) +return res; + +res = fdt_property_string(fdt, "compatible", "arm,smmu-v3"); +if ( res ) +return res; + +/* Create reg property */ +cells = ®[0]; +dt_child_set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS, + GUEST_VSMMUV3_BASE, GUEST_VSMMUV3_SIZE); +res = fdt_property(fdt, "reg", reg, + (GUEST_ROOT_ADDRESS_CELLS + + GUEST_ROOT_SIZE_CELLS) * sizeof(*reg)); +if ( res ) +return res; + +res = fdt_property_cell(fdt, "phandle", GUEST_PHANDLE_VSMMUV3); +if ( res ) +return res; + +res = fdt_property_cell(fdt, "#iommu-cells", 1); +if ( res ) +return res; + +res = fdt_end_node(fdt); + +return res; +} +#endif + int __init map_irq_to_domain(struct domain *d, unsigned int irq, bool need_mapping, const char *devname) { @@ -3395,6 +3438,15 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo) goto err; } +#ifdef CONFIG_VIRTUAL_ARM_SMMU_V3 +if ( is_viommu_enabled() ) +{ +ret = make_vsmmuv3_node(kinfo); +if ( ret ) +goto err; +} +#endif + ret = fdt_end_node(kinfo->fdt); if ( ret < 0 ) goto err; diff --git a/xen/include/public/device_tree_defs.h b/xen/include/public/device_tree_defs.h index 9e80d0499d..7846a0425c 100644 --- a/xen/include/public/device_tree_defs.h +++ b/xen/include/public/device_tree_defs.h @@ -14,6 +14,7 @@ */ #define GUEST_PHANDLE_GIC (65000) #define GUEST_PHANDLE_IOMMU (GUEST_PHANDLE_GIC + 1) +#define GUEST_PHANDLE_VSMMUV3 (GUEST_PHANDLE_IOMMU + 1) #define GUEST_ROOT_ADDRESS_CELLS 2 #define GUEST_ROOT_SIZE_CELLS 2 -- 2.25.1
[RFC PATCH 13/21] xen/arm: vsmmuv3: Add "iommus" property node for dom0 devices
"iommus" property will be added for dom0 devices to virtual IOMMU node to enable the dom0 linux kernel to configure the IOMMU Signed-off-by: Rahul Singh --- xen/arch/arm/domain_build.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index abbaf37a2e..a5295e8c3e 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -1172,9 +1172,12 @@ static int __init write_properties(struct domain *d, struct kernel_info *kinfo, continue; } -if ( iommu_node ) +/* + * Expose IOMMU specific properties to hwdom when vIOMMU is + * enabled. + */ +if ( iommu_node && !is_viommu_enabled() ) { -/* Don't expose IOMMU specific properties to hwdom */ if ( dt_property_name_is_equal(prop, "iommus") ) continue; -- 2.25.1
[RFC PATCH 12/21] xen/arm: vsmmuv3: Add support for event queue and global error
Event queue is used to send the events to guest when there is an events/ faults. Add support for event queue to send events to guest. Global error in SMMUv3 hw will be updated in smmu_gerror and smmu_gerrorn register. Add support for global error registers to send global error to guest. Signed-off-by: Rahul Singh --- xen/drivers/passthrough/arm/smmu-v3.h | 20 +++ xen/drivers/passthrough/arm/vsmmu-v3.c | 163 - xen/include/public/arch-arm.h | 5 +- 3 files changed, 183 insertions(+), 5 deletions(-) diff --git a/xen/drivers/passthrough/arm/smmu-v3.h b/xen/drivers/passthrough/arm/smmu-v3.h index 50a050408b..b598cdeb72 100644 --- a/xen/drivers/passthrough/arm/smmu-v3.h +++ b/xen/drivers/passthrough/arm/smmu-v3.h @@ -348,6 +348,26 @@ #define EVTQ_0_ID GENMASK_ULL(7, 0) +#define EVT_ID_BAD_STREAMID0x02 +#define EVT_ID_BAD_STE 0x04 +#define EVT_ID_TRANSLATION_FAULT 0x10 +#define EVT_ID_ADDR_SIZE_FAULT 0x11 +#define EVT_ID_ACCESS_FAULT0x12 +#define EVT_ID_PERMISSION_FAULT0x13 + +#define EVTQ_0_SSV (1UL << 11) +#define EVTQ_0_SSIDGENMASK_ULL(31, 12) +#define EVTQ_0_SID GENMASK_ULL(63, 32) +#define EVTQ_1_STAGGENMASK_ULL(15, 0) +#define EVTQ_1_STALL (1UL << 31) +#define EVTQ_1_PnU (1UL << 33) +#define EVTQ_1_InD (1UL << 34) +#define EVTQ_1_RnW (1UL << 35) +#define EVTQ_1_S2 (1UL << 39) +#define EVTQ_1_CLASS GENMASK_ULL(41, 40) +#define EVTQ_1_TT_READ (1UL << 44) +#define EVTQ_2_ADDRGENMASK_ULL(63, 0) +#define EVTQ_3_IPA GENMASK_ULL(51, 12) /* PRI queue */ #define PRIQ_ENT_SZ_SHIFT 4 #define PRIQ_ENT_DWORDS((1 << PRIQ_ENT_SZ_SHIFT) >> 3) diff --git a/xen/drivers/passthrough/arm/vsmmu-v3.c b/xen/drivers/passthrough/arm/vsmmu-v3.c index 5188181929..031c1f74b6 100644 --- a/xen/drivers/passthrough/arm/vsmmu-v3.c +++ b/xen/drivers/passthrough/arm/vsmmu-v3.c @@ -43,6 +43,7 @@ extern const struct viommu_desc __read_mostly *cur_viommu; /* Helper Macros */ #define smmu_get_cmdq_enabled(x)FIELD_GET(CR0_CMDQEN, x) +#define smmu_get_evtq_enabled(x)FIELD_GET(CR0_EVTQEN, x) #define smmu_cmd_get_command(x) FIELD_GET(CMDQ_0_OP, x) #define smmu_cmd_get_sid(x) FIELD_GET(CMDQ_PREFETCH_0_SID, x) #define smmu_get_ste_s1cdmax(x) FIELD_GET(STRTAB_STE_0_S1CDMAX, x) @@ -51,6 +52,35 @@ extern const struct viommu_desc __read_mostly *cur_viommu; #define smmu_get_ste_s1ctxptr(x)FIELD_PREP(STRTAB_STE_0_S1CTXPTR_MASK, \ FIELD_GET(STRTAB_STE_0_S1CTXPTR_MASK, x)) +/* event queue entry */ +struct arm_smmu_evtq_ent { +/* Common fields */ +uint8_t opcode; +uint32_tsid; + +/* Event-specific fields */ +union { +struct { +uint32_t ssid; +bool ssv; +} c_bad_ste_streamid; + +struct { +bool stall; +uint16_t stag; +uint32_t ssid; +bool ssv; +bool s2; +uint64_t addr; +bool rnw; +bool pnu; +bool ind; +uint8_t class; +uint64_t addr2; +} f_translation; +}; +}; + /* stage-1 translation configuration */ struct arm_vsmmu_s1_trans_cfg { paddr_t s1ctxptr; @@ -81,6 +111,7 @@ struct virt_smmu { uint32_tstrtab_base_cfg; uint64_tstrtab_base; uint32_tirq_ctrl; +uint32_tvirq; uint64_tgerror_irq_cfg0; uint64_tevtq_irq_cfg0; struct arm_vsmmu_queue evtq, cmdq; @@ -88,6 +119,12 @@ struct virt_smmu { }; /* Queue manipulation functions */ +static bool queue_full(struct arm_vsmmu_queue *q) +{ +return Q_IDX(q, q->prod) == Q_IDX(q, q->cons) && + Q_WRP(q, q->prod) != Q_WRP(q, q->cons); +} + static bool queue_empty(struct arm_vsmmu_queue *q) { return Q_IDX(q, q->prod) == Q_IDX(q, q->cons) && @@ -100,11 +137,105 @@ static void queue_inc_cons(struct arm_vsmmu_queue *q) q->cons = Q_OVF(q->cons) | Q_WRP(q, cons) | Q_IDX(q, cons); } +static void queue_inc_prod(struct arm_vsmmu_queue *q) +{ +u32 prod = (Q_WRP(q, q->prod) | Q_IDX(q, q->prod)) + 1; +q->prod = Q_OVF(q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod); +} + static void dump_smmu_command(uint64_t *command) { gdprintk(XENLOG_ERR, "cmd 0x%02llx: %016lx %016lx\n", smmu_cmd_get_command(command[0]), command[0], command[1]); } + +static void arm_vsmmu_inject_irq(struct virt_smmu *smmu, bool is_gerror, +uint32_t gerror_err) +{ +uint32_t new_gerro
[RFC PATCH 11/21] xen/arm: vsmmuv3: Attach Stage-1 configuration to SMMUv3 hardware
Attach the Stage-1 configuration to device STE to support nested translation for the guests. Signed-off-by: Rahul Singh --- xen/drivers/passthrough/arm/smmu-v3.c | 79 ++ xen/drivers/passthrough/arm/smmu-v3.h | 1 + xen/drivers/passthrough/arm/vsmmu-v3.c | 18 ++ xen/include/xen/iommu.h| 14 + 4 files changed, 112 insertions(+) diff --git a/xen/drivers/passthrough/arm/smmu-v3.c b/xen/drivers/passthrough/arm/smmu-v3.c index 4f96fdb92f..c4b4a5d86d 100644 --- a/xen/drivers/passthrough/arm/smmu-v3.c +++ b/xen/drivers/passthrough/arm/smmu-v3.c @@ -2754,6 +2754,37 @@ static struct arm_smmu_device *arm_smmu_get_by_dev(struct device *dev) return NULL; } +static struct iommu_domain *arm_smmu_get_domain_by_sid(struct domain *d, + u32 sid) +{ + int i; + unsigned long flags; + struct iommu_domain *io_domain; + struct arm_smmu_domain *smmu_domain; + struct arm_smmu_master *master; + struct arm_smmu_xen_domain *xen_domain = dom_iommu(d)->arch.priv; + + /* +* Loop through the &xen_domain->contexts to locate a context +* assigned to this SMMU +*/ + list_for_each_entry(io_domain, &xen_domain->contexts, list) { + smmu_domain = to_smmu_domain(io_domain); + + spin_lock_irqsave(&smmu_domain->devices_lock, flags); + list_for_each_entry(master, &smmu_domain->devices, domain_head) { + for (i = 0; i < master->num_streams; i++) { + if (sid != master->streams[i].id) + continue; + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); + return io_domain; + } + } + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); + } + return NULL; +} + static struct iommu_domain *arm_smmu_get_domain(struct domain *d, struct device *dev) { @@ -2909,6 +2940,53 @@ static void arm_smmu_iommu_xen_domain_teardown(struct domain *d) xfree(xen_domain); } +static int arm_smmu_attach_guest_config(struct domain *d, u32 sid, + struct iommu_guest_config *cfg) +{ + int ret = -EINVAL; + unsigned long flags; + struct arm_smmu_master *master; + struct arm_smmu_domain *smmu_domain; + struct arm_smmu_xen_domain *xen_domain = dom_iommu(d)->arch.priv; + struct iommu_domain *io_domain = arm_smmu_get_domain_by_sid(d, sid); + + if (!io_domain) + return -ENODEV; + + smmu_domain = to_smmu_domain(io_domain); + + spin_lock(&xen_domain->lock); + + switch (cfg->config) { + case ARM_SMMU_DOMAIN_ABORT: + smmu_domain->abort = true; + break; + case ARM_SMMU_DOMAIN_BYPASS: + smmu_domain->abort = false; + break; + case ARM_SMMU_DOMAIN_NESTED: + /* Enable Nested stage translation. */ + smmu_domain->stage = ARM_SMMU_DOMAIN_NESTED; + smmu_domain->s1_cfg.s1ctxptr = cfg->s1ctxptr; + smmu_domain->s1_cfg.s1fmt = cfg->s1fmt; + smmu_domain->s1_cfg.s1cdmax = cfg->s1cdmax; + smmu_domain->abort = false; + break; + default: + goto out; + } + + spin_lock_irqsave(&smmu_domain->devices_lock, flags); + list_for_each_entry(master, &smmu_domain->devices, domain_head) + arm_smmu_install_ste_for_dev(master); + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); + + ret = 0; +out: + spin_unlock(&xen_domain->lock); + return ret; +} + static const struct iommu_ops arm_smmu_iommu_ops = { .page_sizes = PAGE_SIZE_4K, .init = arm_smmu_iommu_xen_domain_init, @@ -2921,6 +2999,7 @@ static const struct iommu_ops arm_smmu_iommu_ops = { .unmap_page = arm_iommu_unmap_page, .dt_xlate = arm_smmu_dt_xlate, .add_device = arm_smmu_add_device, + .attach_guest_config = arm_smmu_attach_guest_config }; static __init int arm_smmu_dt_init(struct dt_device_node *dev, diff --git a/xen/drivers/passthrough/arm/smmu-v3.h b/xen/drivers/passthrough/arm/smmu-v3.h index e270fe05e0..50a050408b 100644 --- a/xen/drivers/passthrough/arm/smmu-v3.h +++ b/xen/drivers/passthrough/arm/smmu-v3.h @@ -393,6 +393,7 @@ enum arm_smmu_domain_stage { ARM_SMMU_DOMAIN_S2, ARM_SMMU_DOMAIN_NESTED, ARM_SMMU_DOMAIN_BYPASS, + ARM_SMMU_DOMAIN_ABORT, }; /* Xen specific code. */ diff --git a/xen/drivers/passthrough/arm/vsmmu-v3.c b/xen/drivers/passthrough/arm/vsmmu-v3.c
[RFC PATCH 05/21] xen/arm: vsmmuv3: Add dummy support for virtual SMMUv3 for guests
domain_viommu_init() will be called during domain creation and will add the dummy trap handler for virtual IOMMUs for guests. A host IOMMU list will be created when host IOMMU devices are probed and this list will be used to create the IOMMU device tree node for dom0. For dom0, 1-1 mapping will be established between vIOMMU in dom0 and physical IOMMU. For domUs, the 1-N mapping will be established between domU and physical IOMMUs. A new area has been reserved in the arm guest physical map at which the emulated vIOMMU node is created in the device tree. Also set the vIOMMU type to vSMMUv3 to enable vIOMMU framework to call vSMMUv3 domain creation/destroy functions. Signed-off-by: Rahul Singh --- xen/arch/arm/domain.c | 3 +- xen/arch/arm/include/asm/domain.h | 4 + xen/arch/arm/include/asm/viommu.h | 20 xen/drivers/passthrough/Kconfig| 8 ++ xen/drivers/passthrough/arm/Makefile | 1 + xen/drivers/passthrough/arm/smmu-v3.c | 7 ++ xen/drivers/passthrough/arm/viommu.c | 30 ++ xen/drivers/passthrough/arm/vsmmu-v3.c | 124 + xen/drivers/passthrough/arm/vsmmu-v3.h | 20 xen/include/public/arch-arm.h | 7 +- 10 files changed, 222 insertions(+), 2 deletions(-) create mode 100644 xen/drivers/passthrough/arm/vsmmu-v3.c create mode 100644 xen/drivers/passthrough/arm/vsmmu-v3.h diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index 2a85209736..9a2b613500 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -692,7 +692,8 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config) return -EINVAL; } -if ( config->arch.viommu_type != XEN_DOMCTL_CONFIG_VIOMMU_NONE ) +if ( config->arch.viommu_type != XEN_DOMCTL_CONFIG_VIOMMU_NONE && + config->arch.viommu_type != viommu_get_type() ) { dprintk(XENLOG_INFO, "vIOMMU type requested not supported by the platform or Xen\n"); diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h index 2ce6764322..8eb4eb5fd6 100644 --- a/xen/arch/arm/include/asm/domain.h +++ b/xen/arch/arm/include/asm/domain.h @@ -114,6 +114,10 @@ struct arch_domain void *tee; #endif +#ifdef CONFIG_VIRTUAL_IOMMU +struct list_head viommu_list; /* List of virtual IOMMUs */ +#endif + } __cacheline_aligned; struct arch_vcpu diff --git a/xen/arch/arm/include/asm/viommu.h b/xen/arch/arm/include/asm/viommu.h index 7cd3818a12..4785877e2a 100644 --- a/xen/arch/arm/include/asm/viommu.h +++ b/xen/arch/arm/include/asm/viommu.h @@ -5,9 +5,21 @@ #ifdef CONFIG_VIRTUAL_IOMMU #include +#include #include #include +extern struct list_head host_iommu_list; + +/* data structure for each hardware IOMMU */ +struct host_iommu { +struct list_head entry; +const struct dt_device_node *dt_node; +paddr_t addr; +paddr_t size; +uint32_t irq; +}; + struct viommu_ops { /* * Called during domain construction if toolstack requests to enable @@ -35,6 +47,8 @@ struct viommu_desc { int domain_viommu_init(struct domain *d, uint16_t viommu_type); int viommu_relinquish_resources(struct domain *d); uint16_t viommu_get_type(void); +void add_to_host_iommu_list(paddr_t addr, paddr_t size, +const struct dt_device_node *node); #else @@ -56,6 +70,12 @@ static inline int viommu_relinquish_resources(struct domain *d) return 0; } +static inline void add_to_host_iommu_list(paddr_t addr, paddr_t size, + const struct dt_device_node *node) +{ +return; +} + #endif /* CONFIG_VIRTUAL_IOMMU */ #endif /* __ARCH_ARM_VIOMMU_H__ */ diff --git a/xen/drivers/passthrough/Kconfig b/xen/drivers/passthrough/Kconfig index 19924fa2de..4c725f5f67 100644 --- a/xen/drivers/passthrough/Kconfig +++ b/xen/drivers/passthrough/Kconfig @@ -41,6 +41,14 @@ config VIRTUAL_IOMMU help Support virtual IOMMU infrastructure to implement vIOMMU. +config VIRTUAL_ARM_SMMU_V3 + bool "ARM Ltd. Virtual SMMUv3 Support (UNSUPPORTED)" if UNSUPPORTED + depends on ARM_SMMU_V3 && VIRTUAL_IOMMU + help +Support for implementations of the virtual ARM System MMU architecture +version 3. Virtual SMMUv3 is unsupported feature and should not be used +in production. + endif config IOMMU_FORCE_PT_SHARE diff --git a/xen/drivers/passthrough/arm/Makefile b/xen/drivers/passthrough/arm/Makefile index 4cc54f3f4d..e758a9d6aa 100644 --- a/xen/drivers/passthrough/arm/Makefile +++ b/xen/drivers/passthrough/arm/Makefile @@ -3,3 +3,4 @@ obj-$(CONFIG_ARM_SMMU) += smmu.o obj-$(CONFIG_IPMMU_VMSA) += ipmmu-vmsa.o obj-$(CONFIG_ARM_SMMU_V3) += smmu-v3.o obj-$(CONFIG_VIRTUAL_IOMMU) += viommu.o +obj-$(CONFIG_VIRTUAL_ARM_SMMU_V3) += vsmmu-v3.o diff --git a/xen/drivers/passthrough/arm/smmu-v3.c b/xen/drivers/passthrough/arm/smmu-v
[RFC PATCH 10/21] xen/arm: vsmmuv3: Add support for command CMD_CFGI_STE
CMD_CFGI_STE is used to invalidate/validate the STE. Emulated vSMMUv3 driver in XEN will read the STE from the guest memory space and capture the Stage-1 configuration required to support nested translation. Signed-off-by: Rahul Singh --- xen/drivers/passthrough/arm/vsmmu-v3.c | 148 + 1 file changed, 148 insertions(+) diff --git a/xen/drivers/passthrough/arm/vsmmu-v3.c b/xen/drivers/passthrough/arm/vsmmu-v3.c index cc651a2dc8..916b97b8a2 100644 --- a/xen/drivers/passthrough/arm/vsmmu-v3.c +++ b/xen/drivers/passthrough/arm/vsmmu-v3.c @@ -44,6 +44,21 @@ extern const struct viommu_desc __read_mostly *cur_viommu; /* Helper Macros */ #define smmu_get_cmdq_enabled(x)FIELD_GET(CR0_CMDQEN, x) #define smmu_cmd_get_command(x) FIELD_GET(CMDQ_0_OP, x) +#define smmu_cmd_get_sid(x) FIELD_GET(CMDQ_PREFETCH_0_SID, x) +#define smmu_get_ste_s1cdmax(x) FIELD_GET(STRTAB_STE_0_S1CDMAX, x) +#define smmu_get_ste_s1fmt(x) FIELD_GET(STRTAB_STE_0_S1FMT, x) +#define smmu_get_ste_s1stalld(x)FIELD_GET(STRTAB_STE_1_S1STALLD, x) +#define smmu_get_ste_s1ctxptr(x)FIELD_PREP(STRTAB_STE_0_S1CTXPTR_MASK, \ +FIELD_GET(STRTAB_STE_0_S1CTXPTR_MASK, x)) + +/* stage-1 translation configuration */ +struct arm_vsmmu_s1_trans_cfg { +paddr_t s1ctxptr; +uint8_t s1fmt; +uint8_t s1cdmax; +boolbypassed; /* translation is bypassed */ +boolaborted; /* translation is aborted */ +}; /* virtual smmu queue */ struct arm_vsmmu_queue { @@ -90,6 +105,138 @@ static void dump_smmu_command(uint64_t *command) gdprintk(XENLOG_ERR, "cmd 0x%02llx: %016lx %016lx\n", smmu_cmd_get_command(command[0]), command[0], command[1]); } +static int arm_vsmmu_find_ste(struct virt_smmu *smmu, uint32_t sid, + uint64_t *ste) +{ +paddr_t addr, strtab_base; +struct domain *d = smmu->d; +uint32_t log2size; +int strtab_size_shift; +int ret; + +log2size = FIELD_GET(STRTAB_BASE_CFG_LOG2SIZE, smmu->strtab_base_cfg); + +if ( sid >= (1 << MIN(log2size, SMMU_IDR1_SIDSIZE)) ) +return -EINVAL; + +if ( smmu->features & STRTAB_BASE_CFG_FMT_2LVL ) +{ +int idx, max_l2_ste, span; +paddr_t l1ptr, l2ptr; +uint64_t l1std; + +strtab_size_shift = MAX(5, (int)log2size - smmu->sid_split - 1 + 3); +strtab_base = smmu->strtab_base & STRTAB_BASE_ADDR_MASK & +~GENMASK_ULL(strtab_size_shift, 0); +idx = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS; +l1ptr = (paddr_t)(strtab_base + idx * sizeof(l1std)); + +ret = access_guest_memory_by_ipa(d, l1ptr, &l1std, + sizeof(l1std), false); +if ( ret ) +{ +gdprintk(XENLOG_ERR, + "Could not read L1PTR at 0X%"PRIx64"\n", l1ptr); +return ret; +} + +span = FIELD_GET(STRTAB_L1_DESC_SPAN, l1std); +if ( !span ) +{ +gdprintk(XENLOG_ERR, "Bad StreamID span\n"); +return -EINVAL; +} + +max_l2_ste = (1 << span) - 1; +l2ptr = FIELD_PREP(STRTAB_L1_DESC_L2PTR_MASK, +FIELD_GET(STRTAB_L1_DESC_L2PTR_MASK, l1std)); +idx = sid & ((1 << smmu->sid_split) - 1); +if ( idx > max_l2_ste ) +{ +gdprintk(XENLOG_ERR, "idx=%d > max_l2_ste=%d\n", + idx, max_l2_ste); +return -EINVAL; +} +addr = l2ptr + idx * sizeof(*ste) * STRTAB_STE_DWORDS; +} +else +{ +strtab_size_shift = log2size + 5; +strtab_base = smmu->strtab_base & STRTAB_BASE_ADDR_MASK & + ~GENMASK_ULL(strtab_size_shift, 0); +addr = strtab_base + sid * sizeof(*ste) * STRTAB_STE_DWORDS; +} +ret = access_guest_memory_by_ipa(d, addr, ste, sizeof(*ste), false); +if ( ret ) +{ +gdprintk(XENLOG_ERR, +"Cannot fetch pte at address=0x%"PRIx64"\n", addr); +return -EINVAL; +} + +return 0; +} + +static int arm_vsmmu_decode_ste(struct virt_smmu *smmu, uint32_t sid, +struct arm_vsmmu_s1_trans_cfg *cfg, +uint64_t *ste) +{ +uint64_t val = ste[0]; + +if ( !(val & STRTAB_STE_0_V) ) +return -EAGAIN; + +switch ( FIELD_GET(STRTAB_STE_0_CFG, val) ) +{ +case STRTAB_STE_0_CFG_BYPASS: +cfg->bypassed = true; +return 0; +case STRTAB_STE_0_CFG_ABORT: +cfg->aborted = true; +return 0; +case STRTAB_STE_0_CFG_S1_TRANS: +break; +case STRTAB_STE_0_CFG_S2_TRANS: +gdprintk(XENLOG_ERR, "vSMMUv3 does not support stage 2 yet\n"); +go
[RFC PATCH 09/21] xen/arm: vsmmuv3: Add support for cmdqueue handling
Add support for virtual cmdqueue handling for guests Signed-off-by: Rahul Singh --- xen/drivers/passthrough/arm/vsmmu-v3.c | 101 + 1 file changed, 101 insertions(+) diff --git a/xen/drivers/passthrough/arm/vsmmu-v3.c b/xen/drivers/passthrough/arm/vsmmu-v3.c index c3f99657e6..cc651a2dc8 100644 --- a/xen/drivers/passthrough/arm/vsmmu-v3.c +++ b/xen/drivers/passthrough/arm/vsmmu-v3.c @@ -1,5 +1,6 @@ /* SPDX-License-Identifier: (GPL-2.0-or-later OR BSD-2-Clause) */ +#include #include #include #include @@ -24,6 +25,26 @@ /* Struct to hold the vIOMMU ops and vIOMMU type */ extern const struct viommu_desc __read_mostly *cur_viommu; +/* SMMUv3 command definitions */ +#define CMDQ_OP_PREFETCH_CFG0x1 +#define CMDQ_OP_CFGI_STE0x3 +#define CMDQ_OP_CFGI_ALL0x4 +#define CMDQ_OP_CFGI_CD 0x5 +#define CMDQ_OP_CFGI_CD_ALL 0x6 +#define CMDQ_OP_TLBI_NH_ASID0x11 +#define CMDQ_OP_TLBI_NH_VA 0x12 +#define CMDQ_OP_TLBI_NSNH_ALL 0x30 +#define CMDQ_OP_CMD_SYNC0x46 + +/* Queue Handling */ +#define Q_BASE(q) ((q)->q_base & Q_BASE_ADDR_MASK) +#define Q_CONS_ENT(q) (Q_BASE(q) + Q_IDX(q, (q)->cons) * (q)->ent_size) +#define Q_PROD_ENT(q) (Q_BASE(q) + Q_IDX(q, (q)->prod) * (q)->ent_size) + +/* Helper Macros */ +#define smmu_get_cmdq_enabled(x)FIELD_GET(CR0_CMDQEN, x) +#define smmu_cmd_get_command(x) FIELD_GET(CMDQ_0_OP, x) + /* virtual smmu queue */ struct arm_vsmmu_queue { uint64_tq_base; /* base register */ @@ -48,8 +69,80 @@ struct virt_smmu { uint64_tgerror_irq_cfg0; uint64_tevtq_irq_cfg0; struct arm_vsmmu_queue evtq, cmdq; +spinlock_t cmd_queue_lock; }; +/* Queue manipulation functions */ +static bool queue_empty(struct arm_vsmmu_queue *q) +{ +return Q_IDX(q, q->prod) == Q_IDX(q, q->cons) && + Q_WRP(q, q->prod) == Q_WRP(q, q->cons); +} + +static void queue_inc_cons(struct arm_vsmmu_queue *q) +{ +uint32_t cons = (Q_WRP(q, q->cons) | Q_IDX(q, q->cons)) + 1; +q->cons = Q_OVF(q->cons) | Q_WRP(q, cons) | Q_IDX(q, cons); +} + +static void dump_smmu_command(uint64_t *command) +{ +gdprintk(XENLOG_ERR, "cmd 0x%02llx: %016lx %016lx\n", + smmu_cmd_get_command(command[0]), command[0], command[1]); +} +static int arm_vsmmu_handle_cmds(struct virt_smmu *smmu) +{ +struct arm_vsmmu_queue *q = &smmu->cmdq; +struct domain *d = smmu->d; +uint64_t command[CMDQ_ENT_DWORDS]; +paddr_t addr; + +if ( !smmu_get_cmdq_enabled(smmu->cr[0]) ) +return 0; + +while ( !queue_empty(q) ) +{ +int ret; + +addr = Q_CONS_ENT(q); +ret = access_guest_memory_by_ipa(d, addr, command, + sizeof(command), false); +if ( ret ) +return ret; + +switch ( smmu_cmd_get_command(command[0]) ) +{ +case CMDQ_OP_CFGI_STE: +break; +case CMDQ_OP_PREFETCH_CFG: +case CMDQ_OP_CFGI_CD: +case CMDQ_OP_CFGI_CD_ALL: +case CMDQ_OP_CFGI_ALL: +case CMDQ_OP_CMD_SYNC: +break; +case CMDQ_OP_TLBI_NH_ASID: +case CMDQ_OP_TLBI_NSNH_ALL: +case CMDQ_OP_TLBI_NH_VA: +if ( !iommu_iotlb_flush_all(smmu->d, 1) ) +break; +default: +gdprintk(XENLOG_ERR, "vSMMUv3: unhandled command\n"); +dump_smmu_command(command); +break; +} + +if ( ret ) +{ +gdprintk(XENLOG_ERR, + "vSMMUv3: command error %d while handling command\n", + ret); +dump_smmu_command(command); +} +queue_inc_cons(q); +} +return 0; +} + static int vsmmuv3_mmio_write(struct vcpu *v, mmio_info_t *info, register_t r, void *priv) { @@ -103,9 +196,15 @@ static int vsmmuv3_mmio_write(struct vcpu *v, mmio_info_t *info, break; case VREG32(ARM_SMMU_CMDQ_PROD): +spin_lock(&smmu->cmd_queue_lock); reg32 = smmu->cmdq.prod; vreg_reg32_update(®32, r, info); smmu->cmdq.prod = reg32; + +if ( arm_vsmmu_handle_cmds(smmu) ) +gdprintk(XENLOG_ERR, "error handling vSMMUv3 commands\n"); + +spin_unlock(&smmu->cmd_queue_lock); break; case VREG32(ARM_SMMU_CMDQ_CONS): @@ -321,6 +420,8 @@ static int vsmmuv3_init_single(struct domain *d, paddr_t addr, paddr_t size) smmu->evtq.q_base = FIELD_PREP(Q_BASE_LOG2SIZE, SMMU_EVTQS); smmu->evtq.ent_size = EVTQ_ENT_DWORDS * DWORDS_BYTES; +spin_lock_init(&smmu->cmd_queue_lock); + register_mmio_handler(d, &vsmmuv3_mmio_handler, addr, size, smmu); /* Register the vIOMMU to be able to clean it up later. */ -- 2.25.1
[RFC PATCH 08/21] xen/arm: vsmmuv3: Add support for registers emulation
Add initial support for various emulated registers for virtual SMMUv3 for guests and also add support for virtual cmdq and eventq. Signed-off-by: Rahul Singh --- xen/drivers/passthrough/arm/vsmmu-v3.c | 281 + 1 file changed, 281 insertions(+) diff --git a/xen/drivers/passthrough/arm/vsmmu-v3.c b/xen/drivers/passthrough/arm/vsmmu-v3.c index e36f200ba5..c3f99657e6 100644 --- a/xen/drivers/passthrough/arm/vsmmu-v3.c +++ b/xen/drivers/passthrough/arm/vsmmu-v3.c @@ -3,25 +3,302 @@ #include #include #include +#include #include +#include + +#include "smmu-v3.h" + +/* Register Definition */ +#define ARM_SMMU_IDR2 0x8 +#define ARM_SMMU_IDR3 0xc +#define ARM_SMMU_IDR4 0x10 +#define IDR0_TERM_MODEL (1 << 26) +#define IDR3_RIL(1 << 10) +#define CR0_RESERVED0xFC20 +#define SMMU_IDR1_SIDSIZE 16 +#define SMMU_CMDQS 19 +#define SMMU_EVTQS 19 +#define DWORDS_BYTES8 /* Struct to hold the vIOMMU ops and vIOMMU type */ extern const struct viommu_desc __read_mostly *cur_viommu; +/* virtual smmu queue */ +struct arm_vsmmu_queue { +uint64_tq_base; /* base register */ +uint32_tprod; +uint32_tcons; +uint8_t ent_size; +uint8_t max_n_shift; +}; + struct virt_smmu { struct domain *d; struct list_head viommu_list; +uint8_t sid_split; +uint32_tfeatures; +uint32_tcr[3]; +uint32_tcr0ack; +uint32_tgerror; +uint32_tgerrorn; +uint32_tstrtab_base_cfg; +uint64_tstrtab_base; +uint32_tirq_ctrl; +uint64_tgerror_irq_cfg0; +uint64_tevtq_irq_cfg0; +struct arm_vsmmu_queue evtq, cmdq; }; static int vsmmuv3_mmio_write(struct vcpu *v, mmio_info_t *info, register_t r, void *priv) { +struct virt_smmu *smmu = priv; +uint64_t reg; +uint32_t reg32; + +switch ( info->gpa & 0x ) +{ +case VREG32(ARM_SMMU_CR0): +reg32 = smmu->cr[0]; +vreg_reg32_update(®32, r, info); +smmu->cr[0] = reg32; +smmu->cr0ack = reg32 & ~CR0_RESERVED; +break; + +case VREG32(ARM_SMMU_CR1): +reg32 = smmu->cr[1]; +vreg_reg32_update(®32, r, info); +smmu->cr[1] = reg32; +break; + +case VREG32(ARM_SMMU_CR2): +reg32 = smmu->cr[2]; +vreg_reg32_update(®32, r, info); +smmu->cr[2] = reg32; +break; + +case VREG64(ARM_SMMU_STRTAB_BASE): +reg = smmu->strtab_base; +vreg_reg64_update(®, r, info); +smmu->strtab_base = reg; +break; + +case VREG32(ARM_SMMU_STRTAB_BASE_CFG): +reg32 = smmu->strtab_base_cfg; +vreg_reg32_update(®32, r, info); +smmu->strtab_base_cfg = reg32; + +smmu->sid_split = FIELD_GET(STRTAB_BASE_CFG_SPLIT, reg32); +smmu->features |= STRTAB_BASE_CFG_FMT_2LVL; +break; + +case VREG32(ARM_SMMU_CMDQ_BASE): +reg = smmu->cmdq.q_base; +vreg_reg64_update(®, r, info); +smmu->cmdq.q_base = reg; +smmu->cmdq.max_n_shift = FIELD_GET(Q_BASE_LOG2SIZE, smmu->cmdq.q_base); +if ( smmu->cmdq.max_n_shift > SMMU_CMDQS ) +smmu->cmdq.max_n_shift = SMMU_CMDQS; +break; + +case VREG32(ARM_SMMU_CMDQ_PROD): +reg32 = smmu->cmdq.prod; +vreg_reg32_update(®32, r, info); +smmu->cmdq.prod = reg32; +break; + +case VREG32(ARM_SMMU_CMDQ_CONS): +reg32 = smmu->cmdq.cons; +vreg_reg32_update(®32, r, info); +smmu->cmdq.cons = reg32; +break; + +case VREG32(ARM_SMMU_EVTQ_BASE): +reg = smmu->evtq.q_base; +vreg_reg64_update(®, r, info); +smmu->evtq.q_base = reg; +smmu->evtq.max_n_shift = FIELD_GET(Q_BASE_LOG2SIZE, smmu->evtq.q_base); +if ( smmu->cmdq.max_n_shift > SMMU_EVTQS ) +smmu->cmdq.max_n_shift = SMMU_EVTQS; +break; + +case VREG32(ARM_SMMU_EVTQ_PROD): +reg32 = smmu->evtq.prod; +vreg_reg32_update(®32, r, info); +smmu->evtq.prod = reg32; +break; + +case VREG32(ARM_SMMU_EVTQ_CONS): +reg32 = smmu->evtq.cons; +vreg_reg32_update(®32, r, info); +smmu->evtq.cons = reg32; +break; + +case VREG32(ARM_SMMU_IRQ_CTRL): +reg32 = smmu->irq_ctrl; +vreg_reg32_update(®32, r, info); +smmu->irq_ctrl = reg32; +break; + +case VREG64(ARM_SMMU_GERROR_IRQ_CFG0): +reg = smmu->gerror_irq_cfg0; +vreg_reg64_update(®, r, info); +smmu->gerror_irq_cfg0 = reg; +break; + +case VREG64(ARM_SMMU_EVTQ_IRQ_CFG0): +reg = smmu->evtq_irq_cfg0; +vreg_reg64_update(®, r, info); +smmu->evtq_irq_cfg0 = reg; +bre
[RFC PATCH 07/21] xen/arm: vIOMMU: Add cmdline boot option "viommu = "
Add cmdline boot option "viommu = " to enable or disable the virtual iommu support for guests on ARM. Signed-off-by: Rahul Singh --- docs/misc/xen-command-line.pandoc | 7 +++ xen/arch/arm/include/asm/viommu.h | 11 +++ xen/drivers/passthrough/arm/viommu.c | 9 + xen/drivers/passthrough/arm/vsmmu-v3.c | 3 +++ 4 files changed, 30 insertions(+) diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc index 424b12cfb2..14a104f2b9 100644 --- a/docs/misc/xen-command-line.pandoc +++ b/docs/misc/xen-command-line.pandoc @@ -1896,6 +1896,13 @@ This option can be specified more than once (up to 8 times at present). Flag to enable or disable support for PCI passthrough +### viommu (arm) +> `= ` + +> Default: `false` + +Flag to enable or disable support for Virtual IOMMU for guests. + ### pcid (x86) > `= | xpti=` diff --git a/xen/arch/arm/include/asm/viommu.h b/xen/arch/arm/include/asm/viommu.h index 4785877e2a..4de4cceeda 100644 --- a/xen/arch/arm/include/asm/viommu.h +++ b/xen/arch/arm/include/asm/viommu.h @@ -10,6 +10,7 @@ #include extern struct list_head host_iommu_list; +extern bool viommu_enabled; /* data structure for each hardware IOMMU */ struct host_iommu { @@ -50,6 +51,11 @@ uint16_t viommu_get_type(void); void add_to_host_iommu_list(paddr_t addr, paddr_t size, const struct dt_device_node *node); +static always_inline bool is_viommu_enabled(void) +{ +return viommu_enabled; +} + #else static inline uint8_t viommu_get_type(void) @@ -76,6 +82,11 @@ static inline void add_to_host_iommu_list(paddr_t addr, paddr_t size, return; } +static always_inline bool is_viommu_enabled(void) +{ +return false; +} + #endif /* CONFIG_VIRTUAL_IOMMU */ #endif /* __ARCH_ARM_VIOMMU_H__ */ diff --git a/xen/drivers/passthrough/arm/viommu.c b/xen/drivers/passthrough/arm/viommu.c index 53ae46349a..a1d6a04ba9 100644 --- a/xen/drivers/passthrough/arm/viommu.c +++ b/xen/drivers/passthrough/arm/viommu.c @@ -3,6 +3,7 @@ #include #include #include +#include #include #include @@ -38,8 +39,16 @@ void add_to_host_iommu_list(paddr_t addr, paddr_t size, list_add_tail(&iommu_data->entry, &host_iommu_list); } +/* By default viommu is disabled. */ +bool __read_mostly viommu_enabled; +boolean_param("viommu", viommu_enabled); + int domain_viommu_init(struct domain *d, uint16_t viommu_type) { +/* Enable viommu when it has been enabled explicitly (viommu=on). */ +if ( !viommu_enabled ) +return 0; + if ( viommu_type == XEN_DOMCTL_CONFIG_VIOMMU_NONE ) return 0; diff --git a/xen/drivers/passthrough/arm/vsmmu-v3.c b/xen/drivers/passthrough/arm/vsmmu-v3.c index 6b4009e5ef..e36f200ba5 100644 --- a/xen/drivers/passthrough/arm/vsmmu-v3.c +++ b/xen/drivers/passthrough/arm/vsmmu-v3.c @@ -105,6 +105,9 @@ void __init vsmmuv3_set_type(void) { const struct viommu_desc *desc = &vsmmuv3_desc; +if ( !is_viommu_enabled() ) +return; + if ( cur_viommu && (cur_viommu != desc) ) { printk("WARNING: Cannot set vIOMMU, already set to a different value\n"); -- 2.25.1
[RFC PATCH 06/21] xen/domctl: Add XEN_DOMCTL_CONFIG_VIOMMU_* and viommu config param
Add new viommu_type field and field values XEN_DOMCTL_CONFIG_VIOMMU_NONE XEN_DOMCTL_CONFIG_VIOMMU_SMMUV3 in xen_arch_domainconfig to enable/disable vIOMMU support for domains. Also add viommu="N" parameter to xl domain configuration to enable the vIOMMU for the domains. Currently, only the "smmuv3" type is supported for ARM. Signed-off-by: Rahul Singh --- docs/man/xl.cfg.5.pod.in | 11 +++ tools/golang/xenlight/helpers.gen.go | 2 ++ tools/golang/xenlight/types.gen.go | 1 + tools/include/libxl.h| 5 + tools/libs/light/libxl_arm.c | 13 + tools/libs/light/libxl_types.idl | 6 ++ tools/xl/xl_parse.c | 9 + 7 files changed, 47 insertions(+) diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in index ec444fb2ba..5854d777ed 100644 --- a/docs/man/xl.cfg.5.pod.in +++ b/docs/man/xl.cfg.5.pod.in @@ -2870,6 +2870,17 @@ Currently, only the "sbsa_uart" model is supported for ARM. =back +=item B + +To enable viommu, user must specify the following option in the VM +config file: + +viommu = "smmuv3" + +Currently, only the "smmuv3" type is supported for ARM. + +=back + =head3 x86 =over 4 diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go index cb1bdf9bdf..8b6d771fc7 100644 --- a/tools/golang/xenlight/helpers.gen.go +++ b/tools/golang/xenlight/helpers.gen.go @@ -1117,6 +1117,7 @@ default: return fmt.Errorf("invalid union key '%v'", x.Type)} x.ArchArm.GicVersion = GicVersion(xc.arch_arm.gic_version) x.ArchArm.Vuart = VuartType(xc.arch_arm.vuart) +x.ArchArm.Viommu = ViommuType(xc.arch_arm.viommu) if err := x.ArchX86.MsrRelaxed.fromC(&xc.arch_x86.msr_relaxed);err != nil { return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err) } @@ -1602,6 +1603,7 @@ default: return fmt.Errorf("invalid union key '%v'", x.Type)} xc.arch_arm.gic_version = C.libxl_gic_version(x.ArchArm.GicVersion) xc.arch_arm.vuart = C.libxl_vuart_type(x.ArchArm.Vuart) +xc.arch_arm.viommu = C.libxl_viommu_type(x.ArchArm.Viommu) if err := x.ArchX86.MsrRelaxed.toC(&xc.arch_x86.msr_relaxed); err != nil { return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err) } diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go index 871576fb0e..16c835ebeb 100644 --- a/tools/golang/xenlight/types.gen.go +++ b/tools/golang/xenlight/types.gen.go @@ -531,6 +531,7 @@ TypeUnion DomainBuildInfoTypeUnion ArchArm struct { GicVersion GicVersion Vuart VuartType +Viommu ViommuType } ArchX86 struct { MsrRelaxed Defbool diff --git a/tools/include/libxl.h b/tools/include/libxl.h index d652895075..49563f57bd 100644 --- a/tools/include/libxl.h +++ b/tools/include/libxl.h @@ -278,6 +278,11 @@ */ #define LIBXL_HAVE_BUILDINFO_ARCH_ARM_TEE 1 +/* + * libxl_domain_build_info has the arch_arm.viommu_type field. + */ +#define LIBXL_HAVE_BUILDINFO_ARM_VIOMMU 1 + /* * LIBXL_HAVE_SOFT_RESET indicates that libxl supports performing * 'soft reset' for domains and there is 'soft_reset' shutdown reason diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c index cd84a7c66e..b8eff10a41 100644 --- a/tools/libs/light/libxl_arm.c +++ b/tools/libs/light/libxl_arm.c @@ -179,6 +179,19 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, return ERROR_FAIL; } +switch (d_config->b_info.arch_arm.viommu_type) { +case LIBXL_VIOMMU_TYPE_NONE: +config->arch.viommu_type = XEN_DOMCTL_CONFIG_VIOMMU_NONE; +break; +case LIBXL_VIOMMU_TYPE_SMMUV3: +config->arch.viommu_type = XEN_DOMCTL_CONFIG_VIOMMU_SMMUV3; +break; +default: +LOG(ERROR, "Unknown vIOMMU type %d", +d_config->b_info.arch_arm.viommu_type); +return ERROR_FAIL; +} + return 0; } diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl index 9e3d33cb5a..06ee5ac6ba 100644 --- a/tools/libs/light/libxl_types.idl +++ b/tools/libs/light/libxl_types.idl @@ -492,6 +492,11 @@ libxl_tee_type = Enumeration("tee_type", [ (1, "optee") ], init_val = "LIBXL_TEE_TYPE_NONE") +libxl_viommu_type = Enumeration("viommu_type", [ +(0, "none"), +(1, "smmuv3") +], init_val = "LIBXL_VIOMMU_TYPE_NONE") + libxl_rdm_reserve = Struct("rdm_reserve", [ ("strategy",libxl_rdm_reserve_strategy), ("policy", libxl_rdm_reserve_policy), @@ -658,6 +663,7 @@ libxl_domain_build_info = Struct("domain_build_info",[ ("arch_arm", Struct(None, [("gic_version", libxl_gic_version), ("vuart", libxl_vuart_type), + (&q
[RFC PATCH 04/21] xen/arm: vIOMMU: add generic vIOMMU framework
This patch adds basic framework for vIOMMU. Signed-off-by: Rahul Singh --- xen/arch/arm/domain.c| 17 +++ xen/arch/arm/domain_build.c | 3 ++ xen/arch/arm/include/asm/viommu.h| 70 xen/drivers/passthrough/Kconfig | 6 +++ xen/drivers/passthrough/arm/Makefile | 1 + xen/drivers/passthrough/arm/viommu.c | 48 +++ xen/include/public/arch-arm.h| 4 ++ 7 files changed, 149 insertions(+) create mode 100644 xen/arch/arm/include/asm/viommu.h create mode 100644 xen/drivers/passthrough/arm/viommu.c diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index 38e22f12af..2a85209736 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -37,6 +37,7 @@ #include #include #include +#include #include #include "vpci.h" @@ -691,6 +692,13 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config) return -EINVAL; } +if ( config->arch.viommu_type != XEN_DOMCTL_CONFIG_VIOMMU_NONE ) +{ +dprintk(XENLOG_INFO, +"vIOMMU type requested not supported by the platform or Xen\n"); +return -EINVAL; +} + return 0; } @@ -783,6 +791,9 @@ int arch_domain_create(struct domain *d, if ( (rc = domain_vpci_init(d)) != 0 ) goto fail; +if ( (rc = domain_viommu_init(d, config->arch.viommu_type)) != 0 ) +goto fail; + return 0; fail: @@ -998,6 +1009,7 @@ static int relinquish_memory(struct domain *d, struct page_list_head *list) enum { PROG_pci = 1, PROG_tee, +PROG_viommu, PROG_xen, PROG_page, PROG_mapping, @@ -1048,6 +1060,11 @@ int domain_relinquish_resources(struct domain *d) if (ret ) return ret; +PROGRESS(viommu): +ret = viommu_relinquish_resources(d); +if (ret ) +return ret; + PROGRESS(xen): ret = relinquish_memory(d, &d->xenpage_list); if ( ret ) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index bd30d3798c..abbaf37a2e 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -27,6 +27,7 @@ #include #include #include +#include #include #include @@ -3858,6 +3859,7 @@ void __init create_domUs(void) struct domain *d; struct xen_domctl_createdomain d_cfg = { .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE, +.arch.viommu_type = viommu_get_type(), .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap, /* * The default of 1023 should be sufficient for guests because @@ -4052,6 +4054,7 @@ void __init create_dom0(void) printk(XENLOG_WARNING "Maximum number of vGIC IRQs exceeded.\n"); dom0_cfg.arch.tee_type = tee_get_type(); dom0_cfg.max_vcpus = dom0_max_vcpus(); +dom0_cfg.arch.viommu_type = viommu_get_type(); if ( iommu_enabled ) dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu; diff --git a/xen/arch/arm/include/asm/viommu.h b/xen/arch/arm/include/asm/viommu.h new file mode 100644 index 00..7cd3818a12 --- /dev/null +++ b/xen/arch/arm/include/asm/viommu.h @@ -0,0 +1,70 @@ +/* SPDX-License-Identifier: (GPL-2.0-or-later OR BSD-2-Clause) */ +#ifndef __ARCH_ARM_VIOMMU_H__ +#define __ARCH_ARM_VIOMMU_H__ + +#ifdef CONFIG_VIRTUAL_IOMMU + +#include +#include +#include + +struct viommu_ops { +/* + * Called during domain construction if toolstack requests to enable + * vIOMMU support. + */ +int (*domain_init)(struct domain *d); + +/* + * Called during domain destruction to free resources used by vIOMMU. + */ +int (*relinquish_resources)(struct domain *d); +}; + +struct viommu_desc { +/* vIOMMU domains init/free operations described above. */ +const struct viommu_ops *ops; + +/* + * ID of vIOMMU. Corresponds to xen_arch_domainconfig.viommu_type. + * Should be one of XEN_DOMCTL_CONFIG_VIOMMU_xxx + */ +uint16_t viommu_type; +}; + +int domain_viommu_init(struct domain *d, uint16_t viommu_type); +int viommu_relinquish_resources(struct domain *d); +uint16_t viommu_get_type(void); + +#else + +static inline uint8_t viommu_get_type(void) +{ +return XEN_DOMCTL_CONFIG_VIOMMU_NONE; +} + +static inline int domain_viommu_init(struct domain *d, uint16_t viommu_type) +{ +if ( likely(viommu_type == XEN_DOMCTL_CONFIG_VIOMMU_NONE) ) +return 0; + +return -ENODEV; +} + +static inline int viommu_relinquish_resources(struct domain *d) +{ +return 0; +} + +#endif /* CONFIG_VIRTUAL_IOMMU */ + +#endif /* __ARCH_ARM_VIOMMU_H__ */ + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */ diff --git a/xen/drivers/passthrough/Kconfig b/xen/drivers/passthrough/Kconfig index 479d7de57a..19924fa2de 100644 --- a/xen/drivers/passthrough/Kconfig +++ b/xen/drivers/pas
[RFC PATCH 03/21] xen/arm: smmuv3: Alloc io_domain for each device
In current implementation io_domain is allocated once for each xen domain as Stage2 translation is common for all devices in same xen domain. Nested stage supports S1 and S2 configuration at the same time. Stage1 translation will be different for each device as linux kernel will allocate page-table for each device. Alloc io_domain for each device so that each device can have different Stage-1 and Stage-2 configuration structure. Signed-off-by: Rahul Singh --- xen/drivers/passthrough/arm/smmu-v3.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/xen/drivers/passthrough/arm/smmu-v3.c b/xen/drivers/passthrough/arm/smmu-v3.c index 866fe8de4d..9174d2dedd 100644 --- a/xen/drivers/passthrough/arm/smmu-v3.c +++ b/xen/drivers/passthrough/arm/smmu-v3.c @@ -2753,11 +2753,13 @@ static struct arm_smmu_device *arm_smmu_get_by_dev(struct device *dev) static struct iommu_domain *arm_smmu_get_domain(struct domain *d, struct device *dev) { + unsigned long flags; struct iommu_domain *io_domain; struct arm_smmu_domain *smmu_domain; struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); struct arm_smmu_xen_domain *xen_domain = dom_iommu(d)->arch.priv; struct arm_smmu_device *smmu = arm_smmu_get_by_dev(fwspec->iommu_dev); + struct arm_smmu_master *master; if (!smmu) return NULL; @@ -2768,8 +2770,15 @@ static struct iommu_domain *arm_smmu_get_domain(struct domain *d, */ list_for_each_entry(io_domain, &xen_domain->contexts, list) { smmu_domain = to_smmu_domain(io_domain); - if (smmu_domain->smmu == smmu) - return io_domain; + + spin_lock_irqsave(&smmu_domain->devices_lock, flags); + list_for_each_entry(master, &smmu_domain->devices, domain_head) { + if (master->dev == dev) { + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); + return io_domain; + } + } + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); } return NULL; } -- 2.25.1
[RFC PATCH 02/21] xen/arm: smmuv3: Add support for stage-1 and nested stage translation
Xen SMMUv3 driver only supports stage-2 translation. Add support for Stage-1 translation that is required to support nested stage translation. In true nested mode, both s1_cfg and s2_cfg will coexist. Let's remove the union. When nested stage translation is setup, both s1_cfg and s2_cfg are valid. We introduce a new smmu_domain abort field that will be set upon guest stage-1 configuration passing. If no guest stage-1 config has been attached, it is ignored when writing the STE. arm_smmu_write_strtab_ent() is modified to write both stage fields in the STE and deal with the abort field. Signed-off-by: Rahul Singh --- xen/drivers/passthrough/arm/smmu-v3.c | 94 +++ xen/drivers/passthrough/arm/smmu-v3.h | 9 +++ 2 files changed, 92 insertions(+), 11 deletions(-) diff --git a/xen/drivers/passthrough/arm/smmu-v3.c b/xen/drivers/passthrough/arm/smmu-v3.c index cbef3f8b36..866fe8de4d 100644 --- a/xen/drivers/passthrough/arm/smmu-v3.c +++ b/xen/drivers/passthrough/arm/smmu-v3.c @@ -686,8 +686,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, * 3. Update Config, sync */ u64 val = le64_to_cpu(dst[0]); - bool ste_live = false; + bool s1_live = false, s2_live = false, ste_live = false; + bool abort, translate = false; struct arm_smmu_device *smmu = NULL; + struct arm_smmu_s1_cfg *s1_cfg = NULL; struct arm_smmu_s2_cfg *s2_cfg = NULL; struct arm_smmu_domain *smmu_domain = NULL; struct arm_smmu_cmdq_ent prefetch_cmd = { @@ -702,30 +704,54 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, smmu = master->smmu; } - if (smmu_domain) - s2_cfg = &smmu_domain->s2_cfg; + if (smmu_domain) { + switch (smmu_domain->stage) { + case ARM_SMMU_DOMAIN_NESTED: + s1_cfg = &smmu_domain->s1_cfg; + fallthrough; + case ARM_SMMU_DOMAIN_S2: + s2_cfg = &smmu_domain->s2_cfg; + break; + default: + break; + } + translate = !!s1_cfg || !!s2_cfg; + } if (val & STRTAB_STE_0_V) { switch (FIELD_GET(STRTAB_STE_0_CFG, val)) { case STRTAB_STE_0_CFG_BYPASS: break; + case STRTAB_STE_0_CFG_S1_TRANS: + s1_live = true; + break; case STRTAB_STE_0_CFG_S2_TRANS: - ste_live = true; + s2_live = true; + break; + case STRTAB_STE_0_CFG_NESTED: + s1_live = true; + s2_live = true; break; case STRTAB_STE_0_CFG_ABORT: - BUG_ON(!disable_bypass); break; default: BUG(); /* STE corruption */ } } + ste_live = s1_live || s2_live; + /* Nuke the existing STE_0 value, as we're going to rewrite it */ val = STRTAB_STE_0_V; /* Bypass/fault */ - if (!smmu_domain || !(s2_cfg)) { - if (!smmu_domain && disable_bypass) + if (!smmu_domain) + abort = disable_bypass; + else + abort = smmu_domain->abort; + + if (abort || !translate) { + if (abort) val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT); else val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS); @@ -743,8 +769,39 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, return; } + if (ste_live) { + /* First invalidate the live STE */ + dst[0] = cpu_to_le64(STRTAB_STE_0_CFG_ABORT); + arm_smmu_sync_ste_for_sid(smmu, sid); + } + + if (s1_cfg) { + BUG_ON(s1_live); + dst[1] = cpu_to_le64( +FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) | +FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) | +FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) | +FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) | +FIELD_PREP(STRTAB_STE_1_STRW, STRTAB_STE_1_STRW_NSEL1)); + + if (smmu->features & ARM_SMMU_FEAT_STALLS && + !(smmu->features & ARM_SMMU_FEAT_STALL_FORCE)) + dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD); + + val |= (s1_cfg->s1ctxptr & STRTAB_STE_0_S1CTXPTR_MASK)
[RFC PATCH 01/21] xen/arm: smmuv3: Maintain a SID->device structure
From: Jean-Philippe Brucker Backport Linux commit cdf315f907d4. This is the clean backport without any changes. When handling faults from the event or PRI queue, we need to find the struct device associated with a SID. Add a rb_tree to keep track of SIDs. Acked-by: Jonathan Cameron Reviewed-by: Eric Auger Reviewed-by: Keqian Zhu Signed-off-by: Jean-Philippe Brucker Acked-by: Will Deacon Link: https://lore.kernel.org/r/20210401154718.307519-8-jean-phili...@linaro.org Signed-off-by: Joerg Roedel Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git cdf315f907d4 Signed-off-by: Rahul Singh --- xen/drivers/passthrough/arm/smmu-v3.c | 131 +- xen/drivers/passthrough/arm/smmu-v3.h | 13 ++- 2 files changed, 118 insertions(+), 26 deletions(-) diff --git a/xen/drivers/passthrough/arm/smmu-v3.c b/xen/drivers/passthrough/arm/smmu-v3.c index 9c9f463009..cbef3f8b36 100644 --- a/xen/drivers/passthrough/arm/smmu-v3.c +++ b/xen/drivers/passthrough/arm/smmu-v3.c @@ -810,6 +810,27 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid) return 0; } +__maybe_unused +static struct arm_smmu_master * +arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid) +{ + struct rb_node *node; + struct arm_smmu_stream *stream; + + node = smmu->streams.rb_node; + while (node) { + stream = rb_entry(node, struct arm_smmu_stream, node); + if (stream->id < sid) + node = node->rb_right; + else if (stream->id > sid) + node = node->rb_left; + else + return stream->master; + } + + return NULL; +} + /* IRQ and event handlers */ static void arm_smmu_evtq_tasklet(void *dev) { @@ -1047,8 +1068,8 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master, if (!master->ats_enabled) return 0; - for (i = 0; i < master->num_sids; i++) { - cmd->atc.sid = master->sids[i]; + for (i = 0; i < master->num_streams; i++) { + cmd->atc.sid = master->streams[i].id; arm_smmu_cmdq_issue_cmd(master->smmu, cmd); } @@ -1276,13 +1297,13 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master) int i, j; struct arm_smmu_device *smmu = master->smmu; - for (i = 0; i < master->num_sids; ++i) { - u32 sid = master->sids[i]; +for (i = 0; i < master->num_streams; ++i) { + u32 sid = master->streams[i].id; __le64 *step = arm_smmu_get_step_for_sid(smmu, sid); /* Bridged PCI devices may end up with duplicated IDs */ for (j = 0; j < i; j++) - if (master->sids[j] == sid) + if (master->streams[j].id == sid) break; if (j < i) continue; @@ -1489,12 +1510,86 @@ static bool arm_smmu_sid_in_range(struct arm_smmu_device *smmu, u32 sid) return sid < limit; } + +static int arm_smmu_insert_master(struct arm_smmu_device *smmu, + struct arm_smmu_master *master) +{ + int i; + int ret = 0; + struct arm_smmu_stream *new_stream, *cur_stream; + struct rb_node **new_node, *parent_node = NULL; + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(master->dev); + + master->streams = _xzalloc_array(sizeof(*master->streams), sizeof(void *), + fwspec->num_ids); + if (!master->streams) + return -ENOMEM; + master->num_streams = fwspec->num_ids; + + mutex_lock(&smmu->streams_mutex); + for (i = 0; i < fwspec->num_ids; i++) { + u32 sid = fwspec->ids[i]; + + new_stream = &master->streams[i]; + new_stream->id = sid; + new_stream->master = master; + + /* +* Check the SIDs are in range of the SMMU and our stream table +*/ + if (!arm_smmu_sid_in_range(smmu, sid)) { + ret = -ERANGE; + break; + } + + /* Ensure l2 strtab is initialised */ + if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) { + ret = arm_smmu_init_l2_strtab(smmu, sid); + if (ret) + break; + } + + /* Insert into SID tree */ + new_node = &(smmu->streams.rb_node); + while (*new_node) { + cur_stream = rb_entry(*new_node, struct arm_smmu_stream, + node); +
[RFC PATCH 00/21] Add SMMUv3 Stage 1 Support for XEN guests
The SMMUv3 supports two stages of translation. Each stage of translation can be independently enabled. An incoming address is logically translated from VA to IPA in stage 1, then the IPA is input to stage 2 which translates the IPA to the output PA. Stage 1 is intended to be used by a software entity to provide isolation or translation to buffers within the entity, for example DMA isolation within an OS. Stage 2 is intended to be available in systems supporting the Virtualization Extensions and is intended to virtualize device DMA to guest VM address spaces. When both stage 1 and stage 2 are enabled, the translation configuration is called nested. Stage 1 translation support is required to provide isolation between different devices within OS. XEN already supports Stage 2 translation but there is no support for Stage 1 translation. The goal of this work is to support Stage 1 translation for XEN guests. Stage 1 has to be configured within the guest to provide isolation. We cannot trust the guest OS to control the SMMUv3 hardware directly as compromised guest OS can corrupt the SMMUv3 configuration and make the system vulnerable. The guest gets the ownership of the stage 1 page tables and also owns stage 1 configuration structures. The XEN handles the root configuration structure (for security reasons), including the stage 2 configuration. XEN will emulate the SMMUv3 hardware and exposes the virtual SMMUv3 to the guest. Guest can use the native SMMUv3 driver to configure the stage 1 translation. When the guest configures the SMMUv3 for Stage 1, XEN will trap the access and configure hardware. SMMUv3 Driver(Guest OS) -> Configure the Stage-1 translation -> XEN trap access -> XEN SMMUv3 driver configure the HW. SMMUv3 driver has to be updated to support the Stage-1 translation support based on work done by the KVM team to support Nested Stage translation: https://github.com/eauger/linux/commits/v5.11-stallv12-2stage-v14 https://lwn.net/Articles/852299/ As the stage 1 translation is configured by XEN on behalf of the guest, translation faults encountered during the translation process need to be propagated up to the guest and re-injected into the guest. When the guest invalidates stage 1 related caches, invalidations must be forwarded to the SMMUv3 hardware. This patch series is sent as RFC to get the initial feedback from the community. This patch series consists of 21 patches which is a big number for the reviewer to review the patches but to understand the feature end-to-end we thought of sending this as a big series. Once we will get initial feedback, we will divide the series into a small number of patches for review. Jean-Philippe Brucker (1): xen/arm: smmuv3: Maintain a SID->device structure Rahul Singh (20): xen/arm: smmuv3: Add support for stage-1 and nested stage translation xen/arm: smmuv3: Alloc io_domain for each device xen/arm: vIOMMU: add generic vIOMMU framework xen/arm: vsmmuv3: Add dummy support for virtual SMMUv3 for guests xen/domctl: Add XEN_DOMCTL_CONFIG_VIOMMU_* and viommu config param xen/arm: vIOMMU: Add cmdline boot option "viommu = " xen/arm: vsmmuv3: Add support for registers emulation xen/arm: vsmmuv3: Add support for cmdqueue handling xen/arm: vsmmuv3: Add support for command CMD_CFGI_STE xen/arm: vsmmuv3: Attach Stage-1 configuration to SMMUv3 hardware xen/arm: vsmmuv3: Add support for event queue and global error xen/arm: vsmmuv3: Add "iommus" property node for dom0 devices xen/arm: vIOMMU: IOMMU device tree node for dom0 xen/arm: vsmmuv3: Emulated SMMUv3 device tree node for dom0less arm/libxl: vsmmuv3: Emulated SMMUv3 device tree node in libxl xen/arm: vsmmuv3: Alloc virq for virtual SMMUv3 xen/arm: iommu: skip the iommu-map property for PCI devices xen/arm: vsmmuv3: Add support to send stage-1 event to guest libxl/arm: vIOMMU: Modify the partial device tree for iommus xen/arm: vIOMMU: Modify the partial device tree for dom0less docs/man/xl.cfg.5.pod.in | 11 + docs/misc/xen-command-line.pandoc | 7 + tools/golang/xenlight/helpers.gen.go | 2 + tools/golang/xenlight/types.gen.go | 1 + tools/include/libxl.h | 5 + tools/libs/light/libxl_arm.c | 121 +++- tools/libs/light/libxl_types.idl | 6 + tools/xl/xl_parse.c| 9 + xen/arch/arm/domain.c | 18 + xen/arch/arm/domain_build.c| 213 +- xen/arch/arm/include/asm/domain.h | 4 + xen/arch/arm/include/asm/viommu.h | 102 +++ xen/drivers/passthrough/Kconfig| 14 + xen/drivers/passthrough/arm/Makefile | 2 + xen/drivers/passthrough/arm/smmu-v3.c | 370 +-- xen/drivers/passthrough/arm/smmu-v3.h | 43 +- xen/drivers/passthrough/arm/viommu.c | 87 +++ xen/drivers/passthrough/arm/vsmmu-v3.c | 887 + xen/drivers/passthrough/arm/vsmmu-v3.h | 32 + xen/include/public/a
Re: Proposal for virtual IOMMU binding b/w vIOMMU and passthrough devices
Hi Julien, > On 27 Oct 2022, at 5:33 pm, Julien Grall wrote: > > On 27/10/2022 17:08, Rahul Singh wrote: >> Hi Julien, > > Hi Rahul, > >>> On 26 Oct 2022, at 8:48 pm, Julien Grall wrote: >>> >>> >>> >>> On 26/10/2022 15:33, Rahul Singh wrote: >>>> Hi Julien, >>> >>> Hi Rahul, >>> >>>>> On 26 Oct 2022, at 2:36 pm, Julien Grall wrote: >>>>> >>>>> >>>>> >>>>> On 26/10/2022 14:17, Rahul Singh wrote: >>>>>> Hi All, >>>>> >>>>> Hi Rahul, >>>>> >>>>>> At Arm, we started to implement the POC to support 2 levels of page >>>>>> tables/nested translation in SMMUv3. >>>>>> To support nested translation for guest OS Xen needs to expose the >>>>>> virtual IOMMU. If we passthrough the >>>>>> device to the guest that is behind an IOMMU and virtual IOMMU is enabled >>>>>> for the guest there is a need to >>>>>> add IOMMU binding for the device in the passthrough node as per [1]. >>>>>> This email is to get an agreement on >>>>>> how to add the IOMMU binding for guest OS. >>>>>> Before I will explain how to add the IOMMU binding let me give a brief >>>>>> overview of how we will add support for virtual >>>>>> IOMMU on Arm. In order to implement virtual IOMMU Xen need SMMUv3 Nested >>>>>> translation support. SMMUv3 hardware >>>>>> supports two stages of translation. Each stage of translation can be >>>>>> independently enabled. An incoming address is logically >>>>>> translated from VA to IPA in stage 1, then the IPA is input to stage 2 >>>>>> which translates the IPA to the output PA. Stage 1 is >>>>>> intended to be used by a software entity( Guest OS) to provide isolation >>>>>> or translation to buffers within the entity, for example, >>>>>> DMA isolation within an OS. Stage 2 is intended to be available in >>>>>> systems supporting the Virtualization Extensions and is >>>>>> intended to virtualize device DMA to guest VM address spaces. When both >>>>>> stage 1 and stage 2 are enabled, the translation >>>>>> configuration is called nesting. >>>>>> Stage 1 translation support is required to provide isolation between >>>>>> different devices within the guest OS. XEN already supports >>>>>> Stage 2 translation but there is no support for Stage 1 translation for >>>>>> guests. We will add support for guests to configure >>>>>> the Stage 1 transition via virtual IOMMU. XEN will emulate the SMMU >>>>>> hardware and exposes the virtual SMMU to the guest. >>>>>> Guest can use the native SMMU driver to configure the stage 1 >>>>>> translation. When the guest configures the SMMU for Stage 1, >>>>>> XEN will trap the access and configure the hardware accordingly. >>>>>> Now back to the question of how we can add the IOMMU binding between the >>>>>> virtual IOMMU and the master devices so that >>>>>> guests can configure the IOMMU correctly. The solution that I am >>>>>> suggesting is as below: >>>>>> For dom0, while handling the DT node(handle_node()) Xen will replace the >>>>>> phandle in the "iommus" property with the virtual >>>>>> IOMMU node phandle. >>>>> Below, you said that each IOMMUs may have a different ID space. So >>>>> shouldn't we expose one vIOMMU per pIOMMU? If not, how do you expect the >>>>> user to specify the mapping? >>>> Yes you are right we need to create one vIOMMU per pIOMMU for dom0. This >>>> also helps in the ACPI case >>>> where we don’t need to modify the tables to delete the pIOMMU entries and >>>> create one vIOMMU. >>>> In this case, no need to replace the phandle as Xen create the vIOMMU with >>>> the same pIOMMU >>>> phandle and same base address. >>>> For domU guests one vIOMMU per guest will be created. >>> >>> IIRC, the SMMUv3 is using a ring like the GICv3 ITS. I think we need to be >>> open here because this may end up to be tricky to security support it (we >>> have N guest ring that can write to M host ring
Re: Proposal for virtual IOMMU binding b/w vIOMMU and passthrough devices
Hi Oleksandr, > On 26 Oct 2022, at 7:23 pm, Oleksandr Tyshchenko wrote: > > > > On Wed, Oct 26, 2022 at 8:18 PM Michal Orzel wrote: > Hi Rahul, > > > Hello all > > [sorry for the possible format issues] > > > On 26/10/2022 16:33, Rahul Singh wrote: > > > > > > Hi Julien, > > > >> On 26 Oct 2022, at 2:36 pm, Julien Grall wrote: > >> > >> > >> > >> On 26/10/2022 14:17, Rahul Singh wrote: > >>> Hi All, > >> > >> Hi Rahul, > >> > >>> At Arm, we started to implement the POC to support 2 levels of page > >>> tables/nested translation in SMMUv3. > >>> To support nested translation for guest OS Xen needs to expose the > >>> virtual IOMMU. If we passthrough the > >>> device to the guest that is behind an IOMMU and virtual IOMMU is enabled > >>> for the guest there is a need to > >>> add IOMMU binding for the device in the passthrough node as per [1]. This > >>> email is to get an agreement on > >>> how to add the IOMMU binding for guest OS. > >>> Before I will explain how to add the IOMMU binding let me give a brief > >>> overview of how we will add support for virtual > >>> IOMMU on Arm. In order to implement virtual IOMMU Xen need SMMUv3 Nested > >>> translation support. SMMUv3 hardware > >>> supports two stages of translation. Each stage of translation can be > >>> independently enabled. An incoming address is logically > >>> translated from VA to IPA in stage 1, then the IPA is input to stage 2 > >>> which translates the IPA to the output PA. Stage 1 is > >>> intended to be used by a software entity( Guest OS) to provide isolation > >>> or translation to buffers within the entity, for example, > >>> DMA isolation within an OS. Stage 2 is intended to be available in > >>> systems supporting the Virtualization Extensions and is > >>> intended to virtualize device DMA to guest VM address spaces. When both > >>> stage 1 and stage 2 are enabled, the translation > >>> configuration is called nesting. > >>> Stage 1 translation support is required to provide isolation between > >>> different devices within the guest OS. XEN already supports > >>> Stage 2 translation but there is no support for Stage 1 translation for > >>> guests. We will add support for guests to configure > >>> the Stage 1 transition via virtual IOMMU. XEN will emulate the SMMU > >>> hardware and exposes the virtual SMMU to the guest. > >>> Guest can use the native SMMU driver to configure the stage 1 > >>> translation. When the guest configures the SMMU for Stage 1, > >>> XEN will trap the access and configure the hardware accordingly. > >>> Now back to the question of how we can add the IOMMU binding between the > >>> virtual IOMMU and the master devices so that > >>> guests can configure the IOMMU correctly. The solution that I am > >>> suggesting is as below: > >>> For dom0, while handling the DT node(handle_node()) Xen will replace the > >>> phandle in the "iommus" property with the virtual > >>> IOMMU node phandle. > >> Below, you said that each IOMMUs may have a different ID space. So > >> shouldn't we expose one vIOMMU per pIOMMU? If not, how do you expect the > >> user to specify the mapping? > > > > Yes you are right we need to create one vIOMMU per pIOMMU for dom0. This > > also helps in the ACPI case > > where we don’t need to modify the tables to delete the pIOMMU entries and > > create one vIOMMU. > > In this case, no need to replace the phandle as Xen create the vIOMMU with > > the same pIOMMU > > phandle and same base address. > > > > For domU guests one vIOMMU per guest will be created. > > > >> > >>> For domU guests, when passthrough the device to the guest as per [2], > >>> add the below property in the partial device tree > >>> node that is required to describe the generic device tree binding for > >>> IOMMUs and their master(s) > >>> "iommus = < &magic_phandle 0xvMasterID> > >>> • magic_phandle will be the phandle ( vIOMMU phandle in xl) that > >>> will be documented so that the user can set that in partial DT node > >>> (0xfdea). > >> > >> Does this mean only one IOMMU will be supported in the guest? > > &g
Re: Proposal for virtual IOMMU binding b/w vIOMMU and passthrough devices
Hi Michal, > On 26 Oct 2022, at 6:17 pm, Michal Orzel wrote: > > Hi Rahul, > > On 26/10/2022 16:33, Rahul Singh wrote: >> >> >> Hi Julien, >> >>> On 26 Oct 2022, at 2:36 pm, Julien Grall wrote: >>> >>> >>> >>> On 26/10/2022 14:17, Rahul Singh wrote: >>>> Hi All, >>> >>> Hi Rahul, >>> >>>> At Arm, we started to implement the POC to support 2 levels of page >>>> tables/nested translation in SMMUv3. >>>> To support nested translation for guest OS Xen needs to expose the virtual >>>> IOMMU. If we passthrough the >>>> device to the guest that is behind an IOMMU and virtual IOMMU is enabled >>>> for the guest there is a need to >>>> add IOMMU binding for the device in the passthrough node as per [1]. This >>>> email is to get an agreement on >>>> how to add the IOMMU binding for guest OS. >>>> Before I will explain how to add the IOMMU binding let me give a brief >>>> overview of how we will add support for virtual >>>> IOMMU on Arm. In order to implement virtual IOMMU Xen need SMMUv3 Nested >>>> translation support. SMMUv3 hardware >>>> supports two stages of translation. Each stage of translation can be >>>> independently enabled. An incoming address is logically >>>> translated from VA to IPA in stage 1, then the IPA is input to stage 2 >>>> which translates the IPA to the output PA. Stage 1 is >>>> intended to be used by a software entity( Guest OS) to provide isolation >>>> or translation to buffers within the entity, for example, >>>> DMA isolation within an OS. Stage 2 is intended to be available in systems >>>> supporting the Virtualization Extensions and is >>>> intended to virtualize device DMA to guest VM address spaces. When both >>>> stage 1 and stage 2 are enabled, the translation >>>> configuration is called nesting. >>>> Stage 1 translation support is required to provide isolation between >>>> different devices within the guest OS. XEN already supports >>>> Stage 2 translation but there is no support for Stage 1 translation for >>>> guests. We will add support for guests to configure >>>> the Stage 1 transition via virtual IOMMU. XEN will emulate the SMMU >>>> hardware and exposes the virtual SMMU to the guest. >>>> Guest can use the native SMMU driver to configure the stage 1 translation. >>>> When the guest configures the SMMU for Stage 1, >>>> XEN will trap the access and configure the hardware accordingly. >>>> Now back to the question of how we can add the IOMMU binding between the >>>> virtual IOMMU and the master devices so that >>>> guests can configure the IOMMU correctly. The solution that I am >>>> suggesting is as below: >>>> For dom0, while handling the DT node(handle_node()) Xen will replace the >>>> phandle in the "iommus" property with the virtual >>>> IOMMU node phandle. >>> Below, you said that each IOMMUs may have a different ID space. So >>> shouldn't we expose one vIOMMU per pIOMMU? If not, how do you expect the >>> user to specify the mapping? >> >> Yes you are right we need to create one vIOMMU per pIOMMU for dom0. This >> also helps in the ACPI case >> where we don’t need to modify the tables to delete the pIOMMU entries and >> create one vIOMMU. >> In this case, no need to replace the phandle as Xen create the vIOMMU with >> the same pIOMMU >> phandle and same base address. >> >> For domU guests one vIOMMU per guest will be created. >> >>> >>>> For domU guests, when passthrough the device to the guest as per [2], add >>>> the below property in the partial device tree >>>> node that is required to describe the generic device tree binding for >>>> IOMMUs and their master(s) >>>> "iommus = < &magic_phandle 0xvMasterID> >>>> • magic_phandle will be the phandle ( vIOMMU phandle in xl) that will >>>> be documented so that the user can set that in partial DT node (0xfdea). >>> >>> Does this mean only one IOMMU will be supported in the guest? >> >> Yes. >> >>> >>>> • vMasterID will be the virtual master ID that the user will provide. >>>> The partial device tree will look like this: >>>> /dts-v1/; >>>> / { >>>>/
Re: Proposal for virtual IOMMU binding b/w vIOMMU and passthrough devices
Hi Julien, > On 26 Oct 2022, at 8:48 pm, Julien Grall wrote: > > > > On 26/10/2022 15:33, Rahul Singh wrote: >> Hi Julien, > > Hi Rahul, > >>> On 26 Oct 2022, at 2:36 pm, Julien Grall wrote: >>> >>> >>> >>> On 26/10/2022 14:17, Rahul Singh wrote: >>>> Hi All, >>> >>> Hi Rahul, >>> >>>> At Arm, we started to implement the POC to support 2 levels of page >>>> tables/nested translation in SMMUv3. >>>> To support nested translation for guest OS Xen needs to expose the virtual >>>> IOMMU. If we passthrough the >>>> device to the guest that is behind an IOMMU and virtual IOMMU is enabled >>>> for the guest there is a need to >>>> add IOMMU binding for the device in the passthrough node as per [1]. This >>>> email is to get an agreement on >>>> how to add the IOMMU binding for guest OS. >>>> Before I will explain how to add the IOMMU binding let me give a brief >>>> overview of how we will add support for virtual >>>> IOMMU on Arm. In order to implement virtual IOMMU Xen need SMMUv3 Nested >>>> translation support. SMMUv3 hardware >>>> supports two stages of translation. Each stage of translation can be >>>> independently enabled. An incoming address is logically >>>> translated from VA to IPA in stage 1, then the IPA is input to stage 2 >>>> which translates the IPA to the output PA. Stage 1 is >>>> intended to be used by a software entity( Guest OS) to provide isolation >>>> or translation to buffers within the entity, for example, >>>> DMA isolation within an OS. Stage 2 is intended to be available in systems >>>> supporting the Virtualization Extensions and is >>>> intended to virtualize device DMA to guest VM address spaces. When both >>>> stage 1 and stage 2 are enabled, the translation >>>> configuration is called nesting. >>>> Stage 1 translation support is required to provide isolation between >>>> different devices within the guest OS. XEN already supports >>>> Stage 2 translation but there is no support for Stage 1 translation for >>>> guests. We will add support for guests to configure >>>> the Stage 1 transition via virtual IOMMU. XEN will emulate the SMMU >>>> hardware and exposes the virtual SMMU to the guest. >>>> Guest can use the native SMMU driver to configure the stage 1 translation. >>>> When the guest configures the SMMU for Stage 1, >>>> XEN will trap the access and configure the hardware accordingly. >>>> Now back to the question of how we can add the IOMMU binding between the >>>> virtual IOMMU and the master devices so that >>>> guests can configure the IOMMU correctly. The solution that I am >>>> suggesting is as below: >>>> For dom0, while handling the DT node(handle_node()) Xen will replace the >>>> phandle in the "iommus" property with the virtual >>>> IOMMU node phandle. >>> Below, you said that each IOMMUs may have a different ID space. So >>> shouldn't we expose one vIOMMU per pIOMMU? If not, how do you expect the >>> user to specify the mapping? >> Yes you are right we need to create one vIOMMU per pIOMMU for dom0. This >> also helps in the ACPI case >> where we don’t need to modify the tables to delete the pIOMMU entries and >> create one vIOMMU. >> In this case, no need to replace the phandle as Xen create the vIOMMU with >> the same pIOMMU >> phandle and same base address. >> For domU guests one vIOMMU per guest will be created. > > IIRC, the SMMUv3 is using a ring like the GICv3 ITS. I think we need to be > open here because this may end up to be tricky to security support it (we > have N guest ring that can write to M host ring). If xl want to creates the one vIOMMU per pIOMMU for domU then xl needs to know the below information: - Find the number of holes in guest memory same as the number of vIOMMU that needs the creation to create the vIOMMU DT nodes. (Think about a big system that has 50+ IOMMUs) Yes, we will create vIOMMU for only those devices that are assigned to guests but still we need to find the hole in guest memory. - Find the pIOMMU attached to the assigned device and create mapping b/w vIOMMU -> pIOMMU to register the MMIO handler. Either we need to modify the current hyerpcall or need to implement a new hypercall to find this information. Because of the above reason I thought of creating one vIOMMU for domU. Yes
Re: Proposal for virtual IOMMU binding b/w vIOMMU and passthrough devices
Hi Julien, > On 26 Oct 2022, at 2:36 pm, Julien Grall wrote: > > > > On 26/10/2022 14:17, Rahul Singh wrote: >> Hi All, > > Hi Rahul, > >> At Arm, we started to implement the POC to support 2 levels of page >> tables/nested translation in SMMUv3. >> To support nested translation for guest OS Xen needs to expose the virtual >> IOMMU. If we passthrough the >> device to the guest that is behind an IOMMU and virtual IOMMU is enabled for >> the guest there is a need to >> add IOMMU binding for the device in the passthrough node as per [1]. This >> email is to get an agreement on >> how to add the IOMMU binding for guest OS. >> Before I will explain how to add the IOMMU binding let me give a brief >> overview of how we will add support for virtual >> IOMMU on Arm. In order to implement virtual IOMMU Xen need SMMUv3 Nested >> translation support. SMMUv3 hardware >> supports two stages of translation. Each stage of translation can be >> independently enabled. An incoming address is logically >> translated from VA to IPA in stage 1, then the IPA is input to stage 2 which >> translates the IPA to the output PA. Stage 1 is >> intended to be used by a software entity( Guest OS) to provide isolation or >> translation to buffers within the entity, for example, >> DMA isolation within an OS. Stage 2 is intended to be available in systems >> supporting the Virtualization Extensions and is >> intended to virtualize device DMA to guest VM address spaces. When both >> stage 1 and stage 2 are enabled, the translation >> configuration is called nesting. >> Stage 1 translation support is required to provide isolation between >> different devices within the guest OS. XEN already supports >> Stage 2 translation but there is no support for Stage 1 translation for >> guests. We will add support for guests to configure >> the Stage 1 transition via virtual IOMMU. XEN will emulate the SMMU hardware >> and exposes the virtual SMMU to the guest. >> Guest can use the native SMMU driver to configure the stage 1 translation. >> When the guest configures the SMMU for Stage 1, >> XEN will trap the access and configure the hardware accordingly. >> Now back to the question of how we can add the IOMMU binding between the >> virtual IOMMU and the master devices so that >> guests can configure the IOMMU correctly. The solution that I am suggesting >> is as below: >> For dom0, while handling the DT node(handle_node()) Xen will replace the >> phandle in the "iommus" property with the virtual >> IOMMU node phandle. > Below, you said that each IOMMUs may have a different ID space. So shouldn't > we expose one vIOMMU per pIOMMU? If not, how do you expect the user to > specify the mapping? Yes you are right we need to create one vIOMMU per pIOMMU for dom0. This also helps in the ACPI case where we don’t need to modify the tables to delete the pIOMMU entries and create one vIOMMU. In this case, no need to replace the phandle as Xen create the vIOMMU with the same pIOMMU phandle and same base address. For domU guests one vIOMMU per guest will be created. > >> For domU guests, when passthrough the device to the guest as per [2], add >> the below property in the partial device tree >> node that is required to describe the generic device tree binding for IOMMUs >> and their master(s) >> "iommus = < &magic_phandle 0xvMasterID> >> • magic_phandle will be the phandle ( vIOMMU phandle in xl) that will >> be documented so that the user can set that in partial DT node (0xfdea). > > Does this mean only one IOMMU will be supported in the guest? Yes. > >> • vMasterID will be the virtual master ID that the user will provide. >> The partial device tree will look like this: >> /dts-v1/; >> / { >> /* #*cells are here to keep DTC happy */ >> #address-cells = <2>; >> #size-cells = <2>; >> aliases { >> net = &mac0; >> }; >> passthrough { >> compatible = "simple-bus"; >> ranges; >> #address-cells = <2>; >> #size-cells = <2>; >> mac0: ethernet@1000 { >> compatible = "calxeda,hb-xgmac"; >> reg = <0 0x1000 0 0x1000>; >> interrupts = <0 80 4 0 81 4 0 82 4>; >>iommus = <0xfdea 0x01>; >> }; >> }; >> }; >> In xl.cfg we need to define a new option to inform Xen about vMasterId to >> pMasterId mapping and to whi
Proposal for virtual IOMMU binding b/w vIOMMU and passthrough devices
Hi All, At Arm, we started to implement the POC to support 2 levels of page tables/nested translation in SMMUv3. To support nested translation for guest OS Xen needs to expose the virtual IOMMU. If we passthrough the device to the guest that is behind an IOMMU and virtual IOMMU is enabled for the guest there is a need to add IOMMU binding for the device in the passthrough node as per [1]. This email is to get an agreement on how to add the IOMMU binding for guest OS. Before I will explain how to add the IOMMU binding let me give a brief overview of how we will add support for virtual IOMMU on Arm. In order to implement virtual IOMMU Xen need SMMUv3 Nested translation support. SMMUv3 hardware supports two stages of translation. Each stage of translation can be independently enabled. An incoming address is logically translated from VA to IPA in stage 1, then the IPA is input to stage 2 which translates the IPA to the output PA. Stage 1 is intended to be used by a software entity( Guest OS) to provide isolation or translation to buffers within the entity, for example, DMA isolation within an OS. Stage 2 is intended to be available in systems supporting the Virtualization Extensions and is intended to virtualize device DMA to guest VM address spaces. When both stage 1 and stage 2 are enabled, the translation configuration is called nesting. Stage 1 translation support is required to provide isolation between different devices within the guest OS. XEN already supports Stage 2 translation but there is no support for Stage 1 translation for guests. We will add support for guests to configure the Stage 1 transition via virtual IOMMU. XEN will emulate the SMMU hardware and exposes the virtual SMMU to the guest. Guest can use the native SMMU driver to configure the stage 1 translation. When the guest configures the SMMU for Stage 1, XEN will trap the access and configure the hardware accordingly. Now back to the question of how we can add the IOMMU binding between the virtual IOMMU and the master devices so that guests can configure the IOMMU correctly. The solution that I am suggesting is as below: For dom0, while handling the DT node(handle_node()) Xen will replace the phandle in the "iommus" property with the virtual IOMMU node phandle. For domU guests, when passthrough the device to the guest as per [2], add the below property in the partial device tree node that is required to describe the generic device tree binding for IOMMUs and their master(s) "iommus = < &magic_phandle 0xvMasterID> • magic_phandle will be the phandle ( vIOMMU phandle in xl) that will be documented so that the user can set that in partial DT node (0xfdea). • vMasterID will be the virtual master ID that the user will provide. The partial device tree will look like this: /dts-v1/; / { /* #*cells are here to keep DTC happy */ #address-cells = <2>; #size-cells = <2>; aliases { net = &mac0; }; passthrough { compatible = "simple-bus"; ranges; #address-cells = <2>; #size-cells = <2>; mac0: ethernet@1000 { compatible = "calxeda,hb-xgmac"; reg = <0 0x1000 0 0x1000>; interrupts = <0 80 4 0 81 4 0 82 4>; iommus = <0xfdea 0x01>; }; }; }; In xl.cfg we need to define a new option to inform Xen about vMasterId to pMasterId mapping and to which IOMMU device this the master device is connected so that Xen can configure the right IOMMU. This is required if the system has devices that have the same master ID but behind a different IOMMU. iommu_devid_map = [ “PMASTER_ID[@VMASTER_ID],IOMMU_BASE_ADDRESS” , “PMASTER_ID[@VMASTER_ID],IOMMU_BASE_ADDRESS”] • PMASTER_ID is the physical master ID of the device from the physical DT. • VMASTER_ID is the virtual master Id that the user will configure in the partial device tree. • IOMMU_BASE_ADDRESS is the base address of the physical IOMMU device to which this device is connected. Example: Let's say the user wants to assign the below physical device in DT to the guest. iommu@4f00 { compatible = "arm,smmu-v3"; interrupts = <0x00 0xe4 0xf04>; interrupt-parent = <0x01>; #iommu-cells = <0x01>; interrupt-names = "combined"; reg = <0x00 0x4f00 0x00 0x4>; phandle = <0xfdeb>; name = "iommu"; }; test@1000 { compatible = "viommu-test”; iommus = <0xfdeb 0x10>; interrupts = <0x00 0xff 0x04>; reg = <0x00 0x1000 0x00 0x1000>; name = "viommu-test"; }; The partial Device tree node will be like this: / { /* #*cells are here to keep DTC happy */ #address-cells = <2>; #size-cells = <2>; passthrough { compatible = "simple-bus"; ranges; #address-cells = <2>; #size-cell
Re: [PATCH 0/2] xen/arm: static event channel
Hi All, > On 26 Sep 2022, at 1:12 pm, Bertrand Marquis wrote: > > Hi Rahul, > > Please give the necessary justification for inclusion in 4.17: > - severity of the bug fixed The severity of the bug is high as without this fixed system with ACPI support will fail to boot. > - probability and impact of potential issues that the patch could add. As we are not supporting the static event channel for ACPI, it is okay to move alloc_static_evtchn() under acpi_disabled check. Regards, Rahul
Re: [PATCH 2/2] xen/arm: fix booting ACPI based system after static evtchn series
Hi Ayan, > On 23 Sep 2022, at 1:10 pm, Ayan Kumar Halder wrote: > > Hi Rahul, > > On 23/09/2022 12:02, Rahul Singh wrote: >> CAUTION: This message has originated from an External Source. Please use >> proper judgment and caution when opening attachments, clicking links, or >> responding to this email. >> >> >> When ACPI is enabled and the system booted with ACPI, BUG() is observed >> after merging the static event channel series. As there is not DT when > [NIT] : s/not/no Ack. >> booted with ACPI there will be no chosen node because of that >> "BUG_ON(chosen == NULL)" will be hit. >> >> (XEN) Xen BUG at arch/arm/domain_build.c:3578 > Is the bug seen on the gitlab ci ? No, I found the issue while testing the ACPI boot. But going forward we will add this in our internal ci. >> >> Move call to alloc_static_evtchn() under acpi_disabled check to fix the >> issue. >> >> Fixes: 1fe16b3ed78a (xen/arm: introduce xen-evtchn dom0less property) >> Signed-off-by: Rahul Singh >> --- >> xen/arch/arm/setup.c | 5 +++-- >> 1 file changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c >> index 61b4f258a0..4395640019 100644 >> --- a/xen/arch/arm/setup.c >> +++ b/xen/arch/arm/setup.c >> @@ -1166,9 +1166,10 @@ void __init start_xen(unsigned long boot_phys_offset, >> printk(XENLOG_INFO "Xen dom0less mode detected\n"); >> >> if ( acpi_disabled ) >> +{ >> create_domUs(); >> - >> -alloc_static_evtchn(); >> +alloc_static_evtchn(); > > Can the code in alloc_static_evtchn() be guarded with "#ifndef CONFIG_ACPI > ... endif" ? Not required as acpi_disabled will take care of that. acpi_disabled variable is used to avoid the CONFIG_ACPI. Regards, Rahul
[PATCH 2/2] xen/arm: fix booting ACPI based system after static evtchn series
When ACPI is enabled and the system booted with ACPI, BUG() is observed after merging the static event channel series. As there is not DT when booted with ACPI there will be no chosen node because of that "BUG_ON(chosen == NULL)" will be hit. (XEN) Xen BUG at arch/arm/domain_build.c:3578 Move call to alloc_static_evtchn() under acpi_disabled check to fix the issue. Fixes: 1fe16b3ed78a (xen/arm: introduce xen-evtchn dom0less property) Signed-off-by: Rahul Singh --- xen/arch/arm/setup.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c index 61b4f258a0..4395640019 100644 --- a/xen/arch/arm/setup.c +++ b/xen/arch/arm/setup.c @@ -1166,9 +1166,10 @@ void __init start_xen(unsigned long boot_phys_offset, printk(XENLOG_INFO "Xen dom0less mode detected\n"); if ( acpi_disabled ) +{ create_domUs(); - -alloc_static_evtchn(); +alloc_static_evtchn(); +} /* * This needs to be called **before** heap_init_late() so modules -- 2.25.1
[PATCH 1/2] xen: Add static event channel in SUPPORT.md on ARM
Static event channel support is tech preview, which shall be documented in SUPPORT.md Signed-off-by: Rahul Singh --- SUPPORT.md | 7 +++ 1 file changed, 7 insertions(+) diff --git a/SUPPORT.md b/SUPPORT.md index 8ebd63ad82..29f74ac506 100644 --- a/SUPPORT.md +++ b/SUPPORT.md @@ -922,6 +922,13 @@ bootscrub=off are passed as Xen command line parameters. (Memory should be scrubbed with bootscrub=idle.) No XSAs will be issues due to unscrubbed memory. +## Static Event Channel + +Allow to setup the static event channel on dom0less system, enabling domains +to send/receive notifications. + +Status, ARM: Tech Preview + # Format and definitions This file contains prose, and machine-readable fragments. -- 2.25.1
[PATCH 0/2] xen/arm: static event channel
This patch series fix issues related to static event channel series. Rahul Singh (2): xen: Add static event channel in SUPPORT.md on ARM xen/arm: fix booting ACPI based system after static evtchn series SUPPORT.md | 7 +++ xen/arch/arm/setup.c | 5 +++-- 2 files changed, 10 insertions(+), 2 deletions(-) -- 2.25.1
[PATCH v6 2/2] xen/pci: replace call to is_memory_hole to pci_check_bar
is_memory_hole was implemented for x86 and not for ARM when introduced. Replace is_memory_hole call to pci_check_bar as function should check if device BAR is in defined memory range. Also, add an implementation for ARM which is required for PCI passthrough. On x86, pci_check_bar will call is_memory_hole which will check if BAR is not overlapping with any memory region defined in the memory map. On ARM, pci_check_bar will go through the host bridge ranges and check if the BAR is in the range of defined ranges. Signed-off-by: Rahul Singh Acked-by: Jan Beulich --- Changes in v6: - change from unsigned long to paddr_t Changes in v5: - drop use of PFN_UP and PF_DOWN in case addresses are not aligned. - As we drop the PFN_UP and PFN_DOWN we need to use the mfn_to_maddr() to get the BAR address without page shift. - Add TODO comment for address alignment check for ranges. - Added Jan Acked-by for x86 and common code. Changes in v4: - check "s <= e" before callback - Add TODO comment for revisiting the function pci_check_bar() when ACPI PCI passthrough support is added. - Not Added the Jan Acked-by as patch is modified. Changes in v3: - fix minor comments --- --- xen/arch/arm/include/asm/pci.h | 2 ++ xen/arch/arm/pci/pci-host-common.c | 54 ++ xen/arch/x86/include/asm/pci.h | 10 ++ xen/drivers/passthrough/pci.c | 8 ++--- 4 files changed, 70 insertions(+), 4 deletions(-) diff --git a/xen/arch/arm/include/asm/pci.h b/xen/arch/arm/include/asm/pci.h index 80a2431804..8cb46f6b71 100644 --- a/xen/arch/arm/include/asm/pci.h +++ b/xen/arch/arm/include/asm/pci.h @@ -126,6 +126,8 @@ int pci_host_iterate_bridges_and_count(struct domain *d, int pci_host_bridge_mappings(struct domain *d); +bool pci_check_bar(const struct pci_dev *pdev, mfn_t start, mfn_t end); + #else /*!CONFIG_HAS_PCI*/ struct arch_pci_dev { }; diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c index 89ef30028e..a8ece94303 100644 --- a/xen/arch/arm/pci/pci-host-common.c +++ b/xen/arch/arm/pci/pci-host-common.c @@ -24,6 +24,16 @@ #include +/* + * struct to hold pci device bar. + */ +struct pdev_bar_check +{ +paddr_t start; +paddr_t end; +bool is_valid; +}; + /* * List for all the pci host bridges. */ @@ -363,6 +373,50 @@ int __init pci_host_bridge_mappings(struct domain *d) return 0; } +/* + * TODO: BAR addresses and Root Complex window addresses are not guaranteed + * to be page aligned. We should check for alignment but this is not the + * right place for alignment check. + */ +static int is_bar_valid(const struct dt_device_node *dev, +paddr_t addr, paddr_t len, void *data) +{ +struct pdev_bar_check *bar_data = data; +paddr_t s = bar_data->start; +paddr_t e = bar_data->end; + +if ( (s >= addr) && (e <= (addr + len - 1)) ) +bar_data->is_valid = true; + +return 0; +} + +/* TODO: Revisit this function when ACPI PCI passthrough support is added. */ +bool pci_check_bar(const struct pci_dev *pdev, mfn_t start, mfn_t end) +{ +int ret; +const struct dt_device_node *dt_node; +paddr_t s = mfn_to_maddr(start); +paddr_t e = mfn_to_maddr(end); +struct pdev_bar_check bar_data = { +.start = s, +.end = e, +.is_valid = false +}; + +if ( s >= e ) +return false; + +dt_node = pci_find_host_bridge_node(pdev); +if ( !dt_node ) +return false; + +ret = dt_for_each_range(dt_node, &is_bar_valid, &bar_data); +if ( ret < 0 ) +return false; + +return bar_data.is_valid; +} /* * Local variables: * mode: C diff --git a/xen/arch/x86/include/asm/pci.h b/xen/arch/x86/include/asm/pci.h index c8e1a9ecdb..f4a58c8acf 100644 --- a/xen/arch/x86/include/asm/pci.h +++ b/xen/arch/x86/include/asm/pci.h @@ -57,4 +57,14 @@ static always_inline bool is_pci_passthrough_enabled(void) void arch_pci_init_pdev(struct pci_dev *pdev); +static inline bool pci_check_bar(const struct pci_dev *pdev, + mfn_t start, mfn_t end) +{ +/* + * Check if BAR is not overlapping with any memory region defined + * in the memory map. + */ +return is_memory_hole(start, end); +} + #endif /* __X86_PCI_H__ */ diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c index cdaf5c247f..149f68bb6e 100644 --- a/xen/drivers/passthrough/pci.c +++ b/xen/drivers/passthrough/pci.c @@ -304,8 +304,8 @@ static void check_pdev(const struct pci_dev *pdev) if ( rc < 0 ) /* Unable to size, better leave memory decoding disabled. */ return; -if ( size && !is_memory_hole(maddr_to_mfn(addr), - maddr_to_mfn(addr + size - 1)) ) +if ( size && !pci_check_bar(pdev, maddr_to_mfn(add
[PATCH v6 1/2] xen/arm: pci: modify pci_find_host_bridge_node argument to const pdev
Modify pci_find_host_bridge_node argument to const pdev to avoid converting the dev to pdev in pci_find_host_bridge_node and also constify the return. Signed-off-by: Rahul Singh Acked-by: Stefano Stabellini Reviewed-by: Oleksandr Tyshchenko --- Changes in v6: - no changes Changes in v5: - no changes Changes in v4: - no changes Changes in v3: - no changes --- --- xen/arch/arm/include/asm/pci.h | 3 ++- xen/arch/arm/pci/pci-host-common.c | 4 ++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/xen/arch/arm/include/asm/pci.h b/xen/arch/arm/include/asm/pci.h index 7c7449d64f..80a2431804 100644 --- a/xen/arch/arm/include/asm/pci.h +++ b/xen/arch/arm/include/asm/pci.h @@ -106,7 +106,8 @@ bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d, struct pci_host_bridge *bridge, uint64_t addr); struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus); -struct dt_device_node *pci_find_host_bridge_node(struct device *dev); +const struct dt_device_node * +pci_find_host_bridge_node(const struct pci_dev *pdev); int pci_get_host_bridge_segment(const struct dt_device_node *node, uint16_t *segment); diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c index fd8c0f837a..89ef30028e 100644 --- a/xen/arch/arm/pci/pci-host-common.c +++ b/xen/arch/arm/pci/pci-host-common.c @@ -243,10 +243,10 @@ err_exit: /* * Get host bridge node given a device attached to it. */ -struct dt_device_node *pci_find_host_bridge_node(struct device *dev) +const struct dt_device_node * +pci_find_host_bridge_node(const struct pci_dev *pdev) { struct pci_host_bridge *bridge; -struct pci_dev *pdev = dev_to_pci(dev); bridge = pci_find_host_bridge(pdev->seg, pdev->bus); if ( unlikely(!bridge) ) -- 2.25.1
[PATCH v6 0/2] xen/pci: implement is_memory_hole for ARM
This patch series is to implement something like is_memory_hole function for ARM. Rahul Singh (2): xen/arm: pci: modify pci_find_host_bridge_node argument to const pdev xen/pci: replace call to is_memory_hole to pci_check_bar xen/arch/arm/include/asm/pci.h | 5 ++- xen/arch/arm/pci/pci-host-common.c | 58 -- xen/arch/x86/include/asm/pci.h | 10 ++ xen/drivers/passthrough/pci.c | 8 ++--- 4 files changed, 74 insertions(+), 7 deletions(-) -- 2.25.1
Re: [PATCH v5 2/2] xen/pci: replace call to is_memory_hole to pci_check_bar
Hi Jan, > On 8 Sep 2022, at 1:03 pm, Jan Beulich wrote: > > On 08.09.2022 13:49, Rahul Singh wrote: >> is_memory_hole was implemented for x86 and not for ARM when introduced. >> Replace is_memory_hole call to pci_check_bar as function should check >> if device BAR is in defined memory range. Also, add an implementation >> for ARM which is required for PCI passthrough. >> >> On x86, pci_check_bar will call is_memory_hole which will check if BAR >> is not overlapping with any memory region defined in the memory map. >> >> On ARM, pci_check_bar will go through the host bridge ranges and check >> if the BAR is in the range of defined ranges. >> >> Signed-off-by: Rahul Singh >> Acked-by: Jan Beulich # x86, common > > FTAOD: I object to this tagging, and I did not provide the ack with > such tags. Quoting docs/process/sending-patches.pandoc: "The > `Acked-by:` tag can only be given by a **maintainer** of the modified > code, and it only covers the code the maintainer is responsible for." > The doc provides for tagging here, yes, but such should only be used > for the unusual case of an ack restricted to less than what a > person's maintainership covers. Otherwise we'd end up seeing overly > many tagged acks. (Recall that tagged R-b is also expected to be the > exception, not the common case.) > Ok. I will remove “# x86, common” if I get comments on this patch and there is a need for the the next version, otherwise, I suggest that the committer can remove this while committing the patch. Regards, Rahul
[PATCH v5 2/2] xen/pci: replace call to is_memory_hole to pci_check_bar
is_memory_hole was implemented for x86 and not for ARM when introduced. Replace is_memory_hole call to pci_check_bar as function should check if device BAR is in defined memory range. Also, add an implementation for ARM which is required for PCI passthrough. On x86, pci_check_bar will call is_memory_hole which will check if BAR is not overlapping with any memory region defined in the memory map. On ARM, pci_check_bar will go through the host bridge ranges and check if the BAR is in the range of defined ranges. Signed-off-by: Rahul Singh Acked-by: Jan Beulich # x86, common --- Changes in v5: - drop use of PFN_UP and PF_DOWN in case addresses are not aligned. - As we drop the PFN_UP and PFN_DOWN we need to use the mfn_to_maddr() to get the BAR address without page shift. - Add TODO comment for address alignment check for ranges. - Added Jan Acked-by for x86 and common code. Changes in v4: - check "s <= e" before callback - Add TODO comment for revisiting the function pci_check_bar() when ACPI PCI passthrough support is added. - Not Added the Jan Acked-by as patch is modified. Changes in v3: - fix minor comments --- xen/arch/arm/include/asm/pci.h | 2 ++ xen/arch/arm/pci/pci-host-common.c | 54 ++ xen/arch/x86/include/asm/pci.h | 10 ++ xen/drivers/passthrough/pci.c | 8 ++--- 4 files changed, 70 insertions(+), 4 deletions(-) diff --git a/xen/arch/arm/include/asm/pci.h b/xen/arch/arm/include/asm/pci.h index 80a2431804..8cb46f6b71 100644 --- a/xen/arch/arm/include/asm/pci.h +++ b/xen/arch/arm/include/asm/pci.h @@ -126,6 +126,8 @@ int pci_host_iterate_bridges_and_count(struct domain *d, int pci_host_bridge_mappings(struct domain *d); +bool pci_check_bar(const struct pci_dev *pdev, mfn_t start, mfn_t end); + #else /*!CONFIG_HAS_PCI*/ struct arch_pci_dev { }; diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c index 89ef30028e..d51cfdf352 100644 --- a/xen/arch/arm/pci/pci-host-common.c +++ b/xen/arch/arm/pci/pci-host-common.c @@ -24,6 +24,16 @@ #include +/* + * struct to hold pci device bar. + */ +struct pdev_bar_check +{ +unsigned long start; +unsigned long end; +bool is_valid; +}; + /* * List for all the pci host bridges. */ @@ -363,6 +373,50 @@ int __init pci_host_bridge_mappings(struct domain *d) return 0; } +/* + * TODO: BAR addresses and Root Complex window addresses are not guaranteed + * to be page aligned. We should check for alignment but this is not the + * right place for alignment check. + */ +static int is_bar_valid(const struct dt_device_node *dev, +uint64_t addr, uint64_t len, void *data) +{ +struct pdev_bar_check *bar_data = data; +unsigned long s = bar_data->start; +unsigned long e = bar_data->end; + +if ( (s >= addr) && (e <= (addr + len - 1)) ) +bar_data->is_valid = true; + +return 0; +} + +/* TODO: Revisit this function when ACPI PCI passthrough support is added. */ +bool pci_check_bar(const struct pci_dev *pdev, mfn_t start, mfn_t end) +{ +int ret; +const struct dt_device_node *dt_node; +unsigned long s = mfn_to_maddr(start); +unsigned long e = mfn_to_maddr(end); +struct pdev_bar_check bar_data = { +.start = s, +.end = e, +.is_valid = false +}; + +if ( s >= e ) +return false; + +dt_node = pci_find_host_bridge_node(pdev); +if ( !dt_node ) +return false; + +ret = dt_for_each_range(dt_node, &is_bar_valid, &bar_data); +if ( ret < 0 ) +return false; + +return bar_data.is_valid; +} /* * Local variables: * mode: C diff --git a/xen/arch/x86/include/asm/pci.h b/xen/arch/x86/include/asm/pci.h index c8e1a9ecdb..f4a58c8acf 100644 --- a/xen/arch/x86/include/asm/pci.h +++ b/xen/arch/x86/include/asm/pci.h @@ -57,4 +57,14 @@ static always_inline bool is_pci_passthrough_enabled(void) void arch_pci_init_pdev(struct pci_dev *pdev); +static inline bool pci_check_bar(const struct pci_dev *pdev, + mfn_t start, mfn_t end) +{ +/* + * Check if BAR is not overlapping with any memory region defined + * in the memory map. + */ +return is_memory_hole(start, end); +} + #endif /* __X86_PCI_H__ */ diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c index cdaf5c247f..149f68bb6e 100644 --- a/xen/drivers/passthrough/pci.c +++ b/xen/drivers/passthrough/pci.c @@ -304,8 +304,8 @@ static void check_pdev(const struct pci_dev *pdev) if ( rc < 0 ) /* Unable to size, better leave memory decoding disabled. */ return; -if ( size && !is_memory_hole(maddr_to_mfn(addr), - maddr_to_mfn(addr + size - 1)) ) +if ( size && !pci_check_bar(pdev, maddr_to_mfn(addr), +maddr_to_mfn(
[PATCH v5 1/2] xen/arm: pci: modify pci_find_host_bridge_node argument to const pdev
Modify pci_find_host_bridge_node argument to const pdev to avoid converting the dev to pdev in pci_find_host_bridge_node and also constify the return. Signed-off-by: Rahul Singh Acked-by: Stefano Stabellini Reviewed-by: Oleksandr Tyshchenko --- Changes in v5: - no changes Changes in v4: - no changes Changes in v3: - no changes --- xen/arch/arm/include/asm/pci.h | 3 ++- xen/arch/arm/pci/pci-host-common.c | 4 ++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/xen/arch/arm/include/asm/pci.h b/xen/arch/arm/include/asm/pci.h index 7c7449d64f..80a2431804 100644 --- a/xen/arch/arm/include/asm/pci.h +++ b/xen/arch/arm/include/asm/pci.h @@ -106,7 +106,8 @@ bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d, struct pci_host_bridge *bridge, uint64_t addr); struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus); -struct dt_device_node *pci_find_host_bridge_node(struct device *dev); +const struct dt_device_node * +pci_find_host_bridge_node(const struct pci_dev *pdev); int pci_get_host_bridge_segment(const struct dt_device_node *node, uint16_t *segment); diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c index fd8c0f837a..89ef30028e 100644 --- a/xen/arch/arm/pci/pci-host-common.c +++ b/xen/arch/arm/pci/pci-host-common.c @@ -243,10 +243,10 @@ err_exit: /* * Get host bridge node given a device attached to it. */ -struct dt_device_node *pci_find_host_bridge_node(struct device *dev) +const struct dt_device_node * +pci_find_host_bridge_node(const struct pci_dev *pdev) { struct pci_host_bridge *bridge; -struct pci_dev *pdev = dev_to_pci(dev); bridge = pci_find_host_bridge(pdev->seg, pdev->bus); if ( unlikely(!bridge) ) -- 2.25.1
[PATCH v5 0/2] xen/pci: implement is_memory_hole for ARM
This patch series is to implement something like is_memory_hole function for ARM. Rahul Singh (2): xen/arm: pci: modify pci_find_host_bridge_node argument to const pdev xen/pci: replace call to is_memory_hole to pci_check_bar xen/arch/arm/include/asm/pci.h | 5 ++- xen/arch/arm/pci/pci-host-common.c | 58 -- xen/arch/x86/include/asm/pci.h | 10 ++ xen/drivers/passthrough/pci.c | 8 ++--- 4 files changed, 74 insertions(+), 7 deletions(-) -- 2.25.1
[PATCH v5 5/7] xen/evtchn: modify evtchn_bind_interdomain to support static evtchn
Static event channel support will be added for dom0less domains. Modify evtchn_bind_interdomain to support static evtchn. It is necessary to have access to the evtchn_bind_interdomain function to do that, so make evtchn_bind_interdomain global and also make it __must_check. Currently evtchn_bind_interdomain() always allocates the next available local port. Static event channel support for dom0less domains requires allocating a specified port. Modify the evtchn_bind_interdomain to accept the port number as an argument and allocate the specified port if available. If the port number argument is zero, the next available port will be allocated. Currently evtchn_bind_interdomain() finds the local domain from "current->domain" pointer. evtchn_bind_interdomain() will be called from the XEN to create static event channel during domain creation. "current" pointer is not valid at that time, therefore modify the evtchn_bind_interdomain() to pass domain as an argument. Signed-off-by: Rahul Singh Acked-by: Jan Beulich Reviewed-by: Julien Grall --- Changes in v5: - no changes Changes in v4: - no changes Changes in v3: - fix minor comments in commit msg Changes in v2: - Merged patches related to evtchn_bind_interdomain in one patch --- xen/common/event_channel.c | 20 ++-- xen/include/xen/event.h| 5 + 2 files changed, 19 insertions(+), 6 deletions(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index f546e81758..f5e0b12d15 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -381,11 +381,16 @@ static void double_evtchn_unlock(struct evtchn *lchn, struct evtchn *rchn) evtchn_write_unlock(rchn); } -static int evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind) +/* + * If lport is zero get the next free port and allocate. If port is non-zero + * allocate the specified lport. + */ +int evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind, struct domain *ld, +evtchn_port_t lport) { struct evtchn *lchn, *rchn; -struct domain *ld = current->domain, *rd; -intlport, rc; +struct domain *rd; +intrc; evtchn_port_t rport = bind->remote_port; domid_trdom = bind->remote_dom; @@ -405,8 +410,11 @@ static int evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind) write_lock(&ld->event_lock); } -if ( (lport = get_free_port(ld)) < 0 ) -ERROR_EXIT(lport); +lport = rc = evtchn_get_port(ld, lport); +if ( rc < 0 ) +ERROR_EXIT(rc); +rc = 0; + lchn = evtchn_from_port(ld, lport); rchn = _evtchn_from_port(rd, rport); @@ -1239,7 +1247,7 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) struct evtchn_bind_interdomain bind_interdomain; if ( copy_from_guest(&bind_interdomain, arg, 1) != 0 ) return -EFAULT; -rc = evtchn_bind_interdomain(&bind_interdomain); +rc = evtchn_bind_interdomain(&bind_interdomain, current->domain, 0); if ( !rc && __copy_to_guest(arg, &bind_interdomain, 1) ) rc = -EFAULT; /* Cleaning up here would be a mess! */ break; diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index f31963703f..8eae9984a9 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -75,6 +75,11 @@ int evtchn_allocate_port(struct domain *d, unsigned int port); int __must_check evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, evtchn_port_t port); +/* Bind an event channel port to interdomain */ +int __must_check evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind, + struct domain *ld, + evtchn_port_t port); + /* Unmask a local event-channel port. */ int evtchn_unmask(unsigned int port); -- 2.25.1
[PATCH v5 7/7] xen/arm: introduce xen-evtchn dom0less property
Introduce a new sub-node under /chosen node to establish static event channel communication between domains on dom0less systems. An event channel will be created beforehand to allow the domains to send notifications to each other. Signed-off-by: Rahul Singh --- Changes in v5: - fix minor comments Changes in v4: - move documentation to common place for evtchn node in booting.txt - Add comment why we use dt_device_static_evtchn_created() - check if dt_get_parent() returns NULL - fold process_static_evtchn_node() in alloc_static_evtchn() Changes in v3: - use device-tree used_by to find the domain id of the evtchn node. - add new static_evtchn_create variable in struct dt_device_node to hold the information if evtchn is already created. - fix minor comments Changes in v2: - no change --- --- docs/misc/arm/device-tree/booting.txt | 98 + xen/arch/arm/domain_build.c | 147 ++ xen/arch/arm/include/asm/setup.h | 1 + xen/arch/arm/setup.c | 2 + xen/include/xen/device_tree.h | 16 +++ 5 files changed, 264 insertions(+) diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt index 47567b3906..e03e5e9e4c 100644 --- a/docs/misc/arm/device-tree/booting.txt +++ b/docs/misc/arm/device-tree/booting.txt @@ -382,3 +382,101 @@ device-tree: This will reserve a 512MB region starting at the host physical address 0x3000 to be exclusively used by DomU1. + +Static Event Channel + +The event channel communication will be established statically between two +domains (dom0 and domU also). Event channel connection information between +domains will be passed to Xen via the device tree node. The event channel +will be created and established in Xen before the domain started. The domain +does not need to do any operation to establish a connection. Domain only +needs hypercall EVTCHNOP_send(local port) to send notifications to the +remote guest. + +There is no need to describe the static event channel info in the domU device +tree. Static event channels are only useful in fully static configurations, +and in those configurations, the domU device tree dynamically generated by Xen +is not needed. + +To enable the event-channel interface for domU guests include the +xen,enhanced = "no-xenstore" property in the domU Xen device tree node. + +Under the "xen,domain" compatible node for domU, there needs to be sub-nodes +with compatible "xen,evtchn" that describe the event channel connection +between two domUs. For dom0, there needs to be sub-nodes with compatible +"xen,evtchn" under the chosen node. + +The static event channel node has the following properties: + +- compatible + +"xen,evtchn" + +- xen,evtchn + +The property is tuples of two numbers +(local-evtchn link-to-foreign-evtchn) where: + +local-evtchn is an integer value that will be used to allocate local port +for a domain to send and receive event notifications to/from the remote +domain. Maximum supported value is 2^17 for FIFO ABI and 4096 for 2L ABI. +It is recommended to use low event channel IDs. + +link-to-foreign-evtchn is a single phandle to a remote evtchn to which +local-evtchn will be connected. + +Example +=== + +chosen { + +/* One sub-node per local event channel. This sub-node is for Dom0. */ +ec1: evtchn@1 { + compatible = "xen,evtchn-v1"; + /* local-evtchn link-to-foreign-evtchn */ + xen,evtchn = <0xa &ec2>; +}; + +domU1 { +compatible = "xen,domain"; +#address-cells = <0x2>; +#size-cells = <0x1>; +xen,enhanced = "no-xenstore"; + +/* One sub-node per local event channel */ +ec2: evtchn@2 { +compatible = "xen,evtchn-v1"; +/* local-evtchn link-to-foreign-evtchn */ +xen,evtchn = <0xa &ec1>; +}; + +ec3: evtchn@3 { +compatible = "xen,evtchn-v1"; +xen,evtchn = <0xb &ec5>; +}; + +ec4: evtchn@4 { +compatible = "xen,evtchn-v1"; +xen,evtchn = <0xc &ec6>; +}; +}; + +domU2 { +compatible = "xen,domain"; +#address-cells = <0x2>; +#size-cells = <0x1>; +xen,enhanced = "no-xenstore"; + +/* One sub-node per local event channel */ +ec5: evtchn@5 { +compatible = "xen,evtchn-v1"; +/* local-evtchn link-to-foreign-evtchn */ +xen,evtchn = <0xb &ec3>; +}; + +ec6: evtchn@6 { +compatible = "xen,evtchn-v1"; +xen,evtchn = <0xd &ec4>; +}; +}; +}; diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/d
[PATCH v5 6/7] xen/arm: introduce new xen,enhanced property value
Introduce a new "xen,enhanced" dom0less property value "no-xenstore" to disable xenstore interface for dom0less guests. Signed-off-by: Rahul Singh --- Changes in v5: - fix minor comments - change unit64_t to uint16_t for dom0less_feature Changes in v4: - Implement defines for dom0less features Changes in v3: - new patch in this version --- docs/misc/arm/device-tree/booting.txt | 4 xen/arch/arm/domain_build.c | 10 ++ xen/arch/arm/include/asm/kernel.h | 23 +-- 3 files changed, 31 insertions(+), 6 deletions(-) diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt index 98253414b8..47567b3906 100644 --- a/docs/misc/arm/device-tree/booting.txt +++ b/docs/misc/arm/device-tree/booting.txt @@ -204,6 +204,10 @@ with the following properties: - "disabled" Xen PV interfaces are disabled. +- "no-xenstore" +All default Xen PV interfaces, including grant-table will be enabled but +xenstore will be disabled for the VM. + If the xen,enhanced property is present with no value, it defaults to "enabled". If the xen,enhanced property is not present, PV interfaces are disabled. diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 4664a8f961..580ed70b9c 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -2891,7 +2891,7 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo) goto err; } -if ( kinfo->dom0less_enhanced ) +if ( kinfo->dom0less_feature & DOM0LESS_ENHANCED_NO_XS ) { ret = make_hypervisor_node(d, kinfo, addrcells, sizecells); if ( ret ) @@ -3209,10 +3209,12 @@ static int __init construct_domU(struct domain *d, (rc == 0 && !strcmp(dom0less_enhanced, "enabled")) ) { if ( hardware_domain ) -kinfo.dom0less_enhanced = true; +kinfo.dom0less_feature = DOM0LESS_ENHANCED; else -panic("Tried to use xen,enhanced without dom0\n"); +panic("At the moment, Xenstore support requires dom0 to be present\n"); } +else if ( rc == 0 && !strcmp(dom0less_enhanced, "no-xenstore") ) +kinfo.dom0less_feature = DOM0LESS_ENHANCED_NO_XS; if ( vcpu_create(d, 0) == NULL ) return -ENOMEM; @@ -3252,7 +3254,7 @@ static int __init construct_domU(struct domain *d, if ( rc < 0 ) return rc; -if ( kinfo.dom0less_enhanced ) +if ( kinfo.dom0less_feature & DOM0LESS_XENSTORE ) { ASSERT(hardware_domain); rc = alloc_xenstore_evtchn(d); diff --git a/xen/arch/arm/include/asm/kernel.h b/xen/arch/arm/include/asm/kernel.h index c4dc039b54..f8bb85767b 100644 --- a/xen/arch/arm/include/asm/kernel.h +++ b/xen/arch/arm/include/asm/kernel.h @@ -9,6 +9,25 @@ #include #include +/* + * List of possible features for dom0less domUs + * + * DOM0LESS_ENHANCED_NO_XS: Notify the OS it is running on top of Xen. All the + * default features (excluding Xenstore) will be + * available. Note that an OS *must* not rely on the + * availability of Xen features if this is not set. + * DOM0LESS_XENSTORE: Xenstore will be enabled for the VM. This feature + * can't be enabled without the + * DOM0LESS_ENHANCED_NO_XS. + * DOM0LESS_ENHANCED: Notify the OS it is running on top of Xen. All the + * default features (including Xenstore) will be + * available. Note that an OS *must* not rely on the + * availability of Xen features if this is not set. + */ +#define DOM0LESS_ENHANCED_NO_XS BIT(0, U) +#define DOM0LESS_XENSTOREBIT(1, U) +#define DOM0LESS_ENHANCED(DOM0LESS_ENHANCED_NO_XS | DOM0LESS_XENSTORE) + struct kernel_info { #ifdef CONFIG_ARM_64 enum domain_type type; @@ -36,8 +55,8 @@ struct kernel_info { /* Enable pl011 emulation */ bool vpl011; -/* Enable PV drivers */ -bool dom0less_enhanced; +/* Enable/Disable PV drivers interfaces */ +uint16_t dom0less_feature; /* GIC phandle */ uint32_t phandle_gic; -- 2.25.1
[PATCH v5 4/7] xen/evtchn: modify evtchn_alloc_unbound to allocate specified port
Currently evtchn_alloc_unbound() always allocates the next available port. Static event channel support for dom0less domains requires allocating a specified port. Modify the evtchn_alloc_unbound() to accept the port number as an argument and allocate the specified port if available. If the port number argument is zero, the next available port will be allocated. Signed-off-by: Rahul Singh Acked-by: Jan Beulich Reviewed-by: Julien Grall --- Changes in v5: - no changes Changes in v4: - no changes Changes in v3: - fix minor comments in commit msg Changes in v2: - fix minor comments --- xen/arch/arm/domain_build.c | 2 +- xen/common/event_channel.c | 17 - xen/include/xen/event.h | 3 ++- 3 files changed, 15 insertions(+), 7 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index e1f46308d9..4664a8f961 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -3171,7 +3171,7 @@ static int __init alloc_xenstore_evtchn(struct domain *d) alloc.dom = d->domain_id; alloc.remote_dom = hardware_domain->domain_id; -rc = evtchn_alloc_unbound(&alloc); +rc = evtchn_alloc_unbound(&alloc, 0); if ( rc ) { printk("Failed allocating event channel for domain\n"); diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 565ab71881..f546e81758 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -317,11 +317,15 @@ static int evtchn_get_port(struct domain *d, evtchn_port_t port) return rc ?: port; } -int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) +/* + * If port is zero get the next free port and allocate. If port is non-zero + * allocate the specified port. + */ +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, evtchn_port_t port) { struct evtchn *chn; struct domain *d; -intport, rc; +intrc; domid_tdom = alloc->dom; d = rcu_lock_domain_by_any_id(dom); @@ -330,8 +334,11 @@ int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) write_lock(&d->event_lock); -if ( (port = get_free_port(d)) < 0 ) -ERROR_EXIT_DOM(port, d); +port = rc = evtchn_get_port(d, port); +if ( rc < 0 ) +ERROR_EXIT(rc); +rc = 0; + chn = evtchn_from_port(d, port); rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom); @@ -1222,7 +1229,7 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) struct evtchn_alloc_unbound alloc_unbound; if ( copy_from_guest(&alloc_unbound, arg, 1) != 0 ) return -EFAULT; -rc = evtchn_alloc_unbound(&alloc_unbound); +rc = evtchn_alloc_unbound(&alloc_unbound, 0); if ( !rc && __copy_to_guest(arg, &alloc_unbound, 1) ) rc = -EFAULT; /* Cleaning up here would be a mess! */ break; diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index f3021fe304..f31963703f 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -72,7 +72,8 @@ void evtchn_free(struct domain *d, struct evtchn *chn); int evtchn_allocate_port(struct domain *d, unsigned int port); /* Allocate a new event channel */ -int __must_check evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc); +int __must_check evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, + evtchn_port_t port); /* Unmask a local event-channel port. */ int evtchn_unmask(unsigned int port); -- 2.25.1
[PATCH v5 3/7] xen/evtchn: restrict the maximum number of evtchn supported for domUs
Restrict the maximum number of evtchn supported for domUs to avoid allocating a large amount of memory in Xen. Set the default value of max_evtchn_port to 1023. The value of 1023 should be sufficient for guests because on ARM we don't bind physical interrupts to event channels. The only use of the evtchn port is inter-domain communications. Another reason why we choose the value of 1023 is to follow the default behavior of libxl. Signed-off-by: Rahul Singh Reviewed-by: Michal Orzel Acked-by: Julien Grall --- Changes in v5: - fix minor comments - Added Julien Acked-by Changes in v4: - fix minor comments in commit msg - Added Michal Reviewed-by Changes in v3: - added in commit msg why we set the max_evtchn_port value to 1023. - added the comment in code also why we set the max_evtchn_port to 1023 - remove the define and set the value to 1023 in code directly. Changes in v2: - new patch in the version --- xen/arch/arm/domain_build.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index b76a84e8f5..e1f46308d9 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -3277,7 +3277,13 @@ void __init create_domUs(void) struct xen_domctl_createdomain d_cfg = { .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE, .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap, -.max_evtchn_port = -1, +/* + * The default of 1023 should be sufficient for guests because + * on ARM we don't bind physical interrupts to event channels. + * The only use of the evtchn port is inter-domain communications. + * 1023 is also the default value used in libxl. + */ +.max_evtchn_port = 1023, .max_grant_frames = -1, .max_maptrack_frames = -1, .grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version), -- 2.25.1
[PATCH v5 2/7] xen/evtchn: Add an helper to reserve/allocate a port
From: Stanislav Kinsburskii In a follow-up patch we will want to either reserve or allocate a port for various event channel helpers. A new wrapper is introduced to either reserve a given port or allocate a fresh one if zero. Take the opportunity to replace the open-coded version in evtchn_bind_virq(). Signed-off-by: Stanislav Kinsburskii Signed-off-by: Julien Grall Signed-off-by: Rahul Singh Acked-by: Jan Beulich --- Changes in v5: - no changes Changes in v4: - Change the Author to Stanislav Kinsburskii Changes in v3: - minor comments in commit msg Changes in v2: - new patch in this version --- xen/common/event_channel.c | 29 - 1 file changed, 16 insertions(+), 13 deletions(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index f81c229358..565ab71881 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -305,6 +305,18 @@ void evtchn_free(struct domain *d, struct evtchn *chn) xsm_evtchn_close_post(chn); } +static int evtchn_get_port(struct domain *d, evtchn_port_t port) +{ +int rc; + +if ( port != 0 ) +rc = evtchn_allocate_port(d, port); +else +rc = get_free_port(d); + +return rc ?: port; +} + int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) { struct evtchn *chn; @@ -462,19 +474,10 @@ int evtchn_bind_virq(evtchn_bind_virq_t *bind, evtchn_port_t port) if ( read_atomic(&v->virq_to_evtchn[virq]) ) ERROR_EXIT(-EEXIST); -if ( port != 0 ) -{ -if ( (rc = evtchn_allocate_port(d, port)) != 0 ) -ERROR_EXIT(rc); -} -else -{ -int alloc_port = get_free_port(d); - -if ( alloc_port < 0 ) -ERROR_EXIT(alloc_port); -port = alloc_port; -} +port = rc = evtchn_get_port(d, port); +if ( rc < 0 ) +ERROR_EXIT(rc); +rc = 0; chn = evtchn_from_port(d, port); -- 2.25.1
[PATCH v5 1/7] xen/evtchn: Make sure all buckets below d->valid_evtchns are allocated
From: Julien Grall Since commit 01280dc19cf3 "evtchn: simplify port_is_valid()", the event channels code assumes that all the buckets below d->valid_evtchns are always allocated. This assumption hold in most of the situation because a guest is not allowed to chose the port. Instead, it will be the first free from port 0. When static event channel support will be added for dom0less domains user can request to allocate the evtchn port numbers that are scattered in nature. The existing implementation of evtchn_allocate_port() is not able to deal with such situation and will end up to override bucket or/and leave some bucket unallocated. The latter will result to a droplet crash if the event channel belongs to an unallocated bucket. This can be solved by making sure that all the buckets below d->valid_evtchns are allocated. There should be no impact for most of the situation but LM/LU as only one bucket would be allocated. For LM/LU, we may end up to allocate multiple buckets if ports in use are sparse. A potential alternative is to check that the bucket is valid in is_port_valid(). This should still possible to do it without taking per-domain lock but will result a couple more of memory access. Signed-off-by: Julien Grall Signed-off-by: Rahul Singh Reviewed-by: Michal Orzel Reviewed-by: Jan Beulich --- Changes in v5: - Added Jan Reviewed-by Changes in v4: - fix comment to remove the reference to Guest Transparent Migration and Live Update - Added Michal Reviewed-by Changes in v3: - fix comments in commit msg. - modify code related to d->valid_evtchns and {read,write}_atomic() Changes in v2: - new patch in this version to avoid the security issue --- xen/common/event_channel.c | 55 -- 1 file changed, 35 insertions(+), 20 deletions(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index c2c6f8c151..f81c229358 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -193,6 +193,15 @@ static struct evtchn *alloc_evtchn_bucket(struct domain *d, unsigned int port) return NULL; } +/* + * Allocate a given port and ensure all the buckets up to that ports + * have been allocated. + * + * The last part is important because the rest of the event channel code + * relies on all the buckets up to d->valid_evtchns to be valid. However, + * event channels may be sparsed when allocating the static evtchn port + * numbers that are scattered in nature. + */ int evtchn_allocate_port(struct domain *d, evtchn_port_t port) { if ( port > d->max_evtchn_port || port >= max_evtchns(d) ) @@ -207,30 +216,36 @@ int evtchn_allocate_port(struct domain *d, evtchn_port_t port) } else { -struct evtchn *chn; -struct evtchn **grp; +unsigned int alloc_port = read_atomic(&d->valid_evtchns); -if ( !group_from_port(d, port) ) +do { -grp = xzalloc_array(struct evtchn *, BUCKETS_PER_GROUP); -if ( !grp ) -return -ENOMEM; -group_from_port(d, port) = grp; -} +struct evtchn *chn; +struct evtchn **grp; -chn = alloc_evtchn_bucket(d, port); -if ( !chn ) -return -ENOMEM; -bucket_from_port(d, port) = chn; +if ( !group_from_port(d, alloc_port) ) +{ +grp = xzalloc_array(struct evtchn *, BUCKETS_PER_GROUP); +if ( !grp ) +return -ENOMEM; +group_from_port(d, alloc_port) = grp; +} -/* - * d->valid_evtchns is used to check whether the bucket can be - * accessed without the per-domain lock. Therefore, - * d->valid_evtchns should be seen *after* the new bucket has - * been setup. - */ -smp_wmb(); -write_atomic(&d->valid_evtchns, d->valid_evtchns + EVTCHNS_PER_BUCKET); +chn = alloc_evtchn_bucket(d, alloc_port); +if ( !chn ) +return -ENOMEM; +bucket_from_port(d, alloc_port) = chn; + +/* + * d->valid_evtchns is used to check whether the bucket can be + * accessed without the per-domain lock. Therefore, + * d->valid_evtchns should be seen *after* the new bucket has + * been setup. + */ +smp_wmb(); +alloc_port += EVTCHNS_PER_BUCKET; +write_atomic(&d->valid_evtchns, alloc_port); +} while ( port >= alloc_port ); } write_atomic(&d->active_evtchns, d->active_evtchns + 1); -- 2.25.1
[PATCH v5 0/7] xen/evtchn: implement static event channel signaling
The purpose of this patch series is to add the static event channel signaling support to Xen on Arm based on design doc [1]. [1] https://lists.xenproject.org/archives/html/xen-devel/2022-05/msg01160.html Julien Grall (1): xen/evtchn: Make sure all buckets below d->valid_evtchns are allocated Rahul Singh (5): xen/evtchn: restrict the maximum number of evtchn supported for domUs xen/evtchn: modify evtchn_alloc_unbound to allocate specified port xen/evtchn: modify evtchn_bind_interdomain to support static evtchn xen/arm: introduce new xen,enhanced property value xen/arm: introduce xen-evtchn dom0less property Stanislav Kinsburskii (1): xen/evtchn: Add an helper to reserve/allocate a port docs/misc/arm/device-tree/booting.txt | 102 xen/arch/arm/domain_build.c | 167 +- xen/arch/arm/include/asm/kernel.h | 23 +++- xen/arch/arm/include/asm/setup.h | 1 + xen/arch/arm/setup.c | 2 + xen/common/event_channel.c| 121 --- xen/include/xen/device_tree.h | 16 +++ xen/include/xen/event.h | 8 +- 8 files changed, 387 insertions(+), 53 deletions(-) -- 2.25.1
Re: [PATCH v4 3/7] xen/evtchn: restrict the maximum number of evtchn supported for domUs
Hi Julien, > On 7 Sep 2022, at 2:01 pm, Julien Grall wrote: > > > > On 06/09/2022 14:40, Rahul Singh wrote: >> Restrict the maximum number of evtchn supported for domUs to avoid >> allocating a large amount of memory in Xen. >> Set the default value of max_evtchn_port to 1023. The value of 1023 >> should be sufficient for domUs guests because on ARM we don't bind > > To me, domUs and guests mean the same. So s/guests// Ack. > >> physical interrupts to event channels. The only use of the evtchn port >> is inter-domain communications. Another reason why we choose the value >> of 1023 to follow the default behavior of libxl. >> Signed-off-by: Rahul Singh >> Reviewed-by: Michal Orzel >> --- >> Changes in v4: >> - fix minor comments in commit msg >> - Added Michal Reviewed-by >> Changes in v3: >> - added in commit msg why we set the max_evtchn_port value to 1023. >> - added the comment in code also why we set the max_evtchn_port to 1023 >> - remove the define and set the value to 1023 in code directly. >> Changes in v2: >> - new patch in the version >> --- >> xen/arch/arm/domain_build.c | 8 +++- >> 1 file changed, 7 insertions(+), 1 deletion(-) >> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c >> index 3fd1186b53..fde133cd94 100644 >> --- a/xen/arch/arm/domain_build.c >> +++ b/xen/arch/arm/domain_build.c >> @@ -3277,7 +3277,13 @@ void __init create_domUs(void) >> struct xen_domctl_createdomain d_cfg = { >> .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE, >> .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap, >> -.max_evtchn_port = -1, >> +/* >> + * The default of 1023 should be sufficient for domUs guests > > To me, domUs and guests mean the same. So s/guests// > > Same here. With that: > > Acked-by: Julien Grall > > Cheers, Ack. Regards, Rahul
Re: [PATCH v4 6/7] xen/arm: introduce new xen,enhanced property value
Hi Julien, > On 7 Sep 2022, at 2:09 pm, Julien Grall wrote: > > Hi Rahul > > On 06/09/2022 14:40, Rahul Singh wrote: >> Introduce a new "xen,enhanced" dom0less property value "no-xenstore" to >> disable xenstore interface for dom0less guests. >> Signed-off-by: Rahul Singh >> --- >> Changes in v4: >> - Implement defines for dom0less features >> Changes in v3: >> - new patch in this version >> --- >> docs/misc/arm/device-tree/booting.txt | 4 >> xen/arch/arm/domain_build.c | 10 ++ >> xen/arch/arm/include/asm/kernel.h | 23 +-- >> 3 files changed, 31 insertions(+), 6 deletions(-) >> diff --git a/docs/misc/arm/device-tree/booting.txt >> b/docs/misc/arm/device-tree/booting.txt >> index 98253414b8..1b0dca1454 100644 >> --- a/docs/misc/arm/device-tree/booting.txt >> +++ b/docs/misc/arm/device-tree/booting.txt >> @@ -204,6 +204,10 @@ with the following properties: >> - "disabled" >> Xen PV interfaces are disabled. >> +- no-xenstore >> +Xen PV interfaces, including grant-table will be enabled but xenstore > > Please use "All default" in front. So it is clear that everything is enabled > but xenstore. Ack. > >> +will be disabled for the VM. >> + >> If the xen,enhanced property is present with no value, it defaults >> to "enabled". If the xen,enhanced property is not present, PV >> interfaces are disabled. >> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c >> index 707e247f6a..0b164ef595 100644 >> --- a/xen/arch/arm/domain_build.c >> +++ b/xen/arch/arm/domain_build.c >> @@ -2891,7 +2891,7 @@ static int __init prepare_dtb_domU(struct domain *d, >> struct kernel_info *kinfo) >> goto err; >> } >> -if ( kinfo->dom0less_enhanced ) >> +if ( kinfo->dom0less_feature & DOM0LESS_ENHANCED_NO_XS ) >> { >> ret = make_hypervisor_node(d, kinfo, addrcells, sizecells); >> if ( ret ) >> @@ -3209,10 +3209,12 @@ static int __init construct_domU(struct domain *d, >> (rc == 0 && !strcmp(dom0less_enhanced, "enabled")) ) >> { >> if ( hardware_domain ) >> -kinfo.dom0less_enhanced = true; >> +kinfo.dom0less_feature = DOM0LESS_ENHANCED; >> else >> -panic("Tried to use xen,enhanced without dom0\n"); >> +panic("At the moment, Xenstore support requires dom0 to be >> present\n"); >> } >> +else if ( rc == 0 && !strcmp(dom0less_enhanced, "no-xenstore") ) >> +kinfo.dom0less_feature = DOM0LESS_ENHANCED_NO_XS; >>if ( vcpu_create(d, 0) == NULL ) >> return -ENOMEM; >> @@ -3252,7 +3254,7 @@ static int __init construct_domU(struct domain *d, >> if ( rc < 0 ) >> return rc; >> -if ( kinfo.dom0less_enhanced ) >> +if ( kinfo.dom0less_feature & DOM0LESS_XENSTORE ) >> { >> ASSERT(hardware_domain); >> rc = alloc_xenstore_evtchn(d); >> diff --git a/xen/arch/arm/include/asm/kernel.h >> b/xen/arch/arm/include/asm/kernel.h >> index c4dc039b54..ad240494ea 100644 >> --- a/xen/arch/arm/include/asm/kernel.h >> +++ b/xen/arch/arm/include/asm/kernel.h >> @@ -9,6 +9,25 @@ >> #include >> #include >> +/* >> + * List of possible features for dom0less domUs >> + * >> + * DOM0LESS_ENHANCED_NO_XS: Notify the OS it is running on top of Xen. All >> the >> + * default features (excluding Xenstore) will be >> + * available. Note that an OS *must* not rely on >> the >> + * availability of Xen features if this is not set. >> + * DOM0LESS_XENSTORE: Xenstore will be enabled for the VM. This >> feature >> + * can't be enabled without the >> + * DOM0LESS_ENHANCED_NO_XS. >> + * DOM0LESS_ENHANCED: Notify the OS it is running on top of Xen. All >> the >> + * default features (including Xenstore) will be >> + * available. Note that an OS *must* not rely on >> the >> + * availability of Xen features if this is not set. >> + */ >> +#define DOM0LESS_ENHANCED_NO_XS BIT(0, U) >> +#define DOM0LESS_XENSTOREBIT(1, U) >> +#define DOM0LESS_ENHANCED(DOM0LESS_ENHANCED_NO_XS | >> DOM0LESS_XENSTORE) >> + >> struct kernel_info { >> #ifdef CONFIG_ARM_64 >> enum domain_type type; >> @@ -36,8 +55,8 @@ struct kernel_info { >> /* Enable pl011 emulation */ >> bool vpl011; >> -/* Enable PV drivers */ >> -bool dom0less_enhanced; >> +/* Enable/Disable PV drivers interface,grant table, evtchn or xenstore >> */ > > The part after "," is technically wrong because it also affects other > interfaces. But I would drop it to avoid any stale comment (we may add new > one in the futures). Ok . I will remove and will comment like this: /* Enable/Disable PV drivers interfaces */ Regards, Rahul
[PATCH v4 1/2] xen/arm: pci: modify pci_find_host_bridge_node argument to const pdev
Modify pci_find_host_bridge_node argument to const pdev to avoid converting the dev to pdev in pci_find_host_bridge_node and also constify the return. Signed-off-by: Rahul Singh Reviewed-by: Oleksandr Tyshchenko Acked-by: Stefano Stabellini --- Changes in v4: - no changes Changes in v3: - no changes --- xen/arch/arm/include/asm/pci.h | 3 ++- xen/arch/arm/pci/pci-host-common.c | 4 ++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/xen/arch/arm/include/asm/pci.h b/xen/arch/arm/include/asm/pci.h index 7c7449d64f..80a2431804 100644 --- a/xen/arch/arm/include/asm/pci.h +++ b/xen/arch/arm/include/asm/pci.h @@ -106,7 +106,8 @@ bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d, struct pci_host_bridge *bridge, uint64_t addr); struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus); -struct dt_device_node *pci_find_host_bridge_node(struct device *dev); +const struct dt_device_node * +pci_find_host_bridge_node(const struct pci_dev *pdev); int pci_get_host_bridge_segment(const struct dt_device_node *node, uint16_t *segment); diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c index fd8c0f837a..89ef30028e 100644 --- a/xen/arch/arm/pci/pci-host-common.c +++ b/xen/arch/arm/pci/pci-host-common.c @@ -243,10 +243,10 @@ err_exit: /* * Get host bridge node given a device attached to it. */ -struct dt_device_node *pci_find_host_bridge_node(struct device *dev) +const struct dt_device_node * +pci_find_host_bridge_node(const struct pci_dev *pdev) { struct pci_host_bridge *bridge; -struct pci_dev *pdev = dev_to_pci(dev); bridge = pci_find_host_bridge(pdev->seg, pdev->bus); if ( unlikely(!bridge) ) -- 2.25.1
[PATCH v4 2/2] xen/pci: replace call to is_memory_hole to pci_check_bar
is_memory_hole was implemented for x86 and not for ARM when introduced. Replace is_memory_hole call to pci_check_bar as function should check if device BAR is in defined memory range. Also, add an implementation for ARM which is required for PCI passthrough. On x86, pci_check_bar will call is_memory_hole which will check if BAR is not overlapping with any memory region defined in the memory map. On ARM, pci_check_bar will go through the host bridge ranges and check if the BAR is in the range of defined ranges. Signed-off-by: Rahul Singh --- Changes in v4: - check "s <= e" before callback - Add TODO comment for revisiting the function pci_check_bar() when ACPI PCI passthrough support is added. - Not Added the Jan Acked-by as patch is modified. Changes in v3: - fix minor comments --- xen/arch/arm/include/asm/pci.h | 2 ++ xen/arch/arm/pci/pci-host-common.c | 49 ++ xen/arch/x86/include/asm/pci.h | 10 ++ xen/drivers/passthrough/pci.c | 8 ++--- 4 files changed, 65 insertions(+), 4 deletions(-) diff --git a/xen/arch/arm/include/asm/pci.h b/xen/arch/arm/include/asm/pci.h index 80a2431804..8cb46f6b71 100644 --- a/xen/arch/arm/include/asm/pci.h +++ b/xen/arch/arm/include/asm/pci.h @@ -126,6 +126,8 @@ int pci_host_iterate_bridges_and_count(struct domain *d, int pci_host_bridge_mappings(struct domain *d); +bool pci_check_bar(const struct pci_dev *pdev, mfn_t start, mfn_t end); + #else /*!CONFIG_HAS_PCI*/ struct arch_pci_dev { }; diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c index 89ef30028e..13d419aa45 100644 --- a/xen/arch/arm/pci/pci-host-common.c +++ b/xen/arch/arm/pci/pci-host-common.c @@ -24,6 +24,16 @@ #include +/* + * struct to hold pci device bar. + */ +struct pdev_bar_check +{ +unsigned long start; +unsigned long end; +bool is_valid; +}; + /* * List for all the pci host bridges. */ @@ -363,6 +373,45 @@ int __init pci_host_bridge_mappings(struct domain *d) return 0; } +static int is_bar_valid(const struct dt_device_node *dev, +uint64_t addr, uint64_t len, void *data) +{ +struct pdev_bar_check *bar_data = data; +unsigned long s = bar_data->start; +unsigned long e = bar_data->end; + +if ( (s >= PFN_DOWN(addr)) && (e <= PFN_UP(addr + len - 1)) ) +bar_data->is_valid = true; + +return 0; +} + +/* TODO: Revisit this function when ACPI PCI passthrough support is added. */ +bool pci_check_bar(const struct pci_dev *pdev, mfn_t start, mfn_t end) +{ +int ret; +const struct dt_device_node *dt_node; +unsigned long s = mfn_x(start); +unsigned long e = mfn_x(end); +struct pdev_bar_check bar_data = { +.start = s, +.end = e, +.is_valid = false +}; + +if ( s >= e ) +return false; + +dt_node = pci_find_host_bridge_node(pdev); +if ( !dt_node ) +return false; + +ret = dt_for_each_range(dt_node, &is_bar_valid, &bar_data); +if ( ret < 0 ) +return false; + +return bar_data.is_valid; +} /* * Local variables: * mode: C diff --git a/xen/arch/x86/include/asm/pci.h b/xen/arch/x86/include/asm/pci.h index c8e1a9ecdb..f4a58c8acf 100644 --- a/xen/arch/x86/include/asm/pci.h +++ b/xen/arch/x86/include/asm/pci.h @@ -57,4 +57,14 @@ static always_inline bool is_pci_passthrough_enabled(void) void arch_pci_init_pdev(struct pci_dev *pdev); +static inline bool pci_check_bar(const struct pci_dev *pdev, + mfn_t start, mfn_t end) +{ +/* + * Check if BAR is not overlapping with any memory region defined + * in the memory map. + */ +return is_memory_hole(start, end); +} + #endif /* __X86_PCI_H__ */ diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c index cdaf5c247f..149f68bb6e 100644 --- a/xen/drivers/passthrough/pci.c +++ b/xen/drivers/passthrough/pci.c @@ -304,8 +304,8 @@ static void check_pdev(const struct pci_dev *pdev) if ( rc < 0 ) /* Unable to size, better leave memory decoding disabled. */ return; -if ( size && !is_memory_hole(maddr_to_mfn(addr), - maddr_to_mfn(addr + size - 1)) ) +if ( size && !pci_check_bar(pdev, maddr_to_mfn(addr), +maddr_to_mfn(addr + size - 1)) ) { /* * Return without enabling memory decoding if BAR position is not @@ -331,8 +331,8 @@ static void check_pdev(const struct pci_dev *pdev) if ( rc < 0 ) return; -if ( size && !is_memory_hole(maddr_to_mfn(addr), - maddr_to_mfn(addr + size - 1)) ) +if ( size && !pci_check_bar(pdev, maddr_to_mfn(addr), +maddr_to_mfn(addr + size - 1)) ) {
[PATCH v4 0/2] xen/pci: implement is_memory_hole for ARM
This patch series is to implement something like is_memory_hole function for ARM. Rahul Singh (2): xen/arm: pci: modify pci_find_host_bridge_node argument to const pdev xen/pci: replace call to is_memory_hole to pci_check_bar xen/arch/arm/include/asm/pci.h | 5 ++- xen/arch/arm/pci/pci-host-common.c | 53 -- xen/arch/x86/include/asm/pci.h | 10 ++ xen/drivers/passthrough/pci.c | 8 ++--- 4 files changed, 69 insertions(+), 7 deletions(-) -- 2.25.1
Re: [PATCH v3 2/2] xen/pci: replace call to is_memory_hole to pci_check_bar
Hi Julien, > On 6 Sep 2022, at 10:53 am, Julien Grall wrote: > > > > On 06/09/2022 10:39, Rahul Singh wrote: >> Hi Julien, >>> On 3 Sep 2022, at 8:18 am, Julien Grall wrote: >>> >>> Hi Rahul, >>> >>> On 01/09/2022 10:29, Rahul Singh wrote: >>>> is_memory_hole was implemented for x86 and not for ARM when introduced. >>>> Replace is_memory_hole call to pci_check_bar as function should check >>>> if device BAR is in defined memory range. Also, add an implementation >>>> for ARM which is required for PCI passthrough. >>>> On x86, pci_check_bar will call is_memory_hole which will check if BAR >>>> is not overlapping with any memory region defined in the memory map. >>>> On ARM, pci_check_bar will go through the host bridge ranges and check >>>> if the BAR is in the range of defined ranges. >>>> Signed-off-by: Rahul Singh >>>> --- >>>> Changes in v3: >>>> - fix minor comments >>>> --- >>>> xen/arch/arm/include/asm/pci.h | 2 ++ >>>> xen/arch/arm/pci/pci-host-common.c | 43 ++ >>>> xen/arch/x86/include/asm/pci.h | 10 +++ >>>> xen/drivers/passthrough/pci.c | 8 +++--- >>>> 4 files changed, 59 insertions(+), 4 deletions(-) >>>> diff --git a/xen/arch/arm/include/asm/pci.h >>>> b/xen/arch/arm/include/asm/pci.h >>>> index 80a2431804..8cb46f6b71 100644 >>>> --- a/xen/arch/arm/include/asm/pci.h >>>> +++ b/xen/arch/arm/include/asm/pci.h >>>> @@ -126,6 +126,8 @@ int pci_host_iterate_bridges_and_count(struct domain >>>> *d, >>>>int pci_host_bridge_mappings(struct domain *d); >>>> +bool pci_check_bar(const struct pci_dev *pdev, mfn_t start, mfn_t end); >>>> + >>>> #else /*!CONFIG_HAS_PCI*/ >>>>struct arch_pci_dev { }; >>>> diff --git a/xen/arch/arm/pci/pci-host-common.c >>>> b/xen/arch/arm/pci/pci-host-common.c >>>> index 89ef30028e..0eb121666d 100644 >>>> --- a/xen/arch/arm/pci/pci-host-common.c >>>> +++ b/xen/arch/arm/pci/pci-host-common.c >>>> @@ -24,6 +24,16 @@ >>>>#include >>>> +/* >>>> + * struct to hold pci device bar. >>>> + */ >>> >>> I find this comment a bit misleading. What you are storing is a >>> candidate region. IOW, this may or may not be a PCI device bar. >>> >>> Given the current use below, I would rename the structure to something more >>> specific like: pdev_bar_check. >> Ack. >>> >>>> +struct pdev_bar >>>> +{ >>>> +mfn_t start; >>>> +mfn_t end; >>>> +bool is_valid; >>>> +}; >>>> + >>>> /* >>>> * List for all the pci host bridges. >>>> */ >>>> @@ -363,6 +373,39 @@ int __init pci_host_bridge_mappings(struct domain *d) >>>> return 0; >>>> } >>>> +static int is_bar_valid(const struct dt_device_node *dev, >>>> +uint64_t addr, uint64_t len, void *data) >>>> +{ >>>> +struct pdev_bar *bar_data = data; >>>> +unsigned long s = mfn_x(bar_data->start); >>>> +unsigned long e = mfn_x(bar_data->end); >>>> + >>>> +if ( (s <= e) && (s >= PFN_DOWN(addr)) && (e <= PFN_UP(addr + len - >>>> 1)) ) >>> >>> AFAICT 's' and 'e' are provided by pci_check_bar() and will never change. >>> So can we move the check 's <= e' outside of the callback? >> Yes, We can move the check outside the callback but I feel that if we check >> here then it is more >> readable that we are checking for all possible values in one statement. Let >> me know your view on this. > The readability is really a matter of taste here. But my point is more on the > number of time a check is done. > > It seems pointless to do the same check N times when you know the values are > not going to change. Admittedly, the operation is fast (this is a comparison) > and N should be small (?). > > However, I think it raises the question on where do you draw the line? > > Personally, I think all invariant should be checked outside of callbacks. So > the line is very clear. > I will move the check for "s <=e” outside the callback and will send it for review. Regards, Rahul
Re: [PATCH v4 6/7] xen/arm: introduce new xen,enhanced property value
Hi Stefano, > On 6 Sep 2022, at 11:12 pm, Stefano Stabellini wrote: > > On Tue, 6 Sep 2022, Rahul Singh wrote: >> Introduce a new "xen,enhanced" dom0less property value "no-xenstore" to >> disable xenstore interface for dom0less guests. >> >> Signed-off-by: Rahul Singh >> --- >> Changes in v4: >> - Implement defines for dom0less features >> Changes in v3: >> - new patch in this version >> --- >> docs/misc/arm/device-tree/booting.txt | 4 >> xen/arch/arm/domain_build.c | 10 ++ >> xen/arch/arm/include/asm/kernel.h | 23 +-- >> 3 files changed, 31 insertions(+), 6 deletions(-) >> >> diff --git a/docs/misc/arm/device-tree/booting.txt >> b/docs/misc/arm/device-tree/booting.txt >> index 98253414b8..1b0dca1454 100644 >> --- a/docs/misc/arm/device-tree/booting.txt >> +++ b/docs/misc/arm/device-tree/booting.txt >> @@ -204,6 +204,10 @@ with the following properties: >> - "disabled" >> Xen PV interfaces are disabled. >> >> +- no-xenstore >> +Xen PV interfaces, including grant-table will be enabled but xenstore >> +will be disabled for the VM. > > Please use "" for consistency: > >- "no-xenstore" > Ack. > >> If the xen,enhanced property is present with no value, it defaults >> to "enabled". If the xen,enhanced property is not present, PV >> interfaces are disabled. >> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c >> index 707e247f6a..0b164ef595 100644 >> --- a/xen/arch/arm/domain_build.c >> +++ b/xen/arch/arm/domain_build.c >> @@ -2891,7 +2891,7 @@ static int __init prepare_dtb_domU(struct domain *d, >> struct kernel_info *kinfo) >> goto err; >> } >> >> -if ( kinfo->dom0less_enhanced ) >> +if ( kinfo->dom0less_feature & DOM0LESS_ENHANCED_NO_XS ) >> { >> ret = make_hypervisor_node(d, kinfo, addrcells, sizecells); >> if ( ret ) >> @@ -3209,10 +3209,12 @@ static int __init construct_domU(struct domain *d, >> (rc == 0 && !strcmp(dom0less_enhanced, "enabled")) ) >> { >> if ( hardware_domain ) >> -kinfo.dom0less_enhanced = true; >> +kinfo.dom0less_feature = DOM0LESS_ENHANCED; >> else >> -panic("Tried to use xen,enhanced without dom0\n"); >> +panic("At the moment, Xenstore support requires dom0 to be >> present\n"); >> } >> +else if ( rc == 0 && !strcmp(dom0less_enhanced, "no-xenstore") ) >> +kinfo.dom0less_feature = DOM0LESS_ENHANCED_NO_XS; >> >> if ( vcpu_create(d, 0) == NULL ) >> return -ENOMEM; >> @@ -3252,7 +3254,7 @@ static int __init construct_domU(struct domain *d, >> if ( rc < 0 ) >> return rc; >> >> -if ( kinfo.dom0less_enhanced ) >> +if ( kinfo.dom0less_feature & DOM0LESS_XENSTORE ) >> { >> ASSERT(hardware_domain); >> rc = alloc_xenstore_evtchn(d); >> diff --git a/xen/arch/arm/include/asm/kernel.h >> b/xen/arch/arm/include/asm/kernel.h >> index c4dc039b54..ad240494ea 100644 >> --- a/xen/arch/arm/include/asm/kernel.h >> +++ b/xen/arch/arm/include/asm/kernel.h >> @@ -9,6 +9,25 @@ >> #include >> #include >> >> +/* >> + * List of possible features for dom0less domUs >> + * >> + * DOM0LESS_ENHANCED_NO_XS: Notify the OS it is running on top of Xen. All >> the >> + * default features (excluding Xenstore) will be >> + * available. Note that an OS *must* not rely on >> the >> + * availability of Xen features if this is not set. >> + * DOM0LESS_XENSTORE: Xenstore will be enabled for the VM. This >> feature >> + * can't be enabled without the >> + * DOM0LESS_ENHANCED_NO_XS. >> + * DOM0LESS_ENHANCED: Notify the OS it is running on top of Xen. All >> the >> + * default features (including Xenstore) will be >> + * available. Note that an OS *must* not rely on >> the >> + * availability of Xen features if this is not set. >> + */ >> +#define DOM0LESS_ENHANCED_NO_XS BIT(0, U) >> +#define DOM0LESS_XENSTOREBIT(1, U) >> +#define DOM0LESS_ENHANCED(DOM0LESS_ENHANCED_NO_XS | >> DOM0LESS_XENSTORE) >> + >> struct kernel_info { >> #ifdef CONFIG_ARM_64 >> enum domain_type type; >> @@ -36,8 +55,8 @@ struct kernel_info { >> /* Enable pl011 emulation */ >> bool vpl011; >> >> -/* Enable PV drivers */ >> -bool dom0less_enhanced; >> +/* Enable/Disable PV drivers interface,grant table, evtchn or xenstore >> */ > > missing a whitespace Ack. > > >> +uint32_t dom0less_feature; > > Given that we only really need 2 bits today, and given that uint8_t and > uint16_t are free but uint32_t increases the size of the struct, could > we just use uint16_t dom0less_feature ? Yes, I will change to uint16_t in next version. Regards, Rahul
Re: [PATCH v4 7/7] xen/arm: introduce xen-evtchn dom0less property
Hi Stefano, > On 6 Sep 2022, at 11:22 pm, Stefano Stabellini wrote: > > On Tue, 6 Sep 2022, Rahul Singh wrote: >> Introduce a new sub-node under /chosen node to establish static event >> channel communication between domains on dom0less systems. >> >> An event channel will be created beforehand to allow the domains to >> send notifications to each other. >> >> Signed-off-by: Rahul Singh >> --- >> Changes in v4: >> - move documentation to common place for evtchn node in booting.txt >> - Add comment why we use dt_device_static_evtchn_created() >> - check if dt_get_parent() returns NULL >> - fold process_static_evtchn_node() in alloc_static_evtchn() >> Changes in v3: >> - use device-tree used_by to find the domain id of the evtchn node. >> - add new static_evtchn_create variable in struct dt_device_node to >> hold the information if evtchn is already created. >> - fix minor comments >> Changes in v2: >> - no change >> --- >> docs/misc/arm/device-tree/booting.txt | 98 + > > I have just reviewed the binding, only three minor comments below. > Everything looks good. Thanks for reviewing the code. > > >> xen/arch/arm/domain_build.c | 147 ++ >> xen/arch/arm/include/asm/setup.h | 1 + >> xen/arch/arm/setup.c | 2 + >> xen/include/xen/device_tree.h | 16 +++ >> 5 files changed, 264 insertions(+) >> >> diff --git a/docs/misc/arm/device-tree/booting.txt >> b/docs/misc/arm/device-tree/booting.txt >> index 1b0dca1454..c8329b73e5 100644 >> --- a/docs/misc/arm/device-tree/booting.txt >> +++ b/docs/misc/arm/device-tree/booting.txt >> @@ -382,3 +382,101 @@ device-tree: >> >> This will reserve a 512MB region starting at the host physical address >> 0x3000 to be exclusively used by DomU1. >> + >> +Static Event Channel >> + >> +The event channel communication will be established statically between two >> +domains (dom0 and domU also). Event channel connection information between >> +domains will be passed to Xen via the device tree node. The event channel >> +will be created and established in Xen before the domain started. The domain >> +doesn???t need to do any operation to establish a connection. Domain only > > doesn't > > better to use ASCII if possible Ack. > > >> +needs hypercall EVTCHNOP_send(local port) to send notifications to the >> +remote guest. >> + >> +There is no need to describe the static event channel info in the domU >> device >> +tree. Static event channels are only useful in fully static configurations, >> +and in those configurations, the domU device tree dynamically generated by >> Xen >> +is not needed. >> + >> +To enable the event-channel interface for domU guests include the >> +"xen,enhanced = "no-xenstore"" property in the domU Xen device tree node. > > double "" Ack. > > >> + >> +Under the "xen,domain" compatible node for domU, there needs to be sub-nodes >> +with compatible "xen,evtchn" that describe the event channel connection >> +between two domUs. For dom0, there needs to be sub-nodes with compatible >> +"xen,evtchn" under the chosen node. >> + >> +The static event channel node has the following properties: >> + >> +- compatible >> + >> +"xen,evtchn" >> + >> +- xen,evtchn >> + >> +The property is tuples of two numbers >> +(local-evtchn link-to-foreign-evtchn) where: >> + >> +local-evtchn is an integer value that will be used to allocate local >> port >> +for a domain to send and receive event notifications to/from the remote >> +domain. Maximum supported value is 2^17 for FIFO ABI and 4096 for 2L >> ABI. >> +It is recommended to use low event channel IDs. >> + >> +link-to-foreign-evtchn is a single phandle to a remote evtchn to which >> +local-evtchn will be connected. >> + >> +Example >> +=== >> + >> +chosen { >> + >> +/* one sub-node per local event channel */ > > It would be good to say that this is for dom0 in the comment, e.g.: > > /* this is for Dom0 */ Ack. Regards, Rahul
[PATCH v4 03/10] xen/arm: smmuv3: Ensure queue is read after updating prod pointer
From: Zhou Wang Backport Linux commit a76a3f2c. Introduce __iomb() in the smmu-v3.c file with other Linux compatibility definitions. Reading the 'prod' MMIO register in order to determine whether or not there is valid data beyond 'cons' for a given queue does not provide sufficient dependency ordering, as the resulting access is address dependent only on 'cons' and can therefore be speculated ahead of time, potentially allowing stale data to be read by the CPU. Use readl() instead of readl_relaxed() when updating the shadow copy of the 'prod' pointer, so that all speculated memory reads from the corresponding queue can occur only from valid slots. Signed-off-by: Zhou Wang Link: https://lore.kernel.org/r/1601281922-117296-1-git-send-email-wangzh...@hisilicon.com [will: Use readl() instead of explicit barrier. Update 'cons' side to match.] Signed-off-by: Will Deacon Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git a76a3f2c Signed-off-by: Rahul Singh --- Changes in v4: - rename iomb() to __iomb() Changes in v3: - rename __iomb() to iomb() and also move it from common file to smmu-v3.c file Changes in v2: - fix commit msg - add __iomb changes also from the origin patch --- xen/drivers/passthrough/arm/smmu-v3.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/xen/drivers/passthrough/arm/smmu-v3.c b/xen/drivers/passthrough/arm/smmu-v3.c index 64d39bb4d3..229b9a4b0d 100644 --- a/xen/drivers/passthrough/arm/smmu-v3.c +++ b/xen/drivers/passthrough/arm/smmu-v3.c @@ -107,6 +107,8 @@ typedef paddr_t dma_addr_t; typedef paddr_tphys_addr_t; typedef unsigned int gfp_t; +#define __iomb() dmb(osh) + #define platform_devicedevice #define GFP_KERNEL 0 @@ -951,7 +953,7 @@ static void queue_sync_cons_out(struct arm_smmu_queue *q) * Ensure that all CPU accesses (reads and writes) to the queue * are complete before we update the cons pointer. */ - mb(); + __iomb(); writel_relaxed(q->llq.cons, q->cons_reg); } @@ -963,8 +965,15 @@ static void queue_inc_cons(struct arm_smmu_ll_queue *q) static int queue_sync_prod_in(struct arm_smmu_queue *q) { + u32 prod; int ret = 0; - u32 prod = readl_relaxed(q->prod_reg); + + /* +* We can't use the _relaxed() variant here, as we must prevent +* speculative reads of the queue before we have determined that +* prod has indeed moved. +*/ + prod = readl(q->prod_reg); if (Q_OVF(prod) != Q_OVF(q->llq.prod)) ret = -EOVERFLOW; -- 2.25.1
[PATCH v4 02/10] xen/arm: smmuv3: Fix endianness annotations
From: Jean-Philippe Brucker Backport Linux commit 376cdf66f624. This is the clean backport without any changes. When building with C=1, sparse reports some issues regarding endianness annotations: arm-smmu-v3.c:221:26: warning: cast to restricted __le64 arm-smmu-v3.c:221:24: warning: incorrect type in assignment (different base types) arm-smmu-v3.c:221:24:expected restricted __le64 [usertype] arm-smmu-v3.c:221:24:got unsigned long long [usertype] arm-smmu-v3.c:229:20: warning: incorrect type in argument 1 (different base types) arm-smmu-v3.c:229:20:expected restricted __le64 [usertype] *[assigned] dst arm-smmu-v3.c:229:20:got unsigned long long [usertype] *ent arm-smmu-v3.c:229:25: warning: incorrect type in argument 2 (different base types) arm-smmu-v3.c:229:25:expected unsigned long long [usertype] *[assigned] src arm-smmu-v3.c:229:25:got restricted __le64 [usertype] * arm-smmu-v3.c:396:20: warning: incorrect type in argument 1 (different base types) arm-smmu-v3.c:396:20:expected restricted __le64 [usertype] *[assigned] dst arm-smmu-v3.c:396:20:got unsigned long long * arm-smmu-v3.c:396:25: warning: incorrect type in argument 2 (different base types) arm-smmu-v3.c:396:25:expected unsigned long long [usertype] *[assigned] src arm-smmu-v3.c:396:25:got restricted __le64 [usertype] * arm-smmu-v3.c:1349:32: warning: invalid assignment: |= arm-smmu-v3.c:1349:32:left side has type restricted __le64 arm-smmu-v3.c:1349:32:right side has type unsigned long arm-smmu-v3.c:1396:53: warning: incorrect type in argument 3 (different base types) arm-smmu-v3.c:1396:53:expected restricted __le64 [usertype] *dst arm-smmu-v3.c:1396:53:got unsigned long long [usertype] *strtab arm-smmu-v3.c:1424:39: warning: incorrect type in argument 1 (different base types) arm-smmu-v3.c:1424:39:expected unsigned long long [usertype] *[assigned] strtab arm-smmu-v3.c:1424:39:got restricted __le64 [usertype] *l2ptr While harmless, they are incorrect and could hide actual errors during development. Fix them. Signed-off-by: Jean-Philippe Brucker Reviewed-by: Robin Murphy Link: https://lore.kernel.org/r/20200918141856.629722-1-jean-phili...@linaro.org Signed-off-by: Will Deacon Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 376cdf66f624 Signed-off-by: Rahul Singh Acked-by: Stefano Stabellini --- Changes in v4: - Move Stefano Acked-by after Signed-off Changes in v3: - Added Stefano Acked-by Changes in v2: - fix commit msg --- xen/drivers/passthrough/arm/smmu-v3.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/xen/drivers/passthrough/arm/smmu-v3.c b/xen/drivers/passthrough/arm/smmu-v3.c index 340609264d..64d39bb4d3 100644 --- a/xen/drivers/passthrough/arm/smmu-v3.c +++ b/xen/drivers/passthrough/arm/smmu-v3.c @@ -1037,7 +1037,7 @@ static int queue_insert_raw(struct arm_smmu_queue *q, u64 *ent) return 0; } -static void queue_read(__le64 *dst, u64 *src, size_t n_dwords) +static void queue_read(u64 *dst, __le64 *src, size_t n_dwords) { int i; @@ -1436,7 +1436,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd); } -static void arm_smmu_init_bypass_stes(u64 *strtab, unsigned int nent) +static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent) { unsigned int i; -- 2.25.1