Re: [PATCH 6/9] swiotlb: refactor swiotlb_tbl_map_single
On Mon, Feb 22, 2021 at 02:29:37PM -0500, Konrad Rzeszutek Wilk wrote: > > 'max_slots' should be 'unsigned long' here. Breaks SWIOTLB on RPi4. Do you > > want > > me to send a fix or you prefer editing the patch? > > I can roll it in. Thx! Thanks to both of you! ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/tegra-smmu: Fix mc errors on tegra124-nyan
23.02.2021 05:13, Nicolin Chen пишет: > Hi Dmitry, > > On Sat, Feb 20, 2021 at 08:16:22AM +0300, Dmitry Osipenko wrote: >> 19.02.2021 01:07, Nicolin Chen пишет: >>> Commit 25938c73cd79 ("iommu/tegra-smmu: Rework tegra_smmu_probe_device()") >>> removed certain hack in the tegra_smmu_probe() by relying on IOMMU core to >>> of_xlate SMMU's SID per device, so as to get rid of tegra_smmu_find() and >>> tegra_smmu_configure() that are typically done in the IOMMU core also. >>> >>> This approach works for both existing devices that have DT nodes and other >>> devices (like PCI device) that don't exist in DT, on Tegra210 and Tegra3 >>> upon testing. However, Page Fault errors are reported on tegra124-Nyan: >>> >>> tegra-mc 70019000.memory-controller: display0a: read @0xfe056b40: >>> EMEM address decode error (SMMU translation error [--S]) >>> tegra-mc 70019000.memory-controller: display0a: read @0xfe056b40: >>> Page fault (SMMU translation error [--S]) >>> >>> After debugging, I found that the mentioned commit changed some function >>> callback sequence of tegra-smmu's, resulting in enabling SMMU for display >>> client before display driver gets initialized. I couldn't reproduce exact >>> same issue on Tegra210 as Tegra124 (arm-32) differs at arch-level code. >> >> Hello Nicolin, >> >> Could you please explain in a more details what exactly makes the >> difference for the callback sequence? > > Here is a log with 5.11.0-rc6: > https://lava.collabora.co.uk/scheduler/job/3187849 > [dump_stack was added in some tegra-smmu functions] > > And here is a corresponding log with reverting the original commit: > https://lava.collabora.co.uk/scheduler/job/3187851 > > Here is a log with 5.11.0-rc7-next-20210210: > https://lava.collabora.co.uk/scheduler/job/3210245 > > And here is a corresponding log with reverting the original commit: > https://lava.collabora.co.uk/scheduler/job/3210596 > > Both failing logs show that mc errors started right after client DC > got enabled by ->attach_dev() callback that in the passing logs was > not called until Host1x driver init. And note that two failing logs > show that ->attach_dev() could be called from two different sources, > of_dma_configure_id() or arch_setup_dma_ops(). > > The reason why ->attach_dev() gets called is probably related to the > following reasons (sorry, can't be 100% sure as I don't have Tegra124 > or other 32bit Tegra board to test): > 1) With the commit reverted, all clients are probed in "arch" stage, >which is even prior to iommu core initialization -- including it >setting default domain type. This probably messed up the type of >allocating domains against the default domain type. Also internal >group is somehow affected. So some condition check in iommu core >failed and then it bypassed ->attach_dev callback in really_probe >stage, until Host1x driver does attach_dev again. > > 2) 32bit ARM has arch_setup_dma_ops() does an additional set of iommu >domain allocation + attach_dev(), after of_dma_configure_id() did >once. This isn't reproducible for me on Tegra210. > > As debugging online isn't very efficient, and given that Thierry has > been working on the linear mapping of framebuffer carveout, I choose > to partially revert as a quick fix. The partially revert should be okay, but it's not clear to me what makes difference for T124 since I don't see that problem on T30, which also has active display at a boot time. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/tegra-smmu: Fix mc errors on tegra124-nyan
Hi Dmitry, On Sat, Feb 20, 2021 at 08:16:22AM +0300, Dmitry Osipenko wrote: > 19.02.2021 01:07, Nicolin Chen пишет: > > Commit 25938c73cd79 ("iommu/tegra-smmu: Rework tegra_smmu_probe_device()") > > removed certain hack in the tegra_smmu_probe() by relying on IOMMU core to > > of_xlate SMMU's SID per device, so as to get rid of tegra_smmu_find() and > > tegra_smmu_configure() that are typically done in the IOMMU core also. > > > > This approach works for both existing devices that have DT nodes and other > > devices (like PCI device) that don't exist in DT, on Tegra210 and Tegra3 > > upon testing. However, Page Fault errors are reported on tegra124-Nyan: > > > > tegra-mc 70019000.memory-controller: display0a: read @0xfe056b40: > > EMEM address decode error (SMMU translation error [--S]) > > tegra-mc 70019000.memory-controller: display0a: read @0xfe056b40: > > Page fault (SMMU translation error [--S]) > > > > After debugging, I found that the mentioned commit changed some function > > callback sequence of tegra-smmu's, resulting in enabling SMMU for display > > client before display driver gets initialized. I couldn't reproduce exact > > same issue on Tegra210 as Tegra124 (arm-32) differs at arch-level code. > > Hello Nicolin, > > Could you please explain in a more details what exactly makes the > difference for the callback sequence? Here is a log with 5.11.0-rc6: https://lava.collabora.co.uk/scheduler/job/3187849 [dump_stack was added in some tegra-smmu functions] And here is a corresponding log with reverting the original commit: https://lava.collabora.co.uk/scheduler/job/3187851 Here is a log with 5.11.0-rc7-next-20210210: https://lava.collabora.co.uk/scheduler/job/3210245 And here is a corresponding log with reverting the original commit: https://lava.collabora.co.uk/scheduler/job/3210596 Both failing logs show that mc errors started right after client DC got enabled by ->attach_dev() callback that in the passing logs was not called until Host1x driver init. And note that two failing logs show that ->attach_dev() could be called from two different sources, of_dma_configure_id() or arch_setup_dma_ops(). The reason why ->attach_dev() gets called is probably related to the following reasons (sorry, can't be 100% sure as I don't have Tegra124 or other 32bit Tegra board to test): 1) With the commit reverted, all clients are probed in "arch" stage, which is even prior to iommu core initialization -- including it setting default domain type. This probably messed up the type of allocating domains against the default domain type. Also internal group is somehow affected. So some condition check in iommu core failed and then it bypassed ->attach_dev callback in really_probe stage, until Host1x driver does attach_dev again. 2) 32bit ARM has arch_setup_dma_ops() does an additional set of iommu domain allocation + attach_dev(), after of_dma_configure_id() did once. This isn't reproducible for me on Tegra210. As debugging online isn't very efficient, and given that Thierry has been working on the linear mapping of framebuffer carveout, I choose to partially revert as a quick fix. Thanks ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RFC v1 5/6] xen-swiotlb: convert variables to arrays
On Fri, 19 Feb 2021, Konrad Rzeszutek Wilk wrote: > On Sun, Feb 07, 2021 at 04:56:01PM +0100, Christoph Hellwig wrote: > > On Thu, Feb 04, 2021 at 09:40:23AM +0100, Christoph Hellwig wrote: > > > So one thing that has been on my mind for a while: I'd really like > > > to kill the separate dma ops in Xen swiotlb. If we compare xen-swiotlb > > > to swiotlb the main difference seems to be: > > > > > > - additional reasons to bounce I/O vs the plain DMA capable > > > - the possibility to do a hypercall on arm/arm64 > > > - an extra translation layer before doing the phys_to_dma and vice > > >versa > > > - an special memory allocator > > > > > > I wonder if inbetween a few jump labels or other no overhead enablement > > > options and possibly better use of the dma_range_map we could kill > > > off most of swiotlb-xen instead of maintaining all this code duplication? > > > > So I looked at this a bit more. > > > > For x86 with XENFEAT_auto_translated_physmap (how common is that?) > > Juergen, Boris please correct me if I am wrong, but that > XENFEAT_auto_translated_physmap > only works for PVH guests? ARM is always XENFEAT_auto_translated_physmap > > pfn_to_gfn is a nop, so plain phys_to_dma/dma_to_phys do work as-is. > > > > xen_arch_need_swiotlb always returns true for x86, and > > range_straddles_page_boundary should never be true for the > > XENFEAT_auto_translated_physmap case. > > Correct. The kernel should have no clue of what the real MFNs are > for PFNs. On ARM, Linux knows the MFNs because for local pages MFN == PFN and for foreign pages it keeps track in arch/arm/xen/p2m.c. More on this below. xen_arch_need_swiotlb only returns true on ARM in rare situations where bouncing on swiotlb buffers is required. Today it only happens on old versions of Xen that don't support the cache flushing hypercall but there could be more cases in the future. > > > > So as far as I can tell the mapping fast path for the > > XENFEAT_auto_translated_physmap can be trivially reused from swiotlb. > > > > That leaves us with the next more complicated case, x86 or fully cache > > coherent arm{,64} without XENFEAT_auto_translated_physmap. In that case > > we need to patch in a phys_to_dma/dma_to_phys that performs the MFN > > lookup, which could be done using alternatives or jump labels. > > I think if that is done right we should also be able to let that cover > > the foreign pages in is_xen_swiotlb_buffer/is_swiotlb_buffer, but > > in that worst case that would need another alternative / jump label. > > > > For non-coherent arm{,64} we'd also need to use alternatives or jump > > labels to for the cache maintainance ops, but that isn't a hard problem > > either. With the caveat that ARM is always XENFEAT_auto_translated_physmap, what you wrote looks correct. I am writing down a brief explanation on how swiotlb-xen is used on ARM. pfn: address as seen by the guest, pseudo-physical address in ARM terminology mfn (or bfn): real address, physical address in ARM terminology On ARM dom0 is auto_translated (so Xen sets up the stage2 translation in the MMU) and the translation is 1:1. So pfn == mfn for Dom0. However, when another domain shares a page with Dom0, that page is not 1:1. Swiotlb-xen is used to retrieve the mfn for the foreign page at xen_swiotlb_map_page. It does that with xen_phys_to_bus -> pfn_to_bfn. It is implemented with a rbtree in arch/arm/xen/p2m.c. In addition, swiotlb-xen is also used to cache-flush the page via hypercall at xen_swiotlb_unmap_page. That is done because dev_addr is really the mfn at unmap_page and we don't know the pfn for it. We can do pfn-to-mfn but we cannot do mfn-to-pfn (there are good reasons for it unfortunately). The only way to cache-flush by mfn is by issuing a hypercall. The hypercall is implemented in arch/arm/xen/mm.c. The pfn != bfn and pfn_valid() checks are used to detect if the page is local (of dom0) or foreign; they work thanks to the fact that Dom0 is 1:1 mapped. Getting back to what you wrote, yes if we had a way to do MFN lookups in phys_to_dma, and a way to call the hypercall at unmap_page if the page is foreign (e.g. if it fails a pfn_valid check) then I think we would be good from an ARM perspective. The only exception is when xen_arch_need_swiotlb returns true, in which case we need to actually bounce on swiotlb buffers. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 6/9] swiotlb: refactor swiotlb_tbl_map_single
> > +static int find_slots(struct device *dev, size_t alloc_size) > > +{ > > + unsigned long boundary_mask = dma_get_seg_boundary(dev); > > + dma_addr_t tbl_dma_addr = > > + phys_to_dma_unencrypted(dev, io_tlb_start) & boundary_mask; > > + unsigned int max_slots = get_max_slots(boundary_mask); > > 'max_slots' should be 'unsigned long' here. Breaks SWIOTLB on RPi4. Do you > want > me to send a fix or you prefer editing the patch? I can roll it in. Thx! ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 6/9] swiotlb: refactor swiotlb_tbl_map_single
Hi Christoph, On Sun, 2021-02-07 at 17:03 +0100, Christoph Hellwig wrote: > Split out a bunch of a self-contained helpers to make the function easier > to follow. > > Signed-off-by: Christoph Hellwig > --- > kernel/dma/swiotlb.c | 179 +-- > 1 file changed, 89 insertions(+), 90 deletions(-) > > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c > index b38b1553c4663a..381c24ef1ac1d0 100644 > --- a/kernel/dma/swiotlb.c > +++ b/kernel/dma/swiotlb.c > @@ -468,134 +468,133 @@ static void swiotlb_bounce(phys_addr_t orig_addr, > phys_addr_t tlb_addr, > } > } > > > -phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t > orig_addr, > - size_t mapping_size, size_t alloc_size, > - enum dma_data_direction dir, unsigned long attrs) > -{ > - dma_addr_t tbl_dma_addr = phys_to_dma_unencrypted(hwdev, io_tlb_start); > - unsigned long flags; > - phys_addr_t tlb_addr; > - unsigned int nslots, stride, index, wrap; > - int i; > - unsigned long mask; > - unsigned long offset_slots; > - unsigned long max_slots; > - unsigned long tmp_io_tlb_used; > - > - if (no_iotlb_memory) > - panic("Can not allocate SWIOTLB buffer earlier and can't now > provide you with the DMA bounce buffer"); > - > - if (mem_encrypt_active()) > - pr_warn_once("Memory encryption is active and system is using > DMA bounce buffers\n"); > +#define slot_addr(start, idx)((start) + ((idx) << IO_TLB_SHIFT)) > > > - if (mapping_size > alloc_size) { > - dev_warn_once(hwdev, "Invalid sizes (mapping: %zd bytes, alloc: > %zd bytes)", > - mapping_size, alloc_size); > - return (phys_addr_t)DMA_MAPPING_ERROR; > - } > - > - mask = dma_get_seg_boundary(hwdev); > +/* > + * Carefully handle integer overflow which can occur when boundary_mask == > ~0UL. > + */ > +static inline unsigned long get_max_slots(unsigned long boundary_mask) > +{ > + if (boundary_mask == ~0UL) > + return 1UL << (BITS_PER_LONG - IO_TLB_SHIFT); > + return nr_slots(boundary_mask + 1); > +} > > > - tbl_dma_addr &= mask; > +static unsigned int wrap_index(unsigned int index) > +{ > + if (index >= io_tlb_nslabs) > + return 0; > + return index; > +} > > > - offset_slots = nr_slots(tbl_dma_addr); > +/* > + * Find a suitable number of IO TLB entries size that will fit this request > and > + * allocate a buffer from that IO TLB pool. > + */ > +static int find_slots(struct device *dev, size_t alloc_size) > +{ > + unsigned long boundary_mask = dma_get_seg_boundary(dev); > + dma_addr_t tbl_dma_addr = > + phys_to_dma_unencrypted(dev, io_tlb_start) & boundary_mask; > + unsigned int max_slots = get_max_slots(boundary_mask); 'max_slots' should be 'unsigned long' here. Breaks SWIOTLB on RPi4. Do you want me to send a fix or you prefer editing the patch? Regards, Nicolas signature.asc Description: This is a digitally signed message part ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [git pull] IOMMU Updates for Linux v5.12
The pull request you sent on Mon, 22 Feb 2021 17:17:35 +0100: > git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git > tags/iommu-updates-v5.12 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/d652ea30ba32db12fe8365182fad5ba2e7c22822 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: Fix event counter availability check
This fix has been accepted in the upstream recently. https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/?h=x86/amd Could you please give this a try? Thanks, Suravee On 2/21/21 8:49 PM, Paul Menzel wrote: Dear Suravee, Am 17.09.20 um 19:55 schrieb Alexander Monakov: On Tue, 16 Jun 2020, Suravee Suthikulpanit wrote: Instead of blindly moving the code around to a spot that would just work, I am trying to understand what might be required here. In this case, the init_device_table_dma()should not be needed. I suspect it's the IOMMU invalidate all command that's also needed here. I'm also checking with the HW and BIOS team. Meanwhile, could you please give the following change a try: Hello. Can you give any update please? […] Sorry for late reply. I have a reproducer and working with the HW team to understand the issue. I should be able to provide update with solution by the end of this week. Hello, hope you are doing well. Has this investigation found anything? I am wondering the same. It’d be great to have this fixed in the upstream Linux kernel. Kind regards, Paul ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/4] iommu/vt-d: Enable write protect propagation from guest
Hi Kevin, On Sat, 20 Feb 2021 02:38:02 +, "Tian, Kevin" wrote: > > From: Jacob Pan > > Sent: Saturday, February 20, 2021 1:09 AM > > > > Hi Kevin, > > > > On Fri, 19 Feb 2021 06:19:04 +, "Tian, Kevin" > > wrote: > > > > > > From: Jacob Pan > > > > Sent: Friday, February 19, 2021 5:31 AM > > > > > > > > Write protect bit, when set, inhibits supervisor writes to the > > > > read-only pages. In guest supervisor shared virtual addressing > > > > (SVA), write-protect should be honored upon guest bind supervisor > > > > PASID request. > > > > > > > > This patch extends the VT-d portion of the IOMMU UAPI to include WP > > > > bit. WPE bit of the supervisor PASID entry will be set to match > > > > CPU CR0.WP bit. > > > > > > > > Signed-off-by: Sanjay Kumar > > > > Signed-off-by: Jacob Pan > > > > --- > > > > drivers/iommu/intel/pasid.c | 5 + > > > > include/uapi/linux/iommu.h | 3 ++- > > > > 2 files changed, 7 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/iommu/intel/pasid.c > > > > b/drivers/iommu/intel/pasid.c index 0b7e0e726ade..c7a2ec930af4 > > > > 100644 --- a/drivers/iommu/intel/pasid.c > > > > +++ b/drivers/iommu/intel/pasid.c > > > > @@ -763,6 +763,11 @@ intel_pasid_setup_bind_data(struct > > intel_iommu > > > > *iommu, struct pasid_entry *pte, > > > > return -EINVAL; > > > > } > > > > pasid_set_sre(pte); > > > > + /* Enable write protect WP if guest requested */ > > > > + if (pasid_data->flags & IOMMU_SVA_VTD_GPASID_WPE) { > > > > + if (pasid_enable_wpe(pte)) > > > > + return -EINVAL; > > > > > > We should call pasid_set_wpe directly, as this binding is about guest > > > page table and suppose the guest has done whatever check required > > > (e.g. gcr0.wp) before setting this bit. pasid_enable_wpe has an > > > additional check on host cr0.wp thus is logically incorrect here. > > > > > If the host CPU does not support WP, can guest VCPU still support WP? If > > so, I agree. > > > > If you change 'support' to 'enable', then the answer is yes. I agree, thanks for explaining. Will change it to pasid_set_wpe. Thanks, Jacob ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/4] iommu/vt-d: Enable write protect for supervisor SVM
Hi Lu, On Sat, 20 Feb 2021 09:56:26 +0800, Lu Baolu wrote: > Hi Jacob and Sanjay, > > On 2/19/21 5:31 AM, Jacob Pan wrote: > > Write protect bit, when set, inhibits supervisor writes to the read-only > > pages. In supervisor shared virtual addressing (SVA), where page tables > > are shared between CPU and DMA, IOMMU PASID entry WPE bit should match > > CR0.WP bit in the CPU. > > This patch sets WPE bit for supervisor PASIDs if CR0.WP is set. > > From reading the commit message, the intention of this patch is to match > PASID entry WPE bith with CPU CR0.WP if 1) SRE is set (supervisor > pasid); 2) page table is shared between CPU and IOMMU. Do I understand > it right? > yes. that is my intention. > But what the real code doing is failing pasid entry setup for first > level translation if CPU CR0.WP is not set. It's not consistent with > what described above. > > What I am thinking is that, as long as SRE is set, we should always set > WPE in intel_pasid_setup_first_level(). For supervisor SVA case, we > should check CPU CR0.WP in intel_svm_bind_mm() and abort binding if > CR0.WP is not set. > > Thought? > This code only affects supervisor SVA, since PASID_FLAG_SUPERVISOR_MODE flag is not set for FL IOVA. > Best regards, > baolu > > > > > Signed-off-by: Sanjay Kumar > > Signed-off-by: Jacob Pan > > --- > > drivers/iommu/intel/pasid.c | 26 ++ > > 1 file changed, 26 insertions(+) > > > > diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c > > index 0cceaabc3ce6..0b7e0e726ade 100644 > > --- a/drivers/iommu/intel/pasid.c > > +++ b/drivers/iommu/intel/pasid.c > > @@ -410,6 +410,15 @@ static inline void pasid_set_sre(struct > > pasid_entry *pe) pasid_set_bits(&pe->val[2], 1 << 0, 1); > > } > > > > +/* > > + * Setup the WPE(Write Protect Enable) field (Bit 132) of a > > + * scalable mode PASID entry. > > + */ > > +static inline void pasid_set_wpe(struct pasid_entry *pe) > > +{ > > + pasid_set_bits(&pe->val[2], 1 << 4, 1 << 4); > > +} > > + > > /* > >* Setup the P(Present) field (Bit 0) of a scalable mode PASID > >* entry. > > @@ -553,6 +562,20 @@ static void pasid_flush_caches(struct intel_iommu > > *iommu, } > > } > > > > +static inline int pasid_enable_wpe(struct pasid_entry *pte) > > +{ > > + unsigned long cr0 = read_cr0(); > > + > > + /* CR0.WP is normally set but just to be sure */ > > + if (unlikely(!(cr0 & X86_CR0_WP))) { > > + pr_err_ratelimited("No CPU write protect!\n"); > > + return -EINVAL; > > + } > > + pasid_set_wpe(pte); > > + > > + return 0; > > +}; > > + > > /* > >* Set up the scalable mode pasid table entry for first only > >* translation type. > > @@ -584,6 +607,9 @@ int intel_pasid_setup_first_level(struct > > intel_iommu *iommu, return -EINVAL; > > } > > pasid_set_sre(pte); > > + if (pasid_enable_wpe(pte)) > > + return -EINVAL; > > + > > } > > > > if (flags & PASID_FLAG_FL5LP) { > > Thanks, Jacob ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[git pull] IOMMU Updates for Linux v5.12
Hi Linus, The following changes since commit 92bf22614b21a2706f4993b278017e437f7785b3: Linux 5.11-rc7 (2021-02-07 13:57:38 -0800) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git tags/iommu-updates-v5.12 for you to fetch changes up to 45e606f2726926b04094e1c9bf809bca4884c57f: Merge branches 'arm/renesas', 'arm/smmu', 'x86/amd', 'x86/vt-d' and 'core' into next (2021-02-12 15:27:17 +0100) IOMMU Updates for Linux v5.12 Including: - ARM SMMU and Mediatek updates from Will Deacon: - Support for MT8192 IOMMU from Mediatek - Arm v7s io-pgtable extensions for MT8192 - Removal of TLBI_ON_MAP quirk - New Qualcomm compatible strings - Allow SVA without hardware broadcast TLB maintenance on SMMUv3 - Virtualization Host Extension support for SMMUv3 (SVA) - Allow SMMUv3 PMU (perf) driver to be built independently from IOMMU - Some tidy-up in IOVA and core code - Conversion of the AMD IOMMU code to use the generic IO-page-table framework - Intel VT-d updates from Lu Baolu: - Audit capability consistency among different IOMMUs - Add SATC reporting structure support - Add iotlb_sync_map callback support - SDHI Support for Renesas IOMMU driver - Misc Cleanups and other small improvments Adrian Huang (1): iommu/amd: Remove unnecessary assignment Bjorn Andersson (2): dt-bindings: arm-smmu-qcom: Add Qualcomm SC8180X compatible iommu/arm-smmu-qcom: Add Qualcomm SC8180X impl Bjorn Helgaas (1): iommu/vt-d: Fix 'physical' typos Colin Ian King (1): iommu/mediatek: Fix unsigned domid comparison with less than zero Dan Carpenter (1): iommu/mediatek: Fix error code in probe() Douglas Anderson (1): iommu: Properly pass gfp_t in _iommu_map() to avoid atomic sleeping Isaac J. Manjarres (1): iommu/arm-smmu-qcom: Fix mask extraction for bootloader programmed SMRs Jean-Philippe Brucker (3): iommu/arm-smmu-v3: Split arm_smmu_tlb_inv_range() iommu/arm-smmu-v3: Make BTM optional for SVA iommu/arm-smmu-v3: Add support for VHE Joerg Roedel (2): Merge tag 'arm-smmu-updates' of git://git.kernel.org/.../will/linux into arm/smmu Merge branches 'arm/renesas', 'arm/smmu', 'x86/amd', 'x86/vt-d' and 'core' into next John Garry (7): iova: Make has_iova_flush_queue() private iova: Delete copy_reserved_iova() iova: Stop exporting some more functions iommu: Stop exporting iommu_map_sg_atomic() iommu: Delete iommu_domain_window_disable() iommu: Delete iommu_dev_has_feature() driver/perf: Remove ARM_SMMU_V3_PMU dependency on ARM_SMMU_V3 Kyung Min Park (2): iommu/vt-d: Audit IOMMU Capabilities and add helper functions iommu/vt-d: Move capability check code to cap_audit files Lianbo Jiang (2): dma-iommu: use static-key to minimize the impact in the fast-path iommu: use the __iommu_attach_device() directly for deferred attach Lu Baolu (7): iommu/vt-d: Consolidate duplicate cache invaliation code iommu/vt-d: Add qi_submit trace event iommu/vt-d: Preset Access/Dirty bits for IOVA over FL iommu/vt-d: Clear PRQ overflow only when PRQ is empty iommu/vt-d: Use INVALID response code instead of FAILURE iommu/vt-d: Fix compile error [-Werror=implicit-function-declaration] iommu/vt-d: Add iotlb_sync_map callback Lukas Bulwahn (1): MAINTAINERS: repair file pattern in MEDIATEK IOMMU DRIVER Robin Murphy (3): iommu/arm-smmu-v3: Remove the page 1 fixup iommu/msm: Hook up iotlb_sync_map iommu/io-pgtable: Remove TLBI_ON_MAP quirk Suravee Suthikulpanit (14): iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline iommu/amd: Prepare for generic IO page table framework iommu/amd: Move pt_root to struct amd_io_pgtable iommu/amd: Convert to using amd_io_pgtable iommu/amd: Declare functions as extern iommu/amd: Move IO page table related functions iommu/amd: Restructure code for freeing page table iommu/amd: Remove amd_iommu_domain_get_pgtable iommu/amd: Rename variables to be consistent with struct io_pgtable_ops iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable iommu/amd: Introduce iommu_v1_iova_to_phys iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page iommu/amd: Adopt IO page table framework for AMD IOMMU v1 page table iommu/amd: Fix performance counter initialization Tom Rix (1): iommu/amd: remove h from printk format specifier Vinod Koul (2): dt-bindings: arm-smmu: Add sm8
Re: [PATCH v11 01/13] vfio: VFIO_IOMMU_SET_PASID_TABLE
Hi Keqian, On 2/22/21 1:20 PM, Keqian Zhu wrote: > Hi Eric, > > On 2021/2/22 18:53, Auger Eric wrote: >> Hi Keqian, >> >> On 2/2/21 1:34 PM, Keqian Zhu wrote: >>> Hi Eric, >>> >>> On 2020/11/16 19:00, Eric Auger wrote: From: "Liu, Yi L" This patch adds an VFIO_IOMMU_SET_PASID_TABLE ioctl which aims to pass the virtual iommu guest configuration to the host. This latter takes the form of the so-called PASID table. Signed-off-by: Jacob Pan Signed-off-by: Liu, Yi L Signed-off-by: Eric Auger --- v11 -> v12: - use iommu_uapi_set_pasid_table - check SET and UNSET are not set simultaneously (Zenghui) v8 -> v9: - Merge VFIO_IOMMU_ATTACH/DETACH_PASID_TABLE into a single VFIO_IOMMU_SET_PASID_TABLE ioctl. v6 -> v7: - add a comment related to VFIO_IOMMU_DETACH_PASID_TABLE v3 -> v4: - restore ATTACH/DETACH - add unwind on failure v2 -> v3: - s/BIND_PASID_TABLE/SET_PASID_TABLE v1 -> v2: - s/BIND_GUEST_STAGE/BIND_PASID_TABLE - remove the struct device arg --- drivers/vfio/vfio_iommu_type1.c | 65 + include/uapi/linux/vfio.h | 19 ++ 2 files changed, 84 insertions(+) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 67e827638995..87ddd9e882dc 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -2587,6 +2587,41 @@ static int vfio_iommu_iova_build_caps(struct vfio_iommu *iommu, return ret; } +static void +vfio_detach_pasid_table(struct vfio_iommu *iommu) +{ + struct vfio_domain *d; + + mutex_lock(&iommu->lock); + list_for_each_entry(d, &iommu->domain_list, next) + iommu_detach_pasid_table(d->domain); + + mutex_unlock(&iommu->lock); +} + +static int +vfio_attach_pasid_table(struct vfio_iommu *iommu, unsigned long arg) +{ + struct vfio_domain *d; + int ret = 0; + + mutex_lock(&iommu->lock); + + list_for_each_entry(d, &iommu->domain_list, next) { + ret = iommu_uapi_attach_pasid_table(d->domain, (void __user *)arg); >>> This design is not very clear to me. This assumes all iommu_domains share >>> the same pasid table. >>> >>> As I understand, it's reasonable when there is only one group in the >>> domain, and only one domain in the vfio_iommu. >>> If more than one group in the vfio_iommu, the guest may put them into >>> different guest iommu_domain, then they have different pasid table. >>> >>> Is this the use scenario? >> >> the vfio_iommu is attached to a container. all the groups within a >> container share the same set of page tables (linux >> Documentation/driver-api/vfio.rst). So to me if you want to use >> different pasid tables, the groups need to be attached to different >> containers. Does that make sense to you? > OK, so this is what I understand about the design. A little question is that > when > we perform attach_pasid_table on a container, maybe we ought to do a sanity > check to make sure that only one group is in this container, instead of > iterating all domain? > > To be frank, my main concern is that if we put each group into different > container > under nested mode, then we give up the possibility that they can share stage2 > page tables, > which saves host memory and reduces the time of preparing environment for VM. Referring to the QEMU integration, when you use a virtual IOMMU, there is generally one VFIO container per viommu protected device (AddressSpace), independently on the fact nested stage is being used. I think the exception is if you put 2 assigned devices behind a virtual PCIe to PCI bridge (pcie-pci-bridge), in that case they have the same RID, they share the same QEMU AddressSpace and they are put in the same container, if the kernel does not reject it (underlying pIOMMUs allow it). See QEMU vfio_connect_container() in hw/vfio/common.c. In that config, if the assigned devices belong to different groups, you may end up with 2 groups set to the same container. But this case is not supported by the guest kernel anyway (independently on the nested stage integration). You hit a BUG_ON as reported a long time ago in https://www.mail-archive.com/qemu-devel@nongnu.org/msg608047.html > > To me, I'd like to understand the "container shares page table" to be: > 1) share stage2 page table under nested mode. under nested mode they share S2 and with this design devices also share the same PASID table. Because on the guest they are in the same group. > 2) share stage1 page table under non-nested mode. in non nested mode there is a single stage, by default S1. > > As when we perform "map" on a container: > 1) under nested mode, we setup stage2 mapping. > 2) under non-nested mode, we setup s
[RFC PATCH 5/5] KVM: arm64: Make sure pinned vmid is released on VM exit
Since the pinned VMID space is not recycled, we need to make sure that we release the vmid back into the pool when we are done with it. Signed-off-by: Shameer Kolothum --- arch/arm64/kvm/arm.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 8955968be49f..d9900ffb88f4 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -181,8 +181,16 @@ void kvm_arch_destroy_vm(struct kvm *kvm) kvm_vgic_destroy(kvm); for (i = 0; i < KVM_MAX_VCPUS; ++i) { - if (kvm->vcpus[i]) { - kvm_vcpu_destroy(kvm->vcpus[i]); + struct kvm_vcpu *vcpu = kvm->vcpus[i]; + + if (vcpu) { + struct kvm_vmid *vmid = &vcpu->arch.hw_mmu->vmid; + + if (refcount_read(&vmid->pinned)) { + ida_free(&kvm_pinned_vmids, vmid->vmid); + refcount_set(&vmid->pinned, 0); + } + kvm_vcpu_destroy(vcpu); kvm->vcpus[i] = NULL; } } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH 3/5] KVM: ARM64: Add support for pinned VMIDs
On an ARM64 system with a SMMUv3 implementation that fully supports Broadcast TLB Maintenance(BTM) feature, the CPU TLB invalidate instructions are received by SMMU. This is very useful when the SMMU shares the page tables with the CPU(eg: Guest SVA use case). For this to work, the SMMU must use the same VMID that is allocated by KVM to configure the stage 2 translations. At present KVM VMID allocations are recycled on rollover and may change as a result. This will create issues if we have to share the KVM VMID with SMMU. Hence, we spilt the KVM VMID space into two, the first half follows the normal recycle on rollover policy while the second half of the VMID pace is used to allocate pinned VMIDs. This feature is enabled based on a command line option "kvm-arm.pinned_vmid_enable". Signed-off-by: Shameer Kolothum --- arch/arm64/include/asm/kvm_host.h | 2 + arch/arm64/kvm/Kconfig| 1 + arch/arm64/kvm/arm.c | 104 +- 3 files changed, 106 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 0cd9f0f75c13..db6441c6a580 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -25,6 +25,7 @@ #include #include #include +#include #include #define __KVM_HAVE_ARCH_INTC_INITIALIZED @@ -65,6 +66,7 @@ struct kvm_vmid { /* The VMID generation used for the virt. memory system */ u64vmid_gen; u32vmid; + refcount_t pinned; }; struct kvm_s2_mmu { diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig index 043756db8f6e..c5c52953e842 100644 --- a/arch/arm64/kvm/Kconfig +++ b/arch/arm64/kvm/Kconfig @@ -40,6 +40,7 @@ menuconfig KVM select HAVE_KVM_VCPU_RUN_PID_CHANGE select TASKSTATS select TASK_DELAY_ACCT + select HAVE_KVM_PINNED_VMID help Support hosting virtualized guest machines. diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index c0ffb019ca8b..8955968be49f 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -56,6 +56,19 @@ static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1); static u32 kvm_next_vmid; static DEFINE_SPINLOCK(kvm_vmid_lock); +static bool kvm_pinned_vmid_enable; + +static int __init early_pinned_vmid_enable(char *buf) +{ + return strtobool(buf, &kvm_pinned_vmid_enable); +} + +early_param("kvm-arm.pinned_vmid_enable", early_pinned_vmid_enable); + +static DEFINE_IDA(kvm_pinned_vmids); +static u32 kvm_pinned_vmid_start; +static u32 kvm_pinned_vmid_end; + static bool vgic_present; static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled); @@ -475,6 +488,10 @@ void force_vm_exit(const cpumask_t *mask) static bool need_new_vmid_gen(struct kvm_vmid *vmid) { u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen); + + if (refcount_read(&vmid->pinned)) + return false; + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ return unlikely(READ_ONCE(vmid->vmid_gen) != current_vmid_gen); } @@ -485,6 +502,8 @@ static bool need_new_vmid_gen(struct kvm_vmid *vmid) */ static void update_vmid(struct kvm_vmid *vmid) { + unsigned int vmid_bits; + if (!need_new_vmid_gen(vmid)) return; @@ -521,7 +540,12 @@ static void update_vmid(struct kvm_vmid *vmid) vmid->vmid = kvm_next_vmid; kvm_next_vmid++; - kvm_next_vmid &= (1 << kvm_get_vmid_bits()) - 1; + if (kvm_pinned_vmid_enable) + vmid_bits = kvm_get_vmid_bits() - 1; + else + vmid_bits = kvm_get_vmid_bits(); + + kvm_next_vmid &= (1 << vmid_bits) - 1; smp_wmb(); WRITE_ONCE(vmid->vmid_gen, atomic64_read(&kvm_vmid_gen)); @@ -569,6 +593,71 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu) return ret; } +int kvm_arch_pinned_vmid_get(struct kvm *kvm) +{ + struct kvm_vcpu *vcpu; + struct kvm_vmid *kvm_vmid; + int ret; + + if (!kvm_pinned_vmid_enable || !atomic_read(&kvm->online_vcpus)) + return -EINVAL; + + vcpu = kvm_get_vcpu(kvm, 0); + if (!vcpu) + return -EINVAL; + + kvm_vmid = &vcpu->arch.hw_mmu->vmid; + + spin_lock(&kvm_vmid_lock); + + if (refcount_inc_not_zero(&kvm_vmid->pinned)) { + spin_unlock(&kvm_vmid_lock); + return kvm_vmid->vmid; + } + + ret = ida_alloc_range(&kvm_pinned_vmids, kvm_pinned_vmid_start, + kvm_pinned_vmid_end, GFP_KERNEL); + if (ret < 0) { + spin_unlock(&kvm_vmid_lock); + return ret; + } + + force_vm_exit(cpu_all_mask); + kvm_call_hyp(__kvm_flush_vm_context); + + kvm_vmid->vmid = (u32)ret; + refcount_set(&kvm_vmid->pinned, 1); + spin_unlock(&kvm_vmid_lock); + + return ret; +} + +int kvm_arch_pinned_vmid_put(struct kvm *kvm) +{ + struc
[RFC PATCH 4/5] iommu/arm-smmu-v3: Use pinned VMID for NESTED stage with BTM
If the SMMU supports BTM and the device belongs to NESTED domain with shared pasid table, we need to use the VMID allocated by the KVM for the s2 configuration. Hence, request a pinned VMID from KVM. Signed-off-by: Shameer Kolothum --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 49 - 1 file changed, 47 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 26bf7da1bcd0..04f83f7c8319 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -28,6 +28,7 @@ #include #include #include +#include #include @@ -2195,6 +2196,33 @@ static void arm_smmu_bitmap_free(unsigned long *map, int idx) clear_bit(idx, map); } +static int arm_smmu_pinned_vmid_get(struct arm_smmu_domain *smmu_domain) +{ + struct arm_smmu_master *master; + + master = list_first_entry_or_null(&smmu_domain->devices, + struct arm_smmu_master, domain_head); + if (!master) + return -EINVAL; + + return kvm_pinned_vmid_get(master->dev); +} + +static int arm_smmu_pinned_vmid_put(struct arm_smmu_domain *smmu_domain) +{ + struct arm_smmu_master *master; + + master = list_first_entry_or_null(&smmu_domain->devices, + struct arm_smmu_master, domain_head); + if (!master) + return -EINVAL; + + if (smmu_domain->s2_cfg.vmid) + return kvm_pinned_vmid_put(master->dev); + + return 0; +} + static void arm_smmu_domain_free(struct iommu_domain *domain) { struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); @@ -2215,8 +2243,11 @@ static void arm_smmu_domain_free(struct iommu_domain *domain) mutex_unlock(&arm_smmu_asid_lock); } if (s2_cfg->set) { - if (s2_cfg->vmid) - arm_smmu_bitmap_free(smmu->vmid_map, s2_cfg->vmid); + if (s2_cfg->vmid) { + if (!(smmu->features & ARM_SMMU_FEAT_BTM) && + smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED) + arm_smmu_bitmap_free(smmu->vmid_map, s2_cfg->vmid); + } } kfree(smmu_domain); @@ -3199,6 +3230,17 @@ static int arm_smmu_attach_pasid_table(struct iommu_domain *domain, !(smmu->features & ARM_SMMU_FEAT_2_LVL_CDTAB)) goto out; + if (smmu->features & ARM_SMMU_FEAT_BTM) { + ret = arm_smmu_pinned_vmid_get(smmu_domain); + if (ret < 0) + goto out; + + if (smmu_domain->s2_cfg.vmid) + arm_smmu_bitmap_free(smmu->vmid_map, smmu_domain->s2_cfg.vmid); + + smmu_domain->s2_cfg.vmid = (u16)ret; + } + smmu_domain->s1_cfg.cdcfg.cdtab_dma = cfg->base_ptr; smmu_domain->s1_cfg.s1cdmax = cfg->pasid_bits; smmu_domain->s1_cfg.s1fmt = cfg->vendor_data.smmuv3.s1fmt; @@ -3221,6 +3263,7 @@ static int arm_smmu_attach_pasid_table(struct iommu_domain *domain, static void arm_smmu_detach_pasid_table(struct iommu_domain *domain) { struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct arm_smmu_device *smmu = smmu_domain->smmu; struct arm_smmu_master *master; unsigned long flags; @@ -3237,6 +3280,8 @@ static void arm_smmu_detach_pasid_table(struct iommu_domain *domain) arm_smmu_install_ste_for_dev(master); spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); + if (smmu->features & ARM_SMMU_FEAT_BTM) + arm_smmu_pinned_vmid_put(smmu_domain); unlock: mutex_unlock(&smmu_domain->init_mutex); } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH 2/5] KVM: Add generic infrastructure to support pinned VMIDs
Provide generic helper functions to get/put pinned VMIDs if the arch supports it. Signed-off-by: Shameer Kolothum --- include/linux/kvm_host.h | 17 + virt/kvm/Kconfig | 2 ++ virt/kvm/kvm_main.c | 25 + 3 files changed, 44 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 7f2e2a09ebbd..25856db74a08 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -836,6 +836,8 @@ bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu); void kvm_vcpu_kick(struct kvm_vcpu *vcpu); int kvm_vcpu_yield_to(struct kvm_vcpu *target); void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu, bool usermode_vcpu_not_eligible); +int kvm_pinned_vmid_get(struct device *dev); +int kvm_pinned_vmid_put(struct device *dev); void kvm_flush_remote_tlbs(struct kvm *kvm); void kvm_reload_remote_mmus(struct kvm *kvm); @@ -1478,4 +1480,19 @@ static inline void kvm_handle_signal_exit(struct kvm_vcpu *vcpu) } #endif /* CONFIG_KVM_XFER_TO_GUEST_WORK */ +#ifdef CONFIG_HAVE_KVM_PINNED_VMID +int kvm_arch_pinned_vmid_get(struct kvm *kvm); +int kvm_arch_pinned_vmid_put(struct kvm *kvm); +#else +static inline int kvm_arch_pinned_vmid_get(struct kvm *kvm) +{ + return -EINVAL; +} + +static inline int kvm_arch_pinned_vmid_put(struct kvm *kvm) +{ + return -EINVAL; +} +#endif + #endif diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index 1c37ccd5d402..bb55616c5616 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -63,3 +63,5 @@ config HAVE_KVM_NO_POLL config KVM_XFER_TO_GUEST_WORK bool +config HAVE_KVM_PINNED_VMID + bool diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 2541a17ff1c4..632d391f0e34 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -51,6 +51,7 @@ #include #include #include +#include #include #include @@ -2849,6 +2850,30 @@ bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu) } EXPORT_SYMBOL_GPL(kvm_vcpu_wake_up); +int kvm_pinned_vmid_get(struct device *dev) +{ + struct kvm *kvm; + + kvm = vfio_kvm_get_from_dev(dev); + if (!kvm) + return -EINVAL; + + return kvm_arch_pinned_vmid_get(kvm); +} +EXPORT_SYMBOL_GPL(kvm_pinned_vmid_get); + +int kvm_pinned_vmid_put(struct device *dev) +{ + struct kvm *kvm; + + kvm = vfio_kvm_get_from_dev(dev); + if (!kvm) + return -EINVAL; + + return kvm_arch_pinned_vmid_put(kvm); +} +EXPORT_SYMBOL_GPL(kvm_pinned_vmid_put); + #ifndef CONFIG_S390 /* * Kick a sleeping VCPU, or a guest VCPU in guest mode, into host kernel mode. -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH 1/5] vfio: Add a helper to retrieve kvm instance from a dev
A device that belongs to vfio_group has the kvm instance associated with it. Retrieve it. Signed-off-by: Shameer Kolothum --- drivers/vfio/vfio.c | 12 include/linux/vfio.h | 1 + 2 files changed, 13 insertions(+) diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index 2151bc7f87ab..d1968e4bf9f4 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -876,6 +876,18 @@ struct vfio_device *vfio_device_get_from_dev(struct device *dev) } EXPORT_SYMBOL_GPL(vfio_device_get_from_dev); +struct kvm *vfio_kvm_get_from_dev(struct device *dev) +{ + struct vfio_group *group; + + group = vfio_group_get_from_dev(dev); + if (!group) + return NULL; + + return group->kvm; +} +EXPORT_SYMBOL_GPL(vfio_kvm_get_from_dev); + static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group, char *buf) { diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 38d3c6a8dc7e..8716298fee27 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -56,6 +56,7 @@ extern void *vfio_del_group_dev(struct device *dev); extern struct vfio_device *vfio_device_get_from_dev(struct device *dev); extern void vfio_device_put(struct vfio_device *device); extern void *vfio_device_data(struct vfio_device *device); +extern struct kvm *vfio_kvm_get_from_dev(struct device *dev); /** * struct vfio_iommu_driver_ops - VFIO IOMMU driver callbacks -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH 0/5] KVM/ARM64 Add support for pinned VMIDs
On an ARM64 system with a SMMUv3 implementation that fully supports Broadcast TLB Maintenance(BTM) feature as part of the Distributed Virtual Memory(DVM) protocol, the CPU TLB invalidate instructions are received by SMMUv3. This is very useful when the SMMUv3 shares the page tables with the CPU(eg: Guest SVA use case). For this to work, the SMMU must use the same VMID that is allocated by KVM to configure the stage 2 translations. At present KVM VMID allocations are recycled on rollover and may change as a result. This will create issues if we have to share the KVM VMID with SMMU. Please see the discussion here, https://lore.kernel.org/linux-iommu/20200522101755.GA3453945@myrica/ This series proposes a way to share the VMID between KVM and IOMMU driver by, 1. Splitting the KVM VMID space into two equal halves based on the command line option "kvm-arm.pinned_vmid_enable". 2. First half of the VMID space follows the normal recycle on rollover policy. 3. Second half of the VMID space doesn't roll over and is used to allocate pinned VMIDs. 4. Provides helper function to retrieve the KVM instance associated with a device(if it is part of a vfio group). 5. Introduces generic interfaces to get/put pinned KVM VMIDs. Open Items: 1. I couldn't figure out a way to determine whether a platform actually fully supports DVM/BTM or not. Not sure we can take a call based on SMMUv3 BTM feature bit alone. Probably we can get it from firmware via IORT? 2. The current splitting of VMID space is only one way to do this and probably not the best. Maybe we can follow the pinned ASID method used in SVA code. Suggestions welcome here. 3. The detach_pasid_table() interface is not very clear to me as the current Qemu prototype is not using that. This requires fixing from my side. This is based on Jean-Philippe's SVA series[1] and Eric's SMMUv3 dual-stage support series[2]. The branch with the whole vSVA + BTM solution is here, https://github.com/hisilicon/kernel-dev/tree/5.10-rc4-2stage-v13-vsva-btm-rfc This is lightly tested on a HiSilicon D06 platform with uacce/zip dev test tool, ./zip_sva_per -k tlb Thanks, Shameer 1. https://github.com/Linaro/linux-kernel-uadk/commits/uacce-devel-5.10 2. https://lore.kernel.org/linux-iommu/20201118112151.25412-1-eric.au...@redhat.com/T/ Shameer Kolothum (5): vfio: Add a helper to retrieve kvm instance from a dev KVM: Add generic infrastructure to support pinned VMIDs KVM: ARM64: Add support for pinned VMIDs iommu/arm-smmu-v3: Use pinned VMID for NESTED stage with BTM KVM: arm64: Make sure pinned vmid is released on VM exit arch/arm64/include/asm/kvm_host.h | 2 + arch/arm64/kvm/Kconfig | 1 + arch/arm64/kvm/arm.c| 116 +++- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 49 - drivers/vfio/vfio.c | 12 ++ include/linux/kvm_host.h| 17 +++ include/linux/vfio.h| 1 + virt/kvm/Kconfig| 2 + virt/kvm/kvm_main.c | 25 + 9 files changed, 220 insertions(+), 5 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/tegra-smmu: Fix mc errors on tegra124-nyan
On 18/02/2021 22:07, Nicolin Chen wrote: > Commit 25938c73cd79 ("iommu/tegra-smmu: Rework tegra_smmu_probe_device()") > removed certain hack in the tegra_smmu_probe() by relying on IOMMU core to > of_xlate SMMU's SID per device, so as to get rid of tegra_smmu_find() and > tegra_smmu_configure() that are typically done in the IOMMU core also. > > This approach works for both existing devices that have DT nodes and other > devices (like PCI device) that don't exist in DT, on Tegra210 and Tegra3 > upon testing. However, Page Fault errors are reported on tegra124-Nyan: > > tegra-mc 70019000.memory-controller: display0a: read @0xfe056b40: >EMEM address decode error (SMMU translation error [--S]) > tegra-mc 70019000.memory-controller: display0a: read @0xfe056b40: >Page fault (SMMU translation error [--S]) > > After debugging, I found that the mentioned commit changed some function > callback sequence of tegra-smmu's, resulting in enabling SMMU for display > client before display driver gets initialized. I couldn't reproduce exact > same issue on Tegra210 as Tegra124 (arm-32) differs at arch-level code. > > Actually this Page Fault is a known issue, as on most of Tegra platforms, > display gets enabled by the bootloader for the splash screen feature, so > it keeps filling the framebuffer memory. A proper fix to this issue is to > 1:1 linear map the framebuffer memory to IOVA space so the SMMU will have > the same address as the physical address in its page table. Yet, Thierry > has been working on the solution above for a year, and it hasn't merged. > > Therefore, let's partially revert the mentioned commit to fix the errors. > > The reason why we do a partial revert here is that we can still set priv > in ->of_xlate() callback for PCI devices. Meanwhile, devices existing in > DT, like display, will go through tegra_smmu_configure() at the stage of > bus_set_iommu() when SMMU gets probed(), as what it did before we merged > the mentioned commit. > > Once we have the linear map solution for framebuffer memory, this change > can be cleaned away. > > [Big thank to Guillaume who reported and helped debugging/verification] > > Fixes: 25938c73cd79 ("iommu/tegra-smmu: Rework tegra_smmu_probe_device()") > Reported-by: Guillaume Tucker You're welcome. I would actually prefer to see it as reported by kernelci.org since I wouldn't have found it without the automated testing and bisection. If you're OK to change this trailer: Reported-by: "kernelci.org bot" > Signed-off-by: Nicolin Chen > --- > > Guillaume, would you please give a "Tested-by" to this change? Thanks! Sure. https://lava.collabora.co.uk/scheduler/job/3249387 Tested-by: Guillaume Tucker Thanks, Guillaume ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v11 01/13] vfio: VFIO_IOMMU_SET_PASID_TABLE
Hi Eric, On 2021/2/22 18:53, Auger Eric wrote: > Hi Keqian, > > On 2/2/21 1:34 PM, Keqian Zhu wrote: >> Hi Eric, >> >> On 2020/11/16 19:00, Eric Auger wrote: >>> From: "Liu, Yi L" >>> >>> This patch adds an VFIO_IOMMU_SET_PASID_TABLE ioctl >>> which aims to pass the virtual iommu guest configuration >>> to the host. This latter takes the form of the so-called >>> PASID table. >>> >>> Signed-off-by: Jacob Pan >>> Signed-off-by: Liu, Yi L >>> Signed-off-by: Eric Auger >>> >>> --- >>> v11 -> v12: >>> - use iommu_uapi_set_pasid_table >>> - check SET and UNSET are not set simultaneously (Zenghui) >>> >>> v8 -> v9: >>> - Merge VFIO_IOMMU_ATTACH/DETACH_PASID_TABLE into a single >>> VFIO_IOMMU_SET_PASID_TABLE ioctl. >>> >>> v6 -> v7: >>> - add a comment related to VFIO_IOMMU_DETACH_PASID_TABLE >>> >>> v3 -> v4: >>> - restore ATTACH/DETACH >>> - add unwind on failure >>> >>> v2 -> v3: >>> - s/BIND_PASID_TABLE/SET_PASID_TABLE >>> >>> v1 -> v2: >>> - s/BIND_GUEST_STAGE/BIND_PASID_TABLE >>> - remove the struct device arg >>> --- >>> drivers/vfio/vfio_iommu_type1.c | 65 + >>> include/uapi/linux/vfio.h | 19 ++ >>> 2 files changed, 84 insertions(+) >>> >>> diff --git a/drivers/vfio/vfio_iommu_type1.c >>> b/drivers/vfio/vfio_iommu_type1.c >>> index 67e827638995..87ddd9e882dc 100644 >>> --- a/drivers/vfio/vfio_iommu_type1.c >>> +++ b/drivers/vfio/vfio_iommu_type1.c >>> @@ -2587,6 +2587,41 @@ static int vfio_iommu_iova_build_caps(struct >>> vfio_iommu *iommu, >>> return ret; >>> } >>> >>> +static void >>> +vfio_detach_pasid_table(struct vfio_iommu *iommu) >>> +{ >>> + struct vfio_domain *d; >>> + >>> + mutex_lock(&iommu->lock); >>> + list_for_each_entry(d, &iommu->domain_list, next) >>> + iommu_detach_pasid_table(d->domain); >>> + >>> + mutex_unlock(&iommu->lock); >>> +} >>> + >>> +static int >>> +vfio_attach_pasid_table(struct vfio_iommu *iommu, unsigned long arg) >>> +{ >>> + struct vfio_domain *d; >>> + int ret = 0; >>> + >>> + mutex_lock(&iommu->lock); >>> + >>> + list_for_each_entry(d, &iommu->domain_list, next) { >>> + ret = iommu_uapi_attach_pasid_table(d->domain, (void __user >>> *)arg); >> This design is not very clear to me. This assumes all iommu_domains share >> the same pasid table. >> >> As I understand, it's reasonable when there is only one group in the domain, >> and only one domain in the vfio_iommu. >> If more than one group in the vfio_iommu, the guest may put them into >> different guest iommu_domain, then they have different pasid table. >> >> Is this the use scenario? > > the vfio_iommu is attached to a container. all the groups within a > container share the same set of page tables (linux > Documentation/driver-api/vfio.rst). So to me if you want to use > different pasid tables, the groups need to be attached to different > containers. Does that make sense to you? OK, so this is what I understand about the design. A little question is that when we perform attach_pasid_table on a container, maybe we ought to do a sanity check to make sure that only one group is in this container, instead of iterating all domain? To be frank, my main concern is that if we put each group into different container under nested mode, then we give up the possibility that they can share stage2 page tables, which saves host memory and reduces the time of preparing environment for VM. To me, I'd like to understand the "container shares page table" to be: 1) share stage2 page table under nested mode. 2) share stage1 page table under non-nested mode. As when we perform "map" on a container: 1) under nested mode, we setup stage2 mapping. 2) under non-nested mode, we setup stage1 mapping. Indeed, to realize stage2 mapping sharing, we should do much more work to refactor SMMU_DOMAIN... Hope you can consider this. :) Thanks, Keqian > > Thanks > > Eric >> >> Thanks, >> Keqian >> >>> + if (ret) >>> + goto unwind; >>> + } >>> + goto unlock; >>> +unwind: >>> + list_for_each_entry_continue_reverse(d, &iommu->domain_list, next) { >>> + iommu_detach_pasid_table(d->domain); >>> + } >>> +unlock: >>> + mutex_unlock(&iommu->lock); >>> + return ret; >>> +} >>> + >>> static int vfio_iommu_migration_build_caps(struct vfio_iommu *iommu, >>>struct vfio_info_cap *caps) >>> { >>> @@ -2747,6 +2782,34 @@ static int vfio_iommu_type1_unmap_dma(struct >>> vfio_iommu *iommu, >>> -EFAULT : 0; >>> } >>> >>> +static int vfio_iommu_type1_set_pasid_table(struct vfio_iommu *iommu, >>> + unsigned long arg) >>> +{ >>> + struct vfio_iommu_type1_set_pasid_table spt; >>> + unsigned long minsz; >>> + int ret = -EINVAL; >>> + >>> + minsz = offsetofend(struct vfio_iommu_type1_set_pasid_table, flags); >>> + >>> + if (copy_from_user(&spt, (void __user *)arg, minsz)) >>> + return -EFA
Re: [PATCH v11 01/13] vfio: VFIO_IOMMU_SET_PASID_TABLE
Hi Keqian, On 2/2/21 1:34 PM, Keqian Zhu wrote: > Hi Eric, > > On 2020/11/16 19:00, Eric Auger wrote: >> From: "Liu, Yi L" >> >> This patch adds an VFIO_IOMMU_SET_PASID_TABLE ioctl >> which aims to pass the virtual iommu guest configuration >> to the host. This latter takes the form of the so-called >> PASID table. >> >> Signed-off-by: Jacob Pan >> Signed-off-by: Liu, Yi L >> Signed-off-by: Eric Auger >> >> --- >> v11 -> v12: >> - use iommu_uapi_set_pasid_table >> - check SET and UNSET are not set simultaneously (Zenghui) >> >> v8 -> v9: >> - Merge VFIO_IOMMU_ATTACH/DETACH_PASID_TABLE into a single >> VFIO_IOMMU_SET_PASID_TABLE ioctl. >> >> v6 -> v7: >> - add a comment related to VFIO_IOMMU_DETACH_PASID_TABLE >> >> v3 -> v4: >> - restore ATTACH/DETACH >> - add unwind on failure >> >> v2 -> v3: >> - s/BIND_PASID_TABLE/SET_PASID_TABLE >> >> v1 -> v2: >> - s/BIND_GUEST_STAGE/BIND_PASID_TABLE >> - remove the struct device arg >> --- >> drivers/vfio/vfio_iommu_type1.c | 65 + >> include/uapi/linux/vfio.h | 19 ++ >> 2 files changed, 84 insertions(+) >> >> diff --git a/drivers/vfio/vfio_iommu_type1.c >> b/drivers/vfio/vfio_iommu_type1.c >> index 67e827638995..87ddd9e882dc 100644 >> --- a/drivers/vfio/vfio_iommu_type1.c >> +++ b/drivers/vfio/vfio_iommu_type1.c >> @@ -2587,6 +2587,41 @@ static int vfio_iommu_iova_build_caps(struct >> vfio_iommu *iommu, >> return ret; >> } >> >> +static void >> +vfio_detach_pasid_table(struct vfio_iommu *iommu) >> +{ >> +struct vfio_domain *d; >> + >> +mutex_lock(&iommu->lock); >> +list_for_each_entry(d, &iommu->domain_list, next) >> +iommu_detach_pasid_table(d->domain); >> + >> +mutex_unlock(&iommu->lock); >> +} >> + >> +static int >> +vfio_attach_pasid_table(struct vfio_iommu *iommu, unsigned long arg) >> +{ >> +struct vfio_domain *d; >> +int ret = 0; >> + >> +mutex_lock(&iommu->lock); >> + >> +list_for_each_entry(d, &iommu->domain_list, next) { >> +ret = iommu_uapi_attach_pasid_table(d->domain, (void __user >> *)arg); > This design is not very clear to me. This assumes all iommu_domains share the > same pasid table. > > As I understand, it's reasonable when there is only one group in the domain, > and only one domain in the vfio_iommu. > If more than one group in the vfio_iommu, the guest may put them into > different guest iommu_domain, then they have different pasid table. > > Is this the use scenario? the vfio_iommu is attached to a container. all the groups within a container share the same set of page tables (linux Documentation/driver-api/vfio.rst). So to me if you want to use different pasid tables, the groups need to be attached to different containers. Does that make sense to you? Thanks Eric > > Thanks, > Keqian > >> +if (ret) >> +goto unwind; >> +} >> +goto unlock; >> +unwind: >> +list_for_each_entry_continue_reverse(d, &iommu->domain_list, next) { >> +iommu_detach_pasid_table(d->domain); >> +} >> +unlock: >> +mutex_unlock(&iommu->lock); >> +return ret; >> +} >> + >> static int vfio_iommu_migration_build_caps(struct vfio_iommu *iommu, >> struct vfio_info_cap *caps) >> { >> @@ -2747,6 +2782,34 @@ static int vfio_iommu_type1_unmap_dma(struct >> vfio_iommu *iommu, >> -EFAULT : 0; >> } >> >> +static int vfio_iommu_type1_set_pasid_table(struct vfio_iommu *iommu, >> +unsigned long arg) >> +{ >> +struct vfio_iommu_type1_set_pasid_table spt; >> +unsigned long minsz; >> +int ret = -EINVAL; >> + >> +minsz = offsetofend(struct vfio_iommu_type1_set_pasid_table, flags); >> + >> +if (copy_from_user(&spt, (void __user *)arg, minsz)) >> +return -EFAULT; >> + >> +if (spt.argsz < minsz) >> +return -EINVAL; >> + >> +if (spt.flags & VFIO_PASID_TABLE_FLAG_SET && >> +spt.flags & VFIO_PASID_TABLE_FLAG_UNSET) >> +return -EINVAL; >> + >> +if (spt.flags & VFIO_PASID_TABLE_FLAG_SET) >> +ret = vfio_attach_pasid_table(iommu, arg + minsz); >> +else if (spt.flags & VFIO_PASID_TABLE_FLAG_UNSET) { >> +vfio_detach_pasid_table(iommu); >> +ret = 0; >> +} >> +return ret; >> +} >> + >> static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, >> unsigned long arg) >> { >> @@ -2867,6 +2930,8 @@ static long vfio_iommu_type1_ioctl(void *iommu_data, >> return vfio_iommu_type1_unmap_dma(iommu, arg); >> case VFIO_IOMMU_DIRTY_PAGES: >> return vfio_iommu_type1_dirty_pages(iommu, arg); >> +case VFIO_IOMMU_SET_PASID_TABLE: >> +return vfio_iommu_type1_set_pasid_table(iommu, arg); >> default: >> return -ENOTTY; >> } >> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/v
Re: [PATCH 5/6] driver core: lift dma_default_coherent into common code
On Mon, 22 Feb 2021, Christoph Hellwig wrote: > > I haven't booted Linux on my Malta for a while now, but it turns out to > > work just fine, and your patch set does not regress it booting multi-user > > NFS-rooted over FDDI. > > > > I note however that the system does not reboot properly: > > > > sd 0:0:0:0: [sda] Synchronizing SCSI cache > > reboot: Restarting system > > Reboot failed -- System halted > > > > which is a regression, and also the MMIO-mapped discrete CBUS UART (ttyS2) > > does not sign in anymore either: > > Do you mean a regression with this series, or just compared to when you > last tested? When last tested. Years ago, so nothing for you to be concerned. I'll look into it sometime unless someone beats me to. Don't hold your breath though. Sorry to be unclear. Maciej ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v13 00/15] SMMUv3 Nested Stage Setup (IOMMU part)
> -Original Message- > From: Auger Eric [mailto:eric.au...@redhat.com] > Sent: 21 February 2021 18:21 > To: Shameerali Kolothum Thodi ; > eric.auger@gmail.com; iommu@lists.linux-foundation.org; > linux-ker...@vger.kernel.org; k...@vger.kernel.org; > kvm...@lists.cs.columbia.edu; w...@kernel.org; j...@8bytes.org; > m...@kernel.org; robin.mur...@arm.com; alex.william...@redhat.com > Cc: jean-phili...@linaro.org; zhangfei@linaro.org; > zhangfei@gmail.com; vivek.gau...@arm.com; > jacob.jun@linux.intel.com; yi.l@intel.com; t...@semihalf.com; > nicoleots...@gmail.com; yuzenghui ; Zengtao (B) > ; linux...@openeuler.org > Subject: Re: [PATCH v13 00/15] SMMUv3 Nested Stage Setup (IOMMU part) > > Hi Shameer, > On 1/8/21 6:05 PM, Shameerali Kolothum Thodi wrote: > > Hi Eric, > > > >> -Original Message- > >> From: Eric Auger [mailto:eric.au...@redhat.com] > >> Sent: 18 November 2020 11:22 > >> To: eric.auger@gmail.com; eric.au...@redhat.com; > >> iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org; > >> k...@vger.kernel.org; kvm...@lists.cs.columbia.edu; w...@kernel.org; > >> j...@8bytes.org; m...@kernel.org; robin.mur...@arm.com; > >> alex.william...@redhat.com > >> Cc: jean-phili...@linaro.org; zhangfei@linaro.org; > >> zhangfei@gmail.com; vivek.gau...@arm.com; Shameerali Kolothum > >> Thodi ; > >> jacob.jun@linux.intel.com; yi.l@intel.com; t...@semihalf.com; > >> nicoleots...@gmail.com; yuzenghui > >> Subject: [PATCH v13 00/15] SMMUv3 Nested Stage Setup (IOMMU part) > >> > >> This series brings the IOMMU part of HW nested paging support > >> in the SMMUv3. The VFIO part is submitted separately. > >> > >> The IOMMU API is extended to support 2 new API functionalities: > >> 1) pass the guest stage 1 configuration > >> 2) pass stage 1 MSI bindings > >> > >> Then those capabilities gets implemented in the SMMUv3 driver. > >> > >> The virtualizer passes information through the VFIO user API > >> which cascades them to the iommu subsystem. This allows the guest > >> to own stage 1 tables and context descriptors (so-called PASID > >> table) while the host owns stage 2 tables and main configuration > >> structures (STE). > > > > I am seeing an issue with Guest testpmd run with this series. > > I have two different setups and testpmd works fine with the > > first one but not with the second. > > > > 1). Guest doesn't have kernel driver built-in for pass-through dev. > > > > root@ubuntu:/# lspci -v > > ... > > 00:02.0 Ethernet controller: Huawei Technologies Co., Ltd. Device a22e (rev > 21) > > Subsystem: Huawei Technologies Co., Ltd. Device > > Flags: fast devsel > > Memory at 800010 (64-bit, prefetchable) [disabled] [size=64K] > > Memory at 80 (64-bit, prefetchable) [disabled] [size=1M] > > Capabilities: [40] Express Root Complex Integrated Endpoint, MSI 00 > > Capabilities: [a0] MSI-X: Enable- Count=67 Masked- > > Capabilities: [b0] Power Management version 3 > > Capabilities: [100] Access Control Services > > Capabilities: [300] Transaction Processing Hints > > > > root@ubuntu:/# echo vfio-pci > > /sys/bus/pci/devices/:00:02.0/driver_override > > root@ubuntu:/# echo :00:02.0 > /sys/bus/pci/drivers_probe > > > > root@ubuntu:/mnt/dpdk/build/app# ./testpmd -w :00:02.0 --file-prefix > socket0 -l 0-1 -n 2 -- -i > > EAL: Detected 8 lcore(s) > > EAL: Detected 1 NUMA nodes > > EAL: Multi-process socket /var/run/dpdk/socket0/mp_socket > > EAL: Selected IOVA mode 'VA' > > EAL: No available hugepages reported in hugepages-32768kB > > EAL: No available hugepages reported in hugepages-64kB > > EAL: No available hugepages reported in hugepages-1048576kB > > EAL: Probing VFIO support... > > EAL: VFIO support initialized > > EAL: Invalid NUMA socket, default to 0 > > EAL: using IOMMU type 1 (Type 1) > > EAL: Probe PCI driver: net_hns3_vf (19e5:a22e) device: :00:02.0 (socket > 0) > > EAL: No legacy callbacks, legacy socket not created > > Interactive-mode selected > > testpmd: create a new mbuf pool : n=155456, > size=2176, socket=0 > > testpmd: preferred mempool ops selected: ring_mp_mc > > > > Warning! port-topology=paired and odd forward ports number, the last port > will pair with itself. > > > > Configuring Port 0 (socket 0) > > Port 0: 8E:A6:8C:43:43:45 > > Checking link statuses... > > Done > > testpmd> > > > > 2). Guest have kernel driver built-in for pass-through dev. > > > > root@ubuntu:/# lspci -v > > ... > > 00:02.0 Ethernet controller: Huawei Technologies Co., Ltd. Device a22e (rev > 21) > > Subsystem: Huawei Technologies Co., Ltd. Device > > Flags: bus master, fast devsel, latency 0 > > Memory at 800010 (64-bit, prefetchable) [size=64K] > > Memory at 80 (64-bit, prefetchable) [size=1M] > > Capabilities: [40] Express Root Complex Integrated Endpoint, MSI 00 > > Capabilities: [a0] MSI-X: Enable+ Count=67 Masked- > > Capabilities: [b0] Power Management version 3 > > Capabilities: [100] Access Control Se