On Wed, 10 Feb 2021, Rahul Singh wrote: > > On 9 Feb 2021, at 8:36 pm, Stefano Stabellini <sstabell...@kernel.org> > > wrote: > > On Tue, 9 Feb 2021, Rahul Singh wrote: > >>> On 8 Feb 2021, at 6:49 pm, Stefano Stabellini <sstabell...@kernel.org> > >>> wrote: > >>> > >>> Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM. > >>> The offending chunk is: > >>> > >>> #define gnttab_need_iommu_mapping(d) \ > >>> - (is_domain_direct_mapped(d) && need_iommu(d)) > >>> + (is_domain_direct_mapped(d) && need_iommu_pt_sync(d)) > >>> > >>> On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is > >>> directly mapped and IOMMU is enabled for the domain, like the old check > >>> did, but the new check is always false. > >>> > >>> In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and > >>> need_sync is set as: > >>> > >>> if ( !is_hardware_domain(d) || iommu_hwdom_strict ) > >>> hd->need_sync = !iommu_use_hap_pt(d); > >>> > >>> iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the > >>> P2M. It is true on ARM. need_sync means that you have a separate IOMMU > >>> page-table and it needs to be updated for every change. need_sync is set > >>> to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too, > >>> which is wrong. > >>> > >>> As a consequence, when using PV network from a domU on a system where > >>> IOMMU is on from Dom0, I get: > >>> > >>> (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, > >>> iova=0x8424cb148, fsynr=0xb0001, cb=0 > >>> [ 68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK > >>> > >>> The fix is to go back to something along the lines of the old > >>> implementation of gnttab_need_iommu_mapping. > >>> > >>> Signed-off-by: Stefano Stabellini <stefano.stabell...@xilinx.com> > >>> Fixes: 91d4eca7add > >>> Backport: 4.12+ > >>> > >>> --- > >>> > >>> Given the severity of the bug, I would like to request this patch to be > >>> backported to 4.12 too, even if 4.12 is security-fixes only since Oct > >>> 2020. > >>> > >>> For the 4.12 backport, we can use iommu_enabled() instead of > >>> is_iommu_enabled() in the implementation of gnttab_need_iommu_mapping. > >>> > >>> Changes in v2: > >>> - improve commit message > >>> - add is_iommu_enabled(d) to the check > >>> --- > >>> xen/include/asm-arm/grant_table.h | 2 +- > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>> > >>> diff --git a/xen/include/asm-arm/grant_table.h > >>> b/xen/include/asm-arm/grant_table.h > >>> index 6f585b1538..0ce77f9a1c 100644 > >>> --- a/xen/include/asm-arm/grant_table.h > >>> +++ b/xen/include/asm-arm/grant_table.h > >>> @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned long gpaddr, > >>> mfn_t mfn, > >>> (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i]) > >>> > >>> #define gnttab_need_iommu_mapping(d) \ > >>> - (is_domain_direct_mapped(d) && need_iommu_pt_sync(d)) > >>> + (is_domain_direct_mapped(d) && is_iommu_enabled(d)) > >>> > >>> #endif /* __ASM_GRANT_TABLE_H__ */ > >> > >> I tested the patch and while creating the guest I observed the below > >> warning from Linux for block device. > >> https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258 > > > > So you are creating a guest with "xl create" in dom0 and you see the > > warnings below printed by the Dom0 kernel? I imagine the domU has a > > virtual "disk" of some sort. > > Yes you are right I am trying to create the guest with "xl create” and before > that, I created the logical volume and trying to attach the logical volume > block to the domain with “xl block-attach”. I observed this error with the > "xl block-attach” command. > > This issue occurs after applying this patch as what I observed this patch > introduce the calls to iommu_legacy_{, un}map() to map the grant pages for > IOMMU that touches the page-tables. I am not sure but what I observed is that > something is written wrong when iomm_unmap calls unmap the pages > because of that issue is observed. > > You can reproduce the issue by just creating the dummy image file and try to > attach it with “xl block-attach” > > dd if=/dev/zero of=test.img bs=1024 count=0 seek=1024 > xl block-attach 0 phy:test.img xvda w > > Sequence of command that I follow is as follows to reproduce the issue: > > lvs vg-xen/myguest > lvcreate -y -L 4G -n myguest vg-xen > parted -s /dev/vg-xen/myguest mklabel msdos > parted -s /dev/vg-xen/myguest unit MB mkpart primary 1 4096 > sync > xl block-attach 0 phy:/dev/vg-xen/myguest xvda w
Ah! You are block-attaching a device in dom0 to dom0! So frontend and backend are both in the same domain, dom0. Yeah that is not supposed to work, and if it did before it was just by coincidence :-) There are a number of checks in Linux that wouldn't work as intended if the page is coming from itself. This is not an ARM-only issue either. I tried it with dom0, like you did, and I am seeing the same warning. I did try to do block-attach to a domU and it works fine. I don't think this is a concern.