RE: [EXTERNAL] Re: Question regarding VIOT proposal
Thank you Jean. Yinghan -Original Message- From: Jean-Philippe Brucker Sent: Friday, December 4, 2020 10:09 AM To: Al Stone Cc: Yinghan Yang ; iommu@lists.linux-foundation.org; Alexander Grest ; eric.au...@redhat.com; j...@8bytes.org; kevin.t...@intel.com; lorenzo.pieral...@arm.com; m...@redhat.com; Boeuf, Sebastien Subject: Re: [EXTERNAL] Re: Question regarding VIOT proposal Hi, On Thu, Dec 03, 2020 at 04:01:27PM -0700, Al Stone wrote: > On 03 Dec 2020 22:21, Yinghan Yang wrote: > > Hi Jean, > > > > I'm sorry for the delayed response. I think the new "PCI range node" > > description makes sense. Could you please make this change in the proposal? > > > > Other than that, the proposal looks good to go. Thanks for the feedback, I made the change > > > > Thanks, > > Yinghan > > Jean, were you going to update your existing doc first? If you do > that, then I can cut and paste the changes into the existing ASWG > proposal. Or do you need to send out an RFC to the mailing list first > and finalize it there? I updated the doc: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fjpbrucker.net%2Fvirtio-iommu%2Fviot%2Fviot-v9.pdf&data=04%7C01%7CYinghan.Yang%40microsoft.com%7C91f189f2a0814e6743c308d8987fc809%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C637427022395762927%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=uB0xVHvFdF1wkb2D4KJFW8JMGNtiT3tAsoNVU%2FdLlLA%3D&reserved=0 You can incorporate it into the ASWG proposal. Changes since v8: * One typo (s/programing/programming/) * Modified the PCI Range node to include a segment range. I also updated the Linux and QEMU implementations on branch virtio-iommu/devel in https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fjpbrucker.net%2Fgit%2Flinux%2F&data=04%7C01%7CYinghan.Yang%40microsoft.com%7C91f189f2a0814e6743c308d8987fc809%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C637427022395762927%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=8OS6A%2Bw1r77hiWIhUWGiUU1rZTXh0Qmx%2Fu7LzIIOalo%3D&reserved=0 and https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fjpbrucker.net%2Fgit%2Fqemu%2F&data=04%7C01%7CYinghan.Yang%40microsoft.com%7C91f189f2a0814e6743c308d8987fc809%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C637427022395762927%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=qAX7dTxzkA%2FcqUg2urWipPv%2BCdu5yxuWGt3ndBYlQKU%3D&reserved=0 Thanks again for helping with this Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [EXTERNAL] Re: Question regarding VIOT proposal
On 04 Dec 2020 19:09, Jean-Philippe Brucker wrote: > Hi, > > On Thu, Dec 03, 2020 at 04:01:27PM -0700, Al Stone wrote: > > On 03 Dec 2020 22:21, Yinghan Yang wrote: > > > Hi Jean, > > > > > > I'm sorry for the delayed response. I think the new "PCI range node" > > > description makes sense. Could you please make this change in the > > > proposal? > > > > > > Other than that, the proposal looks good to go. > > Thanks for the feedback, I made the change > > > > > > > Thanks, > > > Yinghan > > > > Jean, were you going to update your existing doc first? If you > > do that, then I can cut and paste the changes into the existing > > ASWG proposal. Or do you need to send out an RFC to the mailing > > list first and finalize it there? > > I updated the doc: https://jpbrucker.net/virtio-iommu/viot/viot-v9.pdf > You can incorporate it into the ASWG proposal. > Changes since v8: > * One typo (s/programing/programming/) > * Modified the PCI Range node to include a segment range. > > I also updated the Linux and QEMU implementations on branch > virtio-iommu/devel in https://jpbrucker.net/git/linux/ and > https://jpbrucker.net/git/qemu/ > > Thanks again for helping with this > > Jean Perfect. Thanks. I'll update the ASWG info right away. -- ciao, al --- Al Stone Software Engineer Red Hat, Inc. a...@redhat.com --- ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [EXTERNAL] Re: Question regarding VIOT proposal
Hi, On Thu, Dec 03, 2020 at 04:01:27PM -0700, Al Stone wrote: > On 03 Dec 2020 22:21, Yinghan Yang wrote: > > Hi Jean, > > > > I'm sorry for the delayed response. I think the new "PCI range node" > > description makes sense. Could you please make this change in the proposal? > > > > Other than that, the proposal looks good to go. Thanks for the feedback, I made the change > > > > Thanks, > > Yinghan > > Jean, were you going to update your existing doc first? If you > do that, then I can cut and paste the changes into the existing > ASWG proposal. Or do you need to send out an RFC to the mailing > list first and finalize it there? I updated the doc: https://jpbrucker.net/virtio-iommu/viot/viot-v9.pdf You can incorporate it into the ASWG proposal. Changes since v8: * One typo (s/programing/programming/) * Modified the PCI Range node to include a segment range. I also updated the Linux and QEMU implementations on branch virtio-iommu/devel in https://jpbrucker.net/git/linux/ and https://jpbrucker.net/git/qemu/ Thanks again for helping with this Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: AMD-Vi: Event logged [IO_PAGE_FAULT device=42:00.0 domain=0x005e address=0xfffffffdf8030000 flags=0x0008]
On Thu, Dec 3, 2020 at 1:18 AM Marc Smith wrote: > > Hi, > > First, I must preface this email by apologizing in advance for asking > about a distro kernel (RHEL in this case); so not truly reporting this > problem and requesting a fix here (I know this should be taken up with > the vendor), rather hoping someone can give me a few hints/pointers on > where to look next for debugging this issue. > > I'm using RHEL 7.8.2003 (CentOS) with a 3.10.0-1127.18.2.el7 kernel. > The systems use a Supermicro H12SSW-NT board (AMD), and we have the > IOMMU enabled along with SR-IOV. I have several virtual machines (QEMU > KVM) that run on these servers, and I'm passing PCIe end-points into > the VMs (in some cases the whole PCIe EP itself, and for some devices > I use SR-IOV and pass in the VFs to the VMs). The VM's run Linux as > their guest OS (a couple different distros). > > While the servers (VMs) are idle, I don't experience any problems. But > when I start doing a lot of I/O in the virtual machines (iSCSI across > Ethernet interfaces, disk I/O via SAS HBAs that are passed into the > VM, etc.) I notice the following after some time at the host layer > ("hypervisor"): > Nov 29 10:50:00 node1 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT > device=42:00.0 domain=0x005e address=0xfffdf803 flags=0x0008] > Nov 29 22:02:03 node1 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT > device=c8:02.1 domain=0x005f address=0xfffdf806 flags=0x0008] > Nov 30 02:13:54 node1 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT > device=42:00.0 domain=0x005e address=0xfffdf802 flags=0x0008] > Nov 30 02:28:44 node1 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT > device=c8:02.0 domain=0x005e address=0xfffdf802 flags=0x0008] > Nov 30 10:48:53 node1 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT > device=01:00.0 domain=0x005e address=0xfffdf804 flags=0x0008] > Dec 2 07:05:22 node1 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT > device=c8:03.0 domain=0x005e address=0xfffdf801 flags=0x0008] > > These events happen to all PCIe devices that are passed into the VMs, > although not all at once... as you can see on the timestamps above, > they are not very frequent when under heavy load (in the log snippet > above, the system was doing a big workload over several days). For the > Ethernet devices that are passed into the VMs, I noticed that they > experience transmit hangs / resets in the virtual machines, and when > these occur, they correspond to a matching IO_PAGE_FAULT that belongs > to that PCI device. > > FWIW, those NIC hangs look like this (visible in the VM guest OS): > [17879.279091] NETDEV WATCHDOG: s1p1 (bnxt_en): transmit queue 2 timed out > [17879.279111] WARNING: CPU: 5 PID: 0 at net/sched/sch_generic.c:447 > dev_watchdog+0x121/0x17e > ... > [17879.279213] bnxt_en :01:09.0 s1p1: TX timeout detected, > starting reset task! > [17883.075299] bnxt_en :01:09.0 s1p1: Resp cmpl intr err msg: 0x51 > [17883.075302] bnxt_en :01:09.0 s1p1: hwrm_ring_free type 1 > failed. rc:fff0 err:0 > [17886.957100] bnxt_en :01:09.0 s1p1: Resp cmpl intr err msg: 0x51 > [17886.957103] bnxt_en :01:09.0 s1p1: hwrm_ring_free type 2 > failed. rc:fff0 err:0 > [17890.843023] bnxt_en :01:09.0 s1p1: Resp cmpl intr err msg: 0x51 > [17890.843025] bnxt_en :01:09.0 s1p1: hwrm_ring_free type 2 > failed. rc:fff0 err:0 > > We see these NIC hangs in the VMs occur with both Broadcom and > Mellanox Ethernet adapters that are passed into the VMs, so I don't > think it's the NICs causing the IO_PAGE_FAULT events observed in the > hypervisor. Plus we see IO_PAGE_FAULT's for devices other than > Ethernet adapters. > > > I have several of these same servers (all using the same motherboard, > processor, memory, BIOS, etc.) and they all experience this behavior > with the IO_PAGE_FAULT events, so I don't believe it to be any one > faulty server / component. I guess my question is I'm not sure where > to dig/push next. Is this perhaps an issue with the BIOS/firmware on > these motherboards? Something with the chipset (AMD IOMMU)? A > colleague has suggested that even the AGESA may be related. Or should > I be focusing on the Linux kernel, the AMD IOMMU driver (software)? > > I've been poking around other similar bug reports, and I see the > IO_PAGE_FAULT and NIC reset / transmit hang seem to be related in > other posts. This commit looked promising: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4e50ce03976fbc8ae995a000c4b10c737467beaa > > But I see RH has already back-ported it into their > 3.10.0-1127.18.2.el7 kernel source. I'm open to trying a newer Linux > vanilla kernel (eg, 5.4.x) but would prefer to resolve this in the > RHEL kernel I'm using now. I'll take a look at this next, although due > to the complex nature of this hypervisor/VM setup, it's a bit tedious > to test. > > > Kernel messages from boot (using the amd_iommu_dump=1 parameter): > ... > [0.214395] AMD-Vi: Usi
Re: [PATCH v13 07/15] iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs
Hi Shameer, Jean-Philippe, On 12/4/20 11:20 AM, Shameerali Kolothum Thodi wrote: > Hi Jean, > >> -Original Message- >> From: Jean-Philippe Brucker [mailto:jean-phili...@linaro.org] >> Sent: 04 December 2020 09:54 >> To: Shameerali Kolothum Thodi >> Cc: Auger Eric ; wangxingang >> ; Xieyingtai ; >> k...@vger.kernel.org; m...@kernel.org; j...@8bytes.org; w...@kernel.org; >> iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org; >> vivek.gau...@arm.com; alex.william...@redhat.com; >> zhangfei@linaro.org; robin.mur...@arm.com; >> kvm...@lists.cs.columbia.edu; eric.auger@gmail.com; Zengtao (B) >> ; qubingbing >> Subject: Re: [PATCH v13 07/15] iommu/smmuv3: Allow stage 1 invalidation with >> unmanaged ASIDs >> >> Hi Shameer, >> >> On Thu, Dec 03, 2020 at 06:42:57PM +, Shameerali Kolothum Thodi wrote: >>> Hi Jean/zhangfei, >>> Is it possible to have a branch with minimum required SVA/UACCE related >> patches >>> that are already public and can be a "stable" candidate for future respin of >> Eric's series? >>> Please share your thoughts. >> >> By "stable" you mean a fixed branch with the latest SVA/UACCE patches >> based on mainline? > > Yes. > > The uacce-devel branches from >> https://github.com/Linaro/linux-kernel-uadk do provide this at the moment >> (they track the latest sva/zip-devel branch >> https://jpbrucker.net/git/linux/ which is roughly based on mainline.) > > Thanks. > > Hi Eric, > > Could you please take a look at the above branches and see whether it make > sense > to rebase on top of either of those? > > From vSVA point of view, it will be less rebase hassle if we can do that. Sure. I will rebase on top of this ;-) Thanks Eric > > Thanks, > Shameer > >> Thanks, >> Jean > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v13 07/15] iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs
Hi Jean, > -Original Message- > From: Jean-Philippe Brucker [mailto:jean-phili...@linaro.org] > Sent: 04 December 2020 09:54 > To: Shameerali Kolothum Thodi > Cc: Auger Eric ; wangxingang > ; Xieyingtai ; > k...@vger.kernel.org; m...@kernel.org; j...@8bytes.org; w...@kernel.org; > iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org; > vivek.gau...@arm.com; alex.william...@redhat.com; > zhangfei@linaro.org; robin.mur...@arm.com; > kvm...@lists.cs.columbia.edu; eric.auger@gmail.com; Zengtao (B) > ; qubingbing > Subject: Re: [PATCH v13 07/15] iommu/smmuv3: Allow stage 1 invalidation with > unmanaged ASIDs > > Hi Shameer, > > On Thu, Dec 03, 2020 at 06:42:57PM +, Shameerali Kolothum Thodi wrote: > > Hi Jean/zhangfei, > > Is it possible to have a branch with minimum required SVA/UACCE related > patches > > that are already public and can be a "stable" candidate for future respin of > Eric's series? > > Please share your thoughts. > > By "stable" you mean a fixed branch with the latest SVA/UACCE patches > based on mainline? Yes. The uacce-devel branches from > https://github.com/Linaro/linux-kernel-uadk do provide this at the moment > (they track the latest sva/zip-devel branch > https://jpbrucker.net/git/linux/ which is roughly based on mainline.) Thanks. Hi Eric, Could you please take a look at the above branches and see whether it make sense to rebase on top of either of those? >From vSVA point of view, it will be less rebase hassle if we can do that. Thanks, Shameer > Thanks, > Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v13 07/15] iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs
Hi Shameer, On Thu, Dec 03, 2020 at 06:42:57PM +, Shameerali Kolothum Thodi wrote: > Hi Jean/zhangfei, > Is it possible to have a branch with minimum required SVA/UACCE related > patches > that are already public and can be a "stable" candidate for future respin of > Eric's series? > Please share your thoughts. By "stable" you mean a fixed branch with the latest SVA/UACCE patches based on mainline? The uacce-devel branches from https://github.com/Linaro/linux-kernel-uadk do provide this at the moment (they track the latest sva/zip-devel branch https://jpbrucker.net/git/linux/ which is roughly based on mainline.) Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu