> -----Original Message----- > From: Hyong Youb Kim (hyonkim) <hyon...@cisco.com> > Sent: Tuesday, July 16, 2019 1:19 PM > To: Jerin Jacob Kollanukkaran <jer...@marvell.com>; David Marchand > <david.march...@redhat.com>; Thomas Monjalon > <tho...@monjalon.net>; Ferruh Yigit <ferruh.yi...@intel.com>; Alejandro > Lucero <alejandro.luc...@netronome.com>; Anatoly Burakov > <anatoly.bura...@intel.com> > Cc: dev@dpdk.org; John Daley (johndale) <johnd...@cisco.com>; Shahed > Shaikh <shsha...@marvell.com>; Nithin Kumar Dabilpuram > <ndabilpu...@marvell.com> > Subject: RE: [RFC PATCH] vfio: avoid re-installing irq handler > > > -----Original Message----- > > From: Jerin Jacob Kollanukkaran <jer...@marvell.com> > [...] > > > > > A rough patch for the approach mentioned earlier. It is only for > > > discussion. > > > > > http://mails.dpdk.org/archives/dev/2019-July/138113.html > > > > > > > > > > To try this out, first revert the following then apply. > > > > > commit 89aac60e0be9 ("vfio: fix interrupts race condition") > > > > > > > > Yes. This patch has to be to reverted. It changes the existing > > > > interrupt behavior and does not address the MSIX case as well. > > > > > > > > I think, The clean fix would be to introduce rte_intr_mask() and > > > > rte_intr_unmask() by abstracting the INTX and MSIX differences And > > > > let qede driver call it as needed. > > > > > > > > Thoughts? > > > > > > Hi, > > > > Hi Hyong, > > > > > > > > You are proposing these? > > > - Add rte_intr_mask_intx, rte_intr_unmask_intx. > > > No APIs for masking MSI/MSI-X as vfio-pci does not support that. > > > - Modify PMD irq handlers to use rte_intr_unmask_intx as necessary. > > > > No, introduce the rte_intr_mask() and rte_intr_unmask(). > > For MSIX + Linux VFIO, That API can return -ENOSUP as Linux VFIO+MSIX > > is not supporting. > > Another platform/eal may support it. > > > > These generic names would invite people to use API, only to see it fail, since > it only works with INTx..
It works for all non VFIO MSIx case. VFIO MSIx case it NOP(Yes, No need to return error in that case) > > > Mask and unmask is operation is known to all IRQ controllers. > > So, IMO, As far as abstraction is concerned it will be good fit. > > > > > That might be too intrusive. And too much work for the sake of INTx.. > > > Anyone really using/needing INTx these days? :-) > > > > Yup. Mask needs to called only for only qede INTx. Looks like qede Has > > MSIX and INTX separate handler. So this mask can go to qede INTx > > > > > > > > The following drivers call rte_intr_enable from their irq handlers. > > > So with explicit rte_intr_unmask_intx, all these would need to do > > > "if using intx, unmask"? > > > > > > atlantic, avp, axgbe, bnx2x, e1000, fm10k, ice, ixgbe, nfp, qede, > > > sfc, > > > vmxnet3 > > > > No change on these PMDs. > > > > Why is that? > > These drivers potentially have the same "lost" interrupt issue mentioned in > the original redhat bz (qede + MSI). I *think* this observation led David to > address them all through vfio changes, rather than fixing qede alone. > > You want to introduce unmask API and use it only for qede in this cycle, and > ask respective maintainers to fix their drivers in 19.11? Changing the rte_intr_enable() to rte_intr_unmask() is trivial on the places Where existing drivers enable as unmask. If we understand it correctly: In case of non VFIO MSIX(INTx) and UIO ------------------------------------------------- AK1) Kernel receives interrupt AK2) Kernel _mask_ the interrupt AK3) Kernel notify the use space On usersapce: AU1) Driver specific interrupt handler invoked AU2) Handle the driver specific interrupt AU3) Call rte_intr_enable(), it will intern call VFIO_IRQ_SET_ACTION_UNMASK using VFIO_DEVICE_SET_IRQS to unmask the interrupt. In case of VFIO MSIX(INTx) ------------------------------------ BK1) Kernel receives interrupt. BK2) Kernel notify the use space On usersapce: BU1) Driver specific interrupt handler invoked BU2) Handle the driver specific interrupt BU3) Call rte_intr_enable(), it will intern call VFIO_IRQ_SET_ACTION_TRIGGER using VFIO_DEVICE_SET_IRQS to unmask the interrupt. VFIO_IRQ_SET_ACTION_TRIGGER: is the nasty one, it will free the existing interrupt handler and request new handler etc. Which is the source of the actual race conditional problem. Ideally BU3 can be just NOP. Since we need to keep the same interrupt handler for both UIO and MSIX(I *think*) DPDK tends to use rte_intr_enable() which can work for AU3/BU3 as well. So we need, A light weight primitive, which unmask the AK2 incase of VFIO INTx by not overriding The meaning of normal rte_intr_enable() which suppose not use for MSIX interrupt in action due to racy behavior of VFIO_IRQ_SET_ACTION_TRIGGER Replacing AU3 and BU3 as rte_intr_unmask() would fix problem. Where rte_intr_unmask() for BU3 is just NOP.