On Fri, 2015-21-08 at 07:25:15 UTC, Daniel Axtens wrote: > cxl_reset currently PERSTs the slot, and then repeatedly tries to > read MMIO space in order to kick off EEH. > > There are 2 problems with this: it's unnecessary, and it's racy. > > It's unnecessary because the PERST will bring down the PHB link. > That will be picked up by the CAPP, which will send out an HMI. > Skiboot, noticing an HMI from the CAPP, will send an OPAL > notification to the kernel, which will trigger EEH recovery. > > It's also racy: the EEH recovery triggered by the CAPP will > eventually cause the MMIO space to have its mapping invalidated > and the pointer NULLed out. This races with our attempt to read > the MMIO space. This is causing OOPSes in testing. > > Simply drop all the attempts to force EEH detection, and trust > that Skiboot will send the notification and that we'll act on it. > The Skiboot code to send the EEH notification has been in Skiboot > for as long as CAPP recovery has been supported, so we don't need > to worry about breaking obscure setups with ancient firmware. > > Cc: Ryan Grimm <[email protected]> > Cc: [email protected] > Fixes: 62fa19d4b4fd ("cxl: Add ability to reset the card") > Signed-off-by: Daniel Axtens <[email protected]> > Acked-by: Ian Munsie <[email protected]>
Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/9d8e27673c45927fee9e7d89 cheers _______________________________________________ Linuxppc-dev mailing list [email protected] https://lists.ozlabs.org/listinfo/linuxppc-dev
