> From: Bjorn Helgaas <helg...@kernel.org>
> Sent: Tuesday, October 8, 2019 12:56 PM
> ...
> Wordsmithing nit: what the patch does is not "fix the error message";
> what it does is fix the *problem*, i.e., the fact that we can't
> operate the device because we can't enable MSI-X.  The message is only
> a symptom.

I totally agree. :-)

> IIUC the relevant part of the system hibernation sequence is:
> 
>   pci_pm_freeze_noirq
>   pci_pm_thaw_noirq
>   pci_pm_thaw
> 
> And the execution flow is:
> 
>   pci_pm_freeze_noirq
>     if (pci_has_legacy_pm_support(pci_dev)) # true for mlx4
>       pci_legacy_suspend_late(dev, PMSG_FREEZE)
>       pci_pm_set_unknown_state
>         dev->current_state = PCI_UNKNOWN  # <---
>   pci_pm_thaw_noirq
>     if (pci_has_legacy_pm_support(pci_dev)) # true
>       pci_legacy_resume_early(dev)          # noop; mlx4 doesn't
> implement
>   pci_pm_thaw                               # returns -95
> EOPNOTSUPP
>     if (pci_has_legacy_pm_support(pci_dev)) # true
>       pci_legacy_resume
>       drv->resume
>         mlx4_resume                       # mlx4_driver.resume (legacy)
>           mlx4_load_one
>             mlx4_enable_msi_x
>               pci_enable_msix_range
>                 __pci_enable_msix_range
>                   __pci_enable_msix
>                     if (!pci_msi_supported())
>                       if (dev->current_state != PCI_D0)  # <---
>                         return 0
>                       return -EINVAL
>               err = -EOPNOTSUPP
>               "INTx is not supported ..."
> 
> (These are just my notes; you don't need to put them all into the
> commit message.  I'm just sharing them in case I'm not understanding
> correctly.)

Yes, these notes are accurate.

> > > > > When the system starts again, a fresh kernel starts to run, and when 
> > > > > the
> > > > > kernel detects that a hibernation image was saved, the kernel
> "quiesces"
> > > > > the devices, and then "restores" the devices from the saved image. In
> this
> > > > > path:
> > > > > device_resume_noirq() -> ... ->
> > > > >    pci_pm_restore_noirq() ->
> > > > >      pci_pm_default_resume_early() ->
> > > > >        pci_power_up() moves the device states back to PCI_D0. This
> path is
> > > > > not broken and doesn't need my patch.
> > > > >
> 
> The cc list suggests that this might be a fix for a user-reported
> problem.  Is there a launchpad or similar link you could include here?

I guess I'm the first one to notice the issue and there is not any bug link 
AFAIK.

The hibernation process usually saves the states into a local disk (before the
system is powered off), and the Mellanox NIC is not needed during the process,
so it's not a real issue that the NIC can not work between pci_pm_thaw() and 
power_down(). This may explain why nobody else noticed the issue. I happened
to see the error message, and hence investigated the issue.

> Should this be marked for stable?

I think we should do it.
 
> > > > > --- a/drivers/pci/pci-driver.c
> > > > > +++ b/drivers/pci/pci-driver.c
> > > > > @@ -1074,15 +1074,16 @@ static int pci_pm_thaw_noirq(struct device
> > > > *dev)
> > > > >                       return error;
> > > > >       }
> > > > >
> > > > > -     if (pci_has_legacy_pm_support(pci_dev))
> > > > > -             return pci_legacy_resume_early(dev);
> > > > > -
> > > > >       /*
> > > > >        * pci_restore_state() requires the device to be in D0 (because
> of MSI
> > > > >        * restoration among other things), so force it into D0 in case
> the
> > > > >        * driver's "freeze" callbacks put it into a low-power state
> directly.
> > > > >        */
> > > > >       pci_set_power_state(pci_dev, PCI_D0);
> > > > > +
> > > > > +     if (pci_has_legacy_pm_support(pci_dev))
> > > > > +             return pci_legacy_resume_early(dev);
> > > > > +
> > > > >       pci_restore_state(pci_dev);
> > > > >
> > > > >       if (drv && drv->pm && drv->pm->thaw_noirq)
> > > > > --
> > > > > 2.19.1
> > > > >
> > The patch looks reasonable to me, but the comment above the
> > pci_set_power_state() call needs to be updated too IMO.
> 
> Hmm.
> 
> 1) pci_restore_state() mainly writes config space, which doesn't
> require the device to be in D0.  The only thing I see that would
> require D0 is the MSI-X MMIO space, so to be more specific, the
> comment could say "restoring the MSI-X *MMIO* state requires the
> device to be in D0".
> 
> But I think you meant some other comment change.  Did you mean
> something along the lines of "a legacy drv->resume_early() callback
> and pci_restore_state() both require the device to be in D0"?
> 
> If something else, maybe you could propose some text?
> 
> 2) I assume pci_pm_thaw_noirq() should leave the device in a
> functionally equivalent state, whether it uses legacy PM or not.  Do
> we want something like the patch below instead?  If we *do* want to
> skip pci_restore_state() for legacy PM, maybe we should add a comment.
> 
> 3) Documentation/power/pci.rst says:
> 
>   ... devices have to be brought back to the fully functional
>   state ...
> 
>   pci_pm_thaw_noirq() ... doesn't put the device into the full power
>   state and doesn't attempt to restore its standard configuration
>   registers.
> 
> That doesn't seem consistent, and it looks like pci_pm_thaw_noirq()
> actually *does* put the device in full power (D0) state and restore
> config registers.

I would leave these questions to Rafael.
 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index a8124e47bf6e..30c721fd6bcf 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -1068,7 +1068,7 @@ static int pci_pm_thaw_noirq(struct device *dev)
>  {
>       struct pci_dev *pci_dev = to_pci_dev(dev);
>       struct device_driver *drv = dev->driver;
> -     int error = 0;
> +     int error;
> 
>       if (pcibios_pm_ops.thaw_noirq) {
>               error = pcibios_pm_ops.thaw_noirq(dev);
> @@ -1076,9 +1076,6 @@ static int pci_pm_thaw_noirq(struct device *dev)
>                       return error;
>       }
> 
> -     if (pci_has_legacy_pm_support(pci_dev))
> -             return pci_legacy_resume_early(dev);
> -
>       /*
>        * pci_restore_state() requires the device to be in D0 (because of MSI
>        * restoration among other things), so force it into D0 in case the
> @@ -1087,10 +1084,13 @@ static int pci_pm_thaw_noirq(struct device *dev)
>       pci_set_power_state(pci_dev, PCI_D0);
>       pci_restore_state(pci_dev);
> 
> +     if (pci_has_legacy_pm_support(pci_dev))
> +             return pci_legacy_resume_early(dev);
> +
>       if (drv && drv->pm && drv->pm->thaw_noirq)
> -             error = drv->pm->thaw_noirq(dev);
> +             return drv->pm->thaw_noirq(dev);
> 
> -     return error;
> +     return 0;
>  }
> 
>  static int pci_pm_thaw(struct device *dev)

The only real difference from my patch is that you moved

 +      if (pci_has_legacy_pm_support(pci_dev))
 +              return pci_legacy_resume_early(dev);

to after the line "pci_restore_state(pci_dev);"

This change is good to me, and shoud also resolve the error message I saw.

Thanks,
-- Dexuan

Reply via email to