On Tuesday, August 6, 2019 3:36:38 PM CEST Bjorn Helgaas wrote:
> On Mon, Aug 05, 2019 at 11:02:51PM +0200, Rafael J. Wysocki wrote:
> > On Mon, Aug 5, 2019 at 10:52 PM Bjorn Helgaas <helg...@kernel.org> wrote:
> > >
> > > pci_check_pme_status() reads the Power Management capability to determine
> > > whether a device has generated a PME.  The capability is in config space,
> > > which is accessible in D0, D1, D2, and D3hot, but not in D3cold.
> > >
> > > If we call pci_check_pme_status() on a device that's in D3cold, config
> > > reads fail and return ~0 data, which we erroneously interpreted as "the
> > > device has generated a PME".
> > >
> > > 000dd5316e1c ("PCI: Do not poll for PME if the device is in D3cold")
> > > avoided many of these problems by avoiding pci_check_pme_status() if we
> > > think the device is in D3cold.  However, it is not a complete fix because
> > > the device may go to D3cold after we check its power state but before
> > > pci_check_pme_status() reads the Power Management Status Register.
> > >
> > > Return false ("device has not generated a PME") if we get an error 
> > > response
> > > reading the Power Management Status Register.
> > >
> > > Fixes: 000dd5316e1c ("PCI: Do not poll for PME if the device is in 
> > > D3cold")
> > > Fixes: 71a83bd727cc ("PCI/PM: add runtime PM support to PCIe port")
> > > Signed-off-by: Bjorn Helgaas <bhelg...@google.com>
> > > ---
> > >  drivers/pci/pci.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > index 984171d40858..af6a97d7012b 100644
> > > --- a/drivers/pci/pci.c
> > > +++ b/drivers/pci/pci.c
> > > @@ -2008,6 +2008,9 @@ bool pci_check_pme_status(struct pci_dev *dev)
> > >
> > >         pmcsr_pos = dev->pm_cap + PCI_PM_CTRL;
> > >         pci_read_config_word(dev, pmcsr_pos, &pmcsr);
> > > +       if (pmcsr == (u16) PCI_ERROR_RESPONSE)
> > > +               return false;
> > 
> > No, sorry.
> > 
> > We tried that and it didn't work.
> > 
> > pcie_pme_handle_request() depends on this returning "true" for all
> > bits set, as from its perspective "device is not accessible" may very
> > well mean "device may have signaled PME".
> 
> Right, it's obviously wrong in the case of devices that advertise
> D3cold in PME_Support, i.e., devices that can generate PME even with
> main power off.  Also, we may want to try to wake up devices if the
> config read fails for a reason other than the device being in D3cold.
> 
> What I don't like about the current code is that it checks
> PCI_PM_CTRL_PME_STATUS in data that may be completely bogus.

Whether or not the other bits in the register make sense doesn't
matter here.  Only the PME_STATUS bit matters.

> Do you think it would be better to do something like this:
> 
>   pci_read_config_word(dev, pmcsr_pos, &pmcsr);
>   if (pmcsr == (u16) PCI_ERROR_RESPONSE) {
>     if (pci_pme_capable(dev, PCI_PM_CAP_PME_D3cold))
>       return true;
>     return false;
>   }
> 
> or maybe this:
> 
>   pci_read_config_word(dev, pmcsr_pos, &pmcsr);
>   if (pmcsr == (u16) PCI_ERROR_RESPONSE)
>     return true;

In this case it still would be prudent to check PME_ENABLE before
returning true and so there is no practical difference between
ERROR_RESPONSE and the valid data with PME_STATUS set.

Except that in the ERROR_RESPONSE case we may as well avoid the
PMCSR write which is not going to make a difference.

> We should get PCI_ERROR_RESPONSE pretty reliably from devices in
> D3cold, so the first possibility would cover that case.
>
> But since pci_check_pme_status() basically returns a hint ("true"
> means a device *may* have generated a PME), and even if the hint is
> wrong, the worst that happens is an unnecessary wakeup, maybe the
> second possibility is safer.
> 
> What do you think?

So if you really want to avoid the PMCSR write in the ERROR_RESPONSE case,
something like this can be done IMO:

                        return false;
 
        /* Clear PME status. */
-       pmcsr |= PCI_PM_CTRL_PME_STATUS;
        if (pmcsr & PCI_PM_CTRL_PME_ENABLE) {
+               if (pmcsr == (u16) PCI_ERROR_RESPONSE)
+                       return true;
+
                /* Disable PME to avoid interrupt flood. */
                pmcsr &= ~PCI_PM_CTRL_PME_ENABLE;
                ret = true;




Reply via email to