Le 22/11/2017 à 13:17, Vaibhav Jain a écrit :
During an eeh a kernel-oops is reported if no vPHB to allocated to the

typo, "to allocated". Not important, but....

AFU. This happens as during AFU init, an error in creation of vPHB is
a non-fatal error. Hence afu->phb should always be checked for NULL
before iterating over it for the virtual AFU pci devices.

This patch fixes the kenel-oops by adding a NULL pointer check for
afu->phb before it is dereferenced.

Fixes: 9e8df8a2196("cxl: EEH support")
Cc: sta...@vger.kernel.org
Signed-off-by: Vaibhav Jain <vaib...@linux.vnet.ibm.com>
---
Changelog:

Resend -> Added the 'Fixes' info and marking the patch to stable tree [Mpe]
v2     -> Added the vphb NULL check to cxl_vphb_error_detected() [Andrew]
---
  drivers/misc/cxl/pci.c | 12 ++++++++++--
  1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index bb7fd3f4edab..18773343ab3e 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -2083,6 +2083,9 @@ static pci_ers_result_t cxl_vphb_error_detected(struct 
cxl_afu *afu,
        /* There should only be one entry, but go through the list
         * anyway
         */
+       if (afu->phb == NULL)
+               return result;
+
        list_for_each_entry(afu_dev, &afu->phb->bus->devices, bus_list) {
                if (!afu_dev->driver)
                        continue;
@@ -2124,8 +2127,7 @@ static pci_ers_result_t cxl_pci_error_detected(struct 
pci_dev *pdev,
                         * Tell the AFU drivers; but we don't care what they
                         * say, we're going away.
                         */
-                       if (afu->phb != NULL)
-                               cxl_vphb_error_detected(afu, state);
+                       cxl_vphb_error_detected(afu, state);
                }
                return PCI_ERS_RESULT_DISCONNECT;
        }
@@ -2265,6 +2267,9 @@ static pci_ers_result_t cxl_pci_slot_reset(struct pci_dev 
*pdev)
                if (cxl_afu_select_best_mode(afu))
                        goto err;

+               if (afu->phb == NULL)
+                       continue;
+
                list_for_each_entry(afu_dev, &afu->phb->bus->devices, bus_list) 
{
                        /* Reset the device context.
                         * TODO: make this less disruptive
@@ -2327,6 +2332,9 @@ static void cxl_pci_resume(struct pci_dev *pdev)
        for (i = 0; i < adapter->slices; i++) {
                afu = adapter->afu[i];

+               if (afu->phb != NULL)
+                       continue;
+

.. that one is more annoying.
afu->phb == NULL?

  Fred

                list_for_each_entry(afu_dev, &afu->phb->bus->devices, bus_list) 
{
                        if (afu_dev->driver && afu_dev->driver->err_handler &&
                            afu_dev->driver->err_handler->resume)


Reply via email to