On Fri, Oct 30, 2015 at 08:40:52AM -0700, Alexander Duyck wrote:
>On 10/29/2015 08:48 PM, Wei Yang wrote:
>>On Thu, Oct 29, 2015 at 05:22:54PM -0500, Bjorn Helgaas wrote:
>>>ines: 115
>>>
>>>From: Alexander Duyck <[email protected]>
>>>
>>>The enumeration path should leave NumVFs set to zero.  But after
>>>4449f079722c ("PCI: Calculate maximum number of buses required for VFs"),
>>>we call virtfn_max_buses() in the enumeration path, which changes NumVFs.
>>>This NumVFs change is visible via lspci and sysfs until a driver enables
>>>SR-IOV.
>>>
>>>Iterate from TotalVFs down to zero so NumVFs is zero when we're finished
>>>computing the maximum number of buses.  Validate offset and stride in
>>>the loop, so we can test it at every possible NumVFs setting.  Rename
>>>virtfn_max_buses() to compute_max_vf_buses() to hint that it does have a
>>>side effect of updating iov->max_VF_buses.
>>>
>>>[bhelgaas: changelog, rename, allow numVF==1 && stride==0, rework loop,
>>>reverse sense of error path]
>>>Fixes: 4449f079722c ("PCI: Calculate maximum number of buses required for 
>>>VFs")
>>>Based-on-patch-by: Ethan Zhao <[email protected]>
>>>Signed-off-by: Alexander Duyck <[email protected]>
>>>Signed-off-by: Bjorn Helgaas <[email protected]>
>>>---
>>>drivers/pci/iov.c |   41 ++++++++++++++++++++++-------------------
>>>1 file changed, 22 insertions(+), 19 deletions(-)
>>>
>>>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>>>index 0cdb2d1..1b1acc2 100644
>>>--- a/drivers/pci/iov.c
>>>+++ b/drivers/pci/iov.c
>>>@@ -54,24 +54,29 @@ static inline void pci_iov_set_numvfs(struct pci_dev 
>>>*dev, int nr_virtfn)
>>>  * The PF consumes one bus number.  NumVFs, First VF Offset, and VF Stride
>>>  * determine how many additional bus numbers will be consumed by VFs.
>>>  *
>>>- * Iterate over all valid NumVFs and calculate the maximum number of bus
>>>- * numbers that could ever be required.
>>>+ * Iterate over all valid NumVFs, validate offset and stride, and calculate
>>>+ * the maximum number of bus numbers that could ever be required.
>>>  */
>>>-static inline u8 virtfn_max_buses(struct pci_dev *dev)
>>>+static int compute_max_vf_buses(struct pci_dev *dev)
>>>{
>>>     struct pci_sriov *iov = dev->sriov;
>>>-    int nr_virtfn;
>>>-    u8 max = 0;
>>>-    int busnr;
>>>+    int nr_virtfn, busnr, rc = 0;
>>>
>>>-    for (nr_virtfn = 1; nr_virtfn <= iov->total_VFs; nr_virtfn++) {
>>>+    for (nr_virtfn = iov->total_VFs; nr_virtfn; nr_virtfn--) {
>>Hmm... I agree to iterate the nr_virtfn from total_VFs to 0, which keep the
>>original logic, before my patch.
>>
>>
>>>             pci_iov_set_numvfs(dev, nr_virtfn);
>>>+            if (!iov->offset || (nr_virtfn > 1 && !iov->stride)) {
>>                                      ^^^
>>
>>Should be this?
>>                 if (!iov->offset || (iov->total_VFs > 1 && !iov->stride))
>
>The problem is the spec says stride and offset can change depending on the
>value of NumVFs.  We end up testing all values from TotalVFs to 1.  The spec
>states that stride is unused if NumVFs is 1, not TotalVFs.
>

Yes, I checked the SPEC again, and found this change is correct.

>>
>>>+                    rc = -EIO;
>>>+                    goto out;
>>>+            }
>>>+
>>>             busnr = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
>>>-            if (busnr > max)
>>>-                    max = busnr;
>>>+            if (busnr > iov->max_VF_buses)
>>>+                    iov->max_VF_buses = busnr;
>>>     }
>>>
>>>-    return max;
>>>+out:
>>>+    pci_iov_set_numvfs(dev, 0);
>>>+    return rc;
>>>}
>>>
>>>static struct pci_bus *virtfn_add_bus(struct pci_bus *bus, int busnr)
>>>@@ -384,7 +389,7 @@ static int sriov_init(struct pci_dev *dev, int pos)
>>>     int rc;
>>>     int nres;
>>>     u32 pgsz;
>>>-    u16 ctrl, total, offset, stride;
>>>+    u16 ctrl, total;
>>>     struct pci_sriov *iov;
>>>     struct resource *res;
>>>     struct pci_dev *pdev;
>>>@@ -410,11 +415,6 @@ static int sriov_init(struct pci_dev *dev, int pos)
>>>
>>>found:
>>>     pci_write_config_word(dev, pos + PCI_SRIOV_CTRL, ctrl);
>>>-    pci_write_config_word(dev, pos + PCI_SRIOV_NUM_VF, 0);
>>>-    pci_read_config_word(dev, pos + PCI_SRIOV_VF_OFFSET, &offset);
>>>-    pci_read_config_word(dev, pos + PCI_SRIOV_VF_STRIDE, &stride);
>>>-    if (!offset || (total > 1 && !stride))
>>>-            return -EIO;
>>>
>>Original code will return when it found this condition, so that we don't need
>>to bother to do those initialization and then fall back.
>>
>>So I suggest to keep it here.
>
>The problem is this code isn't valid as per the spec.  The offset and stride
>are considered unused when numvfs is 0.  That is why this has to be dropped.
>
>>>     pci_read_config_word(dev, pos + PCI_SRIOV_TOTAL_VF, &total);
>>>     if (!total)
>>>@@ -456,8 +456,6 @@ found:
>>>     iov->nres = nres;
>>>     iov->ctrl = ctrl;
>>>     iov->total_VFs = total;
>>>-    iov->offset = offset;
>>>-    iov->stride = stride;
>>>     iov->pgsz = pgsz;
>>>     iov->self = dev;
>>>     pci_read_config_dword(dev, pos + PCI_SRIOV_CAP, &iov->cap);
>>>@@ -474,10 +472,15 @@ found:
>>>
>>>     dev->sriov = iov;
>>>     dev->is_physfn = 1;
>>>-    iov->max_VF_buses = virtfn_max_buses(dev);
>>>+    rc = compute_max_vf_buses(dev);
>>>+    if (rc)
>>>+            goto fail_max_buses;
>>>
>>>     return 0;
>>>
>>>+fail_max_buses:
>>>+    dev->sriov = NULL;
>>The dev->sriov is allocated with kzalloc(), seems we forget to release it?
>
>No, if you check at the end of the function we release it via a kfree(iov).
>

Right.

>>>+    dev->is_physfn = 0;
>>>failed:
>>>     for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
>>>             res = &dev->resource[i + PCI_IOV_R

-- 
Richard Yang
Help you, Help me

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to