On Mon, Sep 8, 2014 at 10:38 PM, Bjorn Helgaas <bhelg...@google.com> wrote:
> On Tue, Sep 02, 2014 at 04:26:00PM -0700, Rajat Jain wrote:
>>
>> As per the PCIe spec, an endpoint may return the configuration cycles
>> with CRS if it is not yet fully ready to be accessed. If the CRS visibility
>> is not enabled at the root port, the spec leaves the retry behaviour open
>> to implementation in such a case. The Intel root ports have chosen to retry
>> endlessly in this situation. Thus, the root controller may "hang" (repeatedly
>> retrying the configuration requests until it gets a status other than CRS) if
>> a device returns CRS for a long time. This can cause a broken endpoint to 
>> bring
>> down the whole PCI hierarchy.
>>
>> This was recently known to cause problems on Intel systems and
>> was discussed here:
>> http://marc.info/?t=140926298500002&r=1&w=2
>>
>> Ref1:
>> https://www.pcisig.com/specifications/pciexpress/ECN_CRS_Software_Visibility_No27.pdf
>>
>> Ref2:
>> PCIe spec V3.0, pg119, pg127 for "Configuration Request Retry Status"
>>
>> Thus enable the CRS visibility for the root ports that support it. This
>> patch reverts the following commit, but enables CRS visibility only
>> when the root port supports it:
>>
>> ad7edfe04908 ("[PCI] Do not enable CRS Software Visibility by default")
>>
>> (Linus' response: http://marc.info/?l=linux-pci&m=140968622520192&w=2)
>>
>> Signed-off-by: Rajat Jain <rajatxj...@gmail.com>
>> Signed-off-by: Rajat Jain <rajatj...@juniper.net>
>> Signed-off-by: Guenter Roeck <gro...@juniper.net>
>
> I put this and the "only look at Vendor ID" patch on a pci/enumeration
> branch [1].  I rewrote the changelogs to reflect my understanding of what's
> going on, so probably the real truth is somewhere between your original and
> mine.  Let me know what should be fixed.
>
> We should figure out an easy way for Josh to test these.  Ideally, he could
> test the second patch by itself first, then both together.

I just provided the 2 images to Josh:

1) An image with both the patches.
2) An image with just the patch 2 i.e. "Enable CRS visibility" patch.

Lets hope that results tally with our expectations.

Thanks,

Rajat


>
> [1] 
> https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/enumeration
>
>> ---
>> v2: Remove the white list, that was enabling the CRS for certain known Intel 
>> systems.
>>     Rather now enable it for all systems that support this capability.
>> v1: Enable CRS for only some given Intel systems (maintain a whitelist)
>>
>>  drivers/pci/probe.c           |   13 +++++++++++++
>>  include/uapi/linux/pci_regs.h |    1 +
>>  2 files changed, 14 insertions(+)
>>
>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>> index e3cf8a2..3c4c35c 100644
>> --- a/drivers/pci/probe.c
>> +++ b/drivers/pci/probe.c
>> @@ -740,6 +740,17 @@ struct pci_bus *pci_add_new_bus(struct pci_bus *parent, 
>> struct pci_dev *dev,
>>  }
>>  EXPORT_SYMBOL(pci_add_new_bus);
>>
>> +static void pci_enable_crs(struct pci_dev *pdev)
>> +{
>> +     u16 root_cap = 0;
>> +
>> +     /* Enable CRS Software visibility if supported */
>> +     pcie_capability_read_word(pdev, PCI_EXP_RTCAP, &root_cap);
>> +     if (root_cap & PCI_EXP_RTCAP_CRSVIS)
>> +             pcie_capability_set_word(pdev, PCI_EXP_RTCTL,
>> +                                      PCI_EXP_RTCTL_CRSSVE);
>> +}
>> +
>>  /*
>>   * If it's a bridge, configure it and scan the bus behind it.
>>   * For CardBus bridges, we don't scan behind as the devices will
>> @@ -787,6 +798,8 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev 
>> *dev, int max, int pass)
>>       pci_write_config_word(dev, PCI_BRIDGE_CONTROL,
>>                             bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT);
>>
>> +     pci_enable_crs(dev);
>> +
>>       if ((secondary || subordinate) && !pcibios_assign_all_busses() &&
>>           !is_cardbus && !broken) {
>>               unsigned int cmax;
>> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
>> index 30db069..a75106d 100644
>> --- a/include/uapi/linux/pci_regs.h
>> +++ b/include/uapi/linux/pci_regs.h
>> @@ -552,6 +552,7 @@
>>  #define  PCI_EXP_RTCTL_PMEIE 0x0008  /* PME Interrupt Enable */
>>  #define  PCI_EXP_RTCTL_CRSSVE        0x0010  /* CRS Software Visibility 
>> Enable */
>>  #define PCI_EXP_RTCAP                30      /* Root Capabilities */
>> +#define  PCI_EXP_RTCAP_CRSVIS        0x0001  /* CRS software visibility 
>> capability */
>>  #define PCI_EXP_RTSTA                32      /* Root Status */
>>  #define PCI_EXP_RTSTA_PME    0x00010000 /* PME status */
>>  #define PCI_EXP_RTSTA_PENDING        0x00020000 /* PME pending */
>> --
>> 1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to