As pointed out by you, PCIe link might not yet have completed the training. Maybe checking the link training completion bit until a certain timeout can be an ideal solution.
Try the diff below: diff --git a/src/device/pci_device.c b/src/device/pci_device.c index 4b5e73b806..c96e9f1e9d 100644 --- a/src/device/pci_device.c +++ b/src/device/pci_device.c @@ -1213,6 +1213,7 @@ static void pci_scan_hidden_device(struct device *dev) * @param min_devfn Minimum devfn to look at in the scan, usually 0x00. * @param max_devfn Maximum devfn to look at in the scan, usually 0xff. */ +#define PCIE_TRAIN_RETRY 10000 void pci_scan_bus(struct bus *bus, unsigned int min_devfn, unsigned int max_devfn) { @@ -1254,6 +1255,14 @@ void pci_scan_bus(struct bus *bus, unsigned int min_devfn, continue; } + /* Wait for training to complete */ + u16 lnk, try, cap = pci_find_capability(dev, PCI_CAP_ID_PCIE); + for (try = PCIE_TRAIN_RETRY; try > 0; try--) { + lnk = pci_read_config16(dev, cap + PCI_EXP_LNKSTA); + if (!(lnk & PCI_EXP_LNKSTA_LT)) + break; + udelay(100); + } /* See if a device is present and setup the device structure. */ dev = pci_probe_dev(dev, bus, devfn); Regards, Naresh On Tue, Sep 21, 2021 at 4:48 PM Sumo <kingsu...@gmail.com> wrote: > Hi, > > Anything else can be tried instead of using simple delays? > Perhaps during the enumeration for some devices a delay is really > required, for the NVMe case the vendor/device ID pair isn't detected at > all... I'm not sure if the PCIe link is still training (I don't have an > analyzer) or if the device is still booting... > > Kind regards, > Sumo > > On Tue, Aug 17, 2021 at 8:34 AM Sumo <kingsu...@gmail.com> wrote: > >> Hi, >> >> I have managed to disable the UART console and collect the logs via cbmem >> tool. Therefore it will not add any additional delay even with debug logs >> enabled so we can compare the logs with and without the delay. >> >> Here are my findings: >> - without the delay in dev_enumerate() the NVMe because isn't detected at >> all (i.e. the PCIe device isn't shown in the bus): >> PCI: 00:0b.0 scanning... >> do_pci_scan_bridge for PCI: 00:0b.0 >> PCI: 00:0b.0: Enabled LTR >> PCI: pci_scan_bus for bus 04 >> POST: 0x24 >> >> *PCI: Static device PCI: 04:00.0 not found, disabling it.*POST: 0x25 >> PCI: Leftover static devices: >> PCI: 04:00.0 >> PCI: Check your devicetree.cb. >> POST: 0x55 >> scan_bus: bus PCI: 00:0b.0 finished in 0 msecs >> >> - by adding the delay the device is detected and initialized: >> PCI: 00:0b.0 scanning... >> do_pci_scan_bridge for PCI: 00:0b.0 >> PCI: 00:0b.0: Enabled LTR >> PCI: pci_scan_bus for bus 04 >> POST: 0x24 >> >> *PCI: 04:00.0 [1987/5012] enabled*POST: 0x25 >> POST: 0x55 >> Enabling Common Clock Configuration >> PCIE CLK PM is not supported by endpoint >> L1 Sub-State supported from root port 11 >> L1 Sub-State Support = 0xf >> CommonModeRestoreTime = 0x28 >> Power On Value = 0x6, Power On Scale = 0x1 >> ASPM: Enabled L1 >> PCIe: Max_Payload_Size adjusted to 256 >> PCI: 04:00.0: Enabled LTR >> PCI: 04:00.0: Programmed LTR max latencies >> scan_bus: bus PCI: 00:0b.0 finished in 0 msecs >> >> Also, another device is failing if I remove the delay - it's a I211 >> gigabit ethernet controller: >> PCI: 00:0f.0 scanning... >> do_pci_scan_bridge for PCI: 00:0f.0 >> PCI: 00:0f.0: Enabled LTR >> PCI: pci_scan_bus for bus 05 >> POST: 0x24 >> >> *PCI: Static device PCI: 05:00.0 not found, disabling it.*POST: 0x25 >> PCI: Leftover static devices: >> PCI: 05:00.0 >> PCI: Check your devicetree.cb. >> POST: 0x55 >> scan_bus: bus PCI: 00:0f.0 finished in 0 msecs >> >> With the delay the I211 is detected: >> PCI: 00:0f.0 scanning... >> do_pci_scan_bridge for PCI: 00:0f.0 >> PCI: 00:0f.0: Enabled LTR >> PCI: pci_scan_bus for bus 05 >> POST: 0x24 >> >> *PCI: 05:00.0 [8086/1539] enabled*POST: 0x25 >> POST: 0x55 >> Enabling Common Clock Configuration >> PCIE CLK PM is not supported by endpoint >> ASPM: Enabled L1 >> PCIe: Max_Payload_Size adjusted to 256 >> PCI: 05:00.0: No LTR support >> scan_bus: bus PCI: 00:0f.0 finished in 0 msecs >> >> Full logs are attached. >> >> Kind regards, >> Sumo >> >> On Mon, Aug 16, 2021 at 8:01 PM Sumo <kingsu...@gmail.com> wrote: >> >>> Hi Paul, >>> >>> When logs are (almost) disabled the error isn't shown, so if I add the >>> delay with logs disabled the log output will have almost no difference at >>> all. >>> >>> Following are the logs, including a log with Coreboot debug enabled + no >>> delay. For all logs FSP loglevel is set to NoDebug: >>> - nvme-err.log : no delay; coreboot debug_level=Error; NVMe error: at >>> the end of the log is shown the error in the UEFI FW: >>> ERROR: C40000002:V02010007 I0 93B80004-9FB3-11D4-9A3A-0090273FC14D >>> 7E90A998; >>> - nvme-ok-delay.log : 20ms delay; coreboot debug_level=Error; NVMe ok; >>> - nvme-ok.log : no delay; coreboot debug_level=Spew; NVMe ok: the >>> coreboot log output is enough to make NVMe work properly; >>> >>> The NVMe is in the root port 00:0b.0, it is shown as 04:00.0 >>> >>> Thanks, >>> Sumo >>> >>> On Mon, Aug 16, 2021 at 2:57 PM Paul Menzel <pmen...@molgen.mpg.de> >>> wrote: >>> >>>> Dear Sumo, >>>> >>>> >>>> Am 16.08.21 um 18:38 schrieb Sumo: >>>> >>>> > The NVMe is not detected when serial console logs are disabled, I >>>> mean by >>>> > setting both Coreboot log_level=Error (or less) and FSP >>>> > PcdFspDebugPrintErrorLevel=NoDebug. Looks like the enumeration fails >>>> then >>>> > further on the device is not listed in the UEFI FW (same issue shown >>>> in >>>> > either CorebootPayloadPkg or UefiPayloadPkg). When Linux boots the >>>> device >>>> > appears normally. >>>> > >>>> > The problem is fixed by adding a small delay inside dev_enumerate() - >>>> a >>>> > 20ms delay at the very beginning of the function is enough. I'm >>>> wondering >>>> > if there is a better solution for this, the device is already defined >>>> in >>>> > the devicetree.cb (set as on). Maybe coreboot is too fast and the >>>> NVMe is >>>> > still booting up - or the PCIe link is still training, not sure. >>>> Coreboot >>>> > doesn't retry if the device is not detected right away? >>>> >>>> Please share the logs without and with the delay. >>>> >>>> >>>> Kind regards, >>>> >>>> Paul >>>> >>> _______________________________________________ > coreboot mailing list -- coreboot@coreboot.org > To unsubscribe send an email to coreboot-le...@coreboot.org > -- Best regards, Naresh G. Solanki
_______________________________________________ coreboot mailing list -- coreboot@coreboot.org To unsubscribe send an email to coreboot-le...@coreboot.org