On Tue, Jun 2, 2020 at 11:12 AM Kurt Miller <k...@intricatesoftware.com> wrote: > > On Tue, 2020-06-02 at 10:23 +0800, Shawn Lin wrote: > > > > 在 2020/6/2 9:59, Kever Yang 写道: > > > > > > Hi Kurt, > > > > > > On 2020/6/2 上午4:30, Kurt Miller wrote: > > > > > > > > On at least the RockPro64, many cards will trip a > > > > synchronous abort when first accessing PCIe config space > > > > during bus scanning. A delay after link training allows > > > > some of these cards to function. > > > > > > > > Signed-off-by: Kurt Miller <k...@intricatesoftware.com> > > > > --- > > > > On the RockPro64, some pci cards trip a synchronous abort when > > > > scanning the > > > > pci bus. For example with HighPoint Rocket Raid 640L which is based on > > > > Marvell 88SE9230 I see this: > > > > > > > > => pci > > > > "Synchronous Abort" handler, esr 0x96000210 > > > > elr: 000000000022d034 lr : 000000000022cfd0 (reloc) > > > > elr: 00000000f4568034 lr : 00000000f4567fd0 > > > > x0 : 0000000000100000 x1 : 00000000f8000000 > > > > x2 : 0000000000000000 x3 : 0000000000100000 > > > > x4 : 00000000f2559290 x5 : 0000000000000000 > > > > x6 : 0000000000000001 x7 : 00000000f2559860 > > > > x8 : 0000000000000030 x9 : 0000000000000008 > > > > x10: 0000000000000010 x11: 00000000f251fd1c > > > > x12: 0000000000001421 x13: 0000000000001468 > > > > x14: 00000000f251fd4c x15: 00000000ffffffff > > > > x16: 0000000000060001 x17: 000000000000001f > > > > x18: 00000000f2532dc0 x19: 00000000f251fcd0 > > > > x20: 0000000000000001 x21: 0000000000000000 > > > > x22: 0000000000010000 x23: 00000000f45d4000 > > > > x24: 0000000000000000 x25: 00000000f45bc000 > > > > x26: 0000000000000000 x27: 0000000000000000 > > > > x28: 00000000f2541440 x29: 00000000f251fc20 > > > > > > > > Code: 540000c1 350000a5 93407c00 f9400081 (b8616800) > > > > Resetting CPU ... > > > > > > > > Adding a delay after link training works-around the problem. I added > > > > this > > > > delay to the OpenBSD rkpcie driver as well: > > > > > > > > https://github.com/openbsd/src/commit/9857dee3520d8ca5bec68538f4b0708d7e64fc87 > > > > > > > > > > > > HighPoint Rocket Raid 640L needs a 1.75 sec delay and Crossfield > > > > SAS9211-4i > > > > needs a 1 second delay, so I arbitrarily decided on 2 seconds. > > > > > > > > The delay work-around was originally discovered by nuumio: > > > > https://github.com/nuumio/linux-kernel/commit/5a65b17686002dc84d461bffa324a2cb68e67aee > > > > > > > > > > > > drivers/pci/pcie_rockchip.c | 8 ++++++++ > > > > 1 file changed, 8 insertions(+) > > > > > > > > diff --git a/drivers/pci/pcie_rockchip.c b/drivers/pci/pcie_rockchip.c > > > > index 0edc2464a8..51cfbf6b18 100644 > > > > --- a/drivers/pci/pcie_rockchip.c > > > > +++ b/drivers/pci/pcie_rockchip.c > > > > @@ -288,6 +288,14 @@ static int rockchip_pcie_init_port(struct udevice > > > > *dev) > > > > goto err_power_off_phy; > > > > } > > > > + /* > > > > + * XXX: On at least the RockPro64, many cards will trip a > > > > + * synchronous abort when first accessing PCIe config space > > > > + * during bus scanning. A delay after link training allows > > > > + * some of these cards to function. > > > > + */ > > > > + mdelay(2000); > > > I don't understand what kind of delay for init needs 2 Seconds, root > > > cause will preferred. > > Hi Kever, > > While working on this issue for the OpenBSD PCIe driver I was not > able to determine the root cause. I tested the following adapters: > > ROCKPro64 2 Port SATA > StarTech PEXSAT32 2 Port SATA > Samsung 970 Evo NVMe w/m.2 adapter > IO CREST SI-PEX40148 2 Port SATA > IO CREST SI-PEX40057 4 port > HighPoint Rocket Raid 640L > Crossfield SAS9211-4i > Del PERC H200 > Dell PERC 6/i > Intel Gigabit VT Quad Port Server > > All of the above adapters successfully link trained, however > three of them would panic upon the first read of the PCI config > space. nuumio's work-around of placing a delay after link-training > allows these cards to work. > > HighPoint Rocket Raid 640L ~1.75 sec delay > Crossfield SAS9211-4i ~1 sec delay > Dell PERC H200 ~1 sec delay > > In attempt to determine if there was a way to detect how long > to wait, I compared every status and debug register documented > in the rk3399 TRM part 2 for the PCI controller. I compared the > values pre-delay and post-delay. I was not able to find a value > that would indicate it was safe to proceed. > > > Strictly speaking, how long should we need for this had been provided, > > see this: > > > > https://patchwork.kernel.org/patch/11561977/ > > > > If you need more delay, I highly suspect you should check if > > the power supply is stable before enabling training. > > Hi Shawn, > > Thank you for pointing out the need for the 100ms delay before > beginning link-training. I believe this is to correct link > training failures, while the delay after link training is > to work-around post link training reading of PCI config > space on certain cards.
There are similar issues on the Linux driver with various cards randomly throwing an abort. If it is power, a 2 second wait to allow cards to stabilize after power on might be wise for the Linux driver as well. I imagine some cards take longer to complete power on reset than others, and attempting to read them immediately after powering them up could be the issue here. > > In my testing I discussed above, I suspected that power > was a likely cause. The HighPoint Rocket Raid 640L is > a good test card because it documents its 3.3v power at > 0.7 watts: > > https://highpoint-tech.com/USA_new/series_r600-specifications.htm > > How can I check if the power supply is stable? > > Regards, > -Kurt > > > > > > > > > > > > > Thanks, > > > > > > - Kever > > > > > > > > > > > + > > > > /* Initialize Root Complex registers. */ > > > > writel(PCIE_LM_VENDOR_ROCKCHIP, priv->apb_base + > > > > PCIE_LM_VENDOR_ID); > > > > writel(PCI_CLASS_BRIDGE_PCI << 16, > > > > > > > > >