Hi Kurt,

On 2020/6/4 上午5:17, Peter Geis wrote:

On Tue, Jun 2, 2020 at 11:12 AM Kurt Miller <k...@intricatesoftware.com> wrote:
On Tue, 2020-06-02 at 10:23 +0800, Shawn Lin wrote:
在 2020/6/2 9:59, Kever Yang 写道:
Hi Kurt,

On 2020/6/2 上午4:30, Kurt Miller wrote:
On at least the RockPro64, many cards will trip a
synchronous abort when first accessing PCIe config space
during bus scanning. A delay after link training allows
some of these cards to function.

Signed-off-by: Kurt Miller <k...@intricatesoftware.com>
---
On the RockPro64, some pci cards trip a synchronous abort when
scanning the
pci bus. For example with HighPoint Rocket Raid 640L which is based on
Marvell 88SE9230 I see this:

=> pci
"Synchronous Abort" handler, esr 0x96000210
elr: 000000000022d034 lr : 000000000022cfd0 (reloc)
elr: 00000000f4568034 lr : 00000000f4567fd0
x0 : 0000000000100000 x1 : 00000000f8000000
x2 : 0000000000000000 x3 : 0000000000100000
x4 : 00000000f2559290 x5 : 0000000000000000
x6 : 0000000000000001 x7 : 00000000f2559860
x8 : 0000000000000030 x9 : 0000000000000008
x10: 0000000000000010 x11: 00000000f251fd1c
x12: 0000000000001421 x13: 0000000000001468
x14: 00000000f251fd4c x15: 00000000ffffffff
x16: 0000000000060001 x17: 000000000000001f
x18: 00000000f2532dc0 x19: 00000000f251fcd0
x20: 0000000000000001 x21: 0000000000000000
x22: 0000000000010000 x23: 00000000f45d4000
x24: 0000000000000000 x25: 00000000f45bc000
x26: 0000000000000000 x27: 0000000000000000
x28: 00000000f2541440 x29: 00000000f251fc20

Code: 540000c1 350000a5 93407c00 f9400081 (b8616800)
Resetting CPU ...

Adding a delay after link training works-around the problem. I added this
delay to the OpenBSD rkpcie driver as well:

https://github.com/openbsd/src/commit/9857dee3520d8ca5bec68538f4b0708d7e64fc87


HighPoint Rocket Raid 640L needs a 1.75 sec delay and Crossfield
SAS9211-4i
needs a 1 second delay, so I arbitrarily decided on 2 seconds.

The delay work-around was originally discovered by nuumio:
https://github.com/nuumio/linux-kernel/commit/5a65b17686002dc84d461bffa324a2cb68e67aee


   drivers/pci/pcie_rockchip.c | 8 ++++++++
   1 file changed, 8 insertions(+)

diff --git a/drivers/pci/pcie_rockchip.c b/drivers/pci/pcie_rockchip.c
index 0edc2464a8..51cfbf6b18 100644
--- a/drivers/pci/pcie_rockchip.c
+++ b/drivers/pci/pcie_rockchip.c
@@ -288,6 +288,14 @@ static int rockchip_pcie_init_port(struct udevice
*dev)
           goto err_power_off_phy;
       }
+    /*
+     * XXX: On at least the RockPro64, many cards will trip a
+     * synchronous abort when first accessing PCIe config space
+     * during bus scanning. A delay after link training allows
+     * some of these cards to function.
+     */
+    mdelay(2000);
I don't understand what kind of delay for init needs 2 Seconds, root
cause will preferred.
Hi Kever,

While working on this issue for the OpenBSD PCIe driver I was not
able to determine the root cause. I tested the following adapters:

ROCKPro64 2 Port SATA
StarTech PEXSAT32 2 Port SATA
Samsung 970 Evo NVMe w/m.2 adapter
IO CREST SI-PEX40148 2 Port SATA
IO CREST SI-PEX40057 4 port
HighPoint Rocket Raid 640L
Crossfield SAS9211-4i
Del PERC H200
Dell PERC 6/i
Intel Gigabit VT Quad Port Server

All of the above adapters successfully link trained, however
three of them would panic upon the first read of the PCI config
space. nuumio's work-around of placing a delay after link-training
allows these cards to work.

HighPoint Rocket Raid 640L ~1.75 sec delay
Crossfield SAS9211-4i ~1 sec delay
Dell PERC H200 ~1 sec delay

In attempt to determine if there was a way to detect how long
to wait, I compared every status and debug register documented
in the rk3399 TRM part 2 for the PCI controller. I compared the
values pre-delay and post-delay. I was not able to find a value
that would indicate it was safe to proceed.

Strictly speaking, how long should we need for this had been provided,
see this:

https://patchwork.kernel.org/patch/11561977/

I can accept the same fix like kernel which is 100ms, but 2 Second is really too much for most cases.


Thanks,

- Kever


If you need more delay, I  highly suspect you should check if
the power supply is stable before enabling training.
Hi Shawn,

Thank you for pointing out the need for the 100ms delay before
beginning link-training. I believe this is to correct link
training failures, while the delay after link training is
to work-around post link training reading of PCI config
space on certain cards.
There are similar issues on the Linux driver with various cards
randomly throwing an abort.
If it is power, a 2 second wait to allow cards to stabilize after
power on might be wise for the Linux driver as well.

I imagine some cards take longer to complete power on reset than
others, and attempting to read them immediately after powering them up
could be the issue here.

In my testing I discussed above, I suspected that power
was a likely cause. The HighPoint Rocket Raid 640L is
a good test card because it documents its 3.3v power at
0.7 watts:

https://highpoint-tech.com/USA_new/series_r600-specifications.htm

How can I check if the power supply is stable?

Regards,
-Kurt



Thanks,

- Kever

+
       /* Initialize Root Complex registers. */
       writel(PCIE_LM_VENDOR_ROCKCHIP, priv->apb_base +
PCIE_LM_VENDOR_ID);
       writel(PCI_CLASS_BRIDGE_PCI << 16,



Reply via email to