On Tue, Oct 16, 2018 at 03:03:07PM -0700, Florian Fainelli wrote: > On 10/16/2018 02:36 PM, Daniel Walker wrote: > > Hi, > > > > I would like to report an issue in the gianfar driver. The issue is as > > follows. > > > > We have a P2020 board that uses the gianfar driver, and we have a m88e1101 > > PHY connect. When the interface is initially brought up traffic flows as > > normal. If you take the interface down then bring it back up traffic stops > > flowing. If you do this sequence over and over up/down/up we find that the > > interface will allow traffic to flow at a low percentage. > > > > In v4.9 interface allows traffic about %10 of the time. > > > > In v4.19-rc8 the allows traffic %30 of the time. > > > > After bisecting I found that in v3.14 the interface was rock solid and > > never did > > we see this issue. However, v3.15 we started to see this issue. After > > bisecting I > > found the following change is the first one which causes the issue, > > > > a328ac9 gianfar: Implement MAC reset and reconfig procedure > > > > I was able to revert this in v3.15 , however with later development a revert > > doesn't appear to be possible. We have no fix for this currently. > > > > I can do testing if you have an idea what might cause the issue. > > What we have seen being typically the problem is that when you have a > PHY connection whereby the PHY provides the RX clock to the MAC (e.g: > RGMII), it is very easy to get in a situation where the PHY clock is > stopped, and the MAC is asked to be reset, but the HW design does not > like that at all since it e.g: stops on packet boundaries and need some > clock cycles to do that, and that results in all sorts of issues (in our > case it was some FIFO corruption). We solved that in bcmgenet.c with > looping internally the TX clock to the RX clock to make sure the > Ethernet MAC (UniMAC in our designs) was successfully reset: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=28c2d1a7a0bfdf3617800d2beae1c67983c03d15 > > Could that somehow be the problem here?
A little more context on this issue after some debugging. The patch which I quote above adds a line into int startup_gfar() which does, gfar_mac_reset(priv); If this line is removed then everything starts working again (this is debugging at the v3.15 source level). On further inspection the block of code inside gfar_mac_reset() is causes a problem is this one, /* Initialize MACCFG2. */ tempval = MACCFG2_INIT_SETTINGS; if (gfar_has_errata(priv, GFAR_ERRATA_74)) tempval |= MACCFG2_HUGEFRAME | MACCFG2_LENGTHCHECK; gfar_write(®s->maccfg2, tempval); and if you change this block to this, tempval = gfar_read(®s->maccfg2); if (gfar_has_errata(priv, GFAR_ERRATA_74)) tempval |= MACCFG2_HUGEFRAME | MACCFG2_LENGTHCHECK; gfar_write(®s->maccfg2, tempval); Then everything starts working. At least on my hardware if you gfar_read() when the hardware first comes up it doesn't cause any issues however, I don't know about other hardware. It would seems that MACCFG2_INIT_SETTINGS is not set up correctly or shouldn't be used in this context. Daniel