Hi Stefan, On Mon, Apr 25, 2022 at 4:18 AM Stefan Roese <s...@denx.de> wrote: > > Hi Tony, > > On 4/25/22 11:33, Tony Dinh wrote: > > Hi Stefan, > > > > On Sun, Apr 24, 2022 at 11:00 PM Stefan Roese <s...@denx.de> wrote: > >> > >> Hi Tony, > >> > >> On 4/23/22 04:15, Tony Dinh wrote: > >>> Hi Stefan, > >>> > >>> Please see my various comments below. And my thoughts at the end. > >>> > >>> On Thu, Apr 21, 2022 at 11:15 PM Stefan Roese <s...@denx.de> wrote: > >>>> > >>>> Hi Tony, > >>>> > >>>> On 4/21/22 23:21, Tony Dinh wrote: > >>>> > >>>> <snip> > >>>> > >>>>>> What really puzzles me is, why the page address is set to a non-zero > >>>>>> value at all at this early boot stage? Could you perhaps add some > >>>>>> debugging code, to check, if and where the page address gets set? > >>>>>> I find it hard to belief, that this starts with non-zero after > >>>>>> powering up the device / PHY. > >>>>>> > >>>>>> Or did I miss something? > >>>>> > >>>>> Other Kirkwood boards behave correctly (such as the Sheevaplug, > >>>>> Dreamplug, and Dell Kace M300). The Page Select register (22) contains > >>>>> 0 in these boards, and all have PHY id 1410e90. The NSA310s also has > >>>>> PHY id 1410e90. > >>>> > >>>> Yes. I'm pretty sure that the page select register is set to 0 upon > >>>> PHY startup. Even though there might be some strapping possibilities > >>>> to configure some PHY registers, the page select is most likely always > >>>> 0 after power-up. So if nobody writes to this reg, then is should be 0. > >>> > >>> Agree. All other Kirkwood boards execute the same code so I think we > >>> would see if somebody writes to this register. > >>> > >>>>> But I could not find in the uclass MVGBE where the Page Select > >>>>> register is set before phy_connect is called. So my guess is that > >>>>> memory location just happens to be zero in other boards but not in > >>>>> this NSA310S board. Perhaps the memory location needs to be set to > >>>>> zero, to make sure all registers point to page 0 in the beginning. > >>>> > >>>> Please see above. > >>>> > >>>>> Possibly, it is here that the Page Select register should be zero out > >>>>> after the priv data is copied: dev_get_priv(). mvgbe_of_to_plat() is > >>>>> called very early on (during the uclass MVGBE creation). > >>>>> > >>>>> static int mvgbe_of_to_plat(struct udevice *dev) > >>>>> { > >>>>> struct eth_pdata *pdata = dev_get_plat(dev); > >>>>> struct mvgbe_device *dmvgbe = dev_get_priv(dev); > >>>> > >>>> Possibly. Again my suggestion is to add some debug code to check at > >>>> different boot times, which value is currently set in the page select > >>>> register. By just reading is out and printing it's value. You might need > >>>> to add some "special code" at the early code paths, as the MDIO driver > >>>> is not started there. > >>>> > >>>> Another idea is, if it's possible to issue a HW-reset to the PHY on the > >>>> NSA310 board. Do you know if some GPIO is connected to the PHY's reset > >>>> pin which could be toggled by the SoC? > >>> > >>> I don't think there is a GPIO that does. I looked at the GPL source > >>> code for this board from way back, and created the DTS for this based > >>> on info in that GPL source. I don't recall that it was available. > >>> > >>>> Note: We could of course just add the reset to 0 as you have done in the > >>>> MAC driver or some board specific code. But I really would like to > >>>> understand why the page select reg is non-zero in this case. > >>> > >>> My conclusion is the register was polluted by something in the board > >>> hardware. This is the comments (paraphrase) we got from this board > >>> GPL source: > >>> /* The ZyXEL NSA310S uses the 88E1310S Alaska (interface > >>> identical to 88E1318) */ > >>> /* and has an MCU attached to the LED[2] via tristate interrupt > >>> */ > >>> > >>> I've rebuilt and rerun the tests for both the Sheevaplug and NSA310S. > >>> Please see the debug log from mvgbe_probe. This is as early as we can > >>> see the uclass initializing. > >>> > >>> ==== NSA310S boot log > >>> U-Boot 2022.04-00569-gca51a8dc04-dirty (Apr 22 2022 - 16:49:50 -0700) > >>> ZyXEL NSA310S/320S 1/2-Bay Power Media Server > >>> > >>> SoC: Kirkwood 88F6281_A1 > >>> DRAM: 256 MiB > >>> Core: 7 devices, 5 uclasses, devicetree: separate > >>> NAND: 128 MiB > >>> Loading Environment from NAND... OK > >>> In: serial > >>> Out: serial > >>> Err: serial > >>> > >>> Net: mvgbe_of_to_plat called > >>> mvgbe_of_to_plat phy-mode 7 > >>> mvgbe_of_to_plat phy addr 1 > >>> mvgbe_probe called > >>> smi_reg_read: phy_addr 1 reg_ofs 22 devad -1 > >>> __mvgbe_mdio_read:(adr 1, off 22) value= 0011 > >>> eth0: ethernet-controller@72000 > >>> Hit any key to stop autoboot: 0 > >>> > >>> ===== Sheevaplug boot log > >>> > >>> U-Boot 2022.04-00569-gca51a8dc04-dirty (Apr 22 2022 - 18:16:25 -0700) > >>> Marvell-Sheevaplug > >>> > >>> SoC: Kirkwood 88F6281_A1 > >>> DRAM: 512 MiB > >>> Core: 9 devices, 7 uclasses, devicetree: separate > >>> NAND: 512 MiB > >>> MMC: mvsdio@90000: 0 > >>> Loading Environment from NAND... OK > >>> In: serial > >>> Out: serial > >>> Err: serial > >>> > >>> Net: mvgbe_of_to_plat called > >>> mvgbe_of_to_plat phy-mode 1 > >>> mvgbe_of_to_plat phy addr 0 > >>> mvgbe_probe called > >>> smi_reg_read: phy_addr 0 reg_ofs 22 devad -1 > >>> __mvgbe_mdio_read:(adr 0, off 22) value= 0000 > >>> eth0: ethernet-controller@72000 > >>> Hit any key to stop autoboot: 0 > >> > >> Still very strange. Perhaps really some HW pin strapping responsible > >> for this difference? > > > > It could be. The Zyxel Kirkwood NAS series NSA310s, 320, and 325 all > > behave like this. The Sheevaplug and Dreamplug don't have this > > behavior, while using the same network chip (I've confirmed that the > > detected PHY id is the same, it is 1410e90). > > > >> > >>> ======== > >>> > >>> My thoughts: > >>> > >>> I think we probably need to refactor some constants in the uclass > >>> drivers/net/mvgbe.c and/or create a helper in the Marvell PHY driver > >>> drivers/net/phy/marvell.c. > >>> > >>> The mvgbe uclass is a generic Ethernet class for all Kirkwood and > >>> Orion-5x boards (CONFIG_ARCH_KIRKWOOD and CONFIG_ARCH_ORION5X). > >>> However, as of right now, it lacks knowledge about specific things > >>> such as the PHY Page Select register for a specific network chip. > >> > >> Yes. And the MAC driver should not know anything about any PHY > >> specific registers IMHO, such as PHY page or whatever. This needs > >> to be done in the PHY driver instead. > >> > >>> Those constants are defined only in drivers/net/phy/marvell.c. For > >>> example, > >>> > >>> #define MII_MARVELL_PHY_PAGE 22 > >>> #define MIIM_88E1121_PHY_PAGE 22 > >>> #define MIIM_88E1145_PHY_PAGE 29 > >>> #define MIIM_88E1310_PHY_PAGE 22 > >>> > >>> When the mvgbe uclass calls mvgbe_of_to_plat() during initialization, > >>> it extracts several parameters from the DTS, such as PHY address, but > >>> has no way to "reset" PHY page in a general way so it will work for > >>> all network chips used in Kirkwood and Orion-5x boards. > >>> > >>> I think my hack to reset the PHY page to 0 would not work for the > >>> 88E1145 chip as seen above. But none of the Kirkwood boards uses this > >>> chip, AFAIK. > >>> > >>> What do you think? Would the patch be OK as is, or perhaps we need to > >>> add a helper to drivers/net/phy/marvell.c to get the PHY Page Select > >>> register? or perhaps you have other thoughts about the solution! > >> > >> One more question: I know that currently phy_connect() does not work on > >> this board, since the PHY page register is non-zero. Is the PHY not > >> detected in this case (PHY ID matching etc)? > > > > Right, the phy_find_by_mask() in phy.c failed to create the PHY, since > > it cannot match with anything. > > Too bad. Newer PHYs seem to mirror the PHY ID registers on all pages, > so at least identifying does work there, regardless of the configured > PHY page address. The 88e1310 seems to be inferior here unfortunately.
> > Note that it is the cold start that > > failed. During a warm start, the PHY is always found as an existing > > PHY (if the network is working already in the kernel, and a warm > > reboot is executed). > > > > Here is the log for the failure to create PHY: > > > > phy.c phy_connect: > > phy.c phy_find_by_mask 2 > > phy.c get_phy_device_by_mask: mask: 0x2 > > create_phy_by_mask phy mask 0x2 > > smi_reg_read: phy_addr 1 reg_ofs 2 devad -1 > > __mvgbe_mdio_read:(adr 1, off 2) value= 0000 > > smi_reg_read: phy_addr 1 reg_ofs 3 devad -1 > > __mvgbe_mdio_read:(adr 1, off 3) value= 0000 > > create_phy_by_mask phy mask 0x2 > > smi_reg_read: phy_addr 1 reg_ofs 2 devad 1 > > __mvgbe_mdio_read:(adr 1, off 2) value= 0000 > > smi_reg_read: phy_addr 1 reg_ofs 3 devad 1 > > __mvgbe_mdio_read:(adr 1, off 3) value= 0000 > > create_phy_by_mask phy mask 0x2 > > smi_reg_read: phy_addr 1 reg_ofs 2 devad 2 > > __mvgbe_mdio_read:(adr 1, off 2) value= 0000 > > smi_reg_read: phy_addr 1 reg_ofs 3 devad 2 > > __mvgbe_mdio_read:(adr 1, off 3) value= 0000 > > create_phy_by_mask phy mask 0x2 > > smi_reg_read: phy_addr 1 reg_ofs 2 devad 3 > > __mvgbe_mdio_read:(adr 1, off 2) value= 0000 > > smi_reg_read: phy_addr 1 reg_ofs 3 devad 3 > > __mvgbe_mdio_read:(adr 1, off 3) value= 0000 > > create_phy_by_mask phy mask 0x2 > > smi_reg_read: phy_addr 1 reg_ofs 2 devad 4 > > __mvgbe_mdio_read:(adr 1, off 2) value= 0000 > > smi_reg_read: phy_addr 1 reg_ofs 3 devad 4 > > __mvgbe_mdio_read:(adr 1, off 3) value= 0000 > > create_phy_by_mask phy mask 0x2 > > smi_reg_read: phy_addr 1 reg_ofs 2 devad 30 > > __mvgbe_mdio_read:(adr 1, off 2) value= 0000 > > smi_reg_read: phy_addr 1 reg_ofs 3 devad 30 > > __mvgbe_mdio_read:(adr 1, off 3) value= 0000 > > phy.c get_phy_device_by_mask: not found and could not create PHY for > > ethernet-controller@72000 > > Could not get PHY for ethernet-controller@72000: addr 1 > > phy_connect failed > > ping failed; host 192.168.0.224 is not alive > > > > While in the successful case the log is like this: > > > > phy.c phy_connect: > > phy.c phy_find_by_mask 2 > > phy.c get_phy_device_by_mask: mask: 0x2 > > create_phy_by_mask phy mask 0x2 > > smi_reg_read: phy_addr 1 reg_ofs 2 devad -1 > > __mvgbe_mdio_read:(adr 1, off 2) value= 0141 > > smi_reg_read: phy_addr 1 reg_ofs 3 devad -1 > > __mvgbe_mdio_read:(adr 1, off 3) value= 0e90 > > phy.c phy_device_create phy_id 21040784 > > phy.c get_phy_driver PHY driver found - PHY id 1410e90 > > > >> > >> Did you try calling miiphy_reset() directly before calling phy_connect? > >> Like this: > >> > >> /* Set phy address of the port */ > >> miiphy_write(dev->name, MV_PHY_ADR_REQUEST, MV_PHY_ADR_REQUEST, > >> phyid); > >> > >> + /* Soft-reset the PHY, especially needed on the NSA310s */ > >> + miiphy_reset(dev->name, phyid); > >> + > >> phydev = phy_connect(bus, phyid, dev, phy_interface); > >> if (!phydev) { > >> printf("phy_connect failed\n"); > >> return NULL; > >> } > >> > >> Does this work? Might be I missed something. > > > > No, it did not work. I've mentioned previously that I tried this > > miiphy_reset at this location, and the board just hung. I needed to > > recycle the power to reboot. I also tried the soft reset (phy_reset) > > and it also hung. > > Thanks for clarifying. > > >> If this still does not work then yes, we might need some generic PHY > >> page reset function. BTW, I introduced a marvell_write_page() function > >> with this patch, which is still pending: > >> > >> https://www.mail-archive.com/u-boot@lists.denx.de/msg436238.html > > > > Oh. I already tried this patch and have not mentioned that to you yet. > > This reg-init occurs too late in the execution path (in Marvel PHY > > driver config function). It would have been possibly a great solution > > to set up the register 22 earlier for this problem. > > > > While I think you're absolutely right above about the MAC driver being > > separated from the PHY driver, I think there is probably a need for > > some additional PHY initialization when necessary. Currently, very > > early on, we have several phy_register calls to register all the > > Marvell drivers in phy_marvell_init(), but that's the only thing it > > does. > > > > I think we might be lacking a small but necessary glue between the > > generic PHY driver (phy.c), mvgbe uclass, and the Marvell PHY driver. > > That might very well be the case. Perhaps one of the net custodians > whats to chime in. Joe, Ramon? > > >> I still need to re-work this a bit. > >> > >> And also I just noticed that the Kernel has a .write_page PHY function > >> and exports this via the common function phy_select_page(). > > > > Cool! sounds like something promising. > > Yes, a very handy addition. Still, somebody needs to work on this and > also make sure, that this "page access" support does not bloat the image > size on all platforms, not in need of this "feature". > > > I also thought of another approach we can take is to have a weak > > function in the mvgbe.c like, > > __weak void __mvgbe_board_eth_init(const char *name, int phyaddr) {} > > The mvgbe uclass can call it at the end of __mvgbe_init(). And this > > NSA310S board file can define the concrete function to set the PHY > > page. > > This is definitely easier than the generic approach envisioned above. > But there weak functions do not scale well. Frankly, I tend a just go > with your current approach with setting the PHY page address in this > MAC driver and be done with it. One reason being, that this driver is > pretty outdated and we will very unlikely see many other new Kirwood > and/or Orion SoC based boards come up in the U-Boot source tree that > need support non-Marvell PHYs. I'm not totally happy with this version > but perhaps it's just "good enough" for these platforms at this stage. > > Comment welcome. I think so too. It is good enough for old boards like the Kirkwood SoCs. Not likely we will see these boards with non-Marvell PHYs. We can go with this version to set the PHY page in this MAC driver. When you have the marvell_write_page function already in the tree and exported in the header file, I could try that, too. It's all academic, but I think it will make it a bit better, because the info about the page register (MII_MARVELL_PHY_PAGE) is encapsulated in marvell.c. Thanks, Tony > > > But this weak-function approach is too much a cop-out. If we can solve > > this nicely within the current design it would be much better. > > Agreed. > > Thanks, > Stefan